Subject: Re: [virtio-dev] [PATCH v4] content: Introduce VIRTIO_NET_F_STANDBY feature
On 11/29/2018 07:08 PM, Samudrala, Sridhar wrote:
The motivation is to reduce the down time to zero to get in par with HyperV. Or maybe even better. But you won't be able to achieve that if initiating datapath switching from the userspace via mgmt software.On 11/29/2018 4:24 PM, si-wei liu wrote:On 11/29/2018 3:53 PM, Samudrala, Sridhar wrote:This needs management software to orchestrate, right?On 11/29/2018 2:53 PM, si-wei liu wrote:On 11/29/2018 1:17 PM, Michael S. Tsirkin wrote:Then still you need a new netlink API - effectively it alters the running state of macvtap as it steals certain filters out from the NIC that affects the datapath of macvtap. I assume we talk about some kernel mechanism to do automatic datapath switching without involving userspace management stack/orchestration software. In the kernel's (net core's) view that also needs some weak binding/coordination between the VF and the macvtap for which MAC filter needs to be activated. Still this senses to me a new API rather than tweaking the current and long-existing default behavior and making it work transparently just for this case. Otherwise, without introducing a new API, how does the userspace infer that the running kernel supports this new behavior.On Thu, Nov 29, 2018 at 12:14:46PM -0800, si-wei liu wrote:On 11/28/2018 5:15 PM, Michael S. Tsirkin wrote:So that is what I ever said - essentially what you need is a netdev API, rather than to add dirty hacks on each driver. That is fine, but how would you implement it? Note there's no equivalent driver level .ndo API to "move" filters, and all existing .ndo APIs manipulate at the MAC address level as opposed to filters. Are you going to convince netdev this is the right thing to do and we should add such API to the net core and each individual driver?On Wed, Nov 28, 2018 at 12:28:42PM -0800, si-wei liu wrote:Well removing a MAC from the PF filter when we are adding it to the VF filter should always be possible. Need to keep it in a separate list and re-add it when removing the MAC from VF filter.Â This can be handled inOn 11/28/2018 12:06 PM, Michael S. Tsirkin wrote:On Wed, Nov 28, 2018 at 10:39:55AM -0800, Samudrala, Sridhar wrote:I'm seriously doubtful that legacy Intel NIC hardware can do that instead of mucking around with software workaround in the PF driver. Actually, the same applies to other NIC vendors when hardware sees duplicate filters. There'sOn 11/28/2018 9:35 AM, Michael S. Tsirkin wrote:On Wed, Nov 28, 2018 at 09:31:32AM -0800, Samudrala, Sridhar wrote:I just checked and it looks like this seems to have been addressed in the ice 100Gb driver. Will bring this up issue internally to see if we can change thisOn 11/28/2018 9:08 AM, Michael S. Tsirkin wrote:On Mon, Nov 26, 2018 at 12:22:56PM -0800, Samudrala, Sridhar wrote:Yes. When the VF is unplugged, you need to reset the VFs MAC so that the packets with VMs MAC start flowing via VF, bridge and the virtio interface.Update:I have just set the vf mac's address to 0 (ip link set ens2f0 vf 1 mac 00:00:00:00:00:00) after unplugging it (the primary device) and the pings started working again on the failover interface. So it seemslike the frames were arriving to the vf on the host.Have you looked at this documentation that shows a sample script to initiate livemigration?https://www.kernel.org/doc/html/latest/networking/net_failover.html-SridharInteresting I didn't notice it does this. So in fact just defining VF mac will immediately divert packets to the VF? Given guest driver did not initialize VF yet won't a bunch of packets be dropped?There is typo in my stmt above (VF->PF)When the VF is unplugged, you need to reset the VFs MAC so that the packets with VMs MAC start flowing via PF, bridge and the virtio interface.When the VF is plugged in, ideally the MAC filter for the VF should be added to the HW once the guest driver comes up and can receive packets. Currently with intel drivers, the filter gets added to HW as soon as the host admin sets the VFs MAC via ndo_set_vf_mac() api. So potentially there could be packet drops until the VF drivercomes up in the VM.Can this be fixed in the intel drivers?behavior in i40e/ixgbe drivers.Also what happens if the mac is programmed both in PF (e.g. with macvtap) and VF? Ideally VF will take precedence.no such control of precedence on one over the other. -Siweithe net core, no need for driver specific hacks.There's no need for a new API IMO. You drop it from list of uc macs, then call .ndo_set_rx_mode.In case of virtio backed by macvtap, you can change the mac address of the macvtap interface. When VF is plugged in, change macvtap's MAC to an unassigned MAC and bringthe virtio link down.When VF in unplugged, set macvtap's MAC to VMs mac and bring up virtio link.Yes. Isn't that a good option as live migration is initiated and orchestrated via mgmt. software.
The number one blocker for that approach now is can Intel ixgbe and i40e driver be fixed to defer adding MAC filter to the NIC until VF is up and running in guest? Particularly we'd limit the fix to PF side only with VF driver intact, using the existing mailbox or adminq interface.What MST and I are discussing is to how to do this switching automatically without involving management software.OK. I agree that it would be nice if we can do all this automatically via Qemu when the orchestration sw initiates live migration rather than the mgmt. sw having to do some pre and post migration steps. It may be possible to do these pre and post migration steps in qemu via netlink api to the kernel to update the MAC addresses as we are now associating the primary and standby interfaces.