virtio-dev message

Subject: Re: [virtio-dev] [PATCH v4] content: Introduce VIRTIO_NET_F_STANDBY feature

From: si-wei liu <si-wei.liu@oracle.com>
To: "Samudrala, Sridhar" <sridhar.samudrala@intel.com>, "Michael S. Tsirkin" <mst@redhat.com>
Date: Thu, 29 Nov 2018 20:46:14 -0800



On 11/29/2018 07:08 PM, Samudrala, Sridhar wrote:

On 11/29/2018 4:24 PM, si-wei liu wrote:
On 11/29/2018 3:53 PM, Samudrala, Sridhar wrote:
On 11/29/2018 2:53 PM, si-wei liu wrote:
On 11/29/2018 1:17 PM, Michael S. Tsirkin wrote:
On Thu, Nov 29, 2018 at 12:14:46PM -0800, si-wei liu wrote:
On 11/28/2018 5:15 PM, Michael S. Tsirkin wrote:
On Wed, Nov 28, 2018 at 12:28:42PM -0800, si-wei liu wrote:
On 11/28/2018 12:06 PM, Michael S. Tsirkin wrote:
On Wed, Nov 28, 2018 at 10:39:55AM -0800, Samudrala, Sridharwrote:
On 11/28/2018 9:35 AM, Michael S. Tsirkin wrote:
On Wed, Nov 28, 2018 at 09:31:32AM -0800, Samudrala, Sridharwrote:
On 11/28/2018 9:08 AM, Michael S. Tsirkin wrote:
On Mon, Nov 26, 2018 at 12:22:56PM -0800, Samudrala,Sridhar wrote:
Update:
I have just set the vf mac's address to 0 (ip link setens2f0 vf 1 mac00:00:00:00:00:00) after unplugging it (the primarydevice) and thepings started working again on the failover interface.So it seems
like the frames were arriving to the vf on the host.
Yes. When the VF is unplugged, you need to reset the VFsMAC so that the packetswith VMs MAC start flowing via VF, bridge and the virtiointerface.
Have you looked at this documentation that shows a samplescript to initiate live
migration?
https://www.kernel.org/doc/html/latest/networking/net_failover.html
-Sridhar
Interesting I didn't notice it does this. So in fact
just defining VF mac will immediately divert packets
to the VF? Given guest driver did not initialize VF
yet won't a bunch of packets be dropped?
There is typo in my stmt above (VF->PF)
When the VF is unplugged, you need to reset the VFs MAC sothat the packetswith VMs MAC start flowing via PF, bridge and the virtiointerface.
When the VF is plugged in, ideally the MAC filter for theVF should be added tothe HW once the guest driver comes up and can receivepackets. Currently with inteldrivers, the filter gets added to HW as soon as the hostadmin sets the VFs MAC viando_set_vf_mac() api. So potentially there could be packetdrops until the VF driver
comes up in the VM.
Can this be fixed in the intel drivers?
I just checked and it looks like this seems to have beenaddressed in theice 100Gb driver. Will bring this up issue internally to seeif we can change this
behavior in i40e/ixgbe drivers.
Also what happens if the mac is programmed both in PF (e.g. with
macvtap) and VF? Ideally VF will take precedence.
I'm seriously doubtful that legacy Intel NIC hardware can dothat instead ofmucking around with software workaround in the PF driver.Actually, the sameapplies to other NIC vendors when hardware sees duplicatefilters. There's
no such control of precedence on one over the other.


-Siwei
Well removing a MAC from the PF filter when we are adding it tothe VFfilter should always be possible. Need to keep it in a separatelist andre-add it when removing the MAC from VF filter.Â This can behandled in
the net core, no need for driver specific hacks.
So that is what I ever said - essentially what you need is anetdev API,rather than to add dirty hacks on each driver. That is fine, buthow wouldyou implement it? Note there's no equivalent driver level .ndoAPI to "move"filters, and all existing .ndo APIs manipulate at the MAC addresslevel asopposed to filters. Are you going to convince netdev this is theright thingto do and we should add such API to the net core and eachindividual driver?
There's no need for a new API IMO.
You drop it from list of uc macs, then call .ndo_set_rx_mode.
Then still you need a new netlink API - effectively it alters therunning state of macvtap as it steals certain filters out from theNIC that affects the datapath of macvtap. I assume we talk aboutsome kernel mechanism to do automatic datapath switching withoutinvolving userspace management stack/orchestration software. In thekernel's (net core's) view that also needs some weakbinding/coordination between the VF and the macvtap for which MACfilter needs to be activated. Still this senses to me a new APIrather than tweaking the current and long-existing default behaviorand making it work transparently just for this case. Otherwise,without introducing a new API, how does the userspace infer thatthe running kernel supports this new behavior.
In case of virtio backed by macvtap, you can change the mac addressof the macvtapinterface. When VF is plugged in, change macvtap's MAC to anunassigned MAC and bring
the virtio link down.
When VF in unplugged, set macvtap's MAC to VMs mac and bring upvirtio link.
This needs management software to orchestrate, right?
Yes. Isn't that a good option as live migration is initiated andorchestrated via mgmt. software.

The motivation is to reduce the down time to zero to get in par withHyperV. Or maybe even better. But you won't be able to achieve that ifinitiating datapath switching from the userspace via mgmt software.

What MST and I are discussing is to how to do this switchingautomatically without involving management software.
OK. I agree that it would be nice if we can do all this automaticallyvia Qemu when the orchestration swinitiates live migration rather than the mgmt. sw having to do somepre and post migration steps.It may be possible to do these pre and post migration steps in qemuvia netlink api to the kernel toupdate the MAC addresses as we are now associating the primary andstandby interfaces.

The number one blocker for that approach now is can Intel ixgbe and i40edriver be fixed to defer adding MAC filter to the NIC until VF is up andrunning in guest? Particularly we'd limit the fix to PF side only withVF driver intact, using the existing mailbox or adminq interface.


Thanks,
-Siwei

References:
- Re: [virtio-dev] [PATCH v4] content: Introduce VIRTIO_NET_F_STANDBY feature
  - From: Sameeh Jubran <sameeh@daynix.com>
- Re: [virtio-dev] [PATCH v4] content: Introduce VIRTIO_NET_F_STANDBY feature
  - From: "Samudrala, Sridhar" <sridhar.samudrala@intel.com>
- Re: [virtio-dev] [PATCH v4] content: Introduce VIRTIO_NET_F_STANDBY feature
  - From: "Michael S. Tsirkin" <mst@redhat.com>
- Re: [virtio-dev] [PATCH v4] content: Introduce VIRTIO_NET_F_STANDBY feature
  - From: "Samudrala, Sridhar" <sridhar.samudrala@intel.com>
- Re: [virtio-dev] [PATCH v4] content: Introduce VIRTIO_NET_F_STANDBY feature
  - From: "Michael S. Tsirkin" <mst@redhat.com>
- Re: [virtio-dev] [PATCH v4] content: Introduce VIRTIO_NET_F_STANDBY feature
  - From: "Samudrala, Sridhar" <sridhar.samudrala@intel.com>
- Re: [virtio-dev] [PATCH v4] content: Introduce VIRTIO_NET_F_STANDBY feature
  - From: "Michael S. Tsirkin" <mst@redhat.com>
- Re: [virtio-dev] [PATCH v4] content: Introduce VIRTIO_NET_F_STANDBY feature
  - From: si-wei liu <si-wei.liu@oracle.com>
- Re: [virtio-dev] [PATCH v4] content: Introduce VIRTIO_NET_F_STANDBY feature
  - From: "Michael S. Tsirkin" <mst@redhat.com>
- Re: [virtio-dev] [PATCH v4] content: Introduce VIRTIO_NET_F_STANDBY feature
  - From: si-wei liu <si-wei.liu@oracle.com>
- Re: [virtio-dev] [PATCH v4] content: Introduce VIRTIO_NET_F_STANDBY feature
  - From: "Michael S. Tsirkin" <mst@redhat.com>
- Re: [virtio-dev] [PATCH v4] content: Introduce VIRTIO_NET_F_STANDBY feature
  - From: si-wei liu <si-wei.liu@oracle.com>
- Re: [virtio-dev] [PATCH v4] content: Introduce VIRTIO_NET_F_STANDBY feature
  - From: "Samudrala, Sridhar" <sridhar.samudrala@intel.com>
- Re: [virtio-dev] [PATCH v4] content: Introduce VIRTIO_NET_F_STANDBY feature
  - From: si-wei liu <si-wei.liu@oracle.com>
- Re: [virtio-dev] [PATCH v4] content: Introduce VIRTIO_NET_F_STANDBY feature
  - From: "Samudrala, Sridhar" <sridhar.samudrala@intel.com>