virtio-comment message

Subject: RE: [virtio-comment] [PATCH 5/5] virtio-pci: implement VIRTIO_F_QUEUE_STATE

From: Parav Pandit <parav@nvidia.com>
To: Jason Wang <jasowang@redhat.com>
Date: Mon, 11 Sep 2023 06:47:43 +0000


> From: Jason Wang <jasowang@redhat.com>
> Sent: Monday, September 11, 2023 12:01 PM
> 
> On Mon, Sep 11, 2023 at 12:12âPM Parav Pandit <parav@nvidia.com> wrote:
> >
> > Hi Michael,
> >
> > > From: virtio-comment@lists.oasis-open.org
> > > <virtio-comment@lists.oasis- open.org> On Behalf Of Jason Wang
> > > Sent: Monday, September 11, 2023 8:31 AM
> > >
> > > On Wed, Sep 6, 2023 at 4:33âPM Michael S. Tsirkin <mst@redhat.com>
> wrote:
> > > >
> > > > On Wed, Sep 06, 2023 at 04:16:37PM +0800, Zhu Lingshan wrote:
> > > > > This patch adds two new le16 fields to common configuration
> > > > > structure to support VIRTIO_F_QUEUE_STATE in PCI transport layer.
> > > > >
> > > > > Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
> > > >
> > > >
> > > > I do not see why this would be pci specific at all.
> > >
> > > This is the PCI interface for live migration. The facility is not specific to PCI.
> > >
> > > It can choose to reuse the common configuration or not, but the
> > > semantic is general enough to be used by other transports. We can
> > > introduce one for MMIO for sure.
> > >
> > > >
> > > > But besides I thought work on live migration will use admin queue.
> > > > This was explicitly one of the motivators.
> > >
> > Please find the proposal that uses administration commands for device
> migration at [1] for passthrough devices.
> >
> > [1]
> > https://lists.oasis-open.org/archives/virtio-comment/202309/msg00061.h
> > tml
> 
> This proposal couples live migration with several requirements, and suffers from
> the exact issues I've mentioned below.
>
It does not.
Can you please list which one?
 
> In some cases, it's even worse (coupling with PCI/SR-IOV, second state machine
> other than the device status).
> 
There is no state machine in [1].
It is not coupled with PCI/SR-IOV either.
It supports PCI/SR-IOV transport and in future other transports too when they evolve.

> >
> >  > I think not. Using admin virtqueue will end up with several problems:
> > >
> > > 1) the feature is not self contained so at the end we need transport
> > > specific facility to migrate the admin virtqueue
> >
> > You mixed up.
> > Admin queue of the owner device is not migrated.
>
If you actually read more, it is for the member device migration and not the owner.
Hence, owner device admin queue is not migrated.
 
> Why not? Ling Shan's proposal makes everything work including migrating the
> owner or in the case there's even no owner.
> 
I donât see in his proposal how all the features and functionality supported is achieved.

> In this proposal, the facility (suspending, queue state, inflight
> descriptors) is decoupled from the transport specific API. Each transport can
> implement one or more types of interfaces. A MMIO based interface is
> proposed but It doesn't prevent you from adding admin commands for those
> facilities on top.
>
Even in proposal [1] most things are transport agonistic.
Member device proposal covers several aspects already of downtime, peer to peer, dirty page tracking, efficient querying VQ state and more.


> > Admin queue of the member device is migrated like any other queue using
> above [1].
> >
> > > 2) won't work in the nested environment, or we need complicated
> > > SR-IOV emulation in order to work
> > >
> > > >
> > > > Poking at the device from the driver to migrate it is not going to
> > > > work if the driver lives within guest.
> > >
> > > This is by design to allow live migration to work in the nested layer.
> > > And it's the way we've used for CPU and MMU. Anything may virtio
> > > different here?
> >
> > Nested and non-nested use cases likely cannot be addressed by single
> solution/interface.
> 
> I think Ling Shan's proposal addressed them both.
>
I donât see how all above points are covered.

 
> > So both are orthogonal requirements to me.
> >
> > One can defined some administration commands to issue on the AQ of the
> member device itself for nested case.
> 
> This is not easy, DMA needs to be isolated so this means you need to either
> emulate SR-IOV and use AQ on virtual PF in the guest or using PASID.
>
This is why nested and non-nested cannot be treated equally and I donât see this all covered in Ling proposal either.
For passthrough device use case [1] has covered the necessary pieces.

> Customers don't want to have admin stuff, SR-IOV or PASID in the guest in order
> to migrate a single virtio device in the nest.

As proposed in [1] for pass through devices no customer needs to do SR-IOV or PASID in the guest for non-nest.

Nested is some special case and likely need mediated based scheme using administration commands.

In best case we can produce common commands, if that fits. 
Else both proposals are orthogonal addressing different use cases.

Follow-Ups:
- Re: [virtio-comment] [PATCH 5/5] virtio-pci: implement VIRTIO_F_QUEUE_STATE
  - From: Jason Wang <jasowang@redhat.com>
- Re: [virtio-comment] [PATCH 5/5] virtio-pci: implement VIRTIO_F_QUEUE_STATE
  - From: "Zhu, Lingshan" <lingshan.zhu@intel.com>

References:
- [PATCH 0/5] virtio: introduce SUSPEND bit and vq state
  - From: Zhu Lingshan <lingshan.zhu@intel.com>
- [PATCH 5/5] virtio-pci: implement VIRTIO_F_QUEUE_STATE
  - From: Zhu Lingshan <lingshan.zhu@intel.com>
- Re: [virtio-comment] [PATCH 5/5] virtio-pci: implement VIRTIO_F_QUEUE_STATE
  - From: "Michael S. Tsirkin" <mst@redhat.com>
- Re: [virtio-comment] [PATCH 5/5] virtio-pci: implement VIRTIO_F_QUEUE_STATE
  - From: Jason Wang <jasowang@redhat.com>
- RE: [virtio-comment] [PATCH 5/5] virtio-pci: implement VIRTIO_F_QUEUE_STATE
  - From: Parav Pandit <parav@nvidia.com>
- Re: [virtio-comment] [PATCH 5/5] virtio-pci: implement VIRTIO_F_QUEUE_STATE
  - From: Jason Wang <jasowang@redhat.com>