virtio-comment message

Subject: Re: [virtio-comment] [PATCH 5/5] virtio-pci: implement VIRTIO_F_QUEUE_STATE

From: "Zhu, Lingshan" <lingshan.zhu@intel.com>
To: Parav Pandit <parav@nvidia.com>, Jason Wang <jasowang@redhat.com>
Date: Mon, 11 Sep 2023 14:58:10 +0800



On 9/11/2023 2:47 PM, Parav Pandit wrote:

From: Jason Wang <jasowang@redhat.com>
Sent: Monday, September 11, 2023 12:01 PM

On Mon, Sep 11, 2023 at 12:12âPM Parav Pandit <parav@nvidia.com> wrote:

Hi Michael,

From: virtio-comment@lists.oasis-open.org
<virtio-comment@lists.oasis- open.org> On Behalf Of Jason Wang
Sent: Monday, September 11, 2023 8:31 AM

On Wed, Sep 6, 2023 at 4:33âPM Michael S. Tsirkin <mst@redhat.com>

wrote:

On Wed, Sep 06, 2023 at 04:16:37PM +0800, Zhu Lingshan wrote:

This patch adds two new le16 fields to common configuration
structure to support VIRTIO_F_QUEUE_STATE in PCI transport layer.

Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>


I do not see why this would be pci specific at all.

This is the PCI interface for live migration. The facility is not specific to PCI.

It can choose to reuse the common configuration or not, but the
semantic is general enough to be used by other transports. We can
introduce one for MMIO for sure.

But besides I thought work on live migration will use admin queue.
This was explicitly one of the motivators.

Please find the proposal that uses administration commands for device

migration at [1] for passthrough devices.

[1]
https://lists.oasis-open.org/archives/virtio-comment/202309/msg00061.h
tml

This proposal couples live migration with several requirements, and suffers from
the exact issues I've mentioned below.

It does not.
Can you please list which one?

In some cases, it's even worse (coupling with PCI/SR-IOV, second state machine
other than the device status).

There is no state machine in [1].
It is not coupled with PCI/SR-IOV either.
It supports PCI/SR-IOV transport and in future other transports too when they evolve.

  > I think not. Using admin virtqueue will end up with several problems:

1) the feature is not self contained so at the end we need transport
specific facility to migrate the admin virtqueue

You mixed up.
Admin queue of the owner device is not migrated.

If you actually read more, it is for the member device migration and not the owner.
Hence, owner device admin queue is not migrated.

Then how to serve bare-metal migration? Migrate by itself?

Why not? Ling Shan's proposal makes everything work including migrating the
owner or in the case there's even no owner.

I donât see in his proposal how all the features and functionality supported is achieved.

I will include in-flight descriptor tracker and diry-page traking in V2,anything else missed?It can migrate the device itself, why don't you think so, can you namesome issues we can work on

for improvements?

In this proposal, the facility (suspending, queue state, inflight
descriptors) is decoupled from the transport specific API. Each transport can
implement one or more types of interfaces. A MMIO based interface is
proposed but It doesn't prevent you from adding admin commands for those
facilities on top.

Even in proposal [1] most things are transport agonistic.
Member device proposal covers several aspects already of downtime, peer to peer, dirty page tracking, efficient querying VQ state and more.

If you want to implement LM by admin vq, the facilities in my series canbe re-used. E.g., forward your suspend to SUSPEND bit.

Admin queue of the member device is migrated like any other queue using

above [1].

2) won't work in the nested environment, or we need complicated
SR-IOV emulation in order to work

Poking at the device from the driver to migrate it is not going to
work if the driver lives within guest.

This is by design to allow live migration to work in the nested layer.
And it's the way we've used for CPU and MMU. Anything may virtio
different here?

Nested and non-nested use cases likely cannot be addressed by single

solution/interface.

I think Ling Shan's proposal addressed them both.

I donât see how all above points are covered.

Why?


And how do you migrate nested VMs by admin vq?

How many admin vqs and the bandwidth are
reserved for migrate all VMs?

Remember CSP migrates all VMs on a host for powersaving or upgrade.

So both are orthogonal requirements to me.

One can defined some administration commands to issue on the AQ of the

member device itself for nested case.

This is not easy, DMA needs to be isolated so this means you need to either
emulate SR-IOV and use AQ on virtual PF in the guest or using PASID.

This is why nested and non-nested cannot be treated equally and I donât see this all covered in Ling proposal either.
For passthrough device use case [1] has covered the necessary pieces.

Customers don't want to have admin stuff, SR-IOV or PASID in the guest in order
to migrate a single virtio device in the nest.

As proposed in [1] for pass through devices no customer needs to do SR-IOV or PASID in the guest for non-nest.

Nested is some special case and likely need mediated based scheme using administration commands.

In best case we can produce common commands, if that fits.
Else both proposals are orthogonal addressing different use cases.

Follow-Ups:
- RE: [virtio-comment] [PATCH 5/5] virtio-pci: implement VIRTIO_F_QUEUE_STATE
  - From: Parav Pandit <parav@nvidia.com>

References:
- [PATCH 0/5] virtio: introduce SUSPEND bit and vq state
  - From: Zhu Lingshan <lingshan.zhu@intel.com>
- [PATCH 5/5] virtio-pci: implement VIRTIO_F_QUEUE_STATE
  - From: Zhu Lingshan <lingshan.zhu@intel.com>
- Re: [virtio-comment] [PATCH 5/5] virtio-pci: implement VIRTIO_F_QUEUE_STATE
  - From: "Michael S. Tsirkin" <mst@redhat.com>
- Re: [virtio-comment] [PATCH 5/5] virtio-pci: implement VIRTIO_F_QUEUE_STATE
  - From: Jason Wang <jasowang@redhat.com>
- RE: [virtio-comment] [PATCH 5/5] virtio-pci: implement VIRTIO_F_QUEUE_STATE
  - From: Parav Pandit <parav@nvidia.com>
- Re: [virtio-comment] [PATCH 5/5] virtio-pci: implement VIRTIO_F_QUEUE_STATE
  - From: Jason Wang <jasowang@redhat.com>
- RE: [virtio-comment] [PATCH 5/5] virtio-pci: implement VIRTIO_F_QUEUE_STATE
  - From: Parav Pandit <parav@nvidia.com>