virtio-comment message

Subject: Re: [virtio-comment] [PATCH 5/5] virtio-pci: implement VIRTIO_F_QUEUE_STATE

From: "Zhu, Lingshan" <lingshan.zhu@intel.com>
To: Parav Pandit <parav@nvidia.com>, Jason Wang <jasowang@redhat.com>
Date: Tue, 12 Sep 2023 14:33:52 +0800



On 9/12/2023 1:58 PM, Parav Pandit wrote:

From: Zhu, Lingshan <lingshan.zhu@intel.com>
Sent: Tuesday, September 12, 2023 9:37 AM

On 9/11/2023 6:21 PM, Parav Pandit wrote:

From: Zhu, Lingshan <lingshan.zhu@intel.com>
Sent: Monday, September 11, 2023 3:03 PM So implement AQ on the
"admin" VF? This require the HW reserve dedicated resource for every
VF?
So expensive, Overkill?

And a VF may be managed by the PF and its admin "vf"?

Yes.

it's a bit chaos, as you can see if the nested(L2 guest) VF can be managed by
both L1 guest VF and the host PF, that means two owners of the L2 VF.

This is the nesting.
When you do M level nesting, does any cpu in world handle its own page tables in isolation of next level and also perform equally well?

Not exactly, in nesting, L1 guest is the host/infrastructure emulatorfor L2, so L2 is expect to do nothing with the host,or something like L2 VF managed by both L1 VF and host PF can lead tooperational and security issues?

If UDP packets are dropped, even application can fail who do no retry.

UDP is not reliable, and performance overhead does not mean fail.

It largely depends on application.
I have seen iperf UDP failing on packet drop and never recovered.
A retransmission over UDP can fail.

That depends on the workload, if it choose UDP, it is aware of the possibilities of
losing packets. But anyway, LM are expected to perform successfully in the due
time

And LM also depends on the workload. :)

Exactly! That's the point, how to meet the requirements!

It is pointless to discuss performance characteristics as a point to use AQ or not.

How to meet QOS requirement when LM?

But too few AQ to serve too high volume of VMs may be a problem.

It is left for the device to implement the needed scale requirement.

Yes, so how many HW resource should the HW implementation reserved to
serve the worst case? Half of the board resource?

The board designer can decide how to manage the resource.
Administration commands are explicit instructions to the device.
It knows how many members device's dirty tracking is ongoing, which device

context is being read/written.
Still, does the board designer need to prepare for the worst case? How to meet
that challenge?

No. board designer does not need to.
As explained already, if board wants to supporting single command of AQ, sure.

Same as above, the QOS question. For example, how to avoid the situationthat

half VMs can be migrated and others timeout?

Admin command can even fail with EAGAIN error code when device is out of

resource and software can retry the command.
As demonstrated, this series is reliable as the config space functionalities, so
maybe less possibilities to fail?

Huh. Config space has far higher failure rate for the PCI transport when due to inherent nature of PCI timeouts and reads and polling.
For any bulk data transfer virtqueue is spec defined approach.
For more than a year this was debated you can check some 2021 emails.

You can see the patches that data transfer done in [1] over registers is snail slow.

Do you often observe virtio PCI config space fail? Or does admin vq needto transfer data through PCI?

They key part is all of these happens outside of the VM's downtime.
Majority of the work in proposal [1] is done when the VM is _live_.
Hence, the resource consumption or reservation is significantly less.

Still depends on the volume of VMs and devices, the orchestration layer needs
to migrate the last round of dirty pages and states even when the VM has been
suspended.

That has nothing do with admin virtqueue.
And migration layer already does it and used by multiple devices.

same as above, QOS

Naming a number or an algorithm for the ratio of devices /
num_of_AQs is beyond this topic, but I made my point clear.

Sure. It is beyond.
And it is not a concern either.

It is, the user expect the LM process success than fail.

I still fail to understand why LM process fails.
The migration process is slow, but downtime is not in [1].

If I recall it clear, the downtime is around 300ms, so don't let the bandwidth or
num of admin vqs become a bottle neck which may introduce more possibilities
to fail.

can depth = 1K introduce significant latency?

AQ command execution is not done serially. There is enough text on
the AQ

chapter as I recall.
Then require more HW resource, I don't see difference.

Difference compared to what, multiple AQs?
If so, sure.
The device who prefers to do only one AQ command at a time, sure it can

work with less resource and do one at a time.
I think we are discussing the same issue as above "resource for the worst case"
problem

Frankly I am not seeing any issue.
AQ is just another virtqueue as basic construct in the spec used by 30+ device types.

explained above, when migrate a VM, the time consuming has toconvergence and the totaldowntime has a due, I remember it is less than 300ms. That is the QOSrequirement.

Follow-Ups:
- RE: [virtio-comment] [PATCH 5/5] virtio-pci: implement VIRTIO_F_QUEUE_STATE
  - From: Parav Pandit <parav@nvidia.com>

References:
- [PATCH 0/5] virtio: introduce SUSPEND bit and vq state
  - From: Zhu Lingshan <lingshan.zhu@intel.com>
- Re: [virtio-comment] [PATCH 5/5] virtio-pci: implement VIRTIO_F_QUEUE_STATE
  - From: Jason Wang <jasowang@redhat.com>
- RE: [virtio-comment] [PATCH 5/5] virtio-pci: implement VIRTIO_F_QUEUE_STATE
  - From: Parav Pandit <parav@nvidia.com>
- Re: [virtio-comment] [PATCH 5/5] virtio-pci: implement VIRTIO_F_QUEUE_STATE
  - From: "Zhu, Lingshan" <lingshan.zhu@intel.com>
- RE: [virtio-comment] [PATCH 5/5] virtio-pci: implement VIRTIO_F_QUEUE_STATE
  - From: Parav Pandit <parav@nvidia.com>
- Re: [virtio-comment] [PATCH 5/5] virtio-pci: implement VIRTIO_F_QUEUE_STATE
  - From: "Zhu, Lingshan" <lingshan.zhu@intel.com>
- RE: [virtio-comment] [PATCH 5/5] virtio-pci: implement VIRTIO_F_QUEUE_STATE
  - From: Parav Pandit <parav@nvidia.com>
- Re: [virtio-comment] [PATCH 5/5] virtio-pci: implement VIRTIO_F_QUEUE_STATE
  - From: "Zhu, Lingshan" <lingshan.zhu@intel.com>
- RE: [virtio-comment] [PATCH 5/5] virtio-pci: implement VIRTIO_F_QUEUE_STATE
  - From: Parav Pandit <parav@nvidia.com>
- Re: [virtio-comment] [PATCH 5/5] virtio-pci: implement VIRTIO_F_QUEUE_STATE
  - From: "Zhu, Lingshan" <lingshan.zhu@intel.com>
- RE: [virtio-comment] [PATCH 5/5] virtio-pci: implement VIRTIO_F_QUEUE_STATE
  - From: Parav Pandit <parav@nvidia.com>
- Re: [virtio-comment] [PATCH 5/5] virtio-pci: implement VIRTIO_F_QUEUE_STATE
  - From: "Zhu, Lingshan" <lingshan.zhu@intel.com>
- RE: [virtio-comment] [PATCH 5/5] virtio-pci: implement VIRTIO_F_QUEUE_STATE
  - From: Parav Pandit <parav@nvidia.com>
- Re: [virtio-comment] [PATCH 5/5] virtio-pci: implement VIRTIO_F_QUEUE_STATE
  - From: "Zhu, Lingshan" <lingshan.zhu@intel.com>
- RE: [virtio-comment] [PATCH 5/5] virtio-pci: implement VIRTIO_F_QUEUE_STATE
  - From: Parav Pandit <parav@nvidia.com>