virtio-comment message

Subject: RE: [virtio-comment] [PATCH 5/5] virtio-pci: implement VIRTIO_F_QUEUE_STATE

From: Parav Pandit <parav@nvidia.com>
To: "Zhu, Lingshan" <lingshan.zhu@intel.com>, Jason Wang <jasowang@redhat.com>
Date: Tue, 12 Sep 2023 05:58:02 +0000

> From: Zhu, Lingshan <lingshan.zhu@intel.com>
> Sent: Tuesday, September 12, 2023 9:37 AM
> 
> On 9/11/2023 6:21 PM, Parav Pandit wrote:
> >> From: Zhu, Lingshan <lingshan.zhu@intel.com>
> >> Sent: Monday, September 11, 2023 3:03 PM So implement AQ on the
> >> "admin" VF? This require the HW reserve dedicated resource for every
> >> VF?
> >> So expensive, Overkill?
> >>
> >> And a VF may be managed by the PF and its admin "vf"?
> > Yes.
> it's a bit chaos, as you can see if the nested(L2 guest) VF can be managed by
> both L1 guest VF and the host PF, that means two owners of the L2 VF.
This is the nesting.
When you do M level nesting, does any cpu in world handle its own page tables in isolation of next level and also perform equally well?

> >
> >>> If UDP packets are dropped, even application can fail who do no retry.
> >> UDP is not reliable, and performance overhead does not mean fail.
> > It largely depends on application.
> > I have seen iperf UDP failing on packet drop and never recovered.
> > A retransmission over UDP can fail.
> That depends on the workload, if it choose UDP, it is aware of the possibilities of
> losing packets. But anyway, LM are expected to perform successfully in the due
> time
And LM also depends on the workload. :)
It is pointless to discuss performance characteristics as a point to use AQ or not.

> >
> >>>> But too few AQ to serve too high volume of VMs may be a problem.
> >>> It is left for the device to implement the needed scale requirement.
> >> Yes, so how many HW resource should the HW implementation reserved to
> >> serve the worst case? Half of the board resource?
> > The board designer can decide how to manage the resource.
> > Administration commands are explicit instructions to the device.
> > It knows how many members device's dirty tracking is ongoing, which device
> context is being read/written.
> Still, does the board designer need to prepare for the worst case? How to meet
> that challenge?
No. board designer does not need to.
As explained already, if board wants to supporting single command of AQ, sure.

> >
> > Admin command can even fail with EAGAIN error code when device is out of
> resource and software can retry the command.
> As demonstrated, this series is reliable as the config space functionalities, so
> maybe less possibilities to fail?
Huh. Config space has far higher failure rate for the PCI transport when due to inherent nature of PCI timeouts and reads and polling.
For any bulk data transfer virtqueue is spec defined approach.
For more than a year this was debated you can check some 2021 emails.

You can see the patches that data transfer done in [1] over registers is snail slow.

> >
> > They key part is all of these happens outside of the VM's downtime.
> > Majority of the work in proposal [1] is done when the VM is _live_.
> > Hence, the resource consumption or reservation is significantly less.
> Still depends on the volume of VMs and devices, the orchestration layer needs
> to migrate the last round of dirty pages and states even when the VM has been
> suspended.
That has nothing do with admin virtqueue.
And migration layer already does it and used by multiple devices.

> >
> >
> >>>> Naming a number or an algorithm for the ratio of devices /
> >>>> num_of_AQs is beyond this topic, but I made my point clear.
> >>> Sure. It is beyond.
> >>> And it is not a concern either.
> >> It is, the user expect the LM process success than fail.
> > I still fail to understand why LM process fails.
> > The migration process is slow, but downtime is not in [1].
> If I recall it clear, the downtime is around 300ms, so don't let the bandwidth or
> num of admin vqs become a bottle neck which may introduce more possibilities
> to fail.
> >
> >>>> can depth = 1K introduce significant latency?
> >>> AQ command execution is not done serially. There is enough text on
> >>> the AQ
> >> chapter as I recall.
> >> Then require more HW resource, I don't see difference.
> > Difference compared to what, multiple AQs?
> > If so, sure.
> > The device who prefers to do only one AQ command at a time, sure it can
> work with less resource and do one at a time.
> I think we are discussing the same issue as above "resource for the worst case"
> problem
Frankly I am not seeing any issue.
AQ is just another virtqueue as basic construct in the spec used by 30+ device types.

Follow-Ups:
- Re: [virtio-comment] [PATCH 5/5] virtio-pci: implement VIRTIO_F_QUEUE_STATE
  - From: "Zhu, Lingshan" <lingshan.zhu@intel.com>

References:
- [PATCH 0/5] virtio: introduce SUSPEND bit and vq state
  - From: Zhu Lingshan <lingshan.zhu@intel.com>
- RE: [virtio-comment] [PATCH 5/5] virtio-pci: implement VIRTIO_F_QUEUE_STATE
  - From: Parav Pandit <parav@nvidia.com>
- Re: [virtio-comment] [PATCH 5/5] virtio-pci: implement VIRTIO_F_QUEUE_STATE
  - From: Jason Wang <jasowang@redhat.com>
- RE: [virtio-comment] [PATCH 5/5] virtio-pci: implement VIRTIO_F_QUEUE_STATE
  - From: Parav Pandit <parav@nvidia.com>
- Re: [virtio-comment] [PATCH 5/5] virtio-pci: implement VIRTIO_F_QUEUE_STATE
  - From: "Zhu, Lingshan" <lingshan.zhu@intel.com>
- RE: [virtio-comment] [PATCH 5/5] virtio-pci: implement VIRTIO_F_QUEUE_STATE
  - From: Parav Pandit <parav@nvidia.com>
- Re: [virtio-comment] [PATCH 5/5] virtio-pci: implement VIRTIO_F_QUEUE_STATE
  - From: "Zhu, Lingshan" <lingshan.zhu@intel.com>
- RE: [virtio-comment] [PATCH 5/5] virtio-pci: implement VIRTIO_F_QUEUE_STATE
  - From: Parav Pandit <parav@nvidia.com>
- Re: [virtio-comment] [PATCH 5/5] virtio-pci: implement VIRTIO_F_QUEUE_STATE
  - From: "Zhu, Lingshan" <lingshan.zhu@intel.com>
- RE: [virtio-comment] [PATCH 5/5] virtio-pci: implement VIRTIO_F_QUEUE_STATE
  - From: Parav Pandit <parav@nvidia.com>
- Re: [virtio-comment] [PATCH 5/5] virtio-pci: implement VIRTIO_F_QUEUE_STATE
  - From: "Zhu, Lingshan" <lingshan.zhu@intel.com>
- RE: [virtio-comment] [PATCH 5/5] virtio-pci: implement VIRTIO_F_QUEUE_STATE
  - From: Parav Pandit <parav@nvidia.com>
- Re: [virtio-comment] [PATCH 5/5] virtio-pci: implement VIRTIO_F_QUEUE_STATE
  - From: "Zhu, Lingshan" <lingshan.zhu@intel.com>
- RE: [virtio-comment] [PATCH 5/5] virtio-pci: implement VIRTIO_F_QUEUE_STATE
  - From: Parav Pandit <parav@nvidia.com>
- Re: [virtio-comment] [PATCH 5/5] virtio-pci: implement VIRTIO_F_QUEUE_STATE
  - From: "Zhu, Lingshan" <lingshan.zhu@intel.com>