[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: Re: [virtio-comment] [PATCH 5/5] virtio-pci: implement VIRTIO_F_QUEUE_STATE
On 9/12/2023 2:47 PM, Parav Pandit wrote:
Still needs to migrate the last round of dirty pages and device states when VM freeze. Still can be large if take big amount of VMs into consideration, and that is where ~300ms due time rules.From: Zhu, Lingshan <lingshan.zhu@intel.com> Sent: Tuesday, September 12, 2023 12:04 PM On 9/12/2023 1:58 PM, Parav Pandit wrote:From: Zhu, Lingshan <lingshan.zhu@intel.com> Sent: Tuesday, September 12, 2023 9:37 AM On 9/11/2023 6:21 PM, Parav Pandit wrote:From: Zhu, Lingshan <lingshan.zhu@intel.com> Sent: Monday, September 11, 2023 3:03 PM So implement AQ on the "admin" VF? This require the HW reserve dedicated resource for every VF? So expensive, Overkill? And a VF may be managed by the PF and its admin "vf"?Yes.it's a bit chaos, as you can see if the nested(L2 guest) VF can be managed by both L1 guest VF and the host PF, that means two owners of theL2 VF.This is the nesting. When you do M level nesting, does any cpu in world handle its own pagetables in isolation of next level and also perform equally well? Not exactly, in nesting, L1 guest is the host/infrastructure emulator for L2, so L2 is expect to do nothing with the host, or something like L2 VF managed by both L1 VF and host PF can lead to operational and security issues?If UDP packets are dropped, even application can fail who do no retry.UDP is not reliable, and performance overhead does not mean fail.It largely depends on application. I have seen iperf UDP failing on packet drop and never recovered. A retransmission over UDP can fail.That depends on the workload, if it choose UDP, it is aware of the possibilities of losing packets. But anyway, LM are expected to perform successfully in the due timeAnd LM also depends on the workload. :)Exactly! That's the point, how to meet the requirements!It is pointless to discuss performance characteristics as a point to use AQ ornot. How to meet QOS requirement when LM?By following [1] where large part of device context and dirty page tracking is done when the VM is running.
No. board designer does not need to. As explained already, if board wants to supporting single command of AQ,sure. Same as above, the QOS question. For example, how to avoid the situation that half VMs can be migrated and others timeout?Why would this happen? Timeout is not related to AQ in case if that happens.
explained above
When the VM freeze, the virtio functionalities, for example virito-net transaction is suspended as well,Timeout can happen to config registers too. And it can be even far more harder for board designers to support PCI reads in a timeout to handle in 384 reads in parallel.
so no TLPs for networking traffic buffers.The on-device Live Migration facility can use the full PCI device bandwidth for migration.
That is the difference with the admin vq.
explained above, it has to meet the due time requirement and many VMs can be migrated simultaneously,I am still not able to follow your point for asking about unrelated QOS questions.
in that situation, they have to race for the admin vq resource/bandwidth.
So you agree actually PCI config space are very unlikely to fail? It is reliable.Admin command can even fail with EAGAIN error code when device is out ofresource and software can retry the command. As demonstrated, this series is reliable as the config space functionalities, so maybe less possibilities to fail?Huh. Config space has far higher failure rate for the PCI transport when due toinherent nature of PCI timeouts and reads and polling.For any bulk data transfer virtqueue is spec defined approach. For more than a year this was debated you can check some 2021 emails. You can see the patches that data transfer done in [1] over registers is snailslow. Do you often observe virtio PCI config space fail? Or does admin vq need to transfer data through PCI?Admin commands needs to transfer bulk data across thousands of VFs in parallel for many VFs without baking registers in PCI.
Please allow me to provide an extreme example, is one single admin vq limitless, that can serve hundreds to thousands of VMs migration? If not, two or three or what number?
They key part is all of these happens outside of the VM's downtime. Majority of the work in proposal [1] is done when the VM is _live_. Hence, the resource consumption or reservation is significantly less.Still depends on the volume of VMs and devices, the orchestration layer needs to migrate the last round of dirty pages and states even when the VM has been suspended.That has nothing do with admin virtqueue. And migration layer already does it and used by multiple devices.same as above, QOSNaming a number or an algorithm for the ratio of devices / num_of_AQs is beyond this topic, but I made my point clear.Sure. It is beyond. And it is not a concern either.It is, the user expect the LM process success than fail.I still fail to understand why LM process fails. The migration process is slow, but downtime is not in [1].If I recall it clear, the downtime is around 300ms, so don't let the bandwidth or num of admin vqs become a bottle neck which may introduce more possibilities to fail.can depth = 1K introduce significant latency?AQ command execution is not done serially. There is enough text on the AQchapter as I recall. Then require more HW resource, I don't see difference.Difference compared to what, multiple AQs? If so, sure. The device who prefers to do only one AQ command at a time, sure it canwork with less resource and do one at a time. I think we are discussing the same issue as above "resource for the worstcase"problemFrankly I am not seeing any issue. AQ is just another virtqueue as basic construct in the spec used by 30+ devicetypes. explained above, when migrate a VM, the time consuming has to convergence and the total downtime has a due, I remember it is less than 300ms. That is the QOS requirement.And admin commands can easily serve that as majority of the work is done when the VM is running and member device is in active state in proposal [1].
explained above, depends on the amount of the migrating VMs.
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]