OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

virtio-comment message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [virtio-comment] [PATCH v1 1/8] admin: Add theory of operation for device migration




On 10/10/2023 5:40 PM, Parav Pandit wrote:
Hi Lingshan,

From: Zhu, Lingshan <lingshan.zhu@intel.com>
Sent: Tuesday, October 10, 2023 2:28 PM

On 10/10/2023 1:21 AM, Parav Pandit wrote:
From: Michael S. Tsirkin <mst@redhat.com>
Sent: Monday, October 9, 2023 9:50 PM
One or more passthrough PCI VF devices are ubiquitous for virtual
machines usage using generic kernel framework such as vfio [1].
Mentioning a specific subsystem in a specific OS may mislead the
user to think it can only work in that setup. Let's not do that,
virtio is not only used for Linux and VFIO.
This is just one example on how these commands are useful.
It can be useful in more ways too in more OSes too.
I will drop from the patch commit log and keep as information
purpose in
cover letter.
Would that work for you?

I donât have any strong opinion to keep it or remove it as most
stakeholders
has the clear view of requirements now.
Let me know.
So some people use VFs with VFIO. Hence the module name.  This
sentence by itself seems to have zero value for the spec. Just drop it.
Ok. Will drop.
So why not build your admin vq live migration on our config space solution, get
out of the troubles, to make your life easier?

Your this question is completely unrelated to this reply or you misunderstood what dropping commit log means.
if you can rebase admin vq LM on our basic facilities, I think you dont need to talk about vfio in the first place,
so I ask you to re-consider Jason's proposal.

Dropping link to vfio does not drop the requirement.
I am ok to drop because requirements are clear of passthrough of member device.
Vfio is not a trouble at all.
Admin command is not a trouble either.

The pure technical reason is: all the functionalities proposed cannot be done in any other existing way.
Why? For below reasons.
1. device context, and write records (aka dirty page addresses) is huge which cannot be shared using config registers at scale of 4000 member devices
dirty page tracking will be implmemented in V2, actually I have the patch right now.
inflight descriptor tracking will be implemented by Eugenio in V2.
There are no scale problem as I repeated for many time, they are per-device basic facilities, just migrate the VF by its own facility,
so there are no 40000 member devices, this is not per PF.

The device context can be read from config space or trapped, like shadow control vq which is already done, that is basic virtualization. If you want to migrate device context, you need to specify device context for every type of device, net maybe easy, how do you see virtio-fs? And we are migrating stateless devices, or no? How do you migrate virtio-fs?
2. sharing such large context and write addresses in parallel for multiple devices cannot be done using single register file
see above
3. These registers cannot be residing in the VF because VF can undergo FLR, and device reset which must clear these registers

do you mean you want to audit all PCI features? When FLR, the device is rested, do you expect a device remember anything after FLR?
Do you want to trap FLR? Why?

Why FLR block or conflict with live migration?

4. When VF does the DMA, all dma occurs in the guest address space, not in hypervisor space; any flr and device reset must stop such dma.
And device reset and flr are controlled by the guest (not mediated by hypervisor).
if the guest reset the device, it is totally reasonable operation, and the guest own the risk, right? and still, do you want to audit every PCI features? at least you didn't do that in your series. For migration, you know the hypervisor takes the ownership of the device in the stop_window.
5. Any PASID to separate out admin vq on the VF does not work for two reasons.
R_1: device flr and device reset must stop all the dmas.
R_2: PASID by most leading vendors is still not mature enough
R_3: One also needs to do inversion to not expose PASID capability of the member PCI device to not expose
see above and what if guest shutdown? the same answer, right?

Actually you don't see any technical problems in our config space proposal,
right?
In config registers method, for passthrough I clearly see the technical problems (functional and scale) listed above.
Due to which config registers cannot reside on the VF and cannot scale either.
so see above answers.



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]