[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: Re: [virtio-comment] [PATCH v1 1/8] admin: Add theory of operation for device migration
On Wed, Nov 22, 2023 at 2:05âPM Parav Pandit <parav@nvidia.com> wrote: > > > > From: Jason Wang <jasowang@redhat.com> > > Sent: Wednesday, November 22, 2023 10:57 AM > > > > On Wed, Nov 22, 2023 at 12:32âAM Parav Pandit <parav@nvidia.com> wrote: > > > > > > > > > > From: Jason Wang <jasowang@redhat.com> > > > > Sent: Tuesday, November 21, 2023 12:54 PM > > > > > > > > On Thu, Nov 16, 2023 at 1:28âPM Parav Pandit <parav@nvidia.com> > > wrote: > > > > > > > > > > > > > > > > From: Jason Wang <jasowang@redhat.com> > > > > > > Sent: Thursday, November 16, 2023 9:50 AM > > > > > > > > > > > > On Thu, Nov 16, 2023 at 1:39âAM Parav Pandit <parav@nvidia.com> > > > > wrote: > > > > > > > > > > > > > > > From: Jason Wang <jasowang@redhat.com> > > > > > > > > Sent: Monday, November 13, 2023 9:03 AM > > > > > > > > > > > > > > > > On Thu, Nov 9, 2023 at 2:25âPM Parav Pandit > > > > > > > > <parav@nvidia.com> > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > From: Jason Wang <jasowang@redhat.com> > > > > > > > > > > Sent: Tuesday, November 7, 2023 9:35 AM > > > > > > > > > > > > > > > > > > > > On Mon, Nov 6, 2023 at 3:05âPM Parav Pandit > > > > > > > > > > <parav@nvidia.com> > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > From: Jason Wang <jasowang@redhat.com> > > > > > > > > > > > > Sent: Monday, November 6, 2023 12:05 PM > > > > > > > > > > > > > > > > > > > > > > > > On Thu, Nov 2, 2023 at 2:10âPM Parav Pandit > > > > > > > > > > > > <parav@nvidia.com> > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > [...] > > > > > > > > > > > > > I disagree. > > > > > There are two different functionalities. > > > > > > > > > > Functionality_1: explicit ask for read or write > > > > > Functionality_2: read what has changed > > > > > > > > This needs to be justified. I won't repeat the questions again here. > > > > > > > As explained the use case in theory of operation already. > > > > > > > > > > > > > Should one merge 1 and 2 and complicate the command? > > > > > I prefer not to. > > > > > > > > Again there're functional duplications. E.g your command duplicates > > > > common_cfg for sure. > > > Nop. it is not. > > > Common cfg is accessed directly by guest member driver. > > > > It can be accessed directly, if we have adminq per VF. > Sure, instead I proposed the cvq as there is no need to burn another queue. Please, explain how CVQ is related here. > > > > > > > > > > > > > > > > > > > > Now having two different commands help for debugging to > > > > > differentiate between mgmt. commands and guest initiated commands. > > > > > :) > > > > > > > > > > > > > > > > > > > > Guest configure the following one by one: > > > > > > > > > > > > > > > > 1) vq size > > > > > > > > 2) vq addresses > > > > > > > > 3) MSI-X > > > > > > > > > > > > > > > > etc? > > > > > > > > > > > > > > > I think you interpreted "incremental" differently than I described. > > > > > > > In the device context read, the incremental is: > > > > > > > > > > > > > > If the hypervisor driver has read the device context twice, > > > > > > > the second read > > > > > > won't return any new data if nothing changed. > > > > > > > > > > > > See above. > > > > > > > > > > > Yeah, two separate commands needed. > > > > > > > > > > > > For example, if RSS configuration didnât change between two > > > > > > > reads, the > > > > > > second read wont return the TLV for RSS Context. > > > > > > > > > > > > > > While for transport the need is, when guest asked, one device > > > > > > > must read it > > > > > > regardless of the change. > > > > > > > > > > > > > > So notion of incremental is not by address, but by the value. > > > > > > > > > > > > > > > > For example, VQ configuration is exchanged once between > > > > > > > > > src and > > > > dst. > > > > > > > > > But VQ avail and used index may be updated multiple times. > > > > > > > > > > > > > > > > If it can work with multiple times of updating, why can't it > > > > > > > > work if we just update it once? > > > > > > > Functionally it can work. > > > > > > > > > > > > I think you answer yourself. > > > > > > > > > > > Yes, I donât like abuse of command. > > > > > > > > How did you define abuse or can spec ever need to define that? > > > I donât have any different definition than dictionary definition for > > > abuse. :) > > > > "Abuse" is pretty subjective. Spec have driver normative, please explain how > > and if you can use that. > > > :) > > So be it. > > Spec does not need to define the dictionary. > > For explicit read and write simple read/write commands are right to me. > > > > > > > > > > > > > > > > > > > > Performance wise, one does not want to update multiple times, > > > > > > > unless there > > > > > > is a change. > > > > > > > > > > > > > > Read as explained above is not meant to return same content again. > > > > > > > > > > > > > > > > > > > > > > > > So here hypervisor do not want to read any specific set of > > > > > > > > > fields and > > > > > > > > hypervisor is not parsing them either. > > > > > > > > > It is just a byte stream for it. > > > > > > > > > > > > > > > > Firstly, spec must define the device context format, so > > > > > > > > hypervisor can understand which byte is what otherwise you > > > > > > > > can't maintain migration compatibility. > > > > > > > Device context is defined already in the latest version. > > > > > > > > > > > > > > > Secondly, you can't mandate how the hypervisor is written. > > > > > > > > > > > > > > > > > > > > > > > > > > As opposed to that, in case of transport, the guest > > > > > > > > > explicitly asks to read or > > > > > > > > write specific bytes. > > > > > > > > > Therefore, it is not incremental. > > > > > > > > > > > > > > > > I'm totally lost. Which part of the transport is not incremental? > > > > > > > > > > > > > > > > > > > > > > > > > > Additionally, if hypervisor has put the trap on virtio > > > > > > > > > config, and because the memory device already has the > > > > > > > > > interface for virtio config, > > > > > > > > > > > > > > > > > > Hypervisor can directly write/read from the virtual config > > > > > > > > > to the member's > > > > > > > > config space, without going through the device context, right? > > > > > > > > > > > > > > > > If it can do it or it can choose to not. I don't see how it > > > > > > > > is related to the discussion here. > > > > > > > > > > > > > > > It is. I donât see a point of hypervisor not using the native > > > > > > > interface provided > > > > > > by the member device. > > > > > > > > > > > > It really depends on the case, and I see how it duplicates with > > > > > > the functionality that is provided by both: > > > > > > > > > > > > 1) The existing PCI transport > > > > > > > > > > > > or > > > > > > > > > > > > 2) The transport virtqueue > > > > > > > > > > > I would like to conclude that we disagree in our approaches. > > > > > PCI transport is for member device to directly communicate from > > > > > guest > > > > driver to the device. > > > > > This is uniform across PF, VFs, SIOV. > > > > > > > > For "PCi transport" did you mean the one defined in spec? If yes, > > > > how can it work with SIOV with what you're saying here (a direct > > > > communication channel)? > > > > > > > SIOV device may have same MMIO as VF. > > > > We circle back, SIOV is for scalability. If you claim registers don't scale, SVIO via > > MMIO doesn't scale either. > Hence, any new and slow things to stay off the MMIO. > And for currently defined things, Lingshan will show the performance numbers of why they should be transported via a virtqeueue. > > > > > > > > > > > > > > > > Admin commands are transport independent and their task is device > > > > migration. > > > > > One is not replacing the other. > > > > > > > > > > Transport virtqueue will never transport driver notifications, > > > > > hence it does > > > > not qualify at "transport". > > > > > > > > Another double standard. > > > I disagree. You coined the term transport vq, so stand behind it to transport > > everything. > > > > Nope, it works like other transport. It doesn't aim to replace any existing > > transport. > > > > > > > > > > > > > MMIO will never transport device notification, hence it does not > > > > qualify as "transport"? > > > > > > > How does interrupts work? > > > > It depends on the platform, no? > I donât know, how does a MMIO virtio hw device deliver an interrupt for x86 cpu? It uses platform or architecture specific mechanisms. > > > > > > Seems like missing basic functionality in transport. > > > > Not necessarily the charge of a transport. Virtio transport can't fly without a > > platform. > What is the equivalent of msix in mmio? MSI is not unique to PCI, it can be used for MMIO as well if the platform or architecture supports that. > > > > > > > > > > > > > > > For the vdpa case, there is no need for extra admin commands as > > > > > the > > > > mediation layer can directly use the interface available from the > > > > member device itself. > > > > > > > > > > You continue to want to overload admin commands for dual purpose, > > > > > does > > > > not make sense to me. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > it is not good idea to overload management > > > > > > > > > > > > > commands with actual run time > > > > > > > > > > > > guest commands. > > > > > > > > > > > > > The device context read writes are largely for > > > > > > > > > > > > > incremental > > > > updates. > > > > > > > > > > > > > > > > > > > > > > > > It doesn't matter if it is incremental or not but > > > > > > > > > > > > > > > > > > > > > > > It does because you want different functionality only > > > > > > > > > > > for purpose of backward > > > > > > > > > > compatibility. > > > > > > > > > > > That also if the device does not offer them as portion > > > > > > > > > > > of MMIO > > > > BAR. > > > > > > > > > > > > > > > > > > > > I don't see how it is related to the "incremental part". > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > 1) the function is there > > > > > > > > > > > > 2) hypervisor can use that function if they want and > > > > > > > > > > > > virtio > > > > > > > > > > > > (spec) can't forbid that > > > > > > > > > > > > > > > > > > > > > > > It is not about forbidding or supporting. > > > > > > > > > > > Its about what functionality to use for management > > > > > > > > > > > plane and guest > > > > > > > > plane. > > > > > > > > > > > Both have different needs. > > > > > > > > > > > > > > > > > > > > People can have different views, there's nothing we can > > > > > > > > > > prevent a hypervisor from using it as a transport as far as I can > > see. > > > > > > > > > For device context write command, it can be used (or > > > > > > > > > probably > > > > > > > > > abused) to do > > > > > > > > write but I fail to see why to use it. > > > > > > > > > > > > > > > > The function is there, you can't prevent people from doing that. > > > > > > > > > > > > > > > One can always mess up itself. :) It is not prevented. It is > > > > > > > just not right way to use the interface. > > > > > > > > > > > > > > > > Because member device already has the interface to do > > > > > > > > > config read/write and > > > > > > > > it is accessible to the hypervisor. > > > > > > > > > > > > > > > > Well, it looks self-contradictory again. Are you saying > > > > > > > > another set of commands that is similar to device context is > > > > > > > > needed for non-PCI > > > > > > transport? > > > > > > > > > > > > > > > All these non pci transport discussion is just meaning less. > > > > > > > Let MMIO bring the concept of member device at that point > > > > > > > something make > > > > > > sense to discuss. > > > > > > > > > > > > It's not necessarily MMIO. For example the SIOV, which I don't > > > > > > think can use the existing PCI transport. > > > > > > > > > > > > > PCI SIOV is also the PCI device at the end. > > > > > > > > > > > > We don't want to end up with two sets of commands to save/load > > > > > > SRIOV and SIOV at least. > > > > > > > > > > > This proposal ensures that SRIOV and SIOV devices are treated equally. > > > > > > > > How? Did you mean your proposal can work for SIOV? What's the > > > > transport then? > > > Yes. All majority of the device contexts should work for SIOV device as_is. > > > Member id would be different. > > > Some device context TLVs may be new as SIOV may have some > > simplifications as it may not have the giant register space like current one. > > > > You only explain the migration part but not the transport part. > > > Because SIOV is still under construction in community. Transport virtqueue is not just designed for SIOV. > There is no point of defining SIOV transport for some half-cooked spec. Then let's not claim your proposal suits for TDISP. The live migration of TDISP is even far beyond half-cooked. Thanks
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]