OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

virtio-comment message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: RE: [virtio-comment] [PATCH v1 1/8] admin: Add theory of operation for device migration


> From: Jason Wang <jasowang@redhat.com>
> Sent: Wednesday, November 22, 2023 10:57 AM
> 
> On Wed, Nov 22, 2023 at 12:32âAM Parav Pandit <parav@nvidia.com> wrote:
> >
> >
> > > From: Jason Wang <jasowang@redhat.com>
> > > Sent: Tuesday, November 21, 2023 12:54 PM
> > >
> > > On Thu, Nov 16, 2023 at 1:28âPM Parav Pandit <parav@nvidia.com>
> wrote:
> > > >
> > > >
> > > > > From: Jason Wang <jasowang@redhat.com>
> > > > > Sent: Thursday, November 16, 2023 9:50 AM
> > > > >
> > > > > On Thu, Nov 16, 2023 at 1:39âAM Parav Pandit <parav@nvidia.com>
> > > wrote:
> > > > > >
> > > > > > > From: Jason Wang <jasowang@redhat.com>
> > > > > > > Sent: Monday, November 13, 2023 9:03 AM
> > > > > > >
> > > > > > > On Thu, Nov 9, 2023 at 2:25âPM Parav Pandit
> > > > > > > <parav@nvidia.com>
> > > wrote:
> > > > > > > >
> > > > > > > >
> > > > > > > > > From: Jason Wang <jasowang@redhat.com>
> > > > > > > > > Sent: Tuesday, November 7, 2023 9:35 AM
> > > > > > > > >
> > > > > > > > > On Mon, Nov 6, 2023 at 3:05âPM Parav Pandit
> > > > > > > > > <parav@nvidia.com>
> > > > > wrote:
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > > From: Jason Wang <jasowang@redhat.com>
> > > > > > > > > > > Sent: Monday, November 6, 2023 12:05 PM
> > > > > > > > > > >
> > > > > > > > > > > On Thu, Nov 2, 2023 at 2:10âPM Parav Pandit
> > > > > > > > > > > <parav@nvidia.com>
> > > > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > >
> 
> [...]
> 
> > > > >
> > > > I disagree.
> > > > There are two different functionalities.
> > > >
> > > > Functionality_1: explicit ask for read or write
> > > > Functionality_2: read what has changed
> > >
> > > This needs to be justified. I won't repeat the questions again here.
> > >
> > As explained the use case in theory of operation already.
> >
> > > >
> > > > Should one merge 1 and 2 and complicate the command?
> > > > I prefer not to.
> > >
> > > Again there're functional duplications. E.g your command duplicates
> > > common_cfg for sure.
> > Nop. it is not.
> > Common cfg is accessed directly by guest member driver.
> 
> It can be accessed directly, if we have adminq per VF.
Sure, instead I proposed the cvq as there is no need to burn another queue.

> 
> >
> > >
> > > >
> > > > Now having two different commands help for debugging to
> > > > differentiate between mgmt. commands and guest initiated commands.
> > > > :)
> > > >
> > > > > >
> > > > > > > Guest configure the following one by one:
> > > > > > >
> > > > > > > 1) vq size
> > > > > > > 2) vq addresses
> > > > > > > 3) MSI-X
> > > > > > >
> > > > > > > etc?
> > > > > > >
> > > > > > I think you interpreted "incremental" differently than I described.
> > > > > > In the device context read, the incremental is:
> > > > > >
> > > > > > If the hypervisor driver has read the device context twice,
> > > > > > the second read
> > > > > won't return any new data if nothing changed.
> > > > >
> > > > > See above.
> > > > >
> > > > Yeah, two separate commands needed.
> > > >
> > > > > > For example, if RSS configuration didnât change between two
> > > > > > reads, the
> > > > > second read wont return the TLV for RSS Context.
> > > > > >
> > > > > > While for transport the need is, when guest asked, one device
> > > > > > must read it
> > > > > regardless of the change.
> > > > > >
> > > > > > So notion of incremental is not by address, but by the value.
> > > > > >
> > > > > > > > For example, VQ configuration is exchanged once between
> > > > > > > > src and
> > > dst.
> > > > > > > > But VQ avail and used index may be updated multiple times.
> > > > > > >
> > > > > > > If it can work with multiple times of updating, why can't it
> > > > > > > work if we just update it once?
> > > > > > Functionally it can work.
> > > > >
> > > > > I think you answer yourself.
> > > > >
> > > > Yes, I donât like abuse of command.
> > >
> > > How did you define abuse or can spec ever need to define that?
> > I donât have any different definition than dictionary definition for
> > abuse. :)
> 
> "Abuse" is pretty subjective. Spec have driver normative, please explain how
> and if you can use that.
> 
:)

So be it.

Spec does not need to define the dictionary.

For explicit read and write simple read/write commands are right to me.

> >
> > >
> > > >
> > > > > > Performance wise, one does not want to update multiple times,
> > > > > > unless there
> > > > > is a change.
> > > > > >
> > > > > > Read as explained above is not meant to return same content again.
> > > > > >
> > > > > > >
> > > > > > > > So here hypervisor do not want to read any specific set of
> > > > > > > > fields and
> > > > > > > hypervisor is not parsing them either.
> > > > > > > > It is just a byte stream for it.
> > > > > > >
> > > > > > > Firstly, spec must define the device context format, so
> > > > > > > hypervisor can understand which byte is what otherwise you
> > > > > > > can't maintain migration compatibility.
> > > > > > Device context is defined already in the latest version.
> > > > > >
> > > > > > > Secondly, you can't mandate how the hypervisor is written.
> > > > > > >
> > > > > > > >
> > > > > > > > As opposed to that, in case of transport, the guest
> > > > > > > > explicitly asks to read or
> > > > > > > write specific bytes.
> > > > > > > > Therefore, it is not incremental.
> > > > > > >
> > > > > > > I'm totally lost. Which part of the transport is not incremental?
> > > > > > >
> > > > > > > >
> > > > > > > > Additionally, if hypervisor has put the trap on virtio
> > > > > > > > config, and because the memory device already has the
> > > > > > > > interface for virtio config,
> > > > > > > >
> > > > > > > > Hypervisor can directly write/read from the virtual config
> > > > > > > > to the member's
> > > > > > > config space, without going through the device context, right?
> > > > > > >
> > > > > > > If it can do it or it can choose to not. I don't see how it
> > > > > > > is related to the discussion here.
> > > > > > >
> > > > > > It is. I donât see a point of hypervisor not using the native
> > > > > > interface provided
> > > > > by the member device.
> > > > >
> > > > > It really depends on the case, and I see how it duplicates with
> > > > > the functionality that is provided by both:
> > > > >
> > > > > 1) The existing PCI transport
> > > > >
> > > > > or
> > > > >
> > > > > 2) The transport virtqueue
> > > > >
> > > > I would like to conclude that we disagree in our approaches.
> > > > PCI transport is for member device to directly communicate from
> > > > guest
> > > driver to the device.
> > > > This is uniform across PF, VFs, SIOV.
> > >
> > > For "PCi transport" did you mean the one defined in spec? If yes,
> > > how can it work with SIOV with what you're saying here (a direct
> > > communication channel)?
> > >
> > SIOV device may have same MMIO as VF.
> 
> We circle back, SIOV is for scalability. If you claim registers don't scale, SVIO via
> MMIO doesn't scale either.
Hence, any new and slow things to stay off the MMIO.
And for currently defined things, Lingshan will show the performance numbers of why they should be transported via a virtqeueue.

> 
> >
> > > >
> > > > Admin commands are transport independent and their task is device
> > > migration.
> > > > One is not replacing the other.
> > > >
> > > > Transport virtqueue will never transport driver notifications,
> > > > hence it does
> > > not qualify at "transport".
> > >
> > > Another double standard.
> > I disagree. You coined the term transport vq, so stand behind it to transport
> everything.
> 
> Nope, it works like other transport. It doesn't aim to replace any existing
> transport.
> 
> >
> > >
> > > MMIO will never transport device notification, hence it does not
> > > qualify as "transport"?
> > >
> > How does interrupts work?
> 
> It depends on the platform, no?
I donât know, how does a MMIO virtio hw device deliver an interrupt for x86 cpu?

> 
> > Seems like missing basic functionality in transport.
> 
> Not necessarily the charge of a transport. Virtio transport can't fly without a
> platform.
What is the equivalent of msix in mmio?
> 
> >
> > > >
> > > > For the vdpa case, there is no need for extra admin commands as
> > > > the
> > > mediation layer can directly use the interface available from the
> > > member device itself.
> > > >
> > > > You continue to want to overload admin commands for dual purpose,
> > > > does
> > > not make sense to me.
> > > >
> > > > > >
> > > > > >  > >
> > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > > it is not good idea to overload management
> > > > > > > > > > > > commands with actual run time
> > > > > > > > > > > guest commands.
> > > > > > > > > > > > The device context read writes are largely for
> > > > > > > > > > > > incremental
> > > updates.
> > > > > > > > > > >
> > > > > > > > > > > It doesn't matter if it is incremental or not but
> > > > > > > > > > >
> > > > > > > > > > It does because you want different functionality only
> > > > > > > > > > for purpose of backward
> > > > > > > > > compatibility.
> > > > > > > > > > That also if the device does not offer them as portion
> > > > > > > > > > of MMIO
> > > BAR.
> > > > > > > > >
> > > > > > > > > I don't see how it is related to the "incremental part".
> > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > > 1) the function is there
> > > > > > > > > > > 2) hypervisor can use that function if they want and
> > > > > > > > > > > virtio
> > > > > > > > > > > (spec) can't forbid that
> > > > > > > > > > >
> > > > > > > > > > It is not about forbidding or supporting.
> > > > > > > > > > Its about what functionality to use for management
> > > > > > > > > > plane and guest
> > > > > > > plane.
> > > > > > > > > > Both have different needs.
> > > > > > > > >
> > > > > > > > > People can have different views, there's nothing we can
> > > > > > > > > prevent a hypervisor from using it as a transport as far as I can
> see.
> > > > > > > > For device context write command, it can be used (or
> > > > > > > > probably
> > > > > > > > abused) to do
> > > > > > > write but I fail to see why to use it.
> > > > > > >
> > > > > > > The function is there, you can't prevent people from doing that.
> > > > > > >
> > > > > > One can always mess up itself. :) It is not prevented. It is
> > > > > > just not right way to use the interface.
> > > > > >
> > > > > > > > Because member device already has the interface to do
> > > > > > > > config read/write and
> > > > > > > it is accessible to the hypervisor.
> > > > > > >
> > > > > > > Well, it looks self-contradictory again. Are you saying
> > > > > > > another set of commands that is similar to device context is
> > > > > > > needed for non-PCI
> > > > > transport?
> > > > > > >
> > > > > > All these non pci transport discussion is just meaning less.
> > > > > > Let MMIO bring the concept of member device at that point
> > > > > > something make
> > > > > sense to discuss.
> > > > >
> > > > > It's not necessarily MMIO. For example the SIOV, which I don't
> > > > > think can use the existing PCI transport.
> > > > >
> > > > > > PCI SIOV is also the PCI device at the end.
> > > > >
> > > > > We don't want to end up with two sets of commands to save/load
> > > > > SRIOV and SIOV at least.
> > > > >
> > > > This proposal ensures that SRIOV and SIOV devices are treated equally.
> > >
> > > How? Did you mean your proposal can work for SIOV? What's the
> > > transport then?
> > Yes. All majority of the device contexts should work for SIOV device as_is.
> > Member id would be different.
> > Some device context TLVs may be new as SIOV may have some
> simplifications as it may not have the giant register space like current one.
> 
> You only explain the migration part but not the transport part.
> 
Because SIOV is still under construction in community. There is no point of defining SIOV transport for some half-cooked spec.
It is just logical to finish the device migration for well-defined member device.

> >
> > >
> > > > How brand new non-compatible SIOV device to transport this, is
> > > > outside of
> > > the scope of this work.
> > >
> > > You invented one that can be used for doing this. If you disagree,
> > > how can we know your proposal can work for SIOV without a transport
> then?
> >
> > I donât understand your comment.
> >
> > All I am saying is, most pieces of device contexts are reusable across VFs and
> SIOVs.
> > When SIOV is defined, we can relook at what may need to be added.
> 
> I've explained several times. Transport virtqueue is not solely designed for
> SIOV. SIOV could be one of the use cases.

Seems like a blander...


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]