OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

virtio-comment message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: RE: [virtio-comment] [PATCH v1 1/8] admin: Add theory of operation for device migration



> From: Jason Wang <jasowang@redhat.com>
> Sent: Thursday, October 19, 2023 8:11 AM
> > > From: Jason Wang <jasowang@redhat.com>
> > > Sent: Wednesday, October 18, 2023 6:23 AM
> >
> > > Why can suspend go directly from guest to device then?
> > >
> > Because all the virtio registers are treated equally by the live migration driver.
> > So why not?
> 
> Good, so you agree that if a new status bit is introduced then it can work with
> passthrough? So did the register for indices.
>
No. I replied to Michael of what are the use case of status bit method that can work for non_passthrough.
 
> > As explained device synchronizing all the operations which are not mediated
> by VMM.
> >
> > If somehow you claim that all the synchronization is possible _only_
> > in software in various mediation layer, And it is impossible in single place in
> device, than I do not agree.
> > V2 listed most of the synchronization points of the device.
> >
> > > > When such device reset is done, it does not reset the device
> > > > context, nor it
> > > clears the dirty page records, because they are done by the controlling
> function.
> > > >
> > > > > I'm not convinced that the scalability is broken by just having
> > > > > 2 or
> > > > > 3 more registers. We all know MSIX requires much more than this.
> > > > >
> > > > Saying there is some other high resource consumer method exists,
> > > > so lets
> > > consume more in new interface we do now, is not good approach.
> > >
> > > This is self-contradictory and double standard. You allow #MSI-X
> > > vectors to grow but not config?
> > >
> >
> >
> > > > MSI-X on its v2 is underway. Hopefully it will be finished this
> > > > year, which is
> > > already cutting down O(N) resources.
> > >
> > > Why couldn't such a method be applied to config registers and others?
> > >
> > Because config registers are not de-duplicating type.
> > Meaning each config register is unique in nature.
> 
> In nature from the view of the device logic but not the transport like PCIE.
Does not matter.

> 
> > For such work a queue/dma approach is taken.
> > So it is applied to config registers and others to not place as _always_available
> registers.
> > We just need to do that in virtio too.
> > All the recent work of flow filters, counters no longer rely on the
> > config registers as we both agreed in discussion [1] with your comment
> 
> Yes but it serves a different purpose CVQ can't work before probing, no? Or are
> you saying you want to trap CVQ for migration?
>
For passthrough nothing of the virtio interface (config, cvq, datavqs) is trapped.
For non_passthrough cvq config registers, etc need to be software composed.
 
> >
> > "Adding cvq is much easier than inventing(duplicating) the work of a
> transport."
If that cvq is on the PF, yes, which is the aq.
Second cvq cannot be placed on the VF, which hypervisor do not have access to.

> >
> > [1]
> > https://lore.kernel.org/virtio-comment/CACGkMEseZeT4VX8Ut-
> 7KraxLKNOMKO
> > gFDNxqKofXSFT8yHfg-w@mail.gmail.com/#t
> > > >
> > > > > You can't solve all the issues in one series. As stated many
> > > > > times,
> > > > I am not solving all issues in one series. This series builds the
> infrastructure.
> > >
> > > It's you that is raising the scalability issue, and what you've
> > > ignored is that such an "issue" has existed for many years and
> > > various hardware has been built on top of that.
> > >
> > That does not mean one should continue with such issue.
> 
> Then how many MSI-X vectors do you expect to have for each VF? Why did you
> choose that value? Why forbid a vendor to have more than that?
>
The device has finite number of MSI-X vectors. The cloud operators running VMs, provisions MSI-X vectors to the VF depending on its vcpus, queues, etc SLA config.
 
> > And that hardware consumes more power and memory that results in overall
> device inefficiency.
> > You should have objected the IMS patches in Linux kernel, you should also
> object new MSI-X proposal and say just use registers.
> 
> Actually the reverse, why can't we use something similar to IMS? IMS allows
> storage in arbitrary places, no?
> 
One wants to stay away from registers, you didnât object non_register interface with an argument that "registers are done for years, so why IMS, why MSI-X improvement?".
Here, you are objecting non_register interface.

> >
> > > Again, we should make sure the function is correct before we can
> > > talk about others, otherwise it would be an endless discussion.
> > >
> > And using registers is not the way to make it correct.
> 
> Correct in what sense? You are actually
> 
> 1) doing a complete NACK of the existing PCI transport design

There is no such design existing.

> 2) prevent new features from developing based on the mature of a software
> device implementation

> 3) tie all new features to the owner/member structure which doesn't exist and
> unnecessary for transport other than PCI
>
Other transports have missed the notion of owner and member is purely their lack.
Once they improve it will work fine.
 
> > Lets make sure that basic function for the member device to first level is
> correct.
> 

> Isn't this what I'm currently doing?
>
You are doing only for non_passthrough mode.
We are doing for first level for passthrough mode.

Your claim what I see is only way to do is non_passthrough via registers because you suggest to say "cvq is easy etc".
And this is where main disagreement is present.

Let me ask basic question.
Do you agree that there are two use cases?
1. device passthrough that does not trap virtio interface (= config space, cvq, data vqs, flow filter vqs, flr and more)

2. only data path is in device, and rest is software composed.

> >
> >
> > > >
> > > > > if you really
> > > > > care about the scalability, the correct way is to behave like a
> > > > > real transport through virtqueue instead of trying to duplicate
> > > > > the functionality of transport virtqueue slowly.
> > > > >
> > > > This will be impossible because no device will transport driver
> > > > notifications
> > > using a virtqueue.
> > > > Therefore, virtqueue is not some generic transport that does
> > > > everything - as
> > > simple as that. hence there is no transport virtqueue.
> > >
> > > You won't get such a wrong conclusion if you read that proposal.
> > I have read those 4 or 5 patches posted by Lingshan and showed you that
> time that driver notifications are not coming via virtqueue.
> > And if I missed it, and if they are coming via virtqueue, it does not meet the
> performance and "efficiency principle" from the paper you pointed.
> 
> Good and if you read the patches, you should know it allows transport specific
> notification or you meant:
> 
If you claim vq as transport, driver notifications must be coming using vq descriptors.
In the transport vq proposal, that is not the case.

> MMIO device notification is not coming via the MMIO, so "MMIO"
> transport is not efficient?
> 
> >
> > >
> > > >
> > > > And virtqueue for bulk data transfer exists so no need to invent
> > > > yet another
> > > thing without a good reason.
> > >
> > > I don't understand why this is related to transport virtqueue
> > > anyhow, it's also a queue interface, no?
> > Transport virtqueue is diversion of unrelated topic here.
> 
> You're talking about scalability, and you are saying registers are the major
> blocker, but you stick to registers for guest to use and think it is scalable.
>
We cannot change the past of what is already present in the spec.
Which new registers?
We are not introducing any new registers in this proposal.

> How can I understand your point from the self contradictory statement above?
> 
> My point is simple, if your complaint about registers is true and if you really care
> about scalability, you should not use any register for virtio at all, and then you
> will end up with the transport virtqueue.
Perfect, we are not introducing any new registers, what is there, is there, that cannot be changed.
In SIOV proposal that we build, will not have any new registers other than needed for init time.

> 
> >
> > A guest vm driver must be able to talk to the member device for all queue
> configuration etc through its own channel not mediated by the hypervisor.
> 
> You just invented a legacy tunnel which requires mediation, no?
> 
It is for legacy.
Nothing is done for new interface.

> > Otherwise such plumbing does not work for any confidential compute
> workload.
> 
> I'm pretty sure your conclusion here is wrong. Let's not keep raising unrelated
> issues here or you don't want to converge this series. Let's open a new thread
> with CC guys if you wish.
> 
> > Hence, I wouldnât discuss transport virtqueue for now.
> 
> Again, if you don't want to talk about transport virtqueue, that's fine. But let's
> leave the scalability issue aside as well.
>
Registers are related for functionality and scale.

Lets first agree on use case before the design, that I asked above.

I will wait to respond to any other emails until we agree on use case requirements.


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]