OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

virtio-comment message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]

Subject: Re: [virtio-comment] [PATCH v1 1/8] admin: Add theory of operation for device migration

On Wed, Oct 18, 2023 at 12:30âPM Parav Pandit <parav@nvidia.com> wrote:
> > From: Jason Wang <jasowang@redhat.com>
> > Sent: Wednesday, October 18, 2023 6:23 AM
> > Why can suspend go directly from guest to device then?
> >
> Because all the virtio registers are treated equally by the live migration driver.
> So why not?

Good, so you agree that if a new status bit is introduced then it can
work with passthrough? So did the register for indices.

> As explained device synchronizing all the operations which are not mediated by VMM.
> If somehow you claim that all the synchronization is possible _only_ in software in various mediation layer,
> And it is impossible in single place in device, than I do not agree.
> V2 listed most of the synchronization points of the device.
> > > When such device reset is done, it does not reset the device context, nor it
> > clears the dirty page records, because they are done by the controlling function.
> > >
> > > > I'm not convinced that the scalability is broken by just having 2 or
> > > > 3 more registers. We all know MSIX requires much more than this.
> > > >
> > > Saying there is some other high resource consumer method exists, so lets
> > consume more in new interface we do now, is not good approach.
> >
> > This is self-contradictory and double standard. You allow #MSI-X vectors to
> > grow but not config?
> >
> > > MSI-X on its v2 is underway. Hopefully it will be finished this year, which is
> > already cutting down O(N) resources.
> >
> > Why couldn't such a method be applied to config registers and others?
> >
> Because config registers are not de-duplicating type.
> Meaning each config register is unique in nature.

In nature from the view of the device logic but not the transport like PCIE.

> For such work a queue/dma approach is taken.
> So it is applied to config registers and others to not place as _always_available registers.
> We just need to do that in virtio too.
> All the recent work of flow filters, counters no longer rely on the config registers as we both agreed in discussion [1] with your comment

Yes but it serves a different purpose CVQ can't work before probing,
no? Or are you saying you want to trap CVQ for migration?

> "Adding cvq is much easier than inventing(duplicating) the work of a transport."
> [1] https://lore.kernel.org/virtio-comment/CACGkMEseZeT4VX8Ut-7KraxLKNOMKOgFDNxqKofXSFT8yHfg-w@mail.gmail.com/#t
> > >
> > > > You can't solve all the issues in one series. As stated many times,
> > > I am not solving all issues in one series. This series builds the infrastructure.
> >
> > It's you that is raising the scalability issue, and what you've ignored is that such
> > an "issue" has existed for many years and various hardware has been built on
> > top of that.
> >
> That does not mean one should continue with such issue.

Then how many MSI-X vectors do you expect to have for each VF? Why did
you choose that value? Why forbid a vendor to have more than that?

> And that hardware consumes more power and memory that results in overall device inefficiency.
> You should have objected the IMS patches in Linux kernel, you should also object new MSI-X proposal and say just use registers.

Actually the reverse, why can't we use something similar to IMS? IMS
allows storage in arbitrary places, no?

> > Again, we should make sure the function is correct before we can talk about
> > others, otherwise it would be an endless discussion.
> >
> And using registers is not the way to make it correct.

Correct in what sense? You are actually

1) doing a complete NACK of the existing PCI transport design
2) prevent new features from developing based on the mature of a
software device implementation
3) tie all new features to the owner/member structure which doesn't
exist and unnecessary for transport other than PCI

> Lets make sure that basic function for the member device to first level is correct.

Isn't this what I'm currently doing?

> > >
> > > > if you really
> > > > care about the scalability, the correct way is to behave like a real
> > > > transport through virtqueue instead of trying to duplicate the
> > > > functionality of transport virtqueue slowly.
> > > >
> > > This will be impossible because no device will transport driver notifications
> > using a virtqueue.
> > > Therefore, virtqueue is not some generic transport that does everything - as
> > simple as that. hence there is no transport virtqueue.
> >
> > You won't get such a wrong conclusion if you read that proposal.
> I have read those 4 or 5 patches posted by Lingshan and showed you that time that driver notifications are not coming via virtqueue.
> And if I missed it, and if they are coming via virtqueue, it does not meet the performance and "efficiency principle" from the paper you pointed.

Good and if you read the patches, you should know it allows transport
specific notification or you meant:

MMIO device notification is not coming via the MMIO, so "MMIO"
transport is not efficient?

> >
> > >
> > > And virtqueue for bulk data transfer exists so no need to invent yet another
> > thing without a good reason.
> >
> > I don't understand why this is related to transport virtqueue anyhow, it's also a
> > queue interface, no?
> Transport virtqueue is diversion of unrelated topic here.

You're talking about scalability, and you are saying registers are the
major blocker, but you stick to registers for guest to use and think
it is scalable.

How can I understand your point from the self contradictory statement above?

My point is simple, if your complaint about registers is true and if
you really care about scalability, you should not use any register for
virtio at all, and then you will end up with the transport virtqueue.

> A guest vm driver must be able to talk to the member device for all queue configuration etc through its own channel not mediated by the hypervisor.

You just invented a legacy tunnel which requires mediation, no?

> Otherwise such plumbing does not work for any confidential compute workload.

I'm pretty sure your conclusion here is wrong. Let's not keep raising
unrelated issues here or you don't want to converge this series. Let's
open a new thread with CC guys if you wish.

> Hence, I wouldnât discuss transport virtqueue for now.

Again, if you don't want to talk about transport virtqueue, that's
fine. But let's leave the scalability issue aside as well.


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]