OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

virtio-comment message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [virtio-comment] Re: [virtio-dev] Re: [PATCH 0/5] virtio: introduce SUSPEND bit and vq state


On Thu, Oct 12, 2023 at 06:49:51PM +0800, Zhu, Lingshan wrote:
> 
> 
> On 10/12/2023 5:59 PM, Michael S. Tsirkin wrote:
> > On Wed, Oct 11, 2023 at 06:38:32PM +0800, Zhu, Lingshan wrote:
> > > 
> > > On 10/11/2023 6:20 PM, Michael S. Tsirkin wrote:
> > > > On Mon, Oct 09, 2023 at 06:01:42PM +0800, Zhu, Lingshan wrote:
> > > > > On 9/27/2023 11:40 PM, Michael S. Tsirkin wrote:
> > > > > > On Wed, Sep 27, 2023 at 04:20:01PM +0800, Zhu, Lingshan wrote:
> > > > > > > On 9/26/2023 6:48 PM, Michael S. Tsirkin wrote:
> > > > > > > > On Tue, Sep 26, 2023 at 05:25:42PM +0800, Zhu, Lingshan wrote:
> > > > > > > > > We don't want to repeat the discussions, it looks like endless circle with
> > > > > > > > > no direction.
> > > > > > > > OK let me try to direct this discussion.
> > > > > > > > You guys were speaking past each other, no dialog is happening.
> > > > > > > > And as long as it goes on no progress will be made and you
> > > > > > > > will keep going in circles.
> > > > > > > > 
> > > > > > > > Parav here made an effort and attempted to summarize
> > > > > > > > use-cases addressed by your proposal but not his.
> > > > > > > > He couldn't resist adding "a yes but" in there oh well.
> > > > > > > > But now I hope you know he knows about your use-cases?
> > > > > > > > 
> > > > > > > > So please do the same. Do you see any advantages to Parav's
> > > > > > > > proposal as compared to yours? Try to list them and
> > > > > > > > if possible try not to accompany the list with "yes but"
> > > > > > > > (put it in a separate mail if you must ;) ).
> > > > > > > > If you won't be able to see any, let me know and I'll try to help.
> > > > > > > > 
> > > > > > > > Once each of you and Parav have finally heard the other and
> > > > > > > > the other also knows he's been heard, that's when we can
> > > > > > > > try to make progress by looking for something that addresses
> > > > > > > > all use-cases as opposed to endlessly repeating same arguments.
> > > > > > > Sure Michael, I will not say "yes but" here.
> > > > > > > 
> > > > > > >    From Parav's proposal, he intends to migrate a member device by its owner
> > > > > > > device through the admin vq,
> > > > > > > thus necessary admin vq commands are introduced in his series.
> > > > > > > 
> > > > > > > 
> > > > > > > I see his proposal can:
> > > > > > > 1) meet some customers requirements without nested and bare-metal
> > > > > > > 2) align with Nvidia production
> > > > > > > 3) easier to emulate by onboard SOC
> > > > > > Is that all you can see?
> > > > > > 
> > > > > > Hint: there's more.
> > > > > please help provide more.
> > > > Just a small subset off the top of my head:
> > > > Error handling.
> > > handle failed live migration? how?
> > For example you can try restarting VM on source.
> > Or at least report an error to hypervisor.
> I am not sure resetting a VM due to failed live migration is
> a good idea, should we resume the VM instead?

Yes - when I said restarting I meant resuming not resetting.

> Then try other
> convergence algorithm?

Talking about device failures here nothing to do with convergence.
But yes, can e.g. try a different destination.

> 
> And I think current live migration solution already implements error
> detector, like sees a time out?

it is extremely hard to predict how
long will it take a random piece of hardware from a random
vendor to respond. even if you do timeouts break nested
don't they ;) and finally, they provide no indication
of what went wrong whatsoever.

> > 
> > 
> > > and for other errors, we have mature error handling solutions
> > > in virtio for years, like re-read, NEEDS_RESET.
> > facepalm
> > 
> > Are you aware of the fact that Linux still doesn't support
> > it since it turned out to be an extremely awkward interface
> > to use?
> I think we have implemented this in virtio driver,
> like re-read to check FEATURES.

grep for NEEDS_RESET in drivers/virtio and weep.

> > 
> > > If that is not good enough, then the corollary is:
> > > admin vq is better than config space,
> > 
> > You keep confusing admin vq with admin commands.
> OK, so are admin commands better than registers?

They have more functionality for sure.

> > 
> > 
> > > then the further corollary could be:
> > > we should refactor virito-pci interfaces to admin vq commands,
> > > like how we handle features
> > > 
> > > Is that true?
> > > > Extendable to other group types such as SIOV.
> > > For SIOV, the admin vq is a transport, but for SR-IOV
> > > the admin vq is a control channel, that is different,
> > > and admin vq can be a side channel.
> > > 
> > > For example, for SIOV, we config and migrate MSIX through
> > > admin vq. For SRIOV, they are in config space.
> > And that's a mess. FYI we already got feedback from Linux devs
> > who are wondering why we can't come up with a consistent
> > interface that does everything.
> I believe config space is a consistent interface for PCI.
> For SIOV, we need a new transport layer anyway.
> > 
> > 
> > > > Batching of commands
> > > > less pci transactioons
> > > so this can still be a QOS issue.
> > > If batching, others to starve?
> > And if you block CPU since you are not accepting
> > a posted write this is better?
> I don't get it, block guest CPU?

host cpu in fact. if you flood pci expess with transactions
this is exactly what happens.

> > 
> > > > Support for keeping some data off-device
> > > I don't get it, what is off-device?
> > > The live migration facilities need to fetch data from the device anyway
> > Heh this is what was driving nvidia to use DMA so heavily all this time.
> > no - if data is not in registers, device can fetch the data from
> > across pci express link, presumably with a local cache.
> For PCI based configuration, like MSI, we need to fetch from config space
> anyway.
> For others like dirty page, we can store the bitmap in host memory, and use
> PASID for isolation.

Oh really?  What do we get by not using same mechanism for
device state then? This begins to look exactly like admin vq.

> > 
> > 
> > > > which does not mean it's better unconditionally.
> > > > are above points clear?
> > > The thing is, what blocks the config space solution?
> > > Why admin vq is a must for live migration?
> > > What's wrong in config space solution?
> > Whan you say what's wrong do you mean you still see no
> > advantages to doing DMA at all? config space is just better
> > with no drawbacks?
> still, if admin vq or admin commands are better than config space,
> we should refactor the whole virtio-pci interfaces to admin vq.

mixing admin vq and command up again apparently.
We want to support virtio over admin commands for SIOV, yes.
And once that's supported nothing should prevent using that
for SRIOV too.

> And Jason has ever proposed to build admin vq LM on our basic
> facilities, but I see this has been rejected.

Please do not conclude that you just need to resubmit.

> > 
> > > Shall we refactor everything in virtio-pci to use admin vq?
> > > > as long as you guys keep not hearing each other we will keep
> > > > seeing these flame wars. if you expect everyone on virtio-comment
> > > > to follow a 300 message thread you are imo very much mistaken.
> > > I am sure I have not ignored any questions.
> > > I am saying admin vq is problematic for live migration,
> > > at least it doesn't work for nested, so why admin vq is a must for live
> > > migration?
> > 
> > My suggestion for you was to add admin command support to
> > VF memory, as an alternative to admin vq. It looks like that
> > will address the nested virt usecase.
> If you mean carrying some big bulk of data like dirty page information,
> we implemented a facility in host memory which is isolated by PASID.
> 
> I should send a new series soon, so we can work on the patch.

I hope that one does not just restart the same flame war.
As it will if people keep talking past each other and
not listening.

> Thanks for your suggestions and efforts anyway.
> > 
> > > > > > 
> > > > > > 
> > > > > > 
> > > > > > > The general purpose of his proposal and mine are aligned: migrate virtio
> > > > > > > devices.
> > > > > > > 
> > > > > > > Jason has ever proposed to collaborate, please allow me quote his proposal:
> > > > > > > 
> > > > > > > "
> > > > > > > Let me repeat once again here for the possible steps to collaboration:
> > > > > > > 
> > > > > > > 1) define virtqueue state, inflight descriptors in the section of
> > > > > > > basic facility but not under the admin commands
> > > > > > > 2) define the dirty page tracking, device context/states in the
> > > > > > > section of basic facility but not under the admin commands
> > > > > > > 3) define transport specific interfaces or admin commands to access them
> > > > > > > "
> > > > > > > 
> > > > > > > I totally agree with his proposal.
> > > > > > > 
> > > > > > > Does this work for you Michael?
> > > > > > > 
> > > > > > > Thanks
> > > > > > > Zhu Lingshan
> > > > > > I just doubt very much this will work.  What will "define" mean then -
> > > > > > not an interface, just a description in english? I think you
> > > > > > underestimate the difficulty of creating such definitions that
> > > > > > are robust and precise.
> > > > > I think we can review the patch to correct the words.
> > > > > > Instead I suggest you define a way to submit admin commands that works
> > > > > > for nested and bare-metal (i.e. not admin vq, and not with sriov group
> > > > > > type). And work with Parav to make live migration admin commands work
> > > > > > reasonably will through this interface and with this type.
> > > > > why admin commands are better than registers?
> > > > > 
> > > > > This publicly archived list offers a means to provide input to the
> > > > > OASIS Virtual I/O Device (VIRTIO) TC.
> > > > > 
> > > > > In order to verify user consent to the Feedback License terms and
> > > > > to minimize spam in the list archive, subscription is required
> > > > > before posting.
> > > > > 
> > > > > Subscribe: virtio-comment-subscribe@lists.oasis-open.org
> > > > > Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
> > > > > List help: virtio-comment-help@lists.oasis-open.org
> > > > > List archive: https://lists.oasis-open.org/archives/virtio-comment/
> > > > > Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
> > > > > List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
> > > > > Committee: https://www.oasis-open.org/committees/virtio/
> > > > > Join OASIS: https://www.oasis-open.org/join/
> > > > > 



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]