OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

virtio-comment message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [virtio-comment] RE: [PATCH V2 3/6] virtio: dont reset vqs when SUSPEND


On Fri, Nov 24, 2023 at 09:01:53PM +0800, Jason Wang wrote:
> On Fri, Nov 24, 2023 at 8:17âPM Michael S. Tsirkin <mst@redhat.com> wrote:
> >
> > On Fri, Nov 24, 2023 at 07:50:42PM +0800, Jason Wang wrote:
> > > > I frankly don't see how a bit which is completely non-orthogonal with
> > > > device and driver state can be useful for debug.
> > >
> > > The bit is to make sure the state of a device doesn't change. It may
> > > help or not just like if you want to pause a cpu/process during the
> > > debug.
> >
> > Heh. But it would be much better to have an orthogonal state
> > that driver can just set without worrying much about
> > device being broken somehow.
> 
> In that case we still need a preise definition of the state. So I
> don't see the difference here.

You don't? If you can inspect state without perturbing it, that is
better for debugging than an intrusive interface that perturbs state.


> Anyhow, we can leave debugging aside.
> 
> >
> >
> > > > For debug you want something that just always works.
> > > > Not an interface that has so many requirements it will break if
> > > > you look at it sidewise.
> > > >
> > > >
> > > > > > >
> > > > > > I donât see wasting time here.
> > > > > > If its debug, its debug.
> > > > > > If its migration, it is migration.
> > > > > > If its pm, its pm.
> > > > >
> > > > > Obviously not. Migration should leverage existing facilities as much
> > > > > as possible instead of duplicating them.
> > > >
> > > > I don't think there's anything obvious here.  A lot of device state
> > > > can't be easily accessed with existing facilities.
> > >
> > > Then we can invent new things.
> >
> > So the approach this patchset takes is a single interface for
> > all state introspection. Some duplication of functionality
> > for the sake of consistency.
> > You don't like it fine but
> 
> The (partial) duplication should be fine as long as it doesn't have
> any issue, but I do see a lot of issues. So I can't say I like it.

Let's focus on issue really instead of endless high level
architecture discussions.

> > there is nothing obvious that it's a bad thing. It's a tradeoff.
> >
> > > > The logical
> > > > continuation of your reasoning would be to add state introspection
> > > > commands e.g. to cvq in virtio net and then use tricks like shadow vq to
> > > > issue these.
> > >
> > > For the device state, yes. Because it is device logic and it is not
> > > what platform or transport can know.
> >
> > Exactly as I thought. Don't think shadow VQ is something we
> > can reasonably make a single migration mechanism though.
> 
> That's my understanding as well. It's up to the hypervisor, spec needs
> to focus on the mechanism but not policy.
> 
> > It feels fragile and heavyweight. It's more of a work
> > around hardware limitations.
> >
> >
> > > > Yea, possible, but can we not go there please?  Nothing is
> > > > wrong with just building commands that do exactly what we want them to
> > > > do instead of trying to build a ship in a bottle.
> > >
> > > But it's not the case for others.
> > >
> > > E.g in Parav's proposal, it tries to rule P2P behaviour via virtio
> > > admin commands when there is a duplication
> >
> > Yes there's some duplication. Advantage is consistency.
> 
> Just to clarify, we do need things like suspend but I don't see
> dealing with PCI P2P in virtio as consistent (e.g we don't imply any
> PCI stuff in virtio reset).

I think I can clarify.  The consistency is that there's a single chapter
that deals with migration, also hypervisors will all do exactly the same
instead of each hypervisor guessing how to do migration and going its
own way.




> >  I actually
> > suggested ways to reduce duplication, by using transport offsets as
> > tags.
> 
> I don't see a connection with P2P here. Or I may miss something.

I don't even know what P2P is in this context. Or why we are
discussing it. Is this going to be another distraction that
no one knows how it will work, just like mentioning TD randomly?

> > Finding a right balance means we all need to stop going to
> > extremes, I wish you and lingshan would stop trying to force everyone
> > to use registers and parav would stop trying to force dma.
> 
> There is a misunderstanding. Actually. I don't want to force
> registers.

You keep insisting on overlaying suspend functionality
over the existing transport. For pci that is going to be
in a register.

> Which kind of interface is the best way to go is really
> implementation specific. In some implementations, registers are cheap
> but not virtqueue, in others, virtqueue is cheap but not register.
> They are all fine. But we can't not claim one proposal that is
> optimized for a specific implementation to be the best way. What I
> want to do is not limit the facilities that are used for live
> migration to any specific transport. I think Ling Shan agreed with me
> in this part. It needs to work on all interfaces regardless of
> registers, DMA, CMA or others. That's why I suggest we focus on what
> needs to be migrated first. That is define the following things
> 
> 1) way to suspend/resume a device
> 2) virtqueue states, indices or inflight ones
> 3) device states
> 
> It is just like how we define virtqueues/features/status and other
> basic facilities where we do not tie it to any specific interfaces
> like DMA, CMA, registers or admin commands. Virtio benefited from the
> flexibility like this in the past so why not stick to that?

Because migration is a complex enough topic that we simply know
from experience that
- things like error handling are needed
- passing big arrays around is needed
these just do not work reasonably well over registers and
this is why admin commands were invented.



> After this, each transport can choose to implement it in their own
> way. For example, if there's a proposal to use those via admin
> virtqueue that's fine.
> And registers also fine and other transports.
> That's why only the last patch of this series is dealing with the PCI
> specific part.

Maybe they are fine in theory. So far I didn't see anything cohesive
that is close functionally being even to Parav's proposal of migration
over admin commands. There's no specific *reason* not to do that,
practically - I don't see why 


> 
> >
> > > and a layer violation.
> >
> > layers are only good if they make sense.
> 
> Case by case, for example, PCI PM has defined the state and the
> interaction with P2P. Reusing that seems much cleaner than inventing a
> mechanism in the virtio layer. Or if it needs a side channel, it needs
> to be invented in PCI not virtio.
> 
> Thanks

There's no *if* in my opinion - migration is way easier for hypervisors
to implement as a side channel so any device state can be migrated. And
this is uniform across transports. Is it harder or easier for hardware
to implement? We have a hardware vendor pushing a side channel approach
so it seems likely they know what is good for hardware? If it is somehow
nvidia specific - can we please have other vendors with actual
plans implementing this hardware (and note this has nothing to do with
VDPA - all this SUSPEND bit work is only useful for full offload) come
forward and say "we don't support this, we support that"?  Or is this
for some unnamed vendors not on the TC? Maybe they should join the TC to
influence the direction then.  Because right now it looks like software
guys telling hardware guys what is good for hardware and I don't see how
this makes any sense.



> >
> > > Thanks
> > >
> > >
> > > >
> > > > --
> > > > MST
> > > >
> >



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]