OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

virtio-comment message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [virtio-comment] RE: [PATCH V2 3/6] virtio: dont reset vqs when SUSPEND


On Fri, Nov 24, 2023 at 11:35:59AM +0800, Jason Wang wrote:
> > > > > > I donât see a need for hypervisor to reset the device in passthrough
> > > mode.
> > > > > Can you explain why is it needed?
> > > > >
> > > > > Qemu has a command "system_reset".
> > > > >
> > > > I mean, what does this translate to reset the device in passthrough mode?
> > >
> > > It needs to reset the virtio device.
> > >
> > > > If this is FLR, it is there.
> > >
> > > Please explain how it works. (It's not only a FLR, it also need virtio level reset)
> > >
> > FLR obviously covers the virtio level reset as FLR covers the PCI + virtio reset.
> 
> Which part of the spec says this? And at least Qemu is not implemented
> in this way.

PCI spec says this. And yes I believe qemu will fully reset the function
on FLR.


> And there's another conflict, you said FLR is under the control of guests ...
> 
> >
> > > >
> > > > > > Do you mean, it is needed in vdpa mode? If yes, the registers are
> > > > > > emulated
> > > > > anyway, so why the member device's native channel cannot be used in
> > > > > vdpa mode?
> > > > > >
> > > > > > > >
> > > > > > > > > > For large device I could imagine it could be short.
> > > > > > > > > >
> > > > > > > > > > In that case if there is suspend the device available, it
> > > > > > > > > > will be used by the guest
> > > > > > > > > driver itself, hypervisor wouldnât know about it when those
> > > > > > > > > registers are not trapped.
> > > > > > > > > > So we need two ways to suspend.
> > > > > > > > > > One is guest visible, and guest controlled.
> > > > > > > > > > Second is hypervisor control to fulfill the device migration needs.
> > > > > > > > >
> > > > > > > > > Can you explain why suspend is special but not reset or why
> > > > > > > > > reset can work but not suspend? If reset can work, so does
> > > > > > > > > suspend. If reset can't, neither does suspend.
> > > > > > > > >
> > > > > > > > As long as reset and suspend both are under guest control, I am fine.
> > > > > > >
> > > > > > > Well, you seem to ignore my question below. Hypervisor needs to
> > > > > > > reset the device as well.
> > > > > > >
> > > > > > Why is it needed in passthrough mode?
> > > > > >
> > > > > > > >
> > > > > > > > > For example, can you explain how a system_reset in Qemu can
> > > > > > > > > work with your proposal?
> > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > So if you can please take a look if the proposed admin
> > > > > > > > > > command to
> > > > > > > > > freeze/stop mode can be used in the emulated register case or not.
> > > > > > > > >
> > > > > > > > > Again, if you design those for PCI, it's a layer violation.
> > > > > > > > > You have answered
> > > > > > > > They are used by the PCI layer, just like your suspend bit.
> > > > > > > > Andy other transport can also use it.
> > > > > > > >
> > > > > > > > > yourself that PM is the right way to go.
> > > > > > > > >
> > > > > > > > > > It helps to have the suspend bit in guest control as well
> > > > > > > > > > with/without
> > > > > > > > > emulation mode.
> > > > > > > > >
> > > > > > > > > I won't repeat it again. You will find you need a full
> > > > > > > > > transport to satisfy all the requirements.
> > > > > > > > I disagree for full transport.
> > > > > > >
> > > > > > > See above and the discussion in another thread.
> > > > > > >
> > > > > > > > If you want to get discuss transport for sure it is some other
> > > > > > > > thread and I want to see "driver notifications via such
> > > > > > > > transport VQ" to fully qualify it
> > > > > > > as transport, And that would be just sub-optimal for actual working.
> > > > > > >
> > > > > > > Sub-optimal since the function is duplicated with a transport
> > > > > > > but it doesn't claim or design as a transport.
> > > > > > >
> > > > > > It is not sub-optimal because of duplication. It is because you
> > > > > > want to
> > > > > transport notifications via virtqueue.
> > > > >
> > > > > Have you ever read the series of tvq? You won't get this conclusion
> > > > > if you do that.
> > > > >
> > > > I have read those 4 patches and I have seen that transportvq do not want to
> > > transport notifications.
> > > > Hence it does not qualify as transport vq.
> > >
> > > It exposes the platform MMIO area for driver notification. This is sufficient.
> > > Any issue you see?
> > Yes, the issue is, it is not transporting the driver notifications.
> > Hence, it is not a transport virtqueue.
> 
> Please read how driver/device notification is done in the spec for
> existing transports.

Yea Parav I don't really see what you are driving at here.
But the bigger problem with tvq is that it was supposed to
add a new group type and be reworked on top of admin command
infrastructure and it never was.

> >
> > >
> > > >
> > > > Frankly, transport vq seems a way to formalize mediation forever in virtio.
> > >
> > > Nope, it can be accessed by a guest driver directly.
> > >
> > > > It is very weird way to build new SIOV device.
> > > > For most things it should be the direct channel that virtio has already from
> > > driver to the device.
> > >
> > > See above. SIOV might require a new transport or not.
> > >
> > It depends on the performance tests that Lingshan will show at scale.
> >
> > > >
> > > >
> > > > > >
> > > > > > > > And hence, I wouldnât call it a transport anymore.
> > > > > > > >
> > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > > This can also be used for debugging I think.
> > > > > > > > > >
> > > > > > > > > > As Michael listed, a dedicated debug interface is usually
> > > > > > > > > > more useful instead
> > > > > > > > > of in-band.
> > > > > > > > >
> > > > > > > > > Well, I've shown you the in-band facilities like debugging
> > > > > > > > > via ethtool and kernel has a lot of other ones. If you have
> > > > > > > > > ever tried to debug in a real production environment, you
> > > > > > > > > will find how useful such handy information is where out-of-
> > > > > > > > > band facilities are often dangerous
> > > > > > > and usually prohibited or even unsupported.
> > > > > > > > Guest driver can always read and write the device status
> > > > > > > > without adding a
> > > > > > > suspend bit.
> > > > > > >
> > > > > > > I don't get here. Suspend make sure the device state is frozen
> > > > > > > which helps for debugging for sure.
> > > > > > You wanted to debug some vq live, you suspend the device, the vq
> > > > > > state got
> > > > > changed.
> > > > > >
> > > > > > I just donât see that suspend is a debug tool.
> > > > >
> > > > > It's not a tool, it's a function that can be used as a debug tool.
> > > > >
> > > > > > Every feature is a debug feature literally.
> > > > > > Classic heisenbug effect.
> > > > > >
> > > > > > Once can change driver notification frequency to see if interrupt
> > > > > > rate
> > > > > changed for debugging.
> > > > > > One can disabled few RQs and see RSS...
> > > > > > Blk can change blk_size to higher value to perf debug..
> > > > > > The list continues..
> > > > >
> > > > > Let's not shift concepts.
> > > > >
> > > > Your comment to attribute device migration as debug feature is actually
> > > shifting the concept.
> > >
> > > It's not.
> > >
> > > Ling Shan put it in the basic facilities as part of device status. You wonder why,
> > > we explained it can be used beyond migration. You asked where, we told you
> > > for example things like debugging. We never claim it can only be used in debug.
> > > Then you shift the concept to say debug could be achieved by a lot of other
> > > facilities. For sure this is correct, but it doesn't have any relationship with the
> > > discussion here.

I frankly don't see how a bit which is completely non-orthogonal with
device and driver state can be useful for debug.
For debug you want something that just always works.
Not an interface that has so many requirements it will break if
you look at it sidewise.


> > >
> > I donât see wasting time here.
> > If its debug, its debug.
> > If its migration, it is migration.
> > If its pm, its pm.
> 
> Obviously not. Migration should leverage existing facilities as much
> as possible instead of duplicating them.

I don't think there's anything obvious here.  A lot of device state
can't be easily accessed with existing facilities. The logical
continuation of your reasoning would be to add state introspection
commands e.g. to cvq in virtio net and then use tricks like shadow vq to
issue these.  Yea, possible, but can we not go there please?  Nothing is
wrong with just building commands that do exactly what we want them to
do instead of trying to build a ship in a bottle.

-- 
MST



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]