OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

virtio-comment message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: RE: [PATCH V2 3/6] virtio: dont reset vqs when SUSPEND



> From: Jason Wang <jasowang@redhat.com>
> Sent: Monday, November 13, 2023 9:05 AM
> 
> On Thu, Nov 9, 2023 at 6:16âPM Parav Pandit <parav@nvidia.com> wrote:
> >
> >
> > > From: Zhu, Lingshan <lingshan.zhu@intel.com>
> > > Sent: Thursday, November 9, 2023 3:28 PM
> > >
> > > On 11/9/2023 1:46 AM, Michael S. Tsirkin wrote:
> > > > On Tue, Nov 07, 2023 at 05:27:23PM +0800, Zhu, Lingshan wrote:
> > > >>
> > > >> On 11/6/2023 5:49 PM, Michael S. Tsirkin wrote:
> > > >>> On Fri, Nov 03, 2023 at 06:34:34PM +0800, Zhu Lingshan wrote:
> > > >>>> When SUSPEND is set, device states and virtqueue states should
> > > >>>> be stablized, therefore the driver should not reset vqs when
> > > >>>> SUSPEND is set in device status.
> > > >>>>
> > > >>>> Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
> > > >>>> ---
> > > >>>>    content.tex | 3 +++
> > > >>>>    1 file changed, 3 insertions(+)
> > > >>>>
> > > >>>> diff --git a/content.tex b/content.tex index bcc9d4b..060b5c2
> > > >>>> 100644
> > > >>>> --- a/content.tex
> > > >>>> +++ b/content.tex
> > > >>>> @@ -444,6 +444,9 @@ \subsubsection{Virtqueue
> > > >>>> Reset}\label{sec:Basic
> > > Facilities of a Virtio Device /
> > > >>>>    The device MUST reset any state of a virtqueue to the default state,
> > > >>>>    including the available state and the used state.
> > > >>>> +If VIRTIO_F_SUSPEND is negotiated and SUSPEND is set in
> > > >>>> +\field{device status}, the driver SHOULD NOT reset any virtqueues.
> > > >>>> +
> > > >>>>    \drivernormative{\paragraph}{Virtqueue Reset}{Basic
> > > >>>> Facilities of a
> > > Virtio Device / Virtqueues / Virtqueue Reset / Virtqueue Reset}
> > > >>>>    After the driver tells the device to reset a queue, the
> > > >>>> driver MUST verify that
> > > >>> Seems somewhat arbitrary and breaks the claim that the feature
> > > >>> is orthogonal and can have uses besides migration.
> > > >> when suspended, the device is frozen.
> > > >> The driver is aware of this process and so should not reset the vqs I think.
> > > > Again that is only true because you want to use it for migration.
> > > > But then you can't claim it's a generic facility.
> > > I don't get it. The device status is a basic facility.
> > >
> > > We need to SUSPEND the device by setting SUSPEND bit, to stabilize
> > > the device states for migration.
> > Is the PCI's PM time not enough to suspend the device?
> 
> Are you saying we don't need virtio reset assuming we had FLR?
>
No. often FLR timing is not enough. Hence every PCI level device has some sort of its own reset mechanism.
 
> Suspending at different layers like rest at different layers.
> 
> We have both FLR and virtio reset. The Virtio level function could be reset
> without FLR. So did suspend.
> 
> That's it.
Sure, but wrapping it under some "basic facility" is just does not make sense.

> 
> And if you want to rule P2P behaviours, PCI PM is really the correct way to go
> instead of trying to do it at the virtio layer.
>
PCI PM is supposed to be controlled by the guest and so the suspend.

Hypervisor needs its channel to suspend the device, as fundamentally guest is unaware of device migration flow.
 
> > For large device I could imagine it could be short.
> >
> > In that case if there is suspend the device available, it will be used by the guest
> driver itself, hypervisor wouldnât know about it when those registers are not
> trapped.
> > So we need two ways to suspend.
> > One is guest visible, and guest controlled.
> > Second is hypervisor control to fulfill the device migration needs.
> 
> Can you explain why suspend is special but not reset or why reset can work but
> not suspend? If reset can work, so does suspend. If reset can't, neither does
> suspend.
> 
As long as reset and suspend both are under guest control, I am fine.

> For example, can you explain how a system_reset in Qemu can work with your
> proposal?
> 
> >
> > So if you can please take a look if the proposed admin command to
> freeze/stop mode can be used in the emulated register case or not.
> 
> Again, if you design those for PCI, it's a layer violation. You have answered
They are used by the PCI layer, just like your suspend bit.
Andy other transport can also use it.

> yourself that PM is the right way to go.
> 
> > It helps to have the suspend bit in guest control as well with/without
> emulation mode.
> 
> I won't repeat it again. You will find you need a full transport to satisfy all the
> requirements.
I disagree for full transport.
If you want to get discuss transport for sure it is some other thread and I want to see "driver notifications via such transport VQ" to fully qualify it as transport,
And that would be just sub-optimal for actual working.
And hence, I wouldnât call it a transport anymore.

> 
> >
> > > This can also be used for debugging I think.
> >
> > As Michael listed, a dedicated debug interface is usually more useful instead
> of in-band.
> 
> Well, I've shown you the in-band facilities like debugging via ethtool and kernel
> has a lot of other ones. If you have ever tried to debug in a real production
> environment, you will find how useful such handy information is where out-of-
> band facilities are often dangerous and usually prohibited or even unsupported.
Guest driver can always read and write the device status without adding a suspend bit.



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]