OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

virtio-comment message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [virtio-comment] RE: [PATCH V2 3/6] virtio: dont reset vqs when SUSPEND


On Thu, Nov 16, 2023 at 1:27âPM Parav Pandit <parav@nvidia.com> wrote:
>
> > From: Jason Wang <jasowang@redhat.com>
> > Sent: Thursday, November 16, 2023 9:50 AM
> >
> > On Thu, Nov 16, 2023 at 1:39âAM Parav Pandit <parav@nvidia.com> wrote:
> > >
> > >
> > >
> > > > From: Jason Wang <jasowang@redhat.com>
> > > > Sent: Monday, November 13, 2023 9:05 AM
> > > >
> > > > On Thu, Nov 9, 2023 at 6:16âPM Parav Pandit <parav@nvidia.com> wrote:
> > > > >
> > > > >
> > > > > > From: Zhu, Lingshan <lingshan.zhu@intel.com>
> > > > > > Sent: Thursday, November 9, 2023 3:28 PM
> > > > > >
> > > > > > On 11/9/2023 1:46 AM, Michael S. Tsirkin wrote:
> > > > > > > On Tue, Nov 07, 2023 at 05:27:23PM +0800, Zhu, Lingshan wrote:
> > > > > > >>
> > > > > > >> On 11/6/2023 5:49 PM, Michael S. Tsirkin wrote:
> > > > > > >>> On Fri, Nov 03, 2023 at 06:34:34PM +0800, Zhu Lingshan wrote:
> > > > > > >>>> When SUSPEND is set, device states and virtqueue states
> > > > > > >>>> should be stablized, therefore the driver should not reset
> > > > > > >>>> vqs when SUSPEND is set in device status.
> > > > > > >>>>
> > > > > > >>>> Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
> > > > > > >>>> ---
> > > > > > >>>>    content.tex | 3 +++
> > > > > > >>>>    1 file changed, 3 insertions(+)
> > > > > > >>>>
> > > > > > >>>> diff --git a/content.tex b/content.tex index
> > > > > > >>>> bcc9d4b..060b5c2
> > > > > > >>>> 100644
> > > > > > >>>> --- a/content.tex
> > > > > > >>>> +++ b/content.tex
> > > > > > >>>> @@ -444,6 +444,9 @@ \subsubsection{Virtqueue
> > > > > > >>>> Reset}\label{sec:Basic
> > > > > > Facilities of a Virtio Device /
> > > > > > >>>>    The device MUST reset any state of a virtqueue to the default
> > state,
> > > > > > >>>>    including the available state and the used state.
> > > > > > >>>> +If VIRTIO_F_SUSPEND is negotiated and SUSPEND is set in
> > > > > > >>>> +\field{device status}, the driver SHOULD NOT reset any
> > virtqueues.
> > > > > > >>>> +
> > > > > > >>>>    \drivernormative{\paragraph}{Virtqueue Reset}{Basic
> > > > > > >>>> Facilities of a
> > > > > > Virtio Device / Virtqueues / Virtqueue Reset / Virtqueue Reset}
> > > > > > >>>>    After the driver tells the device to reset a queue, the
> > > > > > >>>> driver MUST verify that
> > > > > > >>> Seems somewhat arbitrary and breaks the claim that the
> > > > > > >>> feature is orthogonal and can have uses besides migration.
> > > > > > >> when suspended, the device is frozen.
> > > > > > >> The driver is aware of this process and so should not reset the vqs I
> > think.
> > > > > > > Again that is only true because you want to use it for migration.
> > > > > > > But then you can't claim it's a generic facility.
> > > > > > I don't get it. The device status is a basic facility.
> > > > > >
> > > > > > We need to SUSPEND the device by setting SUSPEND bit, to
> > > > > > stabilize the device states for migration.
> > > > > Is the PCI's PM time not enough to suspend the device?
> > > >
> > > > Are you saying we don't need virtio reset assuming we had FLR?
> > > >
> > > No. often FLR timing is not enough. Hence every PCI level device has some
> > sort of its own reset mechanism.
> > >
> > > > Suspending at different layers like rest at different layers.
> > > >
> > > > We have both FLR and virtio reset. The Virtio level function could
> > > > be reset without FLR. So did suspend.
> > > >
> > > > That's it.
> > > Sure, but wrapping it under some "basic facility" is just does not make sense.
> >
> > Why, device status (e.g reset) belongs to that part.
> >
> Lingshan claimed that suspending device is for live migration in commit log and in discussion he portray it as some basic facility unrelated to device migration such as debug etc.
> Instead of claiming it as some non_device_migration facility does not make sense.

It is used for migration for sure.

>
> > >
> > > >
> > > > And if you want to rule P2P behaviours, PCI PM is really the correct
> > > > way to go instead of trying to do it at the virtio layer.
> > > >
> > > PCI PM is supposed to be controlled by the guest and so the suspend.
> >
> > I've listed issues about D3cold and others, I can't believe it can't be controlled
> > totally by guests.
> >
> D3cold is not controlled by the driver as defined by the PCI spec hence it is not applicable.

Have you seen the link I give you? Even if you are right, there still
could be such a request from the firmware, no?

> D3hot is controlled by the driver.

So, it requires the device context to be preserved, which is not
documented in your patch.

> > >
> > > Hypervisor needs its channel to suspend the device, as fundamentally guest is
> > unaware of device migration flow.
> >
> > That's pretty fine, the hypervisor also needs its channel to reset the device. If
> > you think there's a conflict with suspend, there should be one for reset as well.
> >
> I donât see a need for hypervisor to reset the device in passthrough mode. Can you explain why is it needed?

Qemu has a command "system_reset".

> Do you mean, it is needed in vdpa mode? If yes, the registers are emulated anyway, so why the member device's native channel cannot be used in vdpa mode?
>
> > >
> > > > > For large device I could imagine it could be short.
> > > > >
> > > > > In that case if there is suspend the device available, it will be
> > > > > used by the guest
> > > > driver itself, hypervisor wouldnât know about it when those
> > > > registers are not trapped.
> > > > > So we need two ways to suspend.
> > > > > One is guest visible, and guest controlled.
> > > > > Second is hypervisor control to fulfill the device migration needs.
> > > >
> > > > Can you explain why suspend is special but not reset or why reset
> > > > can work but not suspend? If reset can work, so does suspend. If
> > > > reset can't, neither does suspend.
> > > >
> > > As long as reset and suspend both are under guest control, I am fine.
> >
> > Well, you seem to ignore my question below. Hypervisor needs to reset the
> > device as well.
> >
> Why is it needed in passthrough mode?
>
> > >
> > > > For example, can you explain how a system_reset in Qemu can work
> > > > with your proposal?
> > > >
> > > > >
> > > > > So if you can please take a look if the proposed admin command to
> > > > freeze/stop mode can be used in the emulated register case or not.
> > > >
> > > > Again, if you design those for PCI, it's a layer violation. You have
> > > > answered
> > > They are used by the PCI layer, just like your suspend bit.
> > > Andy other transport can also use it.
> > >
> > > > yourself that PM is the right way to go.
> > > >
> > > > > It helps to have the suspend bit in guest control as well
> > > > > with/without
> > > > emulation mode.
> > > >
> > > > I won't repeat it again. You will find you need a full transport to
> > > > satisfy all the requirements.
> > > I disagree for full transport.
> >
> > See above and the discussion in another thread.
> >
> > > If you want to get discuss transport for sure it is some other thread
> > > and I want to see "driver notifications via such transport VQ" to fully qualify it
> > as transport, And that would be just sub-optimal for actual working.
> >
> > Sub-optimal since the function is duplicated with a transport but it doesn't
> > claim or design as a transport.
> >
> It is not sub-optimal because of duplication. It is because you want to transport notifications via virtqueue.

Have you ever read the series of tvq? You won't get this conclusion if
you do that.

>
> > > And hence, I wouldnât call it a transport anymore.
> > >
> > > >
> > > > >
> > > > > > This can also be used for debugging I think.
> > > > >
> > > > > As Michael listed, a dedicated debug interface is usually more
> > > > > useful instead
> > > > of in-band.
> > > >
> > > > Well, I've shown you the in-band facilities like debugging via
> > > > ethtool and kernel has a lot of other ones. If you have ever tried
> > > > to debug in a real production environment, you will find how useful
> > > > such handy information is where out-of- band facilities are often dangerous
> > and usually prohibited or even unsupported.
> > > Guest driver can always read and write the device status without adding a
> > suspend bit.
> >
> > I don't get here. Suspend make sure the device state is frozen which helps for
> > debugging for sure.
> You wanted to debug some vq live, you suspend the device, the vq state got changed.
>
> I just donât see that suspend is a debug tool.

It's not a tool, it's a function that can be used as a debug tool.

> Every feature is a debug feature literally.
> Classic heisenbug effect.
>
> Once can change driver notification frequency to see if interrupt rate changed for debugging.
> One can disabled few RQs and see RSS...
> Blk can change blk_size to higher value to perf debug..
> The list continues..

Let's not shift concepts.

Obviously, suspend is not the only way to debug. But that's not the
context here.

Thanks

>
> >
> > Thanks
> >
> > >
>



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]