OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

virtio-comment message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [virtio-comment] RE: [PATCH V2 3/6] virtio: dont reset vqs when SUSPEND


On Wed, Nov 22, 2023 at 12:32âAM Parav Pandit <parav@nvidia.com> wrote:
>
>
> > From: Jason Wang <jasowang@redhat.com>
> > Sent: Tuesday, November 21, 2023 1:03 PM
> >
> > On Thu, Nov 16, 2023 at 1:27âPM Parav Pandit <parav@nvidia.com> wrote:
> > >
> > > > From: Jason Wang <jasowang@redhat.com>
> > > > Sent: Thursday, November 16, 2023 9:50 AM
> > > >
> > > > On Thu, Nov 16, 2023 at 1:39âAM Parav Pandit <parav@nvidia.com>
> > wrote:
> > > > >
> > > > >
> > > > >
> > > > > > From: Jason Wang <jasowang@redhat.com>
> > > > > > Sent: Monday, November 13, 2023 9:05 AM
> > > > > >
> > > > > > On Thu, Nov 9, 2023 at 6:16âPM Parav Pandit <parav@nvidia.com>
> > wrote:
> > > > > > >
> > > > > > >
> > > > > > > > From: Zhu, Lingshan <lingshan.zhu@intel.com>
> > > > > > > > Sent: Thursday, November 9, 2023 3:28 PM
> > > > > > > >
> > > > > > > > On 11/9/2023 1:46 AM, Michael S. Tsirkin wrote:
> > > > > > > > > On Tue, Nov 07, 2023 at 05:27:23PM +0800, Zhu, Lingshan
> > wrote:
> > > > > > > > >>
> > > > > > > > >> On 11/6/2023 5:49 PM, Michael S. Tsirkin wrote:
> > > > > > > > >>> On Fri, Nov 03, 2023 at 06:34:34PM +0800, Zhu Lingshan
> > wrote:
> > > > > > > > >>>> When SUSPEND is set, device states and virtqueue states
> > > > > > > > >>>> should be stablized, therefore the driver should not
> > > > > > > > >>>> reset vqs when SUSPEND is set in device status.
> > > > > > > > >>>>
> > > > > > > > >>>> Signed-off-by: Zhu Lingshan <lingshan.zhu@intel.com>
> > > > > > > > >>>> ---
> > > > > > > > >>>>    content.tex | 3 +++
> > > > > > > > >>>>    1 file changed, 3 insertions(+)
> > > > > > > > >>>>
> > > > > > > > >>>> diff --git a/content.tex b/content.tex index
> > > > > > > > >>>> bcc9d4b..060b5c2
> > > > > > > > >>>> 100644
> > > > > > > > >>>> --- a/content.tex
> > > > > > > > >>>> +++ b/content.tex
> > > > > > > > >>>> @@ -444,6 +444,9 @@ \subsubsection{Virtqueue
> > > > > > > > >>>> Reset}\label{sec:Basic
> > > > > > > > Facilities of a Virtio Device /
> > > > > > > > >>>>    The device MUST reset any state of a virtqueue to
> > > > > > > > >>>> the default
> > > > state,
> > > > > > > > >>>>    including the available state and the used state.
> > > > > > > > >>>> +If VIRTIO_F_SUSPEND is negotiated and SUSPEND is set
> > > > > > > > >>>> +in \field{device status}, the driver SHOULD NOT reset
> > > > > > > > >>>> +any
> > > > virtqueues.
> > > > > > > > >>>> +
> > > > > > > > >>>>    \drivernormative{\paragraph}{Virtqueue Reset}{Basic
> > > > > > > > >>>> Facilities of a
> > > > > > > > Virtio Device / Virtqueues / Virtqueue Reset / Virtqueue
> > > > > > > > Reset}
> > > > > > > > >>>>    After the driver tells the device to reset a queue,
> > > > > > > > >>>> the driver MUST verify that
> > > > > > > > >>> Seems somewhat arbitrary and breaks the claim that the
> > > > > > > > >>> feature is orthogonal and can have uses besides migration.
> > > > > > > > >> when suspended, the device is frozen.
> > > > > > > > >> The driver is aware of this process and so should not
> > > > > > > > >> reset the vqs I
> > > > think.
> > > > > > > > > Again that is only true because you want to use it for migration.
> > > > > > > > > But then you can't claim it's a generic facility.
> > > > > > > > I don't get it. The device status is a basic facility.
> > > > > > > >
> > > > > > > > We need to SUSPEND the device by setting SUSPEND bit, to
> > > > > > > > stabilize the device states for migration.
> > > > > > > Is the PCI's PM time not enough to suspend the device?
> > > > > >
> > > > > > Are you saying we don't need virtio reset assuming we had FLR?
> > > > > >
> > > > > No. often FLR timing is not enough. Hence every PCI level device
> > > > > has some
> > > > sort of its own reset mechanism.
> > > > >
> > > > > > Suspending at different layers like rest at different layers.
> > > > > >
> > > > > > We have both FLR and virtio reset. The Virtio level function
> > > > > > could be reset without FLR. So did suspend.
> > > > > >
> > > > > > That's it.
> > > > > Sure, but wrapping it under some "basic facility" is just does not make
> > sense.
> > > >
> > > > Why, device status (e.g reset) belongs to that part.
> > > >
> > > Lingshan claimed that suspending device is for live migration in commit log
> > and in discussion he portray it as some basic facility unrelated to device
> > migration such as debug etc.
> > > Instead of claiming it as some non_device_migration facility does not make
> > sense.
> >
> > It is used for migration for sure.
> This is why it is not working when device is directly mapped.

We circle back. It works for the case of trap/emulation.

For direct mapping:

You claim guest reset can work but suspend can't?

> The hypervisor messing this bit and guest is also doing power management with it.

So I don't see how it differs from: virtio reset and FLR is under the
control of guests.

>
> Both of them needs separate channel to do their own work.
>
> >
> > >
> > > > >
> > > > > >
> > > > > > And if you want to rule P2P behaviours, PCI PM is really the
> > > > > > correct way to go instead of trying to do it at the virtio layer.
> > > > > >
> > > > > PCI PM is supposed to be controlled by the guest and so the suspend.
> > > >
> > > > I've listed issues about D3cold and others, I can't believe it can't
> > > > be controlled totally by guests.
> > > >
> > > D3cold is not controlled by the driver as defined by the PCI spec hence it is
> > not applicable.
> >
> > Have you seen the link I give you? Even if you are right, there still could be such
> > a request from the firmware, no?
> I may have missed the link.
> You have 10 replies, so it is easy to miss important things in rest of the comments.

I meant there still could be D3cold requests from the guest via
virtual firmware.

So it's not necessarily related to the guest driver.

>
> >
> > > D3hot is controlled by the driver.
> >
> > So, it requires the device context to be preserved, which is not documented in
> > your patch.
> PCI PM interactions is covered in v4 in the device requirements section.
>
> >
> > > > >
> > > > > Hypervisor needs its channel to suspend the device, as
> > > > > fundamentally guest is
> > > > unaware of device migration flow.
> > > >
> > > > That's pretty fine, the hypervisor also needs its channel to reset
> > > > the device. If you think there's a conflict with suspend, there should be one
> > for reset as well.
> > > >
> > > I donât see a need for hypervisor to reset the device in passthrough mode.
> > Can you explain why is it needed?
> >
> > Qemu has a command "system_reset".
> >
> I mean, what does this translate to reset the device in passthrough mode?

It needs to reset the virtio device.

> If this is FLR, it is there.

Please explain how it works. (It's not only a FLR, it also need virtio
level reset)

>
> > > Do you mean, it is needed in vdpa mode? If yes, the registers are emulated
> > anyway, so why the member device's native channel cannot be used in vdpa
> > mode?
> > >
> > > > >
> > > > > > > For large device I could imagine it could be short.
> > > > > > >
> > > > > > > In that case if there is suspend the device available, it will
> > > > > > > be used by the guest
> > > > > > driver itself, hypervisor wouldnât know about it when those
> > > > > > registers are not trapped.
> > > > > > > So we need two ways to suspend.
> > > > > > > One is guest visible, and guest controlled.
> > > > > > > Second is hypervisor control to fulfill the device migration needs.
> > > > > >
> > > > > > Can you explain why suspend is special but not reset or why
> > > > > > reset can work but not suspend? If reset can work, so does
> > > > > > suspend. If reset can't, neither does suspend.
> > > > > >
> > > > > As long as reset and suspend both are under guest control, I am fine.
> > > >
> > > > Well, you seem to ignore my question below. Hypervisor needs to
> > > > reset the device as well.
> > > >
> > > Why is it needed in passthrough mode?
> > >
> > > > >
> > > > > > For example, can you explain how a system_reset in Qemu can work
> > > > > > with your proposal?
> > > > > >
> > > > > > >
> > > > > > > So if you can please take a look if the proposed admin command
> > > > > > > to
> > > > > > freeze/stop mode can be used in the emulated register case or not.
> > > > > >
> > > > > > Again, if you design those for PCI, it's a layer violation. You
> > > > > > have answered
> > > > > They are used by the PCI layer, just like your suspend bit.
> > > > > Andy other transport can also use it.
> > > > >
> > > > > > yourself that PM is the right way to go.
> > > > > >
> > > > > > > It helps to have the suspend bit in guest control as well
> > > > > > > with/without
> > > > > > emulation mode.
> > > > > >
> > > > > > I won't repeat it again. You will find you need a full transport
> > > > > > to satisfy all the requirements.
> > > > > I disagree for full transport.
> > > >
> > > > See above and the discussion in another thread.
> > > >
> > > > > If you want to get discuss transport for sure it is some other
> > > > > thread and I want to see "driver notifications via such transport
> > > > > VQ" to fully qualify it
> > > > as transport, And that would be just sub-optimal for actual working.
> > > >
> > > > Sub-optimal since the function is duplicated with a transport but it
> > > > doesn't claim or design as a transport.
> > > >
> > > It is not sub-optimal because of duplication. It is because you want to
> > transport notifications via virtqueue.
> >
> > Have you ever read the series of tvq? You won't get this conclusion if you do
> > that.
> >
> I have read those 4 patches and I have seen that transportvq do not want to transport notifications.
> Hence it does not qualify as transport vq.

It exposes the platform MMIO area for driver notification. This is
sufficient. Any issue you see?

>
> Frankly, transport vq seems a way to formalize mediation forever in virtio.

Nope, it can be accessed by a guest driver directly.

> It is very weird way to build new SIOV device.
> For most things it should be the direct channel that virtio has already from driver to the device.

See above. SIOV might require a new transport or not.

>
>
> > >
> > > > > And hence, I wouldnât call it a transport anymore.
> > > > >
> > > > > >
> > > > > > >
> > > > > > > > This can also be used for debugging I think.
> > > > > > >
> > > > > > > As Michael listed, a dedicated debug interface is usually more
> > > > > > > useful instead
> > > > > > of in-band.
> > > > > >
> > > > > > Well, I've shown you the in-band facilities like debugging via
> > > > > > ethtool and kernel has a lot of other ones. If you have ever
> > > > > > tried to debug in a real production environment, you will find
> > > > > > how useful such handy information is where out-of- band
> > > > > > facilities are often dangerous
> > > > and usually prohibited or even unsupported.
> > > > > Guest driver can always read and write the device status without
> > > > > adding a
> > > > suspend bit.
> > > >
> > > > I don't get here. Suspend make sure the device state is frozen which
> > > > helps for debugging for sure.
> > > You wanted to debug some vq live, you suspend the device, the vq state got
> > changed.
> > >
> > > I just donât see that suspend is a debug tool.
> >
> > It's not a tool, it's a function that can be used as a debug tool.
> >
> > > Every feature is a debug feature literally.
> > > Classic heisenbug effect.
> > >
> > > Once can change driver notification frequency to see if interrupt rate
> > changed for debugging.
> > > One can disabled few RQs and see RSS...
> > > Blk can change blk_size to higher value to perf debug..
> > > The list continues..
> >
> > Let's not shift concepts.
> >
> Your comment to attribute device migration as debug feature is actually shifting the concept.

It's not.

Ling Shan put it in the basic facilities as part of device status. You
wonder why, we explained it can be used beyond migration. You asked
where, we told you for example things like debugging. We never claim
it can only be used in debug. Then you shift the concept to say debug
could be achieved by a lot of other facilities. For sure this is
correct, but it doesn't have any relationship with the discussion
here.

>
> > Obviously, suspend is not the only way to debug. But that's not the context
> > here.
> >
> I have no further comments on the claim that suspending a device a debug feature.
> If it, add a debug section and put it under that.
> You also know that it is not, so let's not waste our time.
>
> I just donât suspend bit as debug interface that undergoes classic heisenbug effect.

I never say it can be used for solving all problems. Anyhow that's
another topic right?

Thanks

>
> > Thanks
> >
> > >
> > > >
> > > > Thanks
> > > >
> > > > >
> > >
>



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]