OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

virtio-comment message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [PATCH v4 1/1] Add suspend support for virtio PCI devices


On Mon, Feb 26, 2024 at 3:42âPM Jason Wang <jasowang@redhat.com> wrote:
>
> On Fri, Feb 16, 2024 at 4:25âPM David Stevens <stevensd@chromium.org> wrote:
> >
> > Add a virtio power management PCI capability to allow drivers to suspend
> > virtio PCI devices. This allows drivers to suspend devices at the virtio
> > level before suspending them at the PCI transport layer. This allows
> > drivers to do a two phase suspend, which prevents notifications from
> > being ignored or lost if interrupts are reconfigured at the PCI
> > transport layer immediately before or after the device is put into the
> > PCI PM D3 low power state.
> > ---
> >  transport-pci.tex | 51 +++++++++++++++++++++++++++++++++++++++++++++++
> >  1 file changed, 51 insertions(+)
> >
>
> I would like to know how the two phase suspend can prevent the loss of
> notification or interrupt. Or for example, if it can be workaround at
> the driver level.

I'll preface this by saying I am not an expert in how Linux handles
interrupts or in PCI, so I might be missing some important
information.

Actually, after thinking about this some more, I'm not sure that the
two phase suspend is necessary per-se. Rather, I think relying only on
PCI power management is simply insufficient.

According to the kernel's system suspend and device interrupts
documentation [1], "... after the 'late' phase of device suspend there
is no legitimate reason why any interrupts from suspended devices
should trigger and if any devices have not been suspended properly
yet, it is better to block interrupts from them anyway." PCI power
management doesn't occur until the noirq phase of suspend, so that's
too late to be useful here. Other than that, as of right now, there's
nothing the driver can do to tell devices they need to stop sending
interrupts. Adding suspend support to the virtio spec would address
this deficiency.

More concretely, I ran into two problems when trying to get shallow
suspend working with only PCI power management. If there are
workarounds for this at the driver level, then that would be great.
But I can't think of any, at least.

First, there is a window between when Linux suspends interrupts with
suspend_device_irqs and when PCI power management occurs. If the
device triggers an interrupt in that window, then it won't be lost
since IRQS_PENDING gets added to the descriptor state. However, if the
originating event should have woken the guest up from sleep, then that
doesn't happen, since there is no way for the device to know it
actually needed to generate a PCI PME. I might be missing something,
but this seems like something that can't be worked around at the
driver level.

Second, the core interrupt code can migrate interrupts when CPUs are
offline'ed entering shallow suspend. Since PCI devices are already in
D3 at this point, __pci_write_msi_msg does not update the MSI routing.
When bringing the device back up, there is a window between when the
device is put into D0 and when the updated routing tables are written
by __pci_restore_msix_state. If the device sends an interrupt in this
window, then it will be sent to the completely wrong handler in the
guest (or dropped if there is no handler installed). I guess this
could be worked around in the core PCI code by masking MSIX interrupts
before suspending the device. However, the fact that Linux doesn't
already do that today suggests that we're wrong, rather than everybody
else happens to be wrong in just the right way.

[1] https://www.kernel.org/doc/Documentation/power/suspend-and-interrupts.rst

-David


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]