[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: Re: [virtio-comment] Re: [PATCH v4 1/1] Add suspend support for virtio PCI devices
On 2/26/2024 4:46 PM, David Stevens wrote:
On Mon, Feb 26, 2024 at 3:42âPM Jason Wang <jasowang@redhat.com> wrote:On Fri, Feb 16, 2024 at 4:25âPM David Stevens <stevensd@chromium.org> wrote:Add a virtio power management PCI capability to allow drivers to suspend virtio PCI devices. This allows drivers to suspend devices at the virtio level before suspending them at the PCI transport layer. This allows drivers to do a two phase suspend, which prevents notifications from being ignored or lost if interrupts are reconfigured at the PCI transport layer immediately before or after the device is put into the PCI PM D3 low power state. --- transport-pci.tex | 51 +++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 51 insertions(+)I would like to know how the two phase suspend can prevent the loss of notification or interrupt. Or for example, if it can be workaround at the driver level.I'll preface this by saying I am not an expert in how Linux handles interrupts or in PCI, so I might be missing some important information. Actually, after thinking about this some more, I'm not sure that the two phase suspend is necessary per-se. Rather, I think relying only on PCI power management is simply insufficient. According to the kernel's system suspend and device interrupts documentation [1], "... after the 'late' phase of device suspend there is no legitimate reason why any interrupts from suspended devices should trigger and if any devices have not been suspended properly yet, it is better to block interrupts from them anyway." PCI power management doesn't occur until the noirq phase of suspend, so that's too late to be useful here. Other than that, as of right now, there's nothing the driver can do to tell devices they need to stop sending interrupts. Adding suspend support to the virtio spec would address this deficiency.
I think the SUSPEND bit which is a virtio level suspending can help here.
More concretely, I ran into two problems when trying to get shallow suspend working with only PCI power management. If there are workarounds for this at the driver level, then that would be great. But I can't think of any, at least. First, there is a window between when Linux suspends interrupts with suspend_device_irqs and when PCI power management occurs. If the device triggers an interrupt in that window, then it won't be lost since IRQS_PENDING gets added to the descriptor state. However, if the originating event should have woken the guest up from sleep, then that doesn't happen, since there is no way for the device to know it actually needed to generate a PCI PME. I might be missing something, but this seems like something that can't be worked around at the driver level. Second, the core interrupt code can migrate interrupts when CPUs are offline'ed entering shallow suspend. Since PCI devices are already in D3 at this point, __pci_write_msi_msg does not update the MSI routing. When bringing the device back up, there is a window between when the device is put into D0 and when the updated routing tables are written by __pci_restore_msix_state. If the device sends an interrupt in this window, then it will be sent to the completely wrong handler in the guest (or dropped if there is no handler installed). I guess this could be worked around in the core PCI code by masking MSIX interrupts before suspending the device. However, the fact that Linux doesn't already do that today suggests that we're wrong, rather than everybody else happens to be wrong in just the right way.
I am not a PCI expert, please correct me if I misunderstand anything. MSI interrupt are in-band DMA writing, so the data is always secure, and CPU should process them anyway even after waking up, the handler is always there since the driver register it. The device should not send any interrupts(both vq and config) once it presents SUSPEND. For PM, I think we should not SUSPEND a PCI device by PM, PM should be supplementary steps for SUSPENDING. While we are working on this patch, I will address the comments for the SUSPENDING bit patch and work our a V2 to wider review and more comments. Finally we will merge into a series. Thanks
[1] https://www.kernel.org/doc/Documentation/power/suspend-and-interrupts.rst -David This publicly archived list offers a means to provide input to the OASIS Virtual I/O Device (VIRTIO) TC. In order to verify user consent to the Feedback License terms and to minimize spam in the list archive, subscription is required before posting. Subscribe: virtio-comment-subscribe@lists.oasis-open.org Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org List help: virtio-comment-help@lists.oasis-open.org List archive: https://lists.oasis-open.org/archives/virtio-comment/ Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists Committee: https://www.oasis-open.org/committees/virtio/ Join OASIS: https://www.oasis-open.org/join/
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]