[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: Re: [PATCH v4 1/1] Add suspend support for virtio PCI devices
On Fri, Feb 16, 2024 at 5:56âPM Michael S. Tsirkin <mst@redhat.com> wrote: > > On Fri, Feb 16, 2024 at 05:24:32PM +0900, David Stevens wrote: > > Add a virtio power management PCI capability to allow drivers to suspend > > virtio PCI devices. This allows drivers to suspend devices at the virtio > > level before suspending them at the PCI transport layer. This allows > > drivers to do a two phase suspend, which prevents notifications from > > being ignored or lost if interrupts are reconfigured at the PCI > > transport layer immediately before or after the device is put into the > > PCI PM D3 low power state. > > --- > > transport-pci.tex | 51 +++++++++++++++++++++++++++++++++++++++++++++++ > > 1 file changed, 51 insertions(+) > > > > diff --git a/transport-pci.tex b/transport-pci.tex > > index a5c6719ea871..ce77708a9b69 100644 > > --- a/transport-pci.tex > > +++ b/transport-pci.tex > > @@ -188,6 +188,8 @@ \subsection{Virtio Structure PCI Capabilities}\label{sec:Virtio Transport Option > > #define VIRTIO_PCI_CAP_SHARED_MEMORY_CFG 8 > > /* Vendor-specific data */ > > #define VIRTIO_PCI_CAP_VENDOR_CFG 9 > > +/* Power management configuration */ > > +#define VIRTIO_PCI_CAP_PM_CFG 10 > > \end{lstlisting} > > > > Any other value is reserved for future use. > > @@ -804,6 +806,55 @@ \subsubsection{PCI configuration access capability}\label{sec:Virtio Transport O > > specified by some other Virtio Structure PCI Capability > > of type other than \field{VIRTIO_PCI_CAP_PCI_CFG}. > > > > +\subsubsection{Power management capability}\label{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Power management capability} > > + > > +The VIRTIO_PCI_CAP_PM_CFG capability refers to a single byte. The > > +driver can write to the byte to set the power state of the device, > > +and it can read from the byte to get the current power state of the > > +device. > > + > > +The valid power states are: > > + > > +\begin{lstlisting} > > +/* Device is operating normally */ > > +#define VIRTIO_PM_STATE_ACTIVE 0 > > +/* Device operation is suspended */ > > +#define VIRTIO_PM_STATE_SUSPENDED 1 > > +\end{lstlisting} > > + > > +The device power state has no effect when \field{device status} does > > +not have the DRIVER_OK bit set or does have the DEVICE_NEEDS_RESET bit > > +set. > > > Given this is only after DRIVER_OK, wouldn't a feature bit + status bit > make more sense? This will make it transport-independent and simplify > discovery. A feature bit + status bit would work as well. I've posted some questions on Zhu Lingshan's patch. > > +\devicenormative{\paragraph}{Power management capability}{Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Power management capability} > > + > > +A device MUST maintain its state while suspended such that all driver > > +visible state after resuming exactly matches driver visible state > > +before suspending. > > + > > +A device MUST NOT send notifications while suspended. > > + > > +A device MAY operate on any buffers in its virtqueue while suspended. > > How is this reconsiled with state matching exactly? buffers are > driver-visible ... Drivers can't atomically access buffers and the PM byte, so the device modifying a buffer while suspended is indistinguishable from the device modifying a buffer in the window between when it is resumed by the driver and when the driver accesses the buffer. But you are right that the wording is contradictory. Maybe the earlier requirement could be better phrased as: A device MUST maintain its state while suspended such that after the driver resumes the device, the driver can use the device as if it had never been suspended in the first place. > > +A device MUST set its power state to VIRTIO_PM_STATE_ACTIVE on reset. > > + > > +A device SHOULD take steps to minimize its resource consumption while > > +suspended, although what this involves is specific to the particular > > +device implementation. > > + > > +\drivernormative{\paragraph}{Power management capability}{Virtio Transport Options / Virtio Over PCI Bus / PCI Device Layout / Power management capability} > > + > > +A driver MUST NOT access a suspended device's BARs corresponding to > > +any virtio structures, except for the power management byte. > > + > > +A driver MAY suspend a device that has buffers in one of its > > +virtqueues, but it MUST NOT modify any such buffers while the device > > +is suspended. > > + > > +A driver MUST read from the power management byte after writing to the > > +byte to verify that the device successfully entered the target power > > +state. > > Verify how? By checking the value returned? And what should it do with the value > does not match? Yes, comparing the value returned to the one it just wrote. The three options I can think of would be to abort the suspend, reset the device, or retry. Retry only makes sense if suspend/resume might succeed in the future. An API is really only retry-friendly if it has a way to set a timeout, since the consumer is what should be deciding how long to wait but can't make that decision without a timeout. Personally, I would lean towards not allowing timeouts here, since it's simpler. So maybe something like this: After the driver writes a new value to the PM byte, if the device can transition to the requested state, then subsequent reads of the PM state byte MUST block until the transition completes. If the device cannot transition to the requested state, it MUST immediately return the prior value of the PM state byte for subsequent reads of the PM state byte. In that case, the only options are to abort the suspension or to reset the device. That's really a policy decision of the driver, so I don't know how it would fit into this spec. -David -David
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]