Subject: Re: [PATCH RFC] virtio: introduce VIRTIO_F_DEVICE_STOP

On 2020/12/18 äå6:15, Stefano Garzarella wrote:
On Fri, Dec 18, 2020 at 12:23:02PM +0800, Jason Wang wrote:
This patch introduces a new status bit DEVICE_STOPPED. This will be
used by the driver to stop and resume a device. The main user will be
live migration support for virtio device.

Signed-off-by: Jason Wang <jasowang@redhat.com>
content.tex | 26 ++++++++++++++++++++++++--
1 file changed, 24 insertions(+), 2 deletions(-)

diff --git a/content.tex b/content.tex
index 61eab41..4392b60 100644
--- a/content.tex
+++ b/content.tex
@@ -47,6 +47,9 @@ \section{\field{Device Status} Field}\label{sec:Basic Facilities of a Virtio Dev
\item[DRIVER_OK (4)] Indicates that the driver is set up and ready to
 drive the device.

+\item[DEVICE_STOPPED (32)] When VIRTIO_F_DEVICE_STOPPED is negotiated,
+Â indicates that the device has been stopped by the driver.

Just for curiosity, why 32 and not 16?
Is there any rule for high bits and low bits of Device Status field?

I'm not sure. I just spot that DEVICE_NEEDS_RESET (with DEVICE prefix) is using 64.

\item[DEVICE_NEEDS_RESET (64)] Indicates that the device has experienced
 an error from which it can't recover.
@@ -58,8 +61,9 @@ \section{\field{Device Status} Field}\label{sec:Basic Facilities of a Virtio Dev
\ref{sec:General Initialization And Device Operation / Device
The driver MUST NOT clear a
-\field{device status} bit. If the driver sets the FAILED bit,
-the driver MUST later reset the device before attempting to re-initialize.
+\field{device status} bit other than DEVICE_STOPPED. If the
+driver sets the FAILED bit, the driver MUST later reset the device
+before attempting to re-initialize.

The driver SHOULD NOT rely on completion of operations of a
device if DEVICE_NEEDS_RESET is set.
@@ -70,12 +74,28 @@ \section{\field{Device Status} Field}\label{sec:Basic Facilities of a Virtio Dev
recover by issuing a reset.

+The driver MUST NOT set or clear DEVICE_STOPPED when DRIVER_OK is not
+set. In order to stop the device, the driver MUST set DEVICE_STOPPED
+first and re-read status to check whether DEVICE_STOPPED is set by the
+device. In order to resume the device, the driver MUST clear
+DEVICE_STOPPED first and read status to ensure whether DEVICE_STOPPED
+is cleared by the device.
\devicenormative{\subsection}{Device Status Field}{Basic Facilities of a Virtio Device / Device Status Field}
The device MUST initialize \field{device status} to 0 upon reset.

The device MUST NOT consume buffers or send any used buffer
notifications to the driver before DRIVER_OK.

+The device MUST ignore DEVICE_STOPPED when DRIVER_OK is not set.
+When driver is trying to set DEVICE_STOPPED, the device MUST not
+process new avail requests and MUST complete all requests that is
+currently processing before setting DEVICE_STOPPED.
+The device MUST keep the config space unchanged when DEVICE_STOPPED is

Maybe here we can specify if "DEVICE_STOPPED is set by the driver" or when the device set it after completing all the requests.

Good point. Since the bit is offered by device, so "set by the device " might be better.

\label{sec:Basic Facilities of a Virtio Device / Device Status Field / DEVICENEEDSRESET}The device SHOULD set DEVICE_NEEDS_RESET when it enters an error state that a reset is needed. If DRIVER_OK is set, after it sets DEVICE_NEEDS_RESET, the device
MUST send a device configuration change notification to the driver.
@@ -6553,6 +6573,8 @@ \chapter{Reserved Feature Bits}\label{sec:Reserved Feature Bits}
 \item[VIRTIO_F_NOTIFICATION_DATA(38)] This feature indicates
 that the driver passes extra data (besides identifying the virtqueue)
 in its device notifications.
+Â \item[VIRTIO_F_DEVICE_STOP(39)] This feature indicates that the
+Â driver can stop and resume the device.
 See \ref{sec:Virtqueues / Driver notifications}~\nameref{sec:Virtqueues / Driver notifications}.

Since the completion of in-flight requests can take time (I'm thinking of block devices), should we consider a notification when the device completes all the pending requests and set DEVICE_STOPPED.

Something similar to the notification when DEVICE_NEEDS_RESET is set.

I'm not sure, but it looks even more complicated. Driver may do a periodic polling or fail the operation (like migration) when the request is not completed soon.



