OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

virtio message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: [PATCH v2] virtio-blk: restore VIRTIO_BLK_F_FLUSH and VIRTIO_BLK_F_CONFIG_WCE


VIRTIO_BLK_F_CONFIG_WCE is important in order to achieve good performance
(up to 2x, though more realistically +30-40%) in latency-bound workloads.
However, it was removed by mistake together with VIRTIO_BLK_F_FLUSH.

In addition, even removing VIRTIO_BLK_F_FLUSH was probably not a great
idea, because it simplifies simple drivers (e.g. firmware) that are okay
with a writethrough cache but still need data to persist after power loss.
What really should have been removed is just the possibility that devices
not propose VIRTIO_BLK_F_FLUSH, but even that only deserves a "SHOULD" in
the new world of conformance statements.

Restore these, with the following changes:

* clarify and use conformance statements in order to define writeback
and writethrough caching according to what is commonly done by high-end
storage.

* clarify (with conformance statements) the influence of the
VIRTIO_BLK_F_FLUSH feature on caching and how to proceed if only one of
VIRTIO_BLK_F_FLUSH and VIRTIO_BLK_F_CONFIG_WCE is negotiated.

* strengthen the requirement for persisting writes after a
VIRTIO_BLK_T_FLUSH request.  This is now a MUST.

This fixes VIRTIO-144.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 content.tex | 76 ++++++++++++++++++++++++++++++++++++++++++-------------------
 1 file changed, 52 insertions(+), 24 deletions(-)

diff --git a/content.tex b/content.tex
index 8de905f..4e03844 100644
--- a/content.tex
+++ b/content.tex
@@ -3723,26 +3723,31 @@ device except where noted.
 
 \item[VIRTIO_BLK_F_BLK_SIZE (6)] Block size of disk is in \field{blk_size}.
 
+\item[VIRTIO_BLK_F_FLUSH (9)] Cache flush command support.
+
 \item[VIRTIO_BLK_F_TOPOLOGY (10)] Device exports information on optimal I/O
     alignment.
+
+\item[VIRTIO_BLK_F_CONFIG_WCE (11)] Device can toggle its cache between writeback
+    and writethrough modes.
 \end{description}
 
+Devices SHOULD always propose VIRTIO_BLK_F_FLUSH, and MUST propose it
+if they propose VIRTIO_BLK_F_CONFIG_WCE.  Drivers SHOULD only negotiate
+these features if they are capable of sending VIRTIO_BLK_T_FLUSH commands.
+
 \subsubsection{Legacy Interface: Feature bits}\label{sec:Device Types / Block Device / Feature bits / Legacy Interface: Feature bits}
 
 \begin{description}
 \item[VIRTIO_BLK_F_BARRIER (0)] Device supports request barriers.
 
 \item[VIRTIO_BLK_F_SCSI (7)] Device supports scsi packet commands.
-
-\item[VIRTIO_BLK_F_FLUSH (9)] Cache flush command support.
-
-\item[VIRTIO_BLK_F_CONFIG_WCE (11)] Device can toggle its cache between writeback
-    and writethrough modes.
 \end{description}
 
-VIRTIO_BLK_F_FLUSH was also called VIRTIO_BLK_F_WCE: Legacy drivers
-MUST only negotiate this feature if they are capable of sending
-VIRTIO_BLK_T_FLUSH commands.
+\begin{note} Naming of features
+  In the legacy interface, VIRTIO_BLK_F_FLUSH was also
+  called VIRTIO_BLK_F_WCE.
+\end{note}
 
 \subsection{Device configuration layout}\label{sec:Device Types / Block Device / Device configuration layout}
 
@@ -3771,7 +3776,7 @@ struct virtio_blk_config {
                 // optimal (suggested maximum) I/O size in blocks
                 le32 opt_io_size;
         } topology;
-        u8 reserved;
+        u8 writeback;
 };
 \end{lstlisting}
 
@@ -3801,21 +3806,27 @@ according to the native endian of the guest rather than
   \field{topology} struct can be read to determine the physical block size and optimal
   I/O lengths for the driver to use. This also does not affect the units
   in the protocol, only performance.
-\end{enumerate}
-
-\subsubsection{Legacy Interface: Device Initialization}\label{sec:Device Types / Block Device / Device Initialization / Legacy Interface: Device Initialization}
 
-The \field{reserved} field used to be called \field{writeback}.  If the
-VIRTIO_BLK_F_CONFIG_WCE feature is offered, the cache mode can be
-read from \field{writeback}; the
-driver can also write to the field in order to toggle the cache
-between writethrough (0) and writeback (1) mode.  If the feature is
-not available, the driver can instead look at the result of
-negotiating VIRTIO_BLK_F_FLUSH: the cache will be in writeback mode
-after reset if and only if VIRTIO_BLK_F_FLUSH is negotiated.
+\item If the VIRTIO_BLK_F_CONFIG_WCE feature is negotiated, the cache
+  mode can be read from \field{writeback}.  0 corresponds to a writethrough
+  cache, 1 to a writeback cache\footnote{Consistent with
+    \ref{sec:Device Types / Block Device / Device Operation},
+    a writethrough cache can be defined broadly as a cache that commits
+    writes to non-volatile storage before reporting their completion.
+    For example, a battery-backed writeback cache actually counts as
+    writethrough according to this definition.}.
+
+\item If the VIRTIO_BLK_F_CONFIG_WCE feature is not available, the driver
+  can deduct the cache mode by looking at the result of negotiating
+  VIRTIO_BLK_F_FLUSH: the cache will be in writeback mode if and only if
+  VIRTIO_BLK_F_FLUSH is negotiated.
+\end{enumerate}
 
-Some older legacy devices did not operate in writethrough mode even
-after a driver announced lack of support for VIRTIO_BLK_F_FLUSH.
+If the VIRTIO_BLK_F_FLUSH feature is not negotiated while
+VIRTIO_BLK_F_CONFIG_WCE is, \field{writeback} MUST read 0 immediately
+after feature negotiation.  Alternatively, a device MAY simply fail
+feature negotiation if the driver proposes VIRTIO_BLK_F_CONFIG_WCE but
+does not propose VIRTIO_BLK_F_FLUSH.
 
 \subsection{Device Operation}\label{sec:Device Types / Block Device / Device Operation}
 
@@ -3871,8 +3882,25 @@ A device MUST set the \field{status} byte to VIRTIO_BLK_S_IOERR
 for a write request if the VIRTIO_BLK_F_RO feature if offered, and MUST NOT
 write any data.
 
-Upon receipt of a VIRTIO_BLK_T_FLUSH request, the device SHOULD ensure
-that any writes which were completed are committed to non-volatile storage.
+If the \field{VIRTIO_BLK_F_FLUSH} feature is not negotiated, the
+device SHOULD ensure that all writes are committed to non-volatile
+storage before reporting completion.  It MUST do so if it proposed
+\field{VIRTIO_BLK_F_FLUSH}.  Failure to do so can cause data loss.
+
+If the \field{VIRTIO_BLK_F_FLUSH} feature is negotiated, upon receipt
+of a VIRTIO_BLK_T_FLUSH request, the device MUST ensure that any writes
+which were completed are committed to non-volatile storage.
+
+If the VIRTIO_BLK_F_CONFIG_WCE feature is negotiated and the
+\field{writeback} field in configuration space is 0, the device MUST
+ensure that all writes are committed to non-volatile storage before
+reporting completion.
+
+If the VIRTIO_BLK_F_CONFIG_WCE feature is negotiated, the driver MAY
+switch to writethrough or writeback mode by writing respectively 0 and
+1 to the \field{writeback} field.  After writing a 0 to \field{writeback},
+the driver MUST NOT assume that writes submitted while the field was 1 have
+been committed to non-volatile storage.
 
 \subsubsection{Legacy Interface: Device Operation}\label{sec:Device Types / Block Device / Device Operation / Legacy Interface: Device Operation}
 When using the legacy interface, transitional devices and drivers
-- 
2.4.3



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]