[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: [PATCH v2] virtio-blk: restore VIRTIO_BLK_F_FLUSH and VIRTIO_BLK_F_CONFIG_WCE
VIRTIO_BLK_F_CONFIG_WCE is important in order to achieve good performance (up to 2x, though more realistically +30-40%) in latency-bound workloads. However, it was removed by mistake together with VIRTIO_BLK_F_FLUSH. In addition, even removing VIRTIO_BLK_F_FLUSH was probably not a great idea, because it simplifies simple drivers (e.g. firmware) that are okay with a writethrough cache but still need data to persist after power loss. What really should have been removed is just the possibility that devices not propose VIRTIO_BLK_F_FLUSH, but even that only deserves a "SHOULD" in the new world of conformance statements. Restore these, with the following changes: * clarify and use conformance statements in order to define writeback and writethrough caching according to what is commonly done by high-end storage. * clarify (with conformance statements) the influence of the VIRTIO_BLK_F_FLUSH feature on caching and how to proceed if only one of VIRTIO_BLK_F_FLUSH and VIRTIO_BLK_F_CONFIG_WCE is negotiated. * strengthen the requirement for persisting writes after a VIRTIO_BLK_T_FLUSH request. This is now a MUST. This fixes VIRTIO-144. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> --- content.tex | 76 ++++++++++++++++++++++++++++++++++++++++++------------------- 1 file changed, 52 insertions(+), 24 deletions(-) diff --git a/content.tex b/content.tex index 8de905f..4e03844 100644 --- a/content.tex +++ b/content.tex @@ -3723,26 +3723,31 @@ device except where noted. \item[VIRTIO_BLK_F_BLK_SIZE (6)] Block size of disk is in \field{blk_size}. +\item[VIRTIO_BLK_F_FLUSH (9)] Cache flush command support. + \item[VIRTIO_BLK_F_TOPOLOGY (10)] Device exports information on optimal I/O alignment. + +\item[VIRTIO_BLK_F_CONFIG_WCE (11)] Device can toggle its cache between writeback + and writethrough modes. \end{description} +Devices SHOULD always propose VIRTIO_BLK_F_FLUSH, and MUST propose it +if they propose VIRTIO_BLK_F_CONFIG_WCE. Drivers SHOULD only negotiate +these features if they are capable of sending VIRTIO_BLK_T_FLUSH commands. + \subsubsection{Legacy Interface: Feature bits}\label{sec:Device Types / Block Device / Feature bits / Legacy Interface: Feature bits} \begin{description} \item[VIRTIO_BLK_F_BARRIER (0)] Device supports request barriers. \item[VIRTIO_BLK_F_SCSI (7)] Device supports scsi packet commands. - -\item[VIRTIO_BLK_F_FLUSH (9)] Cache flush command support. - -\item[VIRTIO_BLK_F_CONFIG_WCE (11)] Device can toggle its cache between writeback - and writethrough modes. \end{description} -VIRTIO_BLK_F_FLUSH was also called VIRTIO_BLK_F_WCE: Legacy drivers -MUST only negotiate this feature if they are capable of sending -VIRTIO_BLK_T_FLUSH commands. +\begin{note} Naming of features + In the legacy interface, VIRTIO_BLK_F_FLUSH was also + called VIRTIO_BLK_F_WCE. +\end{note} \subsection{Device configuration layout}\label{sec:Device Types / Block Device / Device configuration layout} @@ -3771,7 +3776,7 @@ struct virtio_blk_config { // optimal (suggested maximum) I/O size in blocks le32 opt_io_size; } topology; - u8 reserved; + u8 writeback; }; \end{lstlisting} @@ -3801,21 +3806,27 @@ according to the native endian of the guest rather than \field{topology} struct can be read to determine the physical block size and optimal I/O lengths for the driver to use. This also does not affect the units in the protocol, only performance. -\end{enumerate} - -\subsubsection{Legacy Interface: Device Initialization}\label{sec:Device Types / Block Device / Device Initialization / Legacy Interface: Device Initialization} -The \field{reserved} field used to be called \field{writeback}. If the -VIRTIO_BLK_F_CONFIG_WCE feature is offered, the cache mode can be -read from \field{writeback}; the -driver can also write to the field in order to toggle the cache -between writethrough (0) and writeback (1) mode. If the feature is -not available, the driver can instead look at the result of -negotiating VIRTIO_BLK_F_FLUSH: the cache will be in writeback mode -after reset if and only if VIRTIO_BLK_F_FLUSH is negotiated. +\item If the VIRTIO_BLK_F_CONFIG_WCE feature is negotiated, the cache + mode can be read from \field{writeback}. 0 corresponds to a writethrough + cache, 1 to a writeback cache\footnote{Consistent with + \ref{sec:Device Types / Block Device / Device Operation}, + a writethrough cache can be defined broadly as a cache that commits + writes to non-volatile storage before reporting their completion. + For example, a battery-backed writeback cache actually counts as + writethrough according to this definition.}. + +\item If the VIRTIO_BLK_F_CONFIG_WCE feature is not available, the driver + can deduct the cache mode by looking at the result of negotiating + VIRTIO_BLK_F_FLUSH: the cache will be in writeback mode if and only if + VIRTIO_BLK_F_FLUSH is negotiated. +\end{enumerate} -Some older legacy devices did not operate in writethrough mode even -after a driver announced lack of support for VIRTIO_BLK_F_FLUSH. +If the VIRTIO_BLK_F_FLUSH feature is not negotiated while +VIRTIO_BLK_F_CONFIG_WCE is, \field{writeback} MUST read 0 immediately +after feature negotiation. Alternatively, a device MAY simply fail +feature negotiation if the driver proposes VIRTIO_BLK_F_CONFIG_WCE but +does not propose VIRTIO_BLK_F_FLUSH. \subsection{Device Operation}\label{sec:Device Types / Block Device / Device Operation} @@ -3871,8 +3882,25 @@ A device MUST set the \field{status} byte to VIRTIO_BLK_S_IOERR for a write request if the VIRTIO_BLK_F_RO feature if offered, and MUST NOT write any data. -Upon receipt of a VIRTIO_BLK_T_FLUSH request, the device SHOULD ensure -that any writes which were completed are committed to non-volatile storage. +If the \field{VIRTIO_BLK_F_FLUSH} feature is not negotiated, the +device SHOULD ensure that all writes are committed to non-volatile +storage before reporting completion. It MUST do so if it proposed +\field{VIRTIO_BLK_F_FLUSH}. Failure to do so can cause data loss. + +If the \field{VIRTIO_BLK_F_FLUSH} feature is negotiated, upon receipt +of a VIRTIO_BLK_T_FLUSH request, the device MUST ensure that any writes +which were completed are committed to non-volatile storage. + +If the VIRTIO_BLK_F_CONFIG_WCE feature is negotiated and the +\field{writeback} field in configuration space is 0, the device MUST +ensure that all writes are committed to non-volatile storage before +reporting completion. + +If the VIRTIO_BLK_F_CONFIG_WCE feature is negotiated, the driver MAY +switch to writethrough or writeback mode by writing respectively 0 and +1 to the \field{writeback} field. After writing a 0 to \field{writeback}, +the driver MUST NOT assume that writes submitted while the field was 1 have +been committed to non-volatile storage. \subsubsection{Legacy Interface: Device Operation}\label{sec:Device Types / Block Device / Device Operation / Legacy Interface: Device Operation} When using the legacy interface, transitional devices and drivers -- 2.4.3
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]