OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

virtio-dev message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [PATCH v3] virtio-iommu: Rework the bypass feature


Hi Jean,

On 10/22/21 2:12 PM, Jean-Philippe Brucker wrote:
> The VIRTIO_IOMMU_F_BYPASS feature is awkward to use and incomplete.
> Although it is implemented by QEMU, it is not supported by any driver as
> far as I know. Replace it with a new VIRTIO_IOMMU_F_BYPASS_CONFIG
> feature.
>
> Two features are missing from virtio-iommu:
>
> * The ability for an hypervisor to start the device in bypass mode. The
>   wording for VIRTIO_IOMMU_F_BYPASS is not clear enough to allow it at
>   the moment, because it only specifies the behavior after feature
>   negotiation.
>
> * The ability for a guest to set individual endpoints in bypass mode
>   when bypass is globally disabled. An OS should have the ability to
>   allow only endpoints it trusts to bypass the IOMMU, while keeping DMA
>   disabled for endpoints it isn't even aware of. At the moment this can
>   only be emulated by creating identity mappings.
>
> The VIRTIO_IOMMU_F_BYPASS_CONFIG feature adds a 'bypass' config field
> that allows to enable and disable bypass globally. It also adds a new
> flag for the ATTACH request.
>
> * The hypervisor can start the VM with bypass enabled or, if it knows
>   that the software stack supports it, disabled. The 'bypass' config
>   fields is initialized to 0 or 1. It is sticky and isn't affected by
>   device reset.
>
> * Generally the firmware won't have an IOMMU driver and will need to be
>   started in bypass mode, so the bootloader and kernel can be loaded
>   from storage endpoint.
>
>   For more security, the firmware could implement a minimal virtio-iommu
>   driver that reuses existing virtio support and only touches the config
>   space. It could enable PCI bus mastering in bridges only for the
>   endpoints that need it, enable global IOMMU bypass by flipping a bit,
>   then tear everything down before handing control over to the OS. This
>   prevents vulnerability windows where a malicious endpoint reprograms
>   the IOMMU while the OS is configuring it [1].
>
>   The isolation provided by vIOMMUs has mainly been used for securely
>   assigning endpoints to untrusted applications so far, while kernel DMA
>   bypasses the IOMMU. But we can expect boot security to become as
>   important in virtualization as it presently is on bare-metal systems,
>   where some devices are untrusted and must never be able to access
>   memory that wasn't assigned to them.
>
> * The OS can enable and disable bypass globally. It can then enable
>   bypass for individual endpoints by attaching them to bypass domains,
>   using the new VIRTIO_IOMMU_ATTACH_F_BYPASS flag. It can disable bypass
>   by attaching them to normal domains.
>
> [1] IOMMU protection against I/O attacks: a vulnerability and a proof of concept
>     Morgan, B., Alata, Ã., Nicomette, V. et al.
>     https://link.springer.com/article/10.1186/s13173-017-0066-7
>
> Fixes: https://github.com/oasis-tcs/virtio-spec/issues/119
> Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
> ---
>
> The virtio-iommu spec with colored diff is available at
> https://jpbrucker.net/virtio-iommu/spec-bypass/virtio-iommu-f-bypass-config-v3-diff.pdf
>
> v1: https://www.mail-archive.com/virtio-dev@lists.oasis-open.org/msg07817.html
> v2: https://www.mail-archive.com/virtio-dev@lists.oasis-open.org/msg07898.html
> v3:
> * Normative statement about device reset vs. system reset - the bypass
>   bit is sticky across device reset to avoid the vulnerability described
>   above, but restored on system reset (a term also used by the virtio
>   memory device).
> * Explain that the field bypass is in effect as long as the new feature
>   is offered, even when not accepted by the driver
> * Another clarification about the state of an endpoint after detach.
> ---
>  conformance.tex  |   1 -
>  virtio-iommu.tex | 111 ++++++++++++++++++++++++++++++++++++++---------
>  2 files changed, 90 insertions(+), 22 deletions(-)
>
> diff --git a/conformance.tex b/conformance.tex
> index c2d7959..5c10ab9 100644
> --- a/conformance.tex
> +++ b/conformance.tex
> @@ -504,7 +504,6 @@ \section{Conformance Targets}\label{sec:Conformance / Conformance Targets}
>  \begin{itemize}
>  \item \ref{devicenormative:Device Types / IOMMU Device / Feature bits}
>  \item \ref{devicenormative:Device Types / IOMMU Device / Device configuration layout}
> -\item \ref{devicenormative:Device Types / IOMMU Device / Device Initialization}
>  \item \ref{devicenormative:Device Types / IOMMU Device / Device operations}
>  \item \ref{devicenormative:Device Types / IOMMU Device / Device operations / ATTACH request}
>  \item \ref{devicenormative:Device Types / IOMMU Device / Device operations / DETACH request}
> diff --git a/virtio-iommu.tex b/virtio-iommu.tex
> index 08b358a..f8cbe88 100644
> --- a/virtio-iommu.tex
> +++ b/virtio-iommu.tex
> @@ -59,14 +59,21 @@ \subsection{Feature bits}\label{sec:Device Types / IOMMU Device / Feature bits}
>    VIRTIO_IOMMU_F_MAP_UNMAP is supported.}
>  
>  \item[VIRTIO_IOMMU_F_BYPASS (3)]
> -  When not attached to a domain, endpoints downstream of the IOMMU
> -  can access the guest-physical address space.
> +  Endpoints that are not attached to a domain are in bypass mode.
>  
>  \item[VIRTIO_IOMMU_F_PROBE (4)]
>    The PROBE request is available.
>  
>  \item[VIRTIO_IOMMU_F_MMIO (5)]
>    The VIRTIO_IOMMU_MAP_F_MMIO flag is available.
> +
> +\item[VIRTIO_IOMMU_F_BYPASS_CONFIG (6)]
> +  Field \field{bypass} of struct virtio_iommu_config determines
> +  whether endpoints that are not attached to a domain are in
> +  bypass mode. Flag VIRTIO_IOMMU_ATTACH_F_BYPASS determines
> +  whether endpoints that are attached to a domain are in bypass
> +  mode.
> +
>  \end{description}
>  
>  \drivernormative{\subsubsection}{Feature bits}{Device Types / IOMMU Device / Feature bits}
> @@ -79,6 +86,11 @@ \subsection{Feature bits}\label{sec:Device Types / IOMMU Device / Feature bits}
>  
>  The device SHOULD offer feature bit VIRTIO_IOMMU_F_MAP_UNMAP.
>  
> +The VIRTIO_IOMMU_F_BYPASS_CONFIG feature supersedes
> +VIRTIO_IOMMU_F_BYPASS. New devices SHOULD NOT offer
> +VIRTIO_IOMMU_F_BYPASS. Devices SHOULD NOT offer both
> +VIRTIO_IOMMU_F_BYPASS and VIRTIO_IOMMU_F_BYPASS_CONFIG.
> +
>  \subsection{Device configuration layout}\label{sec:Device Types / IOMMU Device / Device configuration layout}
>  
>  The \field{page_size_mask} field is always present. Availability of the
> @@ -97,12 +109,20 @@ \subsection{Device configuration layout}\label{sec:Device Types / IOMMU Device /
>      le32 end;
>    } domain_range;
>    le32 probe_size;
> +  u8 bypass;
> +  u8 reserved[3];
>  };
>  \end{lstlisting}
>  
>  \drivernormative{\subsubsection}{Device configuration layout}{Device Types / IOMMU Device / Device configuration layout}
>  
> -The driver MUST NOT write to device configuration fields.
> +When the VIRTIO_IOMMU_F_BYPASS_CONFIG feature is negotiated, the
> +driver MAY write to \field{bypass}. The driver MUST NOT write to
> +any other device configuration field.
> +
> +The driver MUST NOT write a value different than 0 or 1 to
> +\field{bypass}. The driver SHOULD ignore bits 1-7 of
> +\field{bypass}.
>  
>  \devicenormative{\subsubsection}{Device configuration layout}{Device Types / IOMMU Device / Device configuration layout}
>  
> @@ -110,17 +130,18 @@ \subsection{Device configuration layout}\label{sec:Device Types / IOMMU Device /
>  the page granularity. The device MAY set more than one bit in
>  \field{page_size_mask}.
>  
> +If the device offers the VIRTIO_IOMMU_F_BYPASS_CONFIG feature, it
> +MAY initialize the \field{bypass} field to 1. Field
> +\field{bypass} SHOULD NOT change on device reset, but SHOULD be
> +restored to its initial value on system reset.
> +
> +The device MUST NOT present a value different than 0 or 1 in
> +\field{bypass}.
> +
>  \subsection{Device initialization}\label{sec:Device Types / IOMMU Device / Device initialization}
>  
>  When the device is reset, endpoints are not attached to any domain.
>  
> -If the VIRTIO_IOMMU_F_BYPASS feature is negotiated, all accesses from
> -unattached endpoints are allowed and translated by the IOMMU using the
> -identity function. If the feature is not negotiated, any memory access
> -from an unattached endpoint fails. Upon attaching an endpoint in
> -bypass mode to a new domain, any memory access from the endpoint fails,
> -since the domain does not contain any mapping.
> -
>  Future devices might support more modes of operation besides MAP/UNMAP.
>  Drivers verify that devices set VIRTIO_IOMMU_F_MAP_UNMAP and fail
>  gracefully if they don't.
> @@ -134,11 +155,6 @@ \subsection{Device initialization}\label{sec:Device Types / IOMMU Device / Devic
>  VIRTIO_IOMMU_T_PROBE request for each endpoint before attaching the
>  endpoint to a domain.
>  
> -\devicenormative{\subsubsection}{Device Initialization}{Device Types / IOMMU Device / Device Initialization}
> -
> -If the driver does not accept the VIRTIO_IOMMU_F_BYPASS feature, the
> -device SHOULD NOT let endpoints access the guest-physical address space.
> -
>  \subsection{Device operations}\label{sec:Device Types / IOMMU Device / Device operations}
>  
>  Driver send requests on the request virtqueue, notifies the device and
> @@ -229,6 +245,27 @@ \subsection{Device operations}\label{sec:Device Types / IOMMU Device / Device op
>    64-bit address space (\texttt{start = 0, end = 0xffffffff ffffffff})
>  \end{itemize}
>  
> +An endpoint is in bypass mode if:
> +\begin{itemize}
> +  \item the VIRTIO_IOMMU_F_BYPASS_CONFIG feature is offered and:
> +    \begin{itemize}
> +      \item config field \field{bypass} is 1 and the endpoint is
> +        not attached to a domain. This applies even if the driver
> +        does not accept the VIRTIO_IOMMU_F_BYPASS_CONFIG feature
> +        and the device initializes field \field{bypass} to 1.
> +
> +        or
> +      \item the endpoint is attached to a domain with
> +        VIRTIO_IOMMU_ATTACH_F_BYPASS.
> +    \end{itemize}
> +  or
> +  \item the VIRTIO_IOMMU_F_BYPASS feature is negotiated and the
> +    endpoint is not attached to a domain.
> +\end{itemize}
> +
> +All accesses from an endpoint in bypass mode are allowed and
> +translated by the IOMMU using the identity function.
> +
>  \drivernormative{\subsubsection}{Device operations}{Device Types / IOMMU Device / Device operations}
>  
>  The driver SHOULD set field \field{reserved} of struct
> @@ -259,6 +296,9 @@ \subsection{Device operations}\label{sec:Device Types / IOMMU Device / Device op
>  virtio_iommu_req_head and SHOULD set field \field{reserved} of struct
>  virtio_iommu_req_tail to zero.
>  
> +The device SHOULD NOT let unattached endpoints that are not in
> +bypass mode access the guest-physical address space.
> +
>  \subsubsection{ATTACH request}\label{sec:Device Types / IOMMU Device / Device operations / ATTACH request}
>  
>  \begin{lstlisting}
> @@ -266,9 +306,12 @@ \subsubsection{ATTACH request}\label{sec:Device Types / IOMMU Device / Device op
>    struct virtio_iommu_req_head head;
>    le32 domain;
>    le32 endpoint;
> -  u8   reserved[8];
> +  le32 flags;
> +  u8   reserved[4];
>    struct virtio_iommu_req_tail tail;
>  };
> +
> +#define VIRTIO_IOMMU_ATTACH_F_BYPASS  (1 << 0)
>  \end{lstlisting}
>  
>  Attach an endpoint to a domain. \field{domain} uniquely identifies a
> @@ -294,6 +337,10 @@ \subsubsection{ATTACH request}\label{sec:Device Types / IOMMU Device / Device op
>  attached to a single domain at a time. Endpoints attached to different
>  domains are isolated from each other.
>  
> +When the VIRTIO_IOMMU_F_BYPASS_CONFIG is negotiated, the driver
> +can set the VIRTIO_IOMMU_ATTACH_F_BYPASS flag to create a bypass
> +domain. Endpoints attached to this domain are in bypass mode.
> +
>  \drivernormative{\paragraph}{ATTACH request}{Device Types / IOMMU Device / Device operations / ATTACH request}
>  
>  The driver SHOULD set \field{reserved} to zero.
> @@ -301,11 +348,20 @@ \subsubsection{ATTACH request}\label{sec:Device Types / IOMMU Device / Device op
>  The driver SHOULD ensure that endpoints that cannot be isolated from each
>  other are attached to the same domain.
>  
> +If the domain already exists and is a bypass domain, the driver
> +SHOULD set the VIRTIO_IOMMU_ATTACH_F_BYPASS flag. If the domain
> +exists and is not a bypass domain, the driver SHOULD NOT set the
> +VIRTIO_IOMMU_ATTACH_F_BYPASS flag.
> +
>  \devicenormative{\paragraph}{ATTACH request}{Device Types / IOMMU Device / Device operations / ATTACH request}
>  
>  If the \field{reserved} field of an ATTACH request is not zero, the device
>  MUST reject the request and set \field{status} to VIRTIO_IOMMU_S_INVAL.
>  
> +If the device does not recognize a \field{flags} bit, it MUST
> +reject the request and set \field{status} to
> +VIRTIO_IOMMU_S_INVAL.
> +
>  If the endpoint identified by \field{endpoint} doesn't exist, the device
>  MUST reject the request and set \field{status} to VIRTIO_IOMMU_S_NOENT.
>  
> @@ -314,6 +370,10 @@ \subsubsection{ATTACH request}\label{sec:Device Types / IOMMU Device / Device op
>  \field{endpoint} to the domain. If it cannot do so, the device MUST reject
>  the request and set \field{status} to VIRTIO_IOMMU_S_UNSUPP.
>  
> +If the domain already exists and the VIRTIO_IOMMU_ATTACH_F_BYPASS
> +flag is not consistent with that domain, the device SHOULD reject
> +the request and set \field{status} to VIRTIO_IOMMU_S_INVAL.
> +
>  If the endpoint identified by \field{endpoint} is already attached to
>  another domain, then the device SHOULD first detach it from that domain
>  and attach it to the one identified by \field{domain}. In that case the
> @@ -342,11 +402,10 @@ \subsubsection{DETACH request}
>  };
>  \end{lstlisting}
>  
> -Detach an endpoint from a domain. When this request completes, the
> -endpoint cannot access any mapping from that domain anymore. If feature
> -VIRTIO_IOMMU_F_BYPASS has been negotiated, then once this request
> -completes all accesses from the endpoint are allowed and translated by the
> -IOMMU using the identity function.
> +Detach an endpoint from a domain. When this request completes,
> +the endpoint cannot access any mapping from that domain anymore.
> +However the endpoint may then be in bypass mode and access the
> +guest-physical address space.
>  
>  After all endpoints have been successfully detached from a domain, it
>  ceases to exist and its ID can be reused by the driver for another domain.
> @@ -433,6 +492,8 @@ \subsubsection{MAP request}\label{sec:Device Types / IOMMU Device / Device opera
>  
>  The driver SHOULD set undefined \field{flags} bits to zero.
>  
> +The driver SHOULD NOT send MAP requests on a bypass domain.
> +
>  \field{virt_end} MUST be strictly greater than \field{virt_start}.
>  
>  The driver SHOULD set the VIRTIO_IOMMU_MAP_F_MMIO flag when the physical
> @@ -461,6 +522,9 @@ \subsubsection{MAP request}\label{sec:Device Types / IOMMU Device / Device opera
>  If \field{domain} does not exist, the device SHOULD reject the request and
>  set \field{status} to VIRTIO_IOMMU_S_NOENT.
>  
> +If the domain is a bypass domain, the device SHOULD reject the
> +request and set \field{status} to VIRTIO_IOMMU_S_INVAL.
> +
>  The device MUST NOT allow writes to a range mapped without the
>  VIRTIO_IOMMU_MAP_F_WRITE flag. However, if the underlying architecture
>  does not support write-only mappings, the device MAY allow reads to a
> @@ -538,6 +602,8 @@ \subsubsection{UNMAP request}\label{sec:Device Types / IOMMU Device / Device ope
>  or be outside any mapping. The last address of a range MUST either be the
>  last address of a mapping or be outside any mapping.
>  
> +The driver SHOULD NOT send UNMAP requests on a bypass domain.
> +
>  \devicenormative{\paragraph}{UNMAP request}{Device Types / IOMMU Device / Device operations / UNMAP request}
>  
>  If the \field{reserved} field of an UNMAP request is not zero, the device
> @@ -547,6 +613,9 @@ \subsubsection{UNMAP request}\label{sec:Device Types / IOMMU Device / Device ope
>  If \field{domain} does not exist, the device SHOULD set the request
>  \field{status} to VIRTIO_IOMMU_S_NOENT.
>  
> +If the domain is a bypass domain, the device SHOULD reject the
> +request and set \field{status} to VIRTIO_IOMMU_S_INVAL.
> +
>  If a mapping affected by the range is not covered in its entirety by the
>  range (the UNMAP request would split the mapping), then the device SHOULD
>  set the request \field{status} to VIRTIO_IOMMU_S_RANGE, and SHOULD NOT

Looks good to me

Reviewed-by: Eric Auger <eric.auger@redhat.com>

Eric




[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]