[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: Re: [PATCH v2 5/9] virtio-iommu: Add VT-d IO page table
Resend the reply and CC virtio-comment list. Hi, On 10/7/23 00:09, Jean-Philippe Brucker wrote:
From: Tina Zhang <tina.zhang@intel.com> Allow a driver of a device behind virtio-iommu to detect support for VT-d IO page table. The driver detects support for VT-d IO page table format in a PROBE request and sends an ATTACH_TABLE request with a pointer to the first stage page translation table. The definition of first stage page translation table ATTACH_TABLE request references the fields defined in VT-d scalable-mode PASID table entry. * attributes defined in ATTACH_TABLE request: FSPTPTR got from pgtbl_addr field. SRE/EAFE/WPE defined in pgtbl_flags field. FSPM calculated through addr_width field. PASID got from pasid field. * attributes not defined in ATTACH_TABLE request, as their configurations need to be decided by host: Memory type features, PAT/EMTE/CD/PGSNP/PWSNP. Snoop Behavior, PGSNP/PWSNP. Support for any new attributes could easily be added by adding flags to the PROBE property and fields in the ATTACH_TABLE config. Signed-off-by: Tina Zhang <tina.zhang@intel.com> [JPB: moved to pgtable-intel.tex, rework formatting] Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org> --- device-types/iommu/description.tex | 3 + device-types/iommu/pgtable-intel.tex | 98 ++++++++++++++++++++++++++++ introduction.tex | 3 + 3 files changed, 104 insertions(+) create mode 100644 device-types/iommu/pgtable-intel.tex diff --git a/device-types/iommu/description.tex b/device-types/iommu/description.tex index c057cdf..a4eb492 100644 --- a/device-types/iommu/description.tex +++ b/device-types/iommu/description.tex @@ -407,6 +407,7 @@ \subsubsection{ATTACH_TABLE request}\label{ref:Device Types / IOMMU Device / Dev };#define VIRTIO_IOMMU_ATTACH_TABLE_ARM_SMMU3 1+#define VIRTIO_IOMMU_ATTACH_TABLE_INTEL_PT 2 \end{lstlisting}Attach an endpoint to a domain, in the same way as an ATTACH@@ -1009,6 +1010,7 @@ \subsubsection{PROBE properties}\label{sec:Device Types / IOMMU Device / Device \begin{lstlisting} #define VIRTIO_IOMMU_PROBE_T_RESV_MEM 1 #define VIRTIO_IOMMU_PROBE_T_HW_ARM_SMMU3 2 +#define VIRTIO_IOMMU_PROBE_T_HW_INTEL_VTD 3 \end{lstlisting}\paragraph{Property RESV_MEM}\label{sec:Device Types / IOMMU Device / Device operations / PROBE properties / RESVMEM}@@ -1160,3 +1162,4 @@ \subsubsection{Fault reporting}\label{sec:Device Types / IOMMU Device / Device o \subsection{Acceleration}\label{sec:Device Types / IOMMU Device / Acceleration}\input{device-types/iommu/pgtable-arm}+\input{device-types/iommu/pgtable-intel} diff --git a/device-types/iommu/pgtable-intel.tex b/device-types/iommu/pgtable-intel.tex new file mode 100644 index 0000000..90a3410 --- /dev/null +++ b/device-types/iommu/pgtable-intel.tex @@ -0,0 +1,98 @@ +\subsubsection{Intel VT-d tables}\label{sec:Device Types / IOMMU Device / Acceleration / Intel} + +Attach first-level translation tables in the format described by +the \hyperref[intro:VT-Directed-IO]{Intel Virtualization +Technology for Directed I/O specification}. + +\paragraph{PROBE property for VT-d page tables}\label{sec:Device Types / IOMMU Device / Acceleration / Intel / PROBE} + +The PROBE property VIRTIO_IOMMU_PROBE_T_HW_INTEL_VTD provides +information about an Intel VT-d IOMMU. + +\begin{lstlisting} +struct virtio_iommu_probe_hw_intel_vtd { + struct virtio_iommu_probe_property head; + u8 reserved[4]; + le64 cap_reg; + le64 ecap_reg; +}; + +#define VIRTIO_IOMMU_HW_INTEL_VTD_CAP_RWBF (1 << 4) +#define VIRTIO_IOMMU_HW_INTEL_VTD_CAP_SAGAW_SHIFT 8 +#define VIRTIO_IOMMU_HW_INTEL_VTD_CAP_SAGAW_MASK 0x1f +#define VIRTIO_IOMMU_HW_INTEL_VTD_CAP_MGAW_SHIFT 16 +#define VIRTIO_IOMMU_HW_INTEL_VTD_CAP_MGAW_MASK 0x1f
The size of MGAW is 6 bits. So, it should be: #define VIRTIO_IOMMU_HW_INTEL_VTD_CAP_MGAW_MASK 0x3f
I don't think we need VCS defined here any more. In the old design, VCS was used as a channel for guest asking host to allocate a PASID for it. In the current design, we drop this and let host mediate guest PASIDs when necessary (which is agnostic to guest viommu driver). So basically, there are two cases: 1) For device assignment of I/O device, the PASIDs of the device are managed by guest. Host doesn't need to do any mediation. 2) For I/O Device sharing case, one device is used by multiple guests. So we cannot let one guest directly manage the PASIDs of that device. In this case, with the help of virtualization, host can trap the guest's first DMA access requiring a PASID and assign a host-side PASID for it. Guest viommu driver doesn't need to know the host-side PASID value.+#define VIRTIO_IOMMU_HW_INTEL_VTD_CAP_ZLR (1 << 22) +#define VIRTIO_IOMMU_HW_INTEL_VTD_CAP_FL1GP (1 << 56) +#define VIRTIO_IOMMU_HW_INTEL_VTD_CAP_5LP (1 << 60) + +#define VIRTIO_IOMMU_HW_INTEL_VTD_ECAP_SC (1 << 7) +#define VIRTIO_IOMMU_HW_INTEL_VTD_ECAP_MTS (1 << 25) +#define VIRTIO_IOMMU_HW_INTEL_VTD_ECAP_EAFS (1 << 34) +#define VIRTIO_IOMMU_HW_INTEL_VTD_ECAP_PSS_SHIFT 35 +#define VIRTIO_IOMMU_HW_INTEL_VTD_ECAP_PSS_MASK 0x1f +#define VIRTIO_IOMMU_HW_INTEL_VTD_ECAP_SMTS (1 << 43) +#define VIRTIO_IOMMU_HW_INTEL_VTD_ECAP_VCS (1 << 44)
+#define VIRTIO_IOMMU_HW_INTEL_VTD_ECAP_SMPWC (1 << 48) +\end{lstlisting}
I think two more definitions are needed: #define VIRTIO_IOMMU_HW_INTEL_VTD_ECAP_NEST (1 << 26) #define VIRTIO_IOMMU_HW_INTEL_VTD_ECAP_FSTS (1 << 47) The ECAP_NEST tells if VT-d hardware supports nested translations.The ECAP_FSTS tells if VT-d hardware supports PASID granular translation type of first-stage.
The cap_reg & ecap_reg expose VT-d hardware capability to guest. And it's expected to have virtio-iommu working with VT-d IO page table when the platform hardware at least has the capability of nested-translation (i.e., hardware must have scalable mode support and first-stage translation support and second-stage translation support.). So, we only pick definitions of the bits from VT-d spec which are related to the first-stage page table and the basic capability of nested-translation.+ +\todo{check that these are necessary and sufficient to support +page table assignement in scalable mode.}
I didn't see any difference between the non-PASID assignment case and the PASID assignment case, in terms of virtio-iommu support VT-d. Any ideas?+\todo{Need to support non-PASID assignment? non-scalable?}
The non-scalable doesn't seem so interesting right now. Any thoughts about it?
No, we don't need guest viommu driver to explicitly ask host's help for guest PASID alloc/free now. In the current design, guest viommu driver doesn't need to know the host-side PASID.+ +\begin{description} + \item[\field{cap_reg}] Capability Register. + \item[\field{ecap_reg}] Extended Capability Register. +\end{description} + +\paragraph{ATTACH_TABLE request for VT-d page table}\label{sec:Device Types / IOMMU Device / Acceleration / Intel / ATTACH_TABLE} + +Attach a single set of page tables to an endpoint. + +\begin{lstlisting} +/* Vt-d I/O Page Table Descriptor */ +struct virtio_iommu_req_attach_table_intel { + struct virtio_iommu_req_head head; + le32 domain; + le32 endpoint; + u8 format; + u8 reserved[7]; + le32 pasid; + le64 pgtbl_addr; + le64 pgtbl_flags; + le32 addr_width; + u8 reserved[80]; + struct virtio_iommu_req_tail tail; +}; + +/* Intel VT-d stage-1 page table entry attributes */ +#define VIRTIO_IOMMU_HW_INTEL_VTD_SRE 0x01 +#define VIRTIO_IOMMU_HW_INTEL_VTD_EAFE 0x02 +#define VIRTIO_IOMMU_HW_INTEL_VTD_WPE 0x40 +#define VIRTIO_IOMMU_HW_INTEL_VTD_LAST 0x80 +\end{lstlisting} + +\begin{description} + \item[\field{format}] VIRTIO_IOMMU_ATTACH_TABLE_INTEL_PT. + \item[\field{pasid}] Process Address Space Identifier. + \todo{don't we need to allocate these with alloc/free + commands to the host?}
Regards, -Tina
+ \item[\field{pgtbl_addr}] Stage-1 page table base address. + \item[\field{pgtbl_flags}] Stage-1 page table entry attributes: + \begin{description} + \item{SRE} Supervisor request, + \item{EAFE} Extended access enable, + \item{WPE} Write protect enable. + \end{description} + \item[\field{pat}] Page attribute table data to compute + effective memory type. + \item[\field{addr_width}] The address width of the untranslated + addresses that are subjected to the stage-1 page table. +\end{description} + +\paragraph{INVALIDATE request for VT-d page table}\label{sec:Device Types / IOMMU Device / Acceleration / Intel / INVALIDATE} + +Supported values for field \field{scope} are +VIRTIO_IOMMU_INVAL_S_PASID and VIRTIO_IOMMU_INVAL_S_ADDRESS. +The only supported value for field \field{caches} is +VIRTIO_IOMMU_INVAL_C_TLB. Field \field{id} is not used. diff --git a/introduction.tex b/introduction.tex index 636fb1b..d0d5086 100644 --- a/introduction.tex +++ b/introduction.tex @@ -104,6 +104,9 @@ \section{Normative References}\label{sec:Normative References} \phantomsection\label{intro:SMMUv3}\textbf{[SMMUv3]} & Arm System Memory Management Unit version 3 \newline\url{https://developer.arm.com/documentation/ihi0070/latest} \\ + \phantomsection\label{intro:VT-Directed-IO}\textbf{[VT-Directed-IO]} & + Intel Virtualization Technology for Directed I/O + \newline\url{https://cdrdv2.intel.com/v1/dl/getContent/671081} \\\phantomsection\label{intro:rfc2784}\textbf{[RFC2784]} &Generic Routing Encapsulation. This protocol is only specified for IPv4 and used as either the payload or delivery protocol.
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]