[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: [PATCH v4 1/1] Introduce MGMT Admin commands
Introduce the concept of a management and a managed device. A management device supports the VIRTIO_ADMIN_DEVICE_MGMT, VIRTIO_ADMIN_DEVICE_MGMT_CAPS, VIRTIO_ADMIN_DEVICE_MGMT_ATTRS and VIRTIO_ADMIN_MANAGED_DEVICE_LIST commands to manage some resources of the managed device. Also introduce a new mechanism to issue commands with dst_type field that is not "self". With the new mechanism, driver can set dst_type to 1 and use the the vdev_id common field to describe the target vdev_id. This mechanism is useful for virtio subsystems with multiple devices with various different capabilities. For example, a virtio subsystem that contains a PCI PF and its VF. For this subsystem, a clever system administrator can use admin commands to map between a vdev_id and a VF number and use it to manipulate the VF resources. As we know, a typical cloud provider SR-IOV use case is to create many VFs for use by guest VMs. The VFs may not be assigned to a VM until a user requests a VM of a certain size, e.g., number of CPUs. A VF may need MSI-X vectors proportional to the number of CPUs in the VM, but there is no standard way today in the spec to change the number of MSI-X vectors supported by a VF, although there are some operating systems that support this. The new admin mechanism manages the MSI-X interrupt vectors assignments of a managed PCI device (i.e. VF) by its management devices (i.e. its parent PF) but can easily extended to any other generic resource management. Reviewed-by: Parav Pandit <parav@nvidia.com> Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com> --- admin.tex | 205 +++++++++++++++++++++++++++++++++++++++++++++-- content.tex | 88 ++++++++++++++++++++ introduction.tex | 25 ++++++ 3 files changed, 310 insertions(+), 8 deletions(-) diff --git a/admin.tex b/admin.tex index ed9f6bd..f069dd5 100644 --- a/admin.tex +++ b/admin.tex @@ -9,11 +9,13 @@ \section{Admin command set}\label{sec:Basic Facilities of a Virtio Device / Admi le16 command; /* * 0 - self - * 1 - 65535 are reserved + * 1 - other virtio device in the same subsystem (described by vdev_id) + * 2 - 65535 are reserved */ le16 dst_type; + le64 vdev_id; /* reserved for common cmd fields */ - u8 reserved[20]; + u8 reserved[12]; u8 command_specific_data[]; /* Device-writable part */ @@ -37,9 +39,12 @@ \section{Admin command set}\label{sec:Basic Facilities of a Virtio Device / Admi \hline 03h & VIRTIO_ADMIN_STATUS_INVALID_FIELD & invalid field was set \\ \hline +04h & VIRTIO_ADMIN_STATUS_INVALID_VDEV_ID & invalid vdev_id was set \\ +\hline + \end{tabular} -The \field{command}, \field{dst_type} and \field{command_specific_data} are +The \field{command}, \field{dst_type}, \field{vdev_id} and \field{command_specific_data} are set by the driver, and the device sets the \field{status}, the \field{command_specific_error} and the \field{command_specific_result}, if needed. @@ -48,9 +53,12 @@ \section{Admin command set}\label{sec:Basic Facilities of a Virtio Device / Admi The mandatory fields to be set by the driver, for all admin commands, are \field{command} and \field{dst_type}. +The optional unused fields to be zeroed by the driver. + The \field{command} defines the opcode for the command. The value for each command can be found in each command section. -The \field{dst_type} defines the target virtio device for the command. This value should be set to 0 (self). +The \field{dst_type} defines the target virtio device for the command. This value can be set to 0 (self) or 1 (other virtio device in the same subsystem). +If \field{dst_type} is set to 1, the \field{vdev_id} is valid and used to describe the vdev_id of the target virtio device. The \field{command_specific_error} should be inspected by the driver only if \field{status} is set to VIRTIO_ADMIN_STATUS_CS_ERR by the device. In this case, the content of \field{command_specific_error} @@ -69,12 +77,25 @@ \section{Admin command set}\label{sec:Basic Facilities of a Virtio Device / Admi \hline 0002h & VIRTIO_ADMIN_DEVICE_INFO & O \\ \hline -0003h - 7FFFh & Generic admin cmds & - \\ +0003h & VIRTIO_ADMIN_DEVICE_MGMT & O \\ +\hline +0004h & VIRTIO_ADMIN_DEVICE_MGMT_CAPS & O \\ +\hline +0005h & VIRTIO_ADMIN_DEVICE_MGMT_ATTRS & O \\ +\hline +0006h & VIRTIO_ADMIN_MANAGED_DEVICE_LIST & O \\ +\hline +0007h - 7FFFh & Generic admin cmds & - \\ \hline 8000h - FFFFh & Reserved & - \\ \hline \end{tabular} +\begin{note} +{The following commands are mandatory for management devices: VIRTIO_ADMIN_DEVICE_IDENTIFY, VIRTIO_ADMIN_DEVICE_INFO, +VIRTIO_ADMIN_DEVICE_MGMT, VIRTIO_ADMIN_DEVICE_MGMT_CAPS, VIRTIO_ADMIN_DEVICE_MGMT_ATTRS and VIRTIO_ADMIN_MANAGED_DEVICE_LIST.} +\end{note} + \subsection{VIRTIO ADMIN DEVICE IDENTIFY command}\label{sec:Basic Facilities of a Virtio Device / Admin command set / VIRTIO ADMIN DEVICE IDENTIFY command} The VIRTIO_ADMIN_DEVICE_IDENTIFY command has no command specific data set by the driver. @@ -89,7 +110,9 @@ \subsection{VIRTIO ADMIN DEVICE IDENTIFY command}\label{sec:Basic Facilities of * (1 means that field was returned): * Bit 0 - vdev_id * Bit 1 - optional_caps_support - * Bits 2 - 63 - reserved for future fields + * Bit 2 - num_supported_managed_vdevs + * Bit 3 - max_managed_vdev_id + * Bits 4 - 63 - reserved for future fields */ le64 attrs_mask; le64 vdev_id; @@ -97,10 +120,16 @@ \subsection{VIRTIO ADMIN DEVICE IDENTIFY command}\label{sec:Basic Facilities of * capabilities are supported by the device: * Bit 0 - if set, VIRTIO_ADMIN_SUBSYSTEM_IDENTIFY supported * Bit 1 - if set, VIRTIO_ADMIN_DEVICE_INFO supported - * Bits 2 - 63 - reserved for future capabilities. + * Bit 2 - if set, the device is a direct management device + * Bits 3 - 63 - reserved for future capabilities. */ le64 optional_caps_support; - u8 reserved[4072]; + /* Number of supported managed virtio devices by the target virtio device */ + le64 num_supported_managed_vdevs; + /* Maximum value of a valid managed vdev_id for the target virtio device */ + le64 max_managed_vdev_id; + + u8 reserved[4056]; }; \end{lstlisting} @@ -158,6 +187,166 @@ \subsection{VIRTIO ADMIN DEVICE INFO command}\label{sec:Basic Facilities of a Vi specification and can't be equal to zero. If bit 0 is not set, the driver will ignore \field{vf_number}.} \end{note} +\subsection{VIRTIO ADMIN DEVICE MGMT command}\label{sec:Basic Facilities of a Virtio Device / Admin command set / VIRTIO ADMIN DEVICE MGMT command} + +The VIRTIO_ADMIN_DEVICE_MGMT command is used by a management device to manage resources of other virtio devices within +the virtio subsystem. +The \field{command} is set to VIRTIO_ADMIN_DEVICE_MGMT. + +The command specific data set by the driver is of form: +\begin{lstlisting} +struct virtio_admin_device_mgmt_data { + /* + * 0 - reserved + * 1 - assign resource to the subject vdev_id + * 2 - query resource of the subject vdev_id + * 3 - 255 are reserved + */ + u8 operation; + /* + * 0 - MSI-X vector + * 1 - 65535 are reserved + */ + le16 resource; + /* The amount of elements for the operation to the given resource */ + le32 resource_count; + u8 reserved[57]; +}; +\end{lstlisting} + +The following table describes the command specific error codes codes: + +\begin{tabular}{|l|l|l|} +\hline +Opcode & Status & Description \\ +\hline \hline +00h & VIRTIO_ADMIN_CS_ERR_VDEV_IN_USE & target device is in use, operation failed \\ +\hline +01h & VIRTIO_ADMIN_CS_ERR_RSC_COUNT_EXCEED & resource count exceed the max possible value \\ +\hline +02h & VIRTIO_ADMIN_CS_ERR_NO_RSC & Mgmt device don't have requested resource count \\ +\hline +03h & VIRTIO_ADMIN_CS_RSC_UNSUPPORTED & unsupported or invalid resource \\ +\hline +04h & VIRTIO_ADMIN_CS_OP_UNSUPPORTED & unsupported or invalid operation \\ +\hline +05h - FFh & Reserved & - \\ +\hline +\end{tabular} + +This command, upon success, returns a data buffer that describes the information according to the requested operation. +This information is of form: +\begin{lstlisting} +struct virtio_admin_device_mgmt_result { + le32 resource_count; + u8 reserved[12]; +}; +\end{lstlisting} + +If the requested operation was "assign resource to the subject vdev_id", the resource_count will return the count of the assigned +resources to the subject vdev_id. Upon success, this value should be equal to the \field{resource_count} of the virtio_admin_device_mgmt_data +structure. In case of a failure, the value of this field is undefined. + +If the requested operation was "query resource of the subject vdev_id", the resource_count will return the count of the currently assigned +resources to the subject vdev_id upon success. In case of a failure, the value of this field is undefined. + +\begin{note} +{MSI-X vector resource type is valid only for PCI devices. VIRTIO_ADMIN_CS_RSC_UNSUPPORTED error is +returned when the destination device is not a PCI device.} +\end{note} + +\begin{note} +{For this command, if setting \field{resource} to MSI-X vector type, the \field{vdev_id} can't be associated to a Virtual Function with +VF index greater than NumVFs value as defined in the PCI specification or smaller than 1. An error is returned when \field{vdev_id} is out of the range.} +\end{note} + +\subsection{VIRTIO ADMIN DEVICE MGMT CAPS command}\label{sec:Basic Facilities of a Virtio Device / Admin command set / VIRTIO ADMIN DEVICE MGMT CAPS command} + +The VIRTIO_ADMIN_DEVICE_MGMT_CAPS upon success, returns a data buffer that describes the management device capabilities. +The \field{command} is set to VIRTIO_ADMIN_DEVICE_MGMT_CAPS. + +Upon success, the returned data buffer is of form: +\begin{lstlisting} +struct virtio_admin_device_mgmt_caps_result { + /* + * caps[0] - Bit 0, if set, MSI-X vector resource mgmt is supported + * Bit 1 - Bit 7 are reserved + * caps[1] - Bit 0 - Bit 7 are reserved + * caps[2] - Bit 0 - Bit 7 are reserved + * .... + * caps[8191] - Bit 0 - Bit 7 are reserved + */ + u8 caps[8191]; +}; +\end{lstlisting} + +\begin{note} +{MSI-X vector resource management is applicable only for PCI devices. Thus, MSI-X vector resource +capability bit can be set to 1 only if the subject vdev_id is a management device that can manage PCI devices (e.g. PCI Physical Function virtio device). +For more details on MSI-X vector management support see section \ref{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI-specific Admin command set / MSI-X vector management}.} +\end{note} + +\subsection{VIRTIO ADMIN DEVICE MGMT ATTRS command}\label{sec:Basic Facilities of a Virtio Device / Admin command set / VIRTIO ADMIN DEVICE MGMT ATTRS command} + +The VIRTIO_ADMIN_DEVICE_MGMT_ATTRS upon success, returns a data buffer that describes the management device attributes. +The \field{command} is set to VIRTIO_ADMIN_DEVICE_MGMT_ATTRS. + +Upon success, the returned data buffer is of form: +\begin{lstlisting} +struct virtio_admin_device_mgmt_attrs_result { + /* For compatibility - indicates which of the below fields were returned + * (1 means that field was returned): + * Bit 0 - vfs_total_msix_count + * Bit 1 - vfs_assigned_msix_count + * Bit 2 - per_vf_max_msix_count + * Bits 3 - 63 - reserved for future fields + */ + le64 attrs_mask; + + /* Total number of msix vectors for the total number of VFs */ + le32 vfs_total_msix_count; + /* Assigned number of msix vectors for the enabled VFs */ + le32 vfs_assigned_msix_count; + /* Max number of msix vectors that can be assigned for a single VF */ + le16 per_vf_max_msix_count; + + u8 reserved[4078]; +}; +\end{lstlisting} + +\begin{note} +{The \field{vfs_total_msix_count}, \field{vfs_assigned_msix_count} and \field{per_vf_max_msix_count} returned if the +subject vdev_id is a management device that can allocate/deallocate MSI-X resources for PCI VFs devices. +Otherwise, the associated bits in \field{attrs_mask} are zeroed.} +\end{note} + +\subsection{VIRTIO ADMIN MANAGED DEVICE LIST command}\label{sec:Basic Facilities of a Virtio Device / Admin command set / VIRTIO ADMIN MANAGED DEVICE LIST command} + +The VIRTIO_ADMIN_MANAGED_DEVICE_LIST upon success, returns a data buffer that describes an ordered list of up to 510 managed +virtio devices, in ascending order, that can be managed by the target management virtio device. +The returned list contain device identifiers with vdev_id greater than or equal to the value of \field{first_vdev_id} in the command specific data. +The \field{command} is set to VIRTIO_ADMIN_MANAGED_DEVICE_LIST. + +The command specific data set by the driver is of form: +\begin{lstlisting} +struct virtio_admin_managed_device_list_data { + le64 first_vdev_id; +}; +\end{lstlisting} + +Upon success, the returned data buffer is of form: +\begin{lstlisting} +struct virtio_admin_managed_device_list_result { + /* Number of entries in the vdev_ids list */ + le16 num_identifiers; + /* Number of currently valid managed virtio devices by the target virtio device */ + le64 num_managed_vdevs; + u8 reserved[6]; + /* List of managed vdev_ids. Only first num_identifiers are valid. */ + le64 vdev_ids[510]; +}; +\end{lstlisting} + \section{Admin Virtqueues}\label{sec:Basic Facilities of a Virtio Device / Admin Virtqueues} An admin virtqueue is a management interface of a device that can be used to send administrative diff --git a/content.tex b/content.tex index bf46192..27353ba 100644 --- a/content.tex +++ b/content.tex @@ -451,6 +451,19 @@ \section{Exporting Objects}\label{sec:Basic Facilities of a Virtio Device / Expo \input{admin.tex} +\section{Device management}\label{sec:Basic Facilities of a Virtio Device / Device management} + +A virtio subsystem might be consist of one or more virtio devices. For example, virtio PCI PF and its VFS compose a virtio subsystem. +A capable PCI PF device might act as the management device in the subsystem, and its PCI VFs are the managed devices. +A management device might have various management capabilities and attributes to manage its managed devices. The capabilities exposed +in the result of VIRTIO_ADMIN_DEVICE_MGMT_CAPS command (see section \ref{sec:Basic Facilities of a Virtio Device / Admin command set / VIRTIO ADMIN DEVICE MGMT CAPS command} +for more details) and the attributes in the result of VIRTIO_ADMIN_DEVICE_MGMT_ATTRS command (see section \ref{sec:Basic Facilities of a Virtio Device / Admin command set / VIRTIO ADMIN DEVICE MGMT ATTRS command} +for more details). The list of the managed devices returned in the result of VIRTIO_ADMIN_MANAGED_DEVICE_LIST command +(see section \ref{sec:Basic Facilities of a Virtio Device / Admin command set / VIRTIO ADMIN MANAGED DEVICE LIST command}) + +The management device will use the VIRTIO_ADMIN_DEVICE_MGMT admin command to manage its managed devices (see section +\ref{sec:Basic Facilities of a Virtio Device / Admin command set / VIRTIO ADMIN DEVICE MGMT command} for more details). + \chapter{General Initialization And Device Operation}\label{sec:General Initialization And Device Operation} We start with an overview of device initialization, then expand on the @@ -1759,6 +1772,81 @@ \subsubsection{Driver Handling Interrupts}\label{sec:Virtio Transport Options / \end{itemize} \end{itemize} +\subsection{PCI-specific Admin capabilities}\label{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI-specific Admin capabilities} + +This documents the group of Admin capabilities for PCI virtio devices. Each capability is +implemented using one or more Admin queue commands. + +\subsubsection{MSI-X vector management}\label{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI-specific Admin command set / MSI-X vector management} + +This capability enables a virtio management device to control the assignment of MSI-X interrupt vectors +for its managed devices. In PCI, a management device can be the PF device and the managed device can be the VF. +Capable management devices will need to implement VIRTIO_ADMIN_DEVICE_MGMT, VIRTIO_ADMIN_DEVICE_MGMT_CAPS, VIRTIO_ADMIN_DEVICE_MGMT_ATTRS +and VIRTIO_ADMIN_MANAGED_DEVICE_LIST admin commands, report the MSI-X attributes in the result of VIRTIO_ADMIN_DEVICE_MGMT_ATTRS and report that MSI-X +vector resource management is supported by setting bit 1 of caps[0] in the result of VIRTIO_ADMIN_DEVICE_MGMT_CAPS admin command. +See sections \ref{sec:Basic Facilities of a Virtio Device / Admin command set / VIRTIO ADMIN DEVICE MGMT CAPS command} and +\ref{sec:Basic Facilities of a Virtio Device / Admin command set / VIRTIO ADMIN DEVICE MGMT ATTRS command} for more details. + +In the result of VIRTIO_ADMIN_DEVICE_MGMT_ATTRS admin command, a capable management device will return the total number of +msix vectors for its VFs in \field{vfs_total_msix_count} field, the number of already assigned msix vectors for its VFs in +\field{vfs_assigned_msix_count} field and also the maximal number of msix vectors that can be assigned for a single VF in +\field{per_vf_max_msix_count} field. In addition, bit 0, bit 1 and bit 2 are set to indicate on the validity of the other 3 +fields in the \field{attrs_mask} field of the result buffer. +See section \ref{sec:Basic Facilities of a Virtio Device / Admin command set / VIRTIO ADMIN DEVICE MGMT ATTRS command} for more details. + +In the result of VIRTIO_ADMIN_MANAGED_DEVICE_LIST admin command, a capable management device will return a list of associated managed virtio devices. +See section \ref{sec:Basic Facilities of a Virtio Device / Admin command set / VIRTIO ADMIN MANAGED DEVICE LIST command} for more details. + +A management device that supports VFs MSI-X vector management capability, will allocate MSI-X resources provided by the parent PF of the SR-IOV VFs. +The default assignment of the MSI-X vectors for managed devices is out of the scope of this specification. +A driver, using VIRTIO_ADMIN_DEVICE_MGMT can update the MSI-X assignment for a specific managed device. +In the data of VIRTIO_ADMIN_DEVICE_MGMT admin command, a driver set the \field{resource} type to be MSI-X vector and the +amount of MSI-X interrupt vectors to configure to the subject managed device in \field{resource_count}. The managed device id is set to \field{vdev_id} field. + +A successful operation guarantees that the requested amount of MSI-X interrupt vectors was assigned to the subject management device. +This value is also returned in the virtio_admin_device_mgmt_result structure. +Also, a successful operation guarantees that the MSI-X capability access by the subject PCI device defined by the PCI specification must reflect +the new configuration in all relevant fields. For example, by default if the PCI VF has been assigned 4 MSI-X vectors, and VIRTIO_ADMIN_DEVICE_MGMT +increases the MSI-X vectors to 8; on this change, reading Table size field of the MSI-X message control register will reflect a value of 7. + +It is beyond the scope of the virtio specification to define necessary synchronization in system software to ensure that a virtio PCI VF device +interrupt configuration modification is reflected in the PCI device. However, it is expected that any modern system software implementing virtio +drivers and PCI subsystem will ensure that any changes occurring in the VF interrupt configuration is either updated in the PCI VF device or +such configuration fails. + +To query amount of MSI-X interrupt vectors that is currently assigned to a managed device, the driver issue VIRTIO_ADMIN_DEVICE_MGMT with \field{operation} set to +"query resource of the subject vdev_id" value (== 2). The driver also set the \field{resource} type to be MSI-X vector and the managed device id is set to \field{vdev_id} +field. In the result of a successful operation, the amount of MSI-X interrupt vectors that is currently assigned to the subject managed device is +returned by the device in \field{resource_count} field of the virtio_admin_device_mgmt_result structure. +See section \ref{sec:Basic Facilities of a Virtio Device / Admin command set / VIRTIO ADMIN DEVICE MGMT command} for more details. + +\paragraph{MSI-X configuration sequence example}\label{sec:Virtio Transport Options / Virtio Over PCI Bus / PCI-specific Admin command set / VF MSI-X control / MSI-X configuration sequence example } + +A typical sequence for configuring MSI-X vectors for PCI VFs using MSI-X vector management mechanism is following: + +\begin{enumerate} +\item Ensure that driver doesn't run and it is safe to change MSI-X (e.g. disable sriov auto probing) + +\item Load the PF driver + +\item Enable SR-IOV by following the PCI specification + +\item Query the management device capabilities using commands VIRTIO_ADMIN_DEVICE_IDENTIFY, VIRTIO_ADMIN_DEVICE_MGMT_CAPS and VIRTIO_ADMIN_DEVICE_MGMT_ATTRS + +\item Find the managed VF vdev_id using commands VIRTIO_ADMIN_MANAGED_DEVICE_LIST and VIRTIO_ADMIN_DEVICE_INFO + +\item Query the VF MSI-X configuration using command VIRTIO_ADMIN_DEVICE_MGMT (query operation) + +\item Assign desired MSI-X configuration for the VF using command VIRTIO_ADMIN_DEVICE_MGMT (assign operation) + +\item After successful completion of the assignment, load the VF driver + +\item Assign the VF to a VM + +\item After VM/VF use is completed, user assigns different MSI-X value and repeats the steps of 7 onwards + +\end{enumerate} + \section{Virtio Over MMIO}\label{sec:Virtio Transport Options / Virtio Over MMIO} Virtual environments without PCI support (a common situation in diff --git a/introduction.tex b/introduction.tex index 94dd7a2..b5906c8 100644 --- a/introduction.tex +++ b/introduction.tex @@ -260,5 +260,30 @@ \subsection{virtio subsystem}\label{sec:Introduction / Definitions / virtio subs VQN and vdev_id are exposed via Admin Command Set. +\subsection{virtio management device}\label{sec:Introduction / Definitions / virtio management device} + +A virtio device that supports VIRTIO_ADMIN_DEVICE_MGMT, VIRTIO_ADMIN_DEVICE_MGMT_CAPS, VIRTIO_ADMIN_DEVICE_MGMT_ATTRS and +VIRTIO_ADMIN_MANAGED_DEVICE_LIST Admin commands and can manage a virtio managed device. +A virtio subsystem may contain zero or more management devices. + +A PCI SR-IOV Physical Function based virtio device is an example of a possible virtio management device. + +\subsection{virtio direct management device}\label{sec:Introduction / Definitions / virtio direct management device} + +A virtio management device that can set \field{dst_type} to 0 (self) and to 1 (other virtio device in the same subsystem), and set \field{vdev_id} to an id that corresponds with +one of its managed virtio devices for the following admin commands (if these fields are applicable): VIRTIO_ADMIN_DEVICE_IDENTIFY, VIRTIO_ADMIN_DEVICE_INFO +and VIRTIO_ADMIN_DEVICE_MGMT. + +The managed device list can be queried using VIRTIO_ADMIN_MANAGED_DEVICE_LIST. + +A virtio subsystem may contain zero or more direct management devices. + +\subsection{virtio managed device}\label{sec:Introduction / Definitions / virtio managed device} + +A virtio device that can be managed by a virtio management device. +A virtio subsystem may contain zero or more managed devices. + +A PCI SR-IOV Virtual Function based virtio device is an example of a possible virtio managed device. + \newpage -- 2.21.0
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]