[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: Re: [PATCH v1 1/2] transport-pci: Introduce legacy registers access commands
On Wed, May 03, 2023 at 06:26:58AM +0300, Parav Pandit wrote: > This patch introduces legacy registers access commands for the owner > group member PCI PF to access the legacy registers of the member VFs. > > If in future any SIOV devices to support legacy registers, they > can be easily supported using same commands by using the group > member identifiers of the future SIOV devices. > > More details as overview, motivation, use case are further described > below. > > Usecase: > -------- > 1. A hypervisor/system needs to provide transitional > virtio devices to the guest VM at scale of thousands, > typically, one to eight devices per VM. > > 2. A hypervisor/system needs to provide such devices using a > vendor agnostic driver in the hypervisor system. > > 3. A hypervisor system prefers to have single stack regardless of > virtio device type (net/blk) and be future compatible with a > single vfio stack using SR-IOV or other scalable device > virtualization technology to map PCI devices to the guest VM. > (as transitional or otherwise) > > Motivation/Background: > ---------------------- > The existing virtio transitional PCI device is missing support for > PCI SR-IOV based devices. Currently it does not work beyond > PCI PF, or as software emulated device in reality. Currently it > has below cited system level limitations: > > [a] PCIe spec citation: > VFs do not support I/O Space and thus VF BARs shall not indicate I/O Space. > > [b] cpu arch citiation: > Intel 64 and IA-32 Architectures Software Developerâs Manual: > The processorâs I/O address space is separate and distinct from > the physical-memory address space. The I/O address space consists > of 64K individually addressable 8-bit I/O ports, numbered 0 through FFFFH. > > [c] PCIe spec citation: > If a bridge implements an I/O address range,...I/O address range will be > aligned to a 4 KB boundary. > > Above usecase requirements can be solved by PCI PF group owner enabling > the access to its group member PCI VFs legacy registers using an admin > virtqueue of the group owner PCI PF. > > Software usage example: > ----------------------- > The most common way to use and map to the guest VM is by > using vfio driver framework in Linux kernel. > > +----------------------+ > |pci_dev_id = 0x100X | > +---------------|pci_rev_id = 0x0 |-----+ > |vfio device |BAR0 = I/O region | | > | |Other attributes | | > | +----------------------+ | > | | > + +--------------+ +-----------------+ | > | |I/O BAR to AQ | | Other vfio | | > | |rd/wr mapper | | functionalities | | > | +--------------+ +-----------------+ | > | | > +------+-------------------------+-----------+ > | | > +----+------------+ +----+------------+ > | +-----+ | | PCI VF device A | > | | AQ |-------------+---->+-------------+ | > | +-----+ | | | | legacy regs | | > | PCI PF device | | | +-------------+ | > +-----------------+ | +-----------------+ > | > | +----+------------+ > | | PCI VF device N | > +---->+-------------+ | > | | legacy regs | | > | +-------------+ | > +-----------------+ > > 2. Virtio pci driver to bind to the listed device id and > use it as native device in the host. > > 3. Use it in a light weight hypervisor to run bare-metal OS. > > Please review. > > Fixes: https://github.com/oasis-tcs/virtio-spec/issues/167 > Signed-off-by: Parav Pandit <parav@nvidia.com> A bunch of grammar mistakes below. We have actual interface to figure out so I didn't bother correcting but pls try to run this through some checker. The one in microsoft word is actually not bad :) > --- > changelog: > v0->v1: > - addressed comments, suggesetions and ideas from Michael Tsirkin and Jason Wang > - far more simpler design than MMR access > - removed complexities of MMR device ids > - removed complexities of MMR registers and extended capabilities > - dropped adding new extended capabilities because if if they are > added, a pci device still needs to have existing capabilities > in the legacy configuration space and hypervisor driver do not > need to access them > --- > admin.tex | 5 ++- > transport-pci-vf-regs.tex | 84 +++++++++++++++++++++++++++++++++++++++ > transport-pci.tex | 2 + > 3 files changed, 90 insertions(+), 1 deletion(-) > create mode 100644 transport-pci-vf-regs.tex > > diff --git a/admin.tex b/admin.tex > index 648253c..852ee04 100644 > --- a/admin.tex > +++ b/admin.tex > @@ -115,7 +115,10 @@ \subsection{Group administration commands}\label{sec:Basic Facilities of a Virti > \hline \hline > 0x0000 & VIRTIO_ADMIN_CMD_LIST_QUERY & Provides to driver list of commands supported for this group type \\ > 0x0001 & VIRTIO_ADMIN_CMD_LIST_USE & Provides to device list of commands used for this group type \\ > -0x0002 - 0x7FFF & - & Commands using \field{struct virtio_admin_cmd} \\ > +0x0002 & VIRTIO_ADMIN_CMD_LREG_WRITE & Write legacy registers of a member device \\ > +0x0003 & VIRTIO_ADMIN_CMD_LREG_READ & Read legacy registers of a member device \\ > +0x0004 & VIRTIO_ADMIN_CMD_LQ_NOTIFY_QUERY & Read the queue notification offset for legacy interface \\ > +0x0005 - 0x7FFF & - & Commands using \field{struct virtio_admin_cmd} \\ > \hline > 0x8000 - 0xFFFF & - & Reserved for future commands (possibly using a different structure) \\ > \hline > diff --git a/transport-pci-vf-regs.tex b/transport-pci-vf-regs.tex > new file mode 100644 > index 0000000..16ced32 > --- /dev/null > +++ b/transport-pci-vf-regs.tex I'd like the name to reflect "legacy". Also I don't think this has to be SRIOV generally. It's just legacy PCI over admin commands. Except for virtio_admin_cmd_lq_notify_query_result which refers to PCI? But that one I can't say for sure what it does. > @@ -0,0 +1,84 @@ > +\subsection{SR-IOV VFs Legacy Registers Access}\label{sec:Virtio Transport Options / Virtio Over PCI Bus / SR-IOV Legacy Registers Access} > + > +As described in PCIe base specification \hyperref[intro:PCIe]{[PCIe]} PCI VFs > +do not support IOBAR. A PCI PF device can optionally enable driver to access > +its member PCI VFs devices legacy common configuration and device configuration > +registers using an administration virtqueue. A PCI PF group owner device that > +supports its member VFs legacy registers access via the administration > +virtqueue should supports following commands. As above. It actually can work for any group if we want to. > + > +\begin{enumerate} > +\item Legacy Registers Write > +\item Legacy Registers Read > +\item Legacy Queue Notify Offset Query > +\end{enumerate} > + Pls add some theory of operation. How can all this be used? > +\subsubsection{Legacy Registers Write}\label{sec:Virtio Transport Options / Virtio Over PCI Bus / SR-IOV Legacy Registers Access / Legacy Registers Write} > + > +Legacy registers write admin command follows \field{struct virtio_admin_cmd}. > +This command writes legacy registers of a member VF device. Driver should write > +appropriate register \field{size} depending on the width of the legacy > +common registers or device specific registers. > +Driver sets command \field{opcode} to VIRTIO_ADMIN_CMD_LREG_WRITE. > +Driver sets \field{group_type} to 1 for VFs. > +Driver sets \field{group_member_id} to a valid VF number. > + > +The \field{command_specific_data} has following listed structure format: > + > +\begin{lstlisting} > +struct virtio_admin_cmd_lreg_wr_data { > + u8 offset; /* Starting byte offset of the register(s) to write */ > + u8 size; /* Number of bytes to write into the register. */ > + u8 register[]; > +}; > +\end{lstlisting} > + > +This command does not have any command specific result. > + > +\subsubsection{Legacy Registers Read}\label{sec:Virtio Transport Options / Virtio Over PCI Bus / SR-IOV Legacy Registers Access / Legacy Registers Read} > + > +Legacy registers read admin command follows \field{struct virtio_admin_cmd}. > +This command reads legacy registers of a member VF device. Driver should write > +appropriate register \field{size} depending on the width of the legacy > +common configuration registers or device specific registers. > +Driver sets command \field{opcode} to VIRTIO_ADMIN_CMD_LREG_READ. > +Driver sets \field{group_type} to 1 for VFs. > +Driver sets \field{group_member_id} to a valid VF number. > + > +The \field{command_specific_data} has following listed structure format: > + > +\begin{lstlisting} > +struct virtio_admin_cmd_lreg_rd_data { > + u8 offset; /* Starting byte offset of the register to read */ > + u8 size; /* Number of bytes to read from the registers */ > +}; > +\end{lstlisting} > + > +When command completes successfully, command result contains following > +listed content: > + > +\begin{lstlisting} > +struct virtio_admin_cmd_lreg_rd_result { > + u8 registers[]; > +}; > +\end{lstlisting} > + > +\subsubsection{Legacy Queue Notify Offset Query}\label{sec:Virtio Transport Options / Virtio Over PCI Bus / SR-IOV Legacy Registers Access / Legacy Queue Notify Offset Query} > + > +This command returns the notify offset of the member VF for queue > +notifications. What is this notify offset? It's never explained. > This command follows \field{struct virtio_admin_cmd}. > +Driver sets command opcode \field{opcode} to VIRTIO_ADMIN_CMD_LQ_NOTIFY_QUERY. > +There is no command specific data for this command. > +Driver sets \field{group_type} to 1. > +Driver sets \field{group_member_id} to a valid VF number. I think ATM the limitation for this is that the member must be a pci device, otherwise BAR is not well defined. We will have to find a way to extend this for SIOV. But that is all, please do not repeat documentation about virtio_admin_cmd header, we have that in a central place. > + > +When command completes successfully, command result contains the queue > +notification address in the listed format: > + > +\begin{lstlisting} > +struct virtio_admin_cmd_lq_notify_query_result { > + u8 bar; /* PCI BAR number of the member VF */ > + u8 reserved[7]; > + le64 offset; /* Byte offset within the BAR */ > +}; > +\end{lstlisting} > diff --git a/transport-pci.tex b/transport-pci.tex > index ff889d3..b187576 100644 > --- a/transport-pci.tex > +++ b/transport-pci.tex > @@ -1179,3 +1179,5 @@ \subsubsection{Driver Handling Interrupts}\label{sec:Virtio Transport Options / > re-examine the configuration space to see what changed. > \end{itemize} > \end{itemize} > + > +\input{transport-pci-vf-regs.tex} As simple as it is, I feel this falls far short of describing how a device should operate. Some issues: - legacy device config offset changes as msi is enabled/disabled suggest separate commands for device/common config - legacy device endian-ness changes with guest suggest commands to enable LE and BE mode - legacy guests often assume INT#x support suggest a way to tunnel that too; though supporting ISR is going to be a challenge :( - I presume admin command is not the way to do kicks? Or is it ok? - there's some kind of notify thing here? I expected to see more statements along the lines of command ABC has the same effect as access to register DEF of the member through the legacy pci interface > -- > 2.26.2
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]