[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: Re: Re: [virtio-comment] [PATCH v4 3/7] transport-fabrics: introduce Virtio-oF Protocol Data Unit
On 7/30/23 07:25, Raphael Norwitz wrote:
Iâd suggest making some tweaks to save memory on the target side for RDMA.On Jun 26, 2023, at 3:25 AM, zhenwei pi <pizhenwei@bytedance.com> wrote: Introduce Virtio-oF PDU for Virtio-oF queue, add Stream Data Transfers and Keyed Data Transfers mechanism. Signed-off-by: zhenwei pi <pizhenwei@bytedance.com> --- transport-fabrics.tex | 68 +++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 68 insertions(+) diff --git a/transport-fabrics.tex b/transport-fabrics.tex index 54d7558..d562b2c 100644 --- a/transport-fabrics.tex +++ b/transport-fabrics.tex @@ -45,3 +45,71 @@ \subsection{Virtio-oF Qualified Name}\label{sec:Virtio Transport Options / Virti \item The maximum name is 256 bytes in length, including the NUL character. \item There is no strict style limitation. \end{itemize} + + +\subsection{Protocol Data Unit}\label{sec:Virtio Transport Options / Virtio Over Fabrics / Protocol Data Unit} +This section defines Virtio-oF Protocol Data Unit (PDU) for both the Virtio-oF control queue and Virtio-oF virtqueue. + +Virtio-oF PDU is a unit of information exchanged between a Virtio-oF initiator and a Virtio-oF target: +\begin{itemize} +\item A request Virtio-oF PDU from Virtio-oF initiator to Virtio-oF target contains command and associated data, if present. +\item A response Virtio-oF PDU from Virtio-oF target to Virtio-oF initiator contains completion and associated data, if present. +\end{itemize} + +\subsubsection{Stream Data Transfers}\label{sec:Virtio Transport Options / Virtio Over Fabrics / Protocol Data Unit/ Stream Data Transfers} +Stream-based (i.e. TCP/IP) Virtio-oF queue transfers Virtio-oF PDUs in a streaming fashion. + +The layout in a stream: +\begin{lstlisting} +PDUx and PDUz contain a command only, PDUy contains a command and data: + + +----+ +---------+ +----+ + |PDUx| | PDUy | |PDUz| + ... +----+ ... +----+----+ ... +----+ ... + |CMD | |CMD |Data| |CMD | + +----+ +---------+ +----+ + +PDUl contains completion only, PDUm and PDUn contain completion and data: + + +----+ +---------+ +---------+ + |PDUl| | PDUm | | PDUn | + ... +----+ ... +----+----+ ... +----+----+ ... + |COMP| |COMP|Data| |COMP|Data| + +----+ +---------+ +---------+ +\end{lstlisting} + +\subsubsection{Keyed Data Transfers}\label{sec:Virtio Transport Options / Virtio Over Fabrics / Protocol Data Unit/ Keyed Data Transfers} +Message-based (i.e. RDMA) Virtio-oF queue transfers Virtio-oF PDUs in a message fashion, and uses the following structure to describe the remote data: + +\begin{lstlisting} +struct virtio_of_keyed_desc { + /* the remote address of data */ + le64 addr; + /* the length of data */ + le32 length; + /* the key to access the remote data */ + le32 key; +}; +\end{lstlisting} + +A Virtio-oF control queue supports 1 keyed descriptor, and a Virtio-oF virtqueue supports 1 or more keyed descriptors. + +The PDUs of messages: +\begin{lstlisting} +PDUx contains a command only, PDUy contains a command and 1 descriptor, +and PDUz contains a command and k - j + 1 descriptors. + + +----+ +---------+ +--------------------+ + |PDUx| | PDUy | | PDUz | + ... +----+ ... +----+----+ ... +----+-----+---+-----+ ... + |CMD | |CMD |DESC| |CMD |DESCj|...|DESCk| + +----+ +---------+ +----------+---+-----+On the target side you have to prepost (MAX_SG_LEN * sizeof(virtio_of_keyed_desc) + sizeof(virtio_command_vq)) * max outstanding for each connection. Assuming a 128 max outstanding and MAX_SG_LEN=1024 (QEMUâs IOV_INTERNAL) that puts you at (1024 * 8 + 13) * 128 ~= 1MB per connection. If a target is servicing thousands of connections that adds up. Iâd suggest adding a protocol message to cap the MAX_SG_LEN and communicate the value to the target. Would also be great to save space in virtio_of_keyed_desc, but I donât think there is a way.
Hi, I have to say that the virtio_of_keyed_desc is not 1:1 mapping to VQ desc.MAX_SG_LEN is supposed at least 1(by default): then 1 memory region is used for both WRITE and READ direction. Example of virtio-blk write 128k, 1 virtio_of_keyed_desc describes a MR of 16(OUT: virtio-blk header) + 128K(OUT: blk data) + 1(IN: u8 status)
Or MAX_SG_LEN is 2, then we can separate OUT vq buffers and IN vq buffers into 2 MR described by 2 virtio_of_keyed_desc.
'Get Keyed Number Descriptors Command' (from [PATCH 4/7] transport-fabrics: introduce command set) is used to get the 'MAX_SG_LEN'.
+ +PDUl, PDUm, and PDUn contain completion only. + + +----+ +----+ +----+ + |PDUl| |PDUm| |PDUn| + ... +----+ ... +----+ ... +----+ ... + |COMP| |COMP| |COMP| + +----+ +----+ +----+ +\end{lstlisting} -- 2.25.1 This publicly archived list offers a means to provide input to the OASIS Virtual I/O Device (VIRTIO) TC. In order to verify user consent to the Feedback License terms and to minimize spam in the list archive, subscription is required before posting. Subscribe: virtio-comment-subscribe@lists.oasis-open.org Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org List help: virtio-comment-help@lists.oasis-open.org List archive: https://lists.oasis-open.org/archives/virtio-comment/ Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists Committee: https://www.oasis-open.org/committees/virtio/ Join OASIS: https://www.oasis-open.org/join/
-- zhenwei pi
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]