OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

virtio-comment message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: Re: [virtio-comment] [PATCH v4 3/7] transport-fabrics: introduce Virtio-oF Protocol Data Unit




On 7/30/23 07:25, Raphael Norwitz wrote:
Iâd suggest making some tweaks to save memory on the target side for RDMA.

On Jun 26, 2023, at 3:25 AM, zhenwei pi <pizhenwei@bytedance.com> wrote:

Introduce Virtio-oF PDU for Virtio-oF queue, add Stream Data Transfers
and Keyed Data Transfers mechanism.

Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>
---
transport-fabrics.tex | 68 +++++++++++++++++++++++++++++++++++++++++++
1 file changed, 68 insertions(+)

diff --git a/transport-fabrics.tex b/transport-fabrics.tex
index 54d7558..d562b2c 100644
--- a/transport-fabrics.tex
+++ b/transport-fabrics.tex
@@ -45,3 +45,71 @@ \subsection{Virtio-oF Qualified Name}\label{sec:Virtio Transport Options / Virti
\item The maximum name is 256 bytes in length, including the NUL character.
\item There is no strict style limitation.
\end{itemize}
+
+
+\subsection{Protocol Data Unit}\label{sec:Virtio Transport Options / Virtio Over Fabrics / Protocol Data Unit}
+This section defines Virtio-oF Protocol Data Unit (PDU) for both the Virtio-oF control queue and Virtio-oF virtqueue.
+
+Virtio-oF PDU is a unit of information exchanged between a Virtio-oF initiator and a Virtio-oF target:
+\begin{itemize}
+\item A request Virtio-oF PDU from Virtio-oF initiator to Virtio-oF target contains command and associated data, if present.
+\item A response Virtio-oF PDU from Virtio-oF target to Virtio-oF initiator contains completion and associated data, if present.
+\end{itemize}
+
+\subsubsection{Stream Data Transfers}\label{sec:Virtio Transport Options / Virtio Over Fabrics / Protocol Data Unit/ Stream Data Transfers}
+Stream-based (i.e. TCP/IP) Virtio-oF queue transfers Virtio-oF PDUs in a streaming fashion.
+
+The layout in a stream:
+\begin{lstlisting}
+PDUx and PDUz contain a command only, PDUy contains a command and data:
+
+     +----+     +---------+     +----+
+     |PDUx|     |   PDUy  |     |PDUz|
+ ... +----+ ... +----+----+ ... +----+ ...
+     |CMD |     |CMD |Data|     |CMD |
+     +----+     +---------+     +----+
+
+PDUl contains completion only, PDUm and PDUn contain completion and data:
+
+     +----+     +---------+     +---------+
+     |PDUl|     |   PDUm  |     |   PDUn  |
+ ... +----+ ... +----+----+ ... +----+----+ ...
+     |COMP|     |COMP|Data|     |COMP|Data|
+     +----+     +---------+     +---------+
+\end{lstlisting}
+
+\subsubsection{Keyed Data Transfers}\label{sec:Virtio Transport Options / Virtio Over Fabrics / Protocol Data Unit/ Keyed Data Transfers}
+Message-based (i.e. RDMA) Virtio-oF queue transfers Virtio-oF PDUs in a message fashion, and uses the following structure to describe the remote data:
+
+\begin{lstlisting}
+struct virtio_of_keyed_desc {
+        /* the remote address of data */
+        le64 addr;
+        /* the length of data */
+        le32 length;
+        /* the key to access the remote data */
+        le32 key;
+};
+\end{lstlisting}
+
+A Virtio-oF control queue supports 1 keyed descriptor, and a Virtio-oF virtqueue supports 1 or more keyed descriptors.
+
+The PDUs of messages:
+\begin{lstlisting}
+PDUx contains a command only, PDUy contains a command and 1 descriptor,
+and PDUz contains a command and k - j + 1 descriptors.
+
+     +----+     +---------+     +--------------------+
+     |PDUx|     |   PDUy  |     |        PDUz        |
+ ... +----+ ... +----+----+ ... +----+-----+---+-----+ ...
+     |CMD |     |CMD |DESC|     |CMD |DESCj|...|DESCk|
+     +----+     +---------+     +----------+---+-----+

On the target side you have to prepost (MAX_SG_LEN * sizeof(virtio_of_keyed_desc) + sizeof(virtio_command_vq)) * max outstanding for each connection.

Assuming a 128 max outstanding and MAX_SG_LEN=1024 (QEMUâs IOV_INTERNAL) that puts you at (1024 * 8 + 13) * 128 ~= 1MB per connection. If a target is servicing thousands of connections that adds up.

Iâd suggest adding a protocol message to cap the MAX_SG_LEN and communicate the value to the target.

Would also be great to save space in virtio_of_keyed_desc, but I donât think there is a way.


Hi,

I have to say that the virtio_of_keyed_desc is not 1:1 mapping to VQ desc.

MAX_SG_LEN is supposed at least 1(by default): then 1 memory region is used for both WRITE and READ direction. Example of virtio-blk write 128k, 1 virtio_of_keyed_desc describes a MR of 16(OUT: virtio-blk header) + 128K(OUT: blk data) + 1(IN: u8 status)

Or MAX_SG_LEN is 2, then we can separate OUT vq buffers and IN vq buffers into 2 MR described by 2 virtio_of_keyed_desc.

'Get Keyed Number Descriptors Command' (from [PATCH 4/7] transport-fabrics: introduce command set) is used to get the 'MAX_SG_LEN'.

+
+PDUl, PDUm, and PDUn contain completion only.
+
+     +----+     +----+     +----+
+     |PDUl|     |PDUm|     |PDUn|
+ ... +----+ ... +----+ ... +----+ ...
+     |COMP|     |COMP|     |COMP|
+     +----+     +----+     +----+
+\end{lstlisting}
--
2.25.1


This publicly archived list offers a means to provide input to the
OASIS Virtual I/O Device (VIRTIO) TC.

In order to verify user consent to the Feedback License terms and
to minimize spam in the list archive, subscription is required
before posting.

Subscribe: virtio-comment-subscribe@lists.oasis-open.org
Unsubscribe: virtio-comment-unsubscribe@lists.oasis-open.org
List help: virtio-comment-help@lists.oasis-open.org
List archive: https://lists.oasis-open.org/archives/virtio-comment/
Feedback License: https://www.oasis-open.org/who/ipr/feedback_license.pdf
List Guidelines: https://www.oasis-open.org/policies-guidelines/mailing-lists
Committee: https://www.oasis-open.org/committees/virtio/
Join OASIS: https://www.oasis-open.org/join/




--
zhenwei pi


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]