OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

virtio-comment message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [PATCH v2 05/11] transport-fabrics: introduce Keyed Transmission




On 5/4/2023 4:19 AM, zhenwei pi wrote:
Keyed transmission is used for message oriented communication(Ex RDMA),
also add virtio-blk read/write 8K example.

Signed-off-by: zhenwei pi <pizhenwei@bytedance.com>
---

+An example of a virtio-blk write 8K request(message size: sizeof(Command) +
+4 * sizeof(Descriptor)):
+\begin{lstlisting}
+ COMMAND            +------+
+                    |opcode|  ->  virtio_of_op_vring
+                    +------+
+                    |cmd id|  ->  10
+                    +------+
+                    |length|  ->  0
+                    +------+
+                    |ndesc |  ->  4
+                    +------+
+                    |rsvd  |
+                    +------+
+
+ DESC0              +------+
+                    |addr  |  -> 0xffff012345670000
+                    +------+
+                    |length|  -> 16 (virtio blk write command)
+                    +------+
+                    |id    |  -> 0
+                    +------+

for RDMA this id is not useful. It can be omitted.
still parsing the rest.

if we talk blk as an example, above command descriptor can be of 32 bytes,
such as
struct virtio_of_cmd {
	u8 opcode;
	u8 rsvd;
	le16 cmd_id;
	u8 inline_desc_cnt;
	u8 rsvd[3];
	/* some padding/metadata for long desc list if any */
};

struct virtio_of_rdma_desc {
	le64 addr;
	le32 length;
	le32 rdma_key;
};

struct virtio_rdma_op {
	struct virtio_of_cmd cmd;
	struct virtio_of_rdma_desc desc[1 or 3]; /* count can be negotiated */
};

With this a send and receive queue on initiator and target can exchange, cmd descriptor for read/writes.

RDMA allows mapping memory and also chaining it with next send.
This way, memory from 1B to 4GB can be represented using single rdma key for data DMA (read or write).

Completion is similarly 8B with status + cmd_id of constant size can be received in an RQ.

This is 1 RTT from initiator to target for cmd and response for whole 4GB data transfer. Depending on the data size, memory pressure, sharing, outstanding commands, etc target can read/write data from initiator memory using RDMA read/write addresses.

With this target can also implement poll or event.

RDMA writes do not guarantee data placement visibility in same order on the responder side as what is send on the requester side.

I ran out of time. Will review more later in the week.


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]