OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

virtio-comment message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: RE: [virtio-comment] RE: RE: Re: Re: Re: [PATCH v2 06/11] transport-fabrics: introduce command set


> From: zhenwei pi <pizhenwei@bytedance.com>
> Sent: Thursday, June 8, 2023 11:55 PM

> I tried to understand your proposal, please correct me if I misunderstand...
> 
> Define data structure like:
> 
> struct virtio_of_keyed_desc {
>          le64 addr;
>          le32 length;
>          le32 key;
> };
> 
> struct virtio_of_command_vq {
>          le16 opcode;
>          le16 command_id;
>          le32 out_length;
>          le32 in_length;
>          union {
>                  struct virtio_of_keyed {
>                          le32 out_offset;
>                  };
> 
>                  struct virtio_of_stream {
>                          u8 rsvd[4];
>                  };
>          };
> };
> 
> struct virtio_of_completion {
>          le16 status;
>          le16 command_id;
>          u8 rsvd[4];
>          union {
>                  le64 value;
>                  struct virtio_of_vq_completion {
>                          le32 in_length;
>                          le32 len;
>                  };
>          }
> };
> 
> 
> For stream(Ex TCP/IP), the request PDU includes [struct virtio_of_command_vq
> + data], the response PDU includes [struct virtio_of_completion + data].
> 
> For keyed(Ex RDMA), the request PDU includes [struct virtio_of_command_vq +
> struct virtio_of_keyed_desc], there are 2 opcodes for keyed transmission:
> 1, opcode virtio_of_op_vq: (basic and required command) the initiator prepares
> a buffer of [out_length + in_length], the target recv a 32B command, and reads
> the remote memory [addr, addr+out_length) by RDMA READ, then writes the
> remote memory [addr+out_length,
> addr+out_length+in_length) by RDMA WRITE, finally sends completion by
> RDMA SEND.
> 
Maybe we can switch to the 64B format which has two benefits.
1. separate RDMA buffer for in, out xfer as each can have different DMA attributes.
2. ability to have one or more inline descs

A good way is to negotiate the max_cmd_size minimum being 32, maximum being a finite reasonable number of 64 or 128.

> 2, opcode virtio_of_op_vq_write_inline: (optional command)
> the initiator gets a remote buffer of target(Ex, 128K) after feature
> negotiation.
> 
> The initiator selects a region of target remote memory(Ex, 4k - 12k),
> and writes payload by RDMA WRITE, then sends a 32B command by RDMA
> SEND(out_offset is 4K, ).
> The target handles command, writes the remote memory [addr,
> addr+in_length), finally sends completion by RDMA SEND.
Yes.


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]