OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

virtio-comment message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [External] Re: [RFC v4] virtio-vsock: add description for datagram type


On Wed, Jun 9, 2021 at 12:17 AM Stefano Garzarella <sgarzare@redhat.com> wrote:
>
> On Tue, Jun 08, 2021 at 09:22:26PM -0700, Jiang Wang . wrote:
> >On Tue, Jun 8, 2021 at 6:46 AM Stefano Garzarella <sgarzare@redhat.com> wrote:
> >>
> >> On Fri, May 28, 2021 at 04:01:18AM +0000, Jiang Wang wrote:
> >> >From: "jiang.wang" <jiang.wang@bytedance.com>
> >> >
> >> >Add supports for datagram type for virtio-vsock. Datagram
> >> >sockets are connectionless and unreliable. To avoid contention
> >> >with stream and other sockets, add two more virtqueues and
> >> >a new feature bit to identify if those two new queues exist or not.
> >> >
> >> >Also add descriptions for resource management of datagram, which
> >> >does not use the existing credit update mechanism associated with
> >> >stream sockets.
> >> >
> >> >Signed-off-by: Jiang Wang <jiang.wang@bytedance.com>
> >> >---
> >> >
> >> >V2: addressed the comments for the previous version.
> >> >V3: add description for the mergeable receive buffer.
> >> >V4: add a feature bit for stream and reserver a bit for seqpacket.
> >> >Fix mrg_rxbuf related sentences.
> >> >
> >> > virtio-vsock.tex | 155 ++++++++++++++++++++++++++++++++++++++++++++++++++-----
> >> > 1 file changed, 142 insertions(+), 13 deletions(-)
> >> >
> >> >diff --git a/virtio-vsock.tex b/virtio-vsock.tex
> >> >index da7e641..bacac3c 100644
> >> >--- a/virtio-vsock.tex
> >> >+++ b/virtio-vsock.tex
> >> >@@ -9,14 +9,41 @@ \subsection{Device ID}\label{sec:Device Types / Socket Device / Device ID}
> >> >
> >> > \subsection{Virtqueues}\label{sec:Device Types / Socket Device / Virtqueues}
> >> > \begin{description}
> >> >-\item[0] rx
> >> >-\item[1] tx
> >> >+\item[0] stream rx
> >> >+\item[1] stream tx
> >> >+\item[2] datagram rx
> >> >+\item[3] datagram tx
> >> >+\item[4] event
> >>
> >> Is there a particular reason to always have the event queue as the last
> >> one?
> >>
> >> Maybe it's better to add the datagram queues at the bottom, so the first
> >> 3 queues are always the same.
> >>
> >I am not sure. I think Linux kernel should be fine with what you described.
> >But I am not sure about QEMU. From the code, I see virtqueue is allocated
> >as an array, like following,
> >
> >+ #ifdef CONFIG_VHOST_VSOCK_DGRAM
> >+    struct vhost_virtqueue vhost_vqs[4];
> >+ #else
> >    struct vhost_virtqueue vhost_vqs[2];
> >+ #endi
>
> I see, also vhost_dev_init() requires an array, so I agree that this is
> the best approach, sorry for the noise.
>
> Just to be sure to check that anything is working if
> CONFIG_VHOST_VSOCK_DGRAM is defined, but the guest has an old driver
> that doesn't support DGRAM, and viceversa.

Sure.  I just want to mention that the QEMU should be consistent
with the device (host). If QEMU enabled CONFIG_VHOST_VSOCK_DGRAM,
the device also needs to enable a similar option. Than the driver can
be old or new versions.

> >
> >so I assume the virtqueues for tx/rx should be
> >continuous? I can try to put the new queues at the end and see if it
> >works or not.
> >
> >btw, my qemu change is here:
> >https://github.com/Jiang1155/qemu/commit/6307aa7a0c347905a31f3ca6577923e2f6dd9d84
> >
> >> >+\end{description}
> >> >+The virtio socket device uses 5 queues if feature bit VIRTIO_VSOCK_F_DRGAM is set. Otherwise, it
> >> >+only uses 3 queues, as the following.
> >> >+
> >> >+\begin{description}
> >> >+\item[0] stream rx
> >> >+\item[1] stream tx
> >> > \item[2] event
> >> > \end{description}
> >> >
> >> >+When behavior differs between stream and datagram rx/tx virtqueues
> >> >+their full names are used. Common behavior is simply described in
> >> >+terms of rx/tx virtqueues and applies to both stream and datagram
> >> >+virtqueues.
> >> >+
> >> > \subsection{Feature bits}\label{sec:Device Types / Socket Device / Feature bits}
> >> >
> >> >-There are currently no feature bits defined for this device.
> >> >+\begin{description}
> >> >+\item[VIRTIO_VSOCK_F_STREAM (0)] Device has support for stream socket type.
> >> >+\end{description}
> >> >+
> >> >+\begin{description}
> >> >+\item[VIRTIO_VSOCK_F_DGRAM (2)] Device has support for datagram socket
> >> >type.
> >> >+\end{description}
> >> >+
> >> >+\begin{description}
> >> >+\item[VIRTIO_VSOCK_F_MRG_RXBUF (3)] Driver can merge receive buffers.
> >> >+\end{description}
> >> >+
> >> >+If no feature bits are defined, then assume only VIRTIO_VSOCK_F_STREAM
> >> >is set.
> >>
> >> I'd say more like socket streams are supported, without reference to the
> >> feature bit, something like: "If no feature bits are defined, then
> >> assume device only supports stream socket type."
> >>
> >OK.
> >
> >> >
> >> > \subsection{Device configuration layout}\label{sec:Device Types / Socket Device / Device configuration layout}
> >> >
> >> >@@ -64,6 +91,8 @@ \subsection{Device Operation}\label{sec:Device Types / Socket Device / Device Op
> >> >
> >> > Packets transmitted or received contain a header before the payload:
> >> >
> >> >+If feature VIRTIO_VSOCK_F_MRG_RXBUF is not negotiated, use the following header.
> >> >+
> >> > \begin{lstlisting}
> >> > struct virtio_vsock_hdr {
> >> >       le64 src_cid;
> >> >@@ -79,6 +108,15 @@ \subsection{Device Operation}\label{sec:Device Types / Socket Device / Device Op
> >> > };
> >> > \end{lstlisting}
> >> >
> >> >+If feature VIRTIO_VSOCK_F_MRG_RXBUF is negotiated, use the following header.
> >> >+\begin{lstlisting}
> >> >+struct virtio_vsock_hdr_mrg_rxbuf {
> >> >+      struct virtio_vsock_hdr hdr;
> >> >+      le16 num_buffers;
> >> >+};
> >> >+\end{lstlisting}
> >> >+
> >> >+
> >> > The upper 32 bits of src_cid and dst_cid are reserved and zeroed.
> >> >
> >> > Most packets simply transfer data but control packets are also used for
> >> >@@ -107,6 +145,9 @@ \subsection{Device Operation}\label{sec:Device Types / Socket Device / Device Op
> >> >
> >> > \subsubsection{Virtqueue Flow Control}\label{sec:Device Types / Socket Device / Device Operation / Virtqueue Flow Control}
> >> >
> >> >+Flow control applies to stream sockets; datagram sockets do not have
> >> >+flow control.
> >> >+
> >> > The tx virtqueue carries packets initiated by applications and replies to
> >> > received packets.  The rx virtqueue carries packets initiated by the device and
> >> > replies to previously transmitted packets.
> >> >@@ -140,12 +181,15 @@ \subsubsection{Addressing}\label{sec:Device Types / Socket Device / Device Opera
> >> > consists of a (cid, port number) tuple. The header fields used for this are
> >> > \field{src_cid}, \field{src_port}, \field{dst_cid}, and \field{dst_port}.
> >> >
> >> >-Currently only stream sockets are supported. \field{type} is 1 for
> >> >stream
> >> >-socket types.
> >> >+Currently stream and datagram (dgram) sockets are supported. \field{type} is 1 for stream
> >> >+socket types. \field{type} is 3 for dgram socket types.
> >>
> >> When Arseny's change will merged, we can define and use
> >> VIRTIO_VSOCK_TYPE_DGRAM,.
> >
> >Sure.
> >> >
> >> > Stream sockets provide in-order, guaranteed, connection-oriented
> >> > delivery
> >> > without message boundaries.
> >> >
> >> >+Datagram sockets provide unordered, unreliable, connectionless
> >> >messages
> >> >+with message boundaries and a maximum length.
> >> >+
> >> > \subsubsection{Buffer Space Management}\label{sec:Device Types / Socket Device / Device Operation / Buffer Space Management}
> >> > \field{buf_alloc} and \field{fwd_cnt} are used for buffer space
> >> > management of
> >> > stream sockets. The guest and the device publish how much buffer space is
> >> >@@ -162,7 +206,7 @@ \subsubsection{Buffer Space Management}\label{sec:Device Types / Socket Device /
> >> > u32 peer_free = peer_buf_alloc - (tx_cnt - peer_fwd_cnt);
> >> > \end{lstlisting}
> >> >
> >> >-If there is insufficient buffer space, the sender waits until virtqueue buffers
> >> >+For stream sockets, if there is insufficient buffer space, the sender waits until virtqueue buffers
> >> > are returned and checks \field{buf_alloc} and \field{fwd_cnt} again. Sending
> >> > the VIRTIO_VSOCK_OP_CREDIT_REQUEST packet queries how much buffer space is
> >> > available. The reply to this query is a VIRTIO_VSOCK_OP_CREDIT_UPDATE packet.
> >> >@@ -170,24 +214,55 @@ \subsubsection{Buffer Space Management}\label{sec:Device Types / Socket Device /
> >> > previously receiving a VIRTIO_VSOCK_OP_CREDIT_REQUEST packet. This allows
> >> > communicating updates any time a change in buffer space occurs.
> >> >
> >> >+Unlike stream sockets, dgram sockets do not use VIRTIO_VSOCK_OP_CREDIT_UPDATE or
> >> >+VIRTIO_VSOCK_OP_CREDIT_REQUEST packets. The dgram buffer management
> >> >+is split to two parts: tx side and rx side. For the tx side, if the
> >>
> >> Maybe better to use sender and receiver, since we use tx and rx to
> >> identify the queues.
> >
> >OK.
> >
> >> >+virtqueue is full, the packet will be dropped.
> >> >+For the rx side, dgram also uses the \field{buf_alloc}. If it is full, the packet
> >> >+is dropped by the receiver.
> >>
> >> This sentence is a bit unclear.
> >> `buf_alloc` for stream socket is used to inform the other peer about the
> >> receive buffer space. In this case we are using the local information,
> >> so there is no need to refer to `buf_alloc`. We can write something
> >> like: "The packet is dropped by the receiver if there is no space in the
> >> receive buffer".
> >
> >OK.
> >
> >> >+
> >> >+\drivernormative{\paragraph}{Device Operation: Buffer Space Management}{Device Types / Socket Device / Device Operation / Setting Up Receive Buffers}
> >> >+\begin{itemize}
> >> >+\item If VIRTIO_VSOCK_F_MRG_RXBUF is not negotiated, the driver SHOULD populate the receive queue(s)
> >> >+      with buffers of at least 1526 bytes for stream sockets and 4096
> >> >bytes for datagram sockets.
> >>
> >> Where does 1526 come from?
> >
> >No specific reason. Any recommendations?
> >
> >> We're adding a requirement for socket streams that wasn't there until
> >> now.
> >
> >This is only when mergeable rxbuf bit is used. I think before this, the stream
> >rx buf should be at least bigger than the pkt header. We just did not put that
> >into the spec.
>
> Mmm, I'm confused now. The statement says "If VIRTIO_VSOCK_F_MRG_RXBUF
> is not negotiated"
>     ^
My bad. You are right, this is a new requirement for stream sockets.
Not a typo.  Maybe remove the part for stream sockets? Something
like following:

If VIRTIO_VSOCK_F_MRG_RXBUF is not negotiated, the driver SHOULD
populate the receive queue(s)
with buffers of  at least 4096 bytes for datagram sockets.

> Is it a typo?
>
>
> IIUC when VIRTIO_VSOCK_F_MRG_RXBUF is negotiated the minimum buffer size
> is virtio_vsock_hdr_mgr_rxbuf...
>
> >
> >> >+\item If VIRTIO_VSOCK_F_MRG_RXBUF is negotiated, each buffer MUST be at
> >> >+least the size of the struct virtio_vsock_hdr_mgr_rxbuf.
>
> ... from this ^ statement.
>
> And I agree that rx buf should be at least the virtio_vsock_hdr size.
>
> >> >+\end{itemize}
> >> >+
> >> >+\begin{note}
> >> >+Obviously each buffer can be split across multiple descriptor
> >> >elements.
> >> >+\end{note}
> >> >+
> >> >+\devicenormative{\paragraph}{Device Operation: Buffer Space Management}{Device Types / Socket Device / Device Operation / Setting Up Receive Buffers}
> >> >+The device MUST set \field{num_buffers} to the number of descriptors used when
> >> >+transmitting the  packet.
> >> >+
> >> >+The device MUST use only a single descriptor if VIRTIO_VSOCK_F_MRG_RXBUF
> >> >+is not negotiated.
> >> >+
> >> > \drivernormative{\paragraph}{Device Operation: Buffer Space Management}{Device Types / Socket Device / Device Operation / Buffer Space Management}
> >> >-VIRTIO_VSOCK_OP_RW data packets MUST only be transmitted when the peer has
> >> >-sufficient free buffer space for the payload.
> >> >+For stream sockets, VIRTIO_VSOCK_OP_RW data packets MUST only be transmitted when the peer has
> >> >+sufficient free buffer space for the payload. For dgram sockets, VIRTIO_VSOCK_OP_RW data packets
> >> >+MAY be transmitted when the peer rx buffer is full. Then the packet will be dropped by the peer,
> >> >+and driver will not get any notification.
> >> >
> >> > All packets associated with a stream flow MUST contain valid
> >> > information in
> >> > \field{buf_alloc} and \field{fwd_cnt} fields.
> >> >
> >> > \devicenormative{\paragraph}{Device Operation: Buffer Space Management}{Device Types / Socket Device / Device Operation / Buffer Space Management}
> >> >-VIRTIO_VSOCK_OP_RW data packets MUST only be transmitted when the peer has
> >> >-sufficient free buffer space for the payload.
> >> >+For stream sockets, VIRTIO_VSOCK_OP_RW data packets MUST only be transmitted when the peer has
> >> >+sufficient free buffer space for the payload. For dgram sockets,
> >> >VIRTIO_VSOCK_OP_RW data packets
> >> >+MAY be transmitted when the peer rx buffer is full. Then the packet will be dropped by the peer,
> >> >+and the device will not get any notification.
> >> >
> >> > All packets associated with a stream flow MUST contain valid information in
> >> > \field{buf_alloc} and \field{fwd_cnt} fields.
> >> >
> >> > \subsubsection{Receive and Transmit}\label{sec:Device Types / Socket Device / Device Operation / Receive and Transmit}
> >> >-The driver queues outgoing packets on the tx virtqueue and incoming
> >> >packet
> >> >+The driver queues outgoing packets on the tx virtqueue and allocates incoming packet
> >> > receive buffers on the rx virtqueue. Packets are of the following form:
> >> >
> >> >+If VIRTIO_VSOCK_F_MRG_RXBUF was not negotiated, use the following.
> >>
> >> Please use present as in the rest of the document,
> >
> >Sure. I see both past tense and present tense in the spec, so I was a
> >little bit confused.
>
> Sorry about that. I generally always saw the present tense and seemed to
> use it everywhere.
>
Sure. No problem.

> >> > \begin{lstlisting}
> >> > struct virtio_vsock_packet {
> >> >     struct virtio_vsock_hdr hdr;
> >> >@@ -195,24 +270,70 @@ \subsubsection{Receive and Transmit}\label{sec:Device Types / Socket Device / De
> >> > };
> >> > \end{lstlisting}
> >> >
> >> >+Otherwise, use the following form:
> >>
> >> Maybe better to repeat:
> >>
> >> If VIRTIO_VSOCK_F_MRG_RXBUF is negotiated...
> >>
> >OK.
> >
> >> >+\begin{lstlisting}
> >> >+struct virtio_vsock_packet_mrg_rxbuf {
> >> >+    struct virtio_vsock_hdr_mrg_rxbuf hdr;
> >> >+    u8 data[];
> >> >+};
> >> >+\end{lstlisting}
> >> >+
> >> >+
> >> > Virtqueue buffers for outgoing packets are read-only. Virtqueue buffers for
> >> > incoming packets are write-only.
> >> >
> >> >+When transmitting packets to the device, \field{num_buffers} is not
> >> >used.
> >> >+
> >> >+\begin{enumerate}
> >> >+\item \field{num_buffers} indicates how many descriptors
> >> >+  this packet is spread over (including this one).
> >> >+  This is valid only if VIRTIO_VSOCK_F_MRG_RXBUF was negotiated.
> >> >+  This allows receipt of large packets without having to allocate large
> >> >+  buffers: a packet that does not fit in a single buffer can flow
> >> >+  over to the next buffer, and so on. In this case, there will be
> >> >+  at least \field{num_buffers} used buffers in the virtqueue, and the device
> >> >+  chains them together to form a single packet in a way similar to
> >> >+  how it would store it in a single buffer spread over multiple
> >> >+  descriptors.
> >> >+  The other buffers will not begin with a struct virtio_vsock_hdr.
> >> >+
> >> >+  If VIRTIO_VSOCK_F_MRG_RXBUF was not negotiated, then only one
> >> >+  descriptor is used.
> >> >+
> >> >+\item If
> >> >+  \field{num_buffers} is one, then the entire packet will be
> >> >+  contained within this buffer, immediately following the struct
> >> >+  virtio_vsock_hdr.
> >> >+\end{enumerate}
> >> >+
> >> > \drivernormative{\paragraph}{Device Operation: Receive and
> >> > Transmit}{Device Types / Socket Device / Device Operation / Receive
> >> > and Transmit}
> >> >
> >> > The \field{guest_cid} configuration field MUST be used as the source CID when
> >> > sending outgoing packets.
> >> >
> >> >-A VIRTIO_VSOCK_OP_RST reply MUST be sent if a packet is received with an
> >> >+For stream and datagram sockets, A VIRTIO_VSOCK_OP_RST reply MUST be
> >>
> >> Perhaps we can leave it as it was before, since the `type` field
> >> identifies the socket type itself.
> >
> >OK.
> >
> >> >sent if a packet is received with an
> >> > unknown \field{type} value.
> >> >
> >> > \devicenormative{\paragraph}{Device Operation: Receive and Transmit}{Device Types / Socket Device / Device Operation / Receive and Transmit}
> >> >
> >> > The \field{guest_cid} configuration field MUST NOT contain a reserved CID as listed in \ref{sec:Device Types / Socket Device / Device configuration layout}.
> >> >
> >> >-A VIRTIO_VSOCK_OP_RST reply MUST be sent if a packet is received with an
> >> >+For stream and datagram sockets, A VIRTIO_VSOCK_OP_RST reply MUST
> >> >be sent if a packet is received with an
> >> > unknown \field{type} value.
> >>
> >> Ditto.
> >
> >OK.
> >
> >> >
> >> >+If VIRTIO_VSOCK_F_MRG_RXBUF has been negotiated, the device MUST set
> >> >+\field{num_buffers} to indicate the number of buffers
> >> >+the packet (including the header) is spread over.
> >> >+
> >> >+If a receive packet is spread over multiple buffers, the device
> >> >+MUST use all buffers but the last (i.e. the first $\field{num_buffers} -
> >> >+1$ buffers) completely up to the full length of each buffer
> >> >+supplied by the driver.
> >> >+
> >> >+The device MUST use all buffers used by a single receive
> >> >+packet together, such that at least \field{num_buffers} are
> >> >+observed by driver as used.
> >> >+
> >> > \subsubsection{Stream Sockets}\label{sec:Device Types / Socket
> >> > Device / Device Operation / Stream Sockets}
> >> >
> >> > Connections are established by sending a VIRTIO_VSOCK_OP_REQUEST packet. If a
> >> >@@ -240,6 +361,14 @@ \subsubsection{Stream Sockets}\label{sec:Device Types / Socket Device / Device O
> >> > destination) address tuple for a new connection while the other peer is still
> >> > processing the old connection.
> >> >
> >> >+\subsubsection{Datagram Sockets}\label{sec:Device Types / Socket Device / Device Operation / Datagram Sockets}
> >> >+
> >> >+Datagram (dgram) sockets are connectionless and unreliable. The sender just sends
> >> >+a message to the peer and hopes it will be delivered. A VIRTIO_VSOCK_OP_RST reply is sent if
> >> >+a receiving socket does not exist on the destination.
> >> >+If the transmission or receiving buffers are full, the packets
> >> >+are dropped.
> >> >+
> >> > \subsubsection{Device Events}\label{sec:Device Types / Socket Device / Device Operation / Device Events}
> >> >
> >> > Certain events are communicated by the device to the driver using the event
> >> >--
> >> >2.11.0
> >> >
> >>
> >> I don't know if maybe it's better to break this patch in two, one
> >> where
> >> we add datagram and one for mergeable buffer.
> >>
> >> But let's see what others think.
> >
> >OK. I can definitely break it to two patches if necessary. Thanks for
> >all the comments.
>
> Thanks,
> Stefano
>


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]