OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

virtio-dev message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [virtio-dev] [PATCH RESEND v2] vsock: add vsock device


On Thu, Jan 28, 2016 at 03:40:55PM +0000, Stefan Hajnoczi wrote:
> The virtio vsock device is a zero-configuration socket communications
> device.  It is designed as a guest<->host management channel suitable
> for communicating with guest agents.
> 
> vsock is designed with the sockets API in mind and the driver is
> typically implemented as an address family (at the same level as
> AF_INET).  Applications written for the sockets API can be ported with
> minimal changes (similar amount of effort as adding IPv6 support to an
> IPv4 application).
> 
> Unlike the existing console device, which is also used for guest<->host
> communication, multiple clients can connect to a server at the same time
> over vsock.  This limitation requires console-based users to arbitrate
> access through a single client.  In vsock they can connect directly and
> do not have to synchronize with each other.
> 
> Unlike network devices, no configuration is necessary because the device
> comes with its address in the configuration space.
> 
> The vsock device was prototyped by Gerd Hoffmann and Asias He.  I picked
> the code and design up from them.
> 
> Cc: Gerd Hoffmann <kraxel@redhat.com>
> Cc: Asias He <asias.hejun@gmail.com>
> Cc: Michael S. Tsirkin <mst@redhat.com>
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> ---
> v2:
>  * Document guest_cid field
>  * Use MAY/MUST/CAN according to RFC 2119
>  * Remove datagram socket type for the time being.  This can be added in
>    the future but there are currently no applications.
>  * Drop 3-way handshake for stream sockets.  It is not needed since
>    virtio-vsock is reliable, in-order delivery and spoofing source
>    addresses is impossible.
>  * Drop max_virtqueue_pairs configuration space field.  This field was
>    never defined and Linux code does not support multiqueue.  It can be
>    added back later, if necessary.


The protocol part looks good to me.

One general comment: RFC 2119 should appear in confirmance statements,
and these should appear in confirmance clauses, separately
for device and driver. Sometimes, this causes a bit of
duplication. Text outside confirmance clauses should not
use RFC 2119 words - the point of this rule is to make
it easier to notice something that should be a confirmance statement.

For example, networking device has:
\item \field{num_buffers} is set to zero.  This field is unused on
transmitted packets.

and then later we have
\drivernormative{\paragraph}{Packet Transmission}{Device Types / Network
Device / Device Operation / Packet Transmission}

The driver MUST set \field{num_buffers} to zero.


and this paragraph is linked from the correct normative statements section
(in conformance.tex).

the should also be a subsection listing normative statements
for your device.

> ---
>  trunk/content.tex | 152 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 152 insertions(+)
> 
> diff --git a/trunk/content.tex b/trunk/content.tex
> index d989d98..8b5b520 100644
> --- a/trunk/content.tex
> +++ b/trunk/content.tex
> @@ -5641,6 +5641,158 @@ descriptor for the \field{sense_len}, \field{residual},
>  \field{status_qualifier}, \field{status}, \field{response} and
>  \field{sense} fields.
>  
> +\section{VSock Device}\label{sec:Device Types / VSock Device}

I think we should call it "Socket Device" in free text. Avoid
abbreviations. While "v" in vsock seems somewhat redundant, you can keep
virtio_vsock in code snippets if you prefer.

> +
> +The virtio vsock device is a zero-configuration socket communications device.

and then

The virtio socket device supports zero-configuration socket communication.


> +It facilitates data transfer between the guest and device without using the
> +Ethernet or IP protocols.
> +
> +\subsection{Device ID}\label{sec:Device Types / VSock Device / Device ID}
> +  13
> +
> +\subsection{Virtqueues}\label{sec:Device Types / VSock Device / Virtqueues}
> +\begin{description}
> +\item[0] ctrl
> +\item[1] rx
> +\item[2] tx
> +\end{description}
> +
> +\subsection{Feature bits}\label{sec:Device Types / VSock Device / Feature bits}
> +
> +\begin{description}
> +There are currently no feature bits defined for this device.
> +\end{description}
> +
> +\subsection{Device configuration layout}\label{sec:Device Types / VSock Device / Device configuration layout}
> +
> +\begin{lstlisting}
> +struct virtio_vsock_config {
> +	__le32 guest_cid;
> +};
> +\end{lstlisting}
> +
> +The \field{guest_cid} field contains the guest's context ID, which uniquely
> +identifies the guest for the lifetime of the device.  The value MUST be used as
> +the source CID when sending outgoing packets.
> +
> +\subsection{Device Initialization}\label{sec:Device Types / VSock Device / Device Initialization}
> +
> +\begin{enumerate}
> +\item The guest's cid is read from \field{guest_cid}.
> +
> +\item Buffers are added to the rx virtqueue to start receiving packets.
> +\end{enumerate}
> +
> +\subsection{Device Operation}\label{sec:Device Types / VSock Device / Device Operation}
> +
> +Packets transmitted or received contain a header before the payload:
> +
> +\begin{lstlisting}
> +struct virtio_vsock_hdr {
> +	__le32 src_cid;
> +	__le32 src_port;
> +	__le32 dst_cid;
> +	__le32 dst_port;
> +	__le32 len;
> +	__le16 type;
> +	__le16 op;
> +	__le32 flags;
> +	__le32 buf_alloc;
> +	__le32 fwd_cnt;
> +};
> +\end{lstlisting}
> +
> +Most packets simply transfer data but control packets are also used for
> +connection and buffer space management.  \field{op} is one of the following
> +operation constants:
> +
> +\begin{lstlisting}
> +enum {
> +	VIRTIO_VSOCK_OP_INVALID = 0,
> +
> +	/* Connect operations */
> +	VIRTIO_VSOCK_OP_REQUEST = 1,
> +	VIRTIO_VSOCK_OP_RESPONSE = 2,
> +	VIRTIO_VSOCK_OP_RST = 3,
> +	VIRTIO_VSOCK_OP_SHUTDOWN = 4,
> +
> +	/* To send payload */
> +	VIRTIO_VSOCK_OP_RW = 5,
> +
> +	/* Tell the peer our credit info */
> +	VIRTIO_VSOCK_OP_CREDIT_UPDATE = 6,
> +	/* Request the peer to send the credit info to us */
> +	VIRTIO_VSOCK_OP_CREDIT_REQUEST = 7,
> +};
> +\end{lstlisting}
> +
> +\subsubsection{Addressing}\label{sec:Device Types / VSock Device / Device Operation / Addressing}
> +
> +VSock flows are identified by a (source, destination) address tuple.

Accordingly, virtio socket flows

> Address
> +information consists of a (cid, port number) tuple. The header fields used for
> +this are \field{src_cid}, \field{src_port}, \field{dst_cid}, and
> +\field{dst_port}.
> +
> +Currently only stream sockets are supported. \field{type} is 1 for stream
> +socket types.  A VIRTIO_VSOCK_OP_RST reply MUST be sent if a packet is received
> +with an unknown \field{type} value.
> +
> +Stream sockets provide in-order, guaranteed, connection-oriented delivery
> +without message boundaries.
> +
> +\subsubsection{Buffer Space Management}\label{sec:Device Types / VSock Device / Device Operation / Buffer Space Management}
> +\field{buf_alloc} and \field{fwd_cnt} are used for buffer space management of
> +stream sockets.  The guest and the device MUST publish how much buffer space is
> +available per socket. This facilitates flow control so packets are never
> +dropped.
> +
> +\field{buf_alloc} is the total receive buffer space, in bytes, for this socket.
> +This includes both free and in-use buffers. \field{fwd_cnt} is the free-running
> +bytes received counter. The sender calculates the amount of free receive buffer
> +space as follows:
> +
> +\begin{lstlisting}
> +/* tx_cnt is the sender's free-running bytes transmitted counter */
> +u32 peer_free = peer_buf_alloc - (tx_cnt - peer_fwd_cnt);
> +\end{lstlisting}
> +
> +If there is insufficient buffer space, the sender MUST wait until virtqueue
> +buffers are returned and check \field{buf_alloc} and \field{fwd_cnt} again. The
> +VIRTIO_VSOCK_OP_CREDIT_REQUEST packet MAY be sent to force buffer space
> +management information exchange. VIRTIO_VSOCK_OP_CREDIT_UPDATE MUST be sent in
> +response and when buffer space is freed.
> +




> +\subsubsection{Receive and Transmit}\label{sec:Device Types / VSock Device / Device Operation / Receive and Transmit}
> +The driver queues outgoing packets on the tx virtqueue and incoming packet
> +receive buffers on the rx virtqueue. Packets are of the following form:
> +
> +\begin{lstlisting}
> +struct virtio_vsock_packet {
> +    struct virtio_vsock_hdr hdr;
> +    u8 data[];
> +};
> +\end{lstlisting}
> +
> +Virtqueue buffers for outgoing packets are read-only. Virtqueue buffers for
> +incoming packets are write-only.
> +
> +\subsubsection{Stream Sockets}\label{sec:Device Types / VSock Device / Device Operation / Stream Sockets}
> +
> +Connections are established by sending a VIRTIO_VSOCK_OP_REQUEST packet. If a
> +listening socket exists on the destination a VIRTIO_VSOCK_OP_RESPONSE reply is
> +sent and the connection is established.  A VIRTIO_VSOCK_OP_RST reply is sent if
> +a listening socket does not exist on the destination or the destination has
> +insufficient resources to establish the connection.
> +
> +When a connected socket receives VIRTIO_VSOCK_OP_SHUTDOWN the header
> +\field{flags} field bit 0 indicates that the peer will not receive any more
> +data and bit 1 indicates that the peer will not send any more data. If these
> +bits are set and there are no more virtqueue buffers pending the socket is
> +disconnected.
> +
> +VIRTIO_VSOCK_OP_RST can be sent at any time to abort the connection process or
> +forcibly disconnect.
> +
>  \chapter{Reserved Feature Bits}\label{sec:Reserved Feature Bits}
>  
>  Currently there are three device-independent feature bits defined:
> -- 
> 2.5.0
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
> For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]