OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

virtio-dev message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [virtio-dev] [RFC] vsock: add vsock device


On Tue, Jun 02, 2015 at 04:29:06PM +0200, Michael S. Tsirkin wrote:
> On Tue, Jun 02, 2015 at 03:17:39PM +0100, Stefan Hajnoczi wrote:
> > On Mon, Jun 01, 2015 at 05:39:15PM +0200, Michael S. Tsirkin wrote:
> > > On Thu, May 21, 2015 at 05:40:41PM +0100, Stefan Hajnoczi wrote:
> > > > +\subsubsection{Buffer Space Management}\label{sec:Device Types / VSock Device / Device Operation / Buffer Space Management}
> > > > +\field{buf_alloc} and \field{fwd_cnt} are used for buffer space management.
> > > > +They allow the guest and the device to publish how much buffer space is
> > > > +available. This facilitates flow control so packets are never dropped and
> > > > +buffer space can potentially be reserved for high-priority flows.
> > > > +
> > > > +\field{buf_alloc} is the sender's total receive buffer space, in bytes. This
> > > > +includes both free and in-use buffers. \field{fwd_cnt} is the sender's
> > > > +free-running bytes received counter. The receiver uses this information to
> > > > +calculate the total amount of free buffer space:
> > > > +
> > > > +\begin{lstlisting}
> > > > +/* tx_cnt is a free-running bytes transmitted counter */
> > > > +u32 peer_free = peer_buf_alloc - (tx_cnt - peer_fwd_cnt);
> > > > +\end{lstlisting}
> > > > +
> > > > +Both the driver and the device MUST track buffer space. If there is
> > > > +insufficient buffer space, they must wait until virtqueue buffers are returned
> > > > +and check \field{buf_alloc} and \field{fwd_cnt} again. The
> > > > +VIRTIO_VSOCK_OP_CREDIT_UPDATE packet MAY be sent to force buffer space
> > > > +management information exchange. VIRTIO_VSOCK_OP_CREDIT_REQUEST MUST be sent in
> > > > +response.
> > 
> > Typo: CREDIT_UPDATE <-> CREDIT_REQUEST are swapped.
> > 
> > > Is it ok to initiate connection with 0 buf space allocated?
> > > If so, how does remote know it should add bufs?
> > 
> > Yes, a new SOCK_STREAM connection can be started with 0 buffer space
> > available.  Only data bytes count towards buffer space, not packet
> > headers used for control packets.
> > 
> > If the peer has 0 buffer space available then a CREDIT_REQUEST query is
> > sent, just to make sure that buffer space is still empty.  Also, when
> > data is dequeued (i.e. recvmsg() from userspace) a gratuitous
> > CREDIT_UPDATE is sent so the peer knows it may now be able to transmit.
> > This wakes the connection back up so packets can be transmitted.
> > 
> 
> OK this gratuitous CREDIT_UPDATE then becomes a requirement:
> MUST or SHOULD, not MAY.

ok

> > 
> > After reviewing this mechanism more closely, I noticed a key point I
> > failed to document: fwd_cnt is only updated on recvmsg() from userspace.
> > The idea is that peer_buf_alloc is the total amount of socket buffer
> > space available.
> > 
> > There is a denial-of-service case: if there are two flows (dgram or
> > stream doesn't matter) and an application never calls recvmsg() for one
> > of the flows, then all buffer space can be tied up.  This prevents the
> > other flow from progressing.
> 
> I'm confused. fwd_cnt is per-flow, isn't it?

You are right.  I double-checked the implementation.  fwd_cnt is already
per-flow.  Phew!

> How is not incrementing it preventing other flows from progressing?

We're fine.  I thought it was global but it is already per-flow.

> > Perhaps we should drop the buffer space mechanism for dgram flows and
> > make the counters per-flow for streams.  That way the denial-of-service
> > issue is avoided.  Does that sound good?
> 
> As I mentioned separately, I'm not sure what does buffer space mechanism
> mean for dgram flows, since a flow must track msg boundaries which
> has memory overhead (for stream flows, small messages can be packed
> into a single buffer if necessary).

The receiver has a finite buffer size.  If the receiving userspace
application doesn't read from the buffer fast enough it can become full.
I guess the idea of buffer space management for dgram was to avoid
dropping packets by letting the sender know they must wait.

Attachment: pgpOFQugreESd.pgp
Description: PGP signature



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]