OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

virtio-dev message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]

Subject: Re: [virtio-dev] [PATCH v4] vsock: add vsock device

On Wed, 2016-03-23 at 16:47 +0000, Stefan Hajnoczi wrote:
> On Mon, Mar 21, 2016 at 12:53:23PM +0000, Ian Campbell wrote:
> > 
> > > 
> > > +The \field{guest_cid} field contains the guest's context ID, which uniquely
> > > +identifies the device for its lifetime.
> > Are there any CID values which are reserved or invalid, perhaps 0 or
> > ~0U? Likewise for port numbers. I think there is no broadcast nor
> > routing so perhaps no need for any reserved space?
> The following reserved CIDs are defined in :
> [...]
> I will document reserved CIDs in the next spec revision.

> Will document in the next revision.

> > > +\drivernormative{\paragraph}{Device Operation: Buffer Space
> > > Management}{Device Types / Socket Device / Device Operation / Buffer
> > > Space Management}
> > > +VIRTIO_VSOCK_OP_RW data packets MUST only be transmitted when the peer has
> > > +sufficient free buffer space for the payload.
> > This implies that the implicit buffering space within the virtio ring
> > itself is being ignored, is that correct?
> > 
> > That rules out a simple zero buffer approach which simply pumps data
> > from the ring into an AF_UNIX socket whenever the socket is available
> > for writing, returning the virtio chains as they are fully consumed.
> You are right, the zero-copy implementation you are describing is
> missing the buffer space management layer and only relying on the rx
> virtqueue.  That doesn't work if you need to support multiple
> simultaneous connections.
> virtio-vsock is by design not zero-copy.  The reason for this is that
> multiple connections are multiplexed over a single pair of rx/tx
> virtqueues.  A malicious or buggy connection could hog the vring and
> prevent other connections from making progress.
> Guaranteed delivery is possible because the rx virtqueue is emptied as
> soon as possible by copying data into the socket receive buffer (socket
> option SO_RCVBUF).  Since each connection has a receive buffer and the
> other side is aware of available space we can achieve guaranteed
> delivery.

Understood, thanks. I'll rethink the approach I'm taking here.

> > The above says various replies MUST be sent, but what if the other end
> > has not provided any buffers to allow a reply to be sent, or they have
> > all been consumed with RX traffic?
> The Linux implementation will wait until vring buffers become available.
> I think this is the correct approach to take advantage of virtqueue's
> guaranteed delivery - otherwise we'd have to add TCPesque retries.
> Currently task state UNINTERRUPTIBLE is used in Linux without a timeout
> so this could tie up the thread indefinitely.  But it's possible to
> improve the code so the timeout is honored and the thread is
> interruptible.

Thanks, I think I'll end up taking the same approach (i.e. spinning in
my TX thread waiting for RX buffers to become available).

> > Under what circumstances would a peer send a SHUTDOWN without both bits
> > set? i.e what is the intended use of a "shutdown, but I'm happy to
> > recieve more data or even send you some more myself" operation?
> Yes, the half-duplex semantics are inherited from TCP and are supported
> by the Sockets API.  For example, you can shutdown write but continue
> receiving.

I had no idea such things were possible with TCP!


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]