OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

virtio-dev message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [virtio-dev] packed ring layout proposal


On Mon, Sep 19, 2016 at 02:01:57PM +0200, Paolo Bonzini wrote:
> 
> 
> On 19/09/2016 09:37, Pierre Pfister (ppfister) wrote:
> > Paolo proposed in a later mail to drop VRING_DESC_F_NEXT altogether,
> > and I think he is right at least in the sense that having two solutions
> > for doing almost the same thing is not necessary (IMHO, even harmful).
> > 
> > But, VRING_DESC_F_INDIRECT adds a layer of indirection and makes
> > prefetching a bit more complicated. The reason why VRING_DESC_F_INDIRECT
> > was useful in the first place is that virtio 1.0 didn't have enough
> > descriptors in the main queue (it is equal to the queue size).
> > 
> > Therefore, ideally, I'd rather have a longer queue and use
> > VRING_DESC_F_NEXT only and drop VRING_DESC_F_INDIRECT altogether.
> 
> VRING_DESC_F_INDIRECT is necessary for other devices.  For storage it is
> not rare to have s/g lists composed of 10-15 descriptors.
> 
> The main issue with VRING_DESC_F_INDIRECT, at least as it is implemented
> in Linux, is the cost of allocating and freeing the indirect descriptor.
>  But if you know that the number of descriptor is fixed (e.g. 2 for
> request header + data) you can pre-allocate the descriptors with a 1:1
> relationship with the entries of the ring buffer.  This might be useful
> for the virtio-net tx ring.

Yes, I've been thinking the same thing. Is there any advantage
to have the indirect addresses pre-programmed as
opposed to stored in the descriptors?

In theory
	if (desc.flags & indirect)
		use(*address[i])

might be speculated as compared to
	if (desc.flags & indirect)
		use(*desc.addr)
which can't.

But OTOH this might add to cache pressure as desc is already in cache
anyway.

If all descriptors are known ahead of the time to be indirect we could
pack more of them per cache line by dropping addr (and maybe len?).
This might work well for virtio blk.

Another consideration is numa locality though. It might be benefitial
to move the indirect buffers to the correct numa node,
if these are always preallocated you can't.


> > With this flag, every-time you look at a new descriptor, before doing
> > anything you will have to look at the flags. Meaning that you will
> > have to wait for the result of a test on a very newly fetched piece
> > of memory.
> 
> Yeah, it trades less cache traffic with higher latency on the remaining
> traffic.  But I like the idea.  For L2 the processor's speculative
> execution should still hide most of the latency (the branch should be
> well predicted in the direction of "another descriptor"---and if there's
> no other descriptor you don't care too much about the speculation
> penalty because you have no work to do).

Yes, it seems to work well.

> You could also try a blind prefetch of desc.addr before looking at the
> flags, in order to help with indirect descriptors.

Tried and this does not seem to help at least in the micro-benchmark.

> >> struct event {
> >> 	__le16 idx;
> >> 	__le16 flags;
> >> }
> >>
> >> Flags can be used like in virtio 1.0, by storing a special
> >> value there:
> >>
> >> #define VRING_F_EVENT_ENABLE  0x0
> >>
> >> #define VRING_F_EVENT_DISABLE 0x1
> >>
> >> or it can be used to switch to event idx:
> >>
> >> #define VRING_F_EVENT_IDX     0x2
> >>
> >> in that case, interrupt triggers when descriptor with
> >> a given index value is used.
> > 
> > Is there a reason why we would keep both ways and not just use the IDX one ?
> 
> Probably a good idea, too, to keep it simple.  Pre-1.0 can go away.
> 
> Paolo

It's not there for compatibility.  I did put some thought into this, and
I feel different workloads need different mechanisms.  I'll add some
motivation in the next revision I post.

-- 
MST


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]