OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

virtio-dev message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: RE: [virtio-dev] [PATCH RFC] packed ring layout spec v5


> From: Michael S. Tsirkin [mailto:mst@redhat.com]
> Sent: 28. november 2017 15:37

> > If the device always returns buffers in order, then couldn't the driver skip the
> >step of reading the used.ring for read-only buffers  (e.g. TX for net devices)? The
> >used.idx tells how many buffers were returned, and since they are returned in
> >the same order as the driver sent them, it knows what their indices are. This
> >would then save one cache-miss in the old structure too.
> 
> True. That would be another variant to support though.
> 
> I doubt it'll outperform this one but I didn't test it specifically. Care trying to
> implement it?

Agreed, and I don't see it as competing with the packed ring, however if there
are low hanging fruits, that improve performance of the non-ring structure
(in at least some significant use cases) they could be worth considering as part
of a rev 1.1 specification too. I'll see what can be done for a prototype.

> 
> > And a follow-up questions would then be: if a device always returns buffers in
> >order, does the v1.0 specification not require drivers to reuse descriptors in the
> >same order as they are returned? I think 3.2.1.1 implies that at least. If so,
> >wouldn't new descriptors always be placed back2back in the descriptor table
> >(contiguous memory)?
> 
> You probably mean this:
> 1. Get the next free descriptor table entry, d
> 
> and you interpret "next" here as "next in ring order".
> 
> I'm not sure everyone follows this interpretation though.
> 
> E.g. Linux does:
> static void detach_buf(struct vring_virtqueue *vq, unsigned int head,
>                        void **ctx)
> {
> ...
>         vq->vring.desc[i].next = cpu_to_virtio16(vq->vq.vdev, vq->free_head);
>         vq->free_head = head;
> 
> So descriptors are added at head of the free list.  Next is interpreted as next on
> this list.  E.g. with a single request in flight, it looks like a single descriptor will
> keep getting reused.
> 

I guess there isn't an explicit enough requirement in v1.0 to claim right or wrong 
with regards to this. Enforcing it could however be made part of a driver 
requirement imposed by the new IN_ORDER feature bit. Thus the IN_ORDER 
feature bit for the non-ring would be defined to enforce that the descriptor indices 
are always processed in-order by both the device and the driver.

My reason for this is to ensure that new descriptors are placed in a contiguous
range of the descriptor table, which should improve the L1$ prefetcher hit rate
for batching, and also provide means for efficient DMA in case of HW-offload.
With knowledge of the number of elements in each buffer it could maybe also be 
possible to calculate the descriptor index range to DMA.

BR,

-Lars


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]