[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: Re: [virtio-dev] packed ring layout proposal
On Mon, Sep 19, 2016 at 07:00:10PM +0200, Paolo Bonzini wrote: > > > On 19/09/2016 18:48, Michael S. Tsirkin wrote: > > > > > > > > #define BATCH_NOT_FIRST 0x0010 > > > > #define BATCH_NOT_LAST 0x0020 > > > > > > > > In other words only descriptor in batch is 0000. > > > > > > > > Batch does not have to be processed together, so > > > > !first/!last flags can be changed when descriptor is used. > > > > > > > > > > I don't exactly understand why this is necessary. > > > Is there any added value by knowing that a bunch of messages have been written at > > > the same time. > > > > virtio net mergeable buffers pretty much rely on this. > > How so? num_buffers is specified in the virtio-net header, which is > going to be in the first received descriptor. BTW this access causes an extra cache miss for some short packets when said packets don't need to be examined immediately. Andrew's testing shows an immediate performance gain just by removing this lookup. I also feel this is a kind of transport thing that really belongs in the descriptor, e.g. if you know there are more descriptors coming prefetch might be more justified. I didn't test this idea yet. > You don't need explicit > marking of which descriptor is the first or the last. No but you do need to make sure that when get_buf returns the first buffer, all buffers are available. > If you want to do parallel polling of the ring, perhaps we can define > extra device-specific flags, and define one (e.g., > VIRTIO_NET_F_RXBUF_HAS_VIRTIO_NET_HDR) or two (adding also > VIRTIO_NET_F_RXBUF_LAST) for MRG_RXBUF. From performance POV, there's a bunch of unused bits in the descriptor that can be put to this use, this is better than a buffer access which might not need to be in cache for a while. And from the achitecture POV, I feel this is a transport thing, not a device specific thing. My kvm forum talk also mentions possible optimization opportunities if the transport can pass a batch of packets around, such as implementing xmit_more. > > Which is optimal depends on the workload: > > - index requires and update each time we enable interrupts > > - flag requires an ipdate each time we switch to enabled/disabled > > interrupts > > > > Accordingly, for workloads that enable interrupts often, > > flags are better. For workloads that keep interrupts > > disabled most of the time, index is better. > > Yeah, this makes sense. So I guess both indexes and flags should stay. > > Paolo
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]