OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

virtio-comment message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [virtio-comment] Hardware friendly proposals from Intel for packed-ring-layout


On Fri, Aug 25, 2017 at 04:32:26PM +0100, Stefan Hajnoczi wrote:
> On Thu, Aug 24, 2017 at 10:11:33PM +0800, Tiwei Bie wrote:
> > On Thu, Aug 24, 2017 at 03:32:15PM +0200, Paolo Bonzini wrote:
> > > On 24/08/2017 15:11, Tiwei Bie wrote:
> > > > On Thu, Aug 24, 2017 at 02:10:34PM +0200, Paolo Bonzini wrote:
> > > >> On 24/08/2017 13:53, Tiwei Bie wrote:
> > > >>>
> > > >>> * In addition to the DESC_HW flag, each virtio queue has a tail pointer
> > > >>>     - Driver creates suitable (i.e. multiple of cacheline) descriptors,
> > > >>>       then performs MMIO write to tail pointer.
> > > >>
> > > >> If I understand correctly, the tail pointer is the value that is written
> > > >> to the MMIO register.  If that is the case, this is unfortunately bad
> > > >> for virtualization.  Virt prefers a doorbell register where the value
> > > >> doesn't matter.  This is because:
> > > >>
> > > >> 1) the value is not available directly and computing it requires
> > > >> instruction decoding, which in turn requires walking page tables
> > > >>
> > > >> 2) if the value doesn't matter, the hypervisor can simply wake up a
> > > >> userspace thread that processes the virtio queue without bothering to
> > > >> pass the value.
> > > >>
> > > >> On the other hand, writing a tail pointer _before_ the MMIO write may
> > > >> cost a cache miss.  Hence the packed ring layout proposal replaced the
> > > >> tail pointer write with lookahead on the ring buffer's DESC_HW flags.
> > > >> The idea is that lookahead is cheaper, because hopefully the first
> > > >> non-DESC_HW buffer will be in the same cache line as the last DESC_HW
> > > >> buffer.
> > > > 
> > > > Thank you so much for such quick and detailed reply!
> > > > 
> > > > Yeah, we know it's a bit tricky to support the tail pointer in
> > > > software. But it's really helpful for the hardware implementation.
> > > > So we want more discussions on this.
> > > > 
> > > > How about having this feature be switchable at runtime, so it's
> > > > possible to be enabled after migrating to a hardware backend, or
> > > > disabled after migrating to a software backend. So for the software
> > > > backend, it can still use the DESC_HW based mechanism.
> > > > 
> > > > It's just some rough thoughts, and we haven't thought about the
> > > > implementation details. What's your thoughts on this?
> > > 
> > > Why is lookahead bad for hardware?  Can a PCIe device use burst reads to
> > > retrieve many 2-byte descriptor in a single TLP transaction?
> > > 
> > 
> > I'm not a hardware engineer, so what I said may be not accurate.
> > Kully (Cc'ed in this thread) can provide more details if necessary.
> > 
> > From my understanding, in current design hardware will need to keep
> > issuing TLP transactions at a certain pace to check whether the
> > descriptor is available or not. And it's possible that there is no
> > available descriptor. In this case, there will be a lot of TLP
> > transactions wasted. That is to say, the doorbell (MMIO write) is
> > helpful for the hardware.
> > 
> > So maybe we can combine both of them, similar to the notification
> > support in virtio1.0, we can have the tail pointer be optional,
> > the backend is free to choose to enable (and use) it or disable it.
> > And the software backend doesn't need to support it and just need
> > to keep it disabled. Any thoughts?
> 
> Isn't the VIRTIO 1.0 "4.1.2.3 Notification structure layout" still going
> to be available in the new ring layout?
> 
> That means the device already has a doorbell and does not need to keep
> issuing bus transactions to poll the DESC_HW bit.

I think I misunderstood the discussion:

The VIRTIO 1.0 "4.1.2.3 Notification structure layout" doorbell does not
indicate how many descriptors are available for the device.

This means the device needs to read the DESC_HW bit for each descriptor
until it reaches the first descriptor without DESC_HW set.

If the doorbell contained the number of descriptors then the device
could fetch exactly N descriptors instead of N + 1.  Maybe it could also
use fewer bus transactions (I'm not familiar with low-level PCIe).

So there does seem to be a performance advantage if the VIRTIO 1.0
"4.1.2.3 Notification structure layout" is modified to include the
number of descriptors available.  Seems like a good idea.

Stefan


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]