OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.


Help: OASIS Mailing Lists Help | MarkMail Help

virtio-comment message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]

Subject: Re: [virtio-comment] Hardware friendly proposals from Intel for packed-ring-layout

On Thu, Aug 24, 2017 at 03:32:15PM +0200, Paolo Bonzini wrote:
> On 24/08/2017 15:11, Tiwei Bie wrote:
> > On Thu, Aug 24, 2017 at 02:10:34PM +0200, Paolo Bonzini wrote:
> >> On 24/08/2017 13:53, Tiwei Bie wrote:
> >>>
> >>> * In addition to the DESC_HW flag, each virtio queue has a tail pointer
> >>>     - Driver creates suitable (i.e. multiple of cacheline) descriptors,
> >>>       then performs MMIO write to tail pointer.
> >>
> >> If I understand correctly, the tail pointer is the value that is written
> >> to the MMIO register.  If that is the case, this is unfortunately bad
> >> for virtualization.  Virt prefers a doorbell register where the value
> >> doesn't matter.  This is because:
> >>
> >> 1) the value is not available directly and computing it requires
> >> instruction decoding, which in turn requires walking page tables
> >>
> >> 2) if the value doesn't matter, the hypervisor can simply wake up a
> >> userspace thread that processes the virtio queue without bothering to
> >> pass the value.
> >>
> >> On the other hand, writing a tail pointer _before_ the MMIO write may
> >> cost a cache miss.  Hence the packed ring layout proposal replaced the
> >> tail pointer write with lookahead on the ring buffer's DESC_HW flags.
> >> The idea is that lookahead is cheaper, because hopefully the first
> >> non-DESC_HW buffer will be in the same cache line as the last DESC_HW
> >> buffer.
> > 
> > Thank you so much for such quick and detailed reply!
> > 
> > Yeah, we know it's a bit tricky to support the tail pointer in
> > software. But it's really helpful for the hardware implementation.
> > So we want more discussions on this.
> > 
> > How about having this feature be switchable at runtime, so it's
> > possible to be enabled after migrating to a hardware backend, or
> > disabled after migrating to a software backend. So for the software
> > backend, it can still use the DESC_HW based mechanism.
> > 
> > It's just some rough thoughts, and we haven't thought about the
> > implementation details. What's your thoughts on this?
> Why is lookahead bad for hardware?  Can a PCIe device use burst reads to
> retrieve many 2-byte descriptor in a single TLP transaction?

I'm not a hardware engineer, so what I said may be not accurate.
Kully (Cc'ed in this thread) can provide more details if necessary.

From my understanding, in current design hardware will need to keep
issuing TLP transactions at a certain pace to check whether the
descriptor is available or not. And it's possible that there is no
available descriptor. In this case, there will be a lot of TLP
transactions wasted. That is to say, the doorbell (MMIO write) is
helpful for the hardware.

So maybe we can combine both of them, similar to the notification
support in virtio1.0, we can have the tail pointer be optional,
the backend is free to choose to enable (and use) it or disable it.
And the software backend doesn't need to support it and just need
to keep it disabled. Any thoughts?

Best regards,
Tiwei Bie

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]