OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

virtio-comment message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [virtio-comment] Hardware friendly proposals from Intel for packed-ring-layout


On Thu, Aug 24, 2017 at 10:11:33PM +0800, Tiwei Bie wrote:
> On Thu, Aug 24, 2017 at 03:32:15PM +0200, Paolo Bonzini wrote:
> > On 24/08/2017 15:11, Tiwei Bie wrote:
> > > On Thu, Aug 24, 2017 at 02:10:34PM +0200, Paolo Bonzini wrote:
> > >> On 24/08/2017 13:53, Tiwei Bie wrote:
> > >>>
> > >>> * In addition to the DESC_HW flag, each virtio queue has a tail pointer
> > >>>     - Driver creates suitable (i.e. multiple of cacheline) descriptors,
> > >>>       then performs MMIO write to tail pointer.
> > >>
> > >> If I understand correctly, the tail pointer is the value that is written
> > >> to the MMIO register.  If that is the case, this is unfortunately bad
> > >> for virtualization.  Virt prefers a doorbell register where the value
> > >> doesn't matter.  This is because:
> > >>
> > >> 1) the value is not available directly and computing it requires
> > >> instruction decoding, which in turn requires walking page tables
> > >>
> > >> 2) if the value doesn't matter, the hypervisor can simply wake up a
> > >> userspace thread that processes the virtio queue without bothering to
> > >> pass the value.
> > >>
> > >> On the other hand, writing a tail pointer _before_ the MMIO write may
> > >> cost a cache miss.  Hence the packed ring layout proposal replaced the
> > >> tail pointer write with lookahead on the ring buffer's DESC_HW flags.
> > >> The idea is that lookahead is cheaper, because hopefully the first
> > >> non-DESC_HW buffer will be in the same cache line as the last DESC_HW
> > >> buffer.
> > > 
> > > Thank you so much for such quick and detailed reply!
> > > 
> > > Yeah, we know it's a bit tricky to support the tail pointer in
> > > software. But it's really helpful for the hardware implementation.
> > > So we want more discussions on this.
> > > 
> > > How about having this feature be switchable at runtime, so it's
> > > possible to be enabled after migrating to a hardware backend, or
> > > disabled after migrating to a software backend. So for the software
> > > backend, it can still use the DESC_HW based mechanism.
> > > 
> > > It's just some rough thoughts, and we haven't thought about the
> > > implementation details. What's your thoughts on this?
> > 
> > Why is lookahead bad for hardware?  Can a PCIe device use burst reads to
> > retrieve many 2-byte descriptor in a single TLP transaction?
> > 
> 
> I'm not a hardware engineer, so what I said may be not accurate.
> Kully (Cc'ed in this thread) can provide more details if necessary.
> 
> From my understanding, in current design hardware will need to keep
> issuing TLP transactions at a certain pace to check whether the
> descriptor is available or not. And it's possible that there is no
> available descriptor. In this case, there will be a lot of TLP
> transactions wasted. That is to say, the doorbell (MMIO write) is
> helpful for the hardware.
> 
> So maybe we can combine both of them, similar to the notification
> support in virtio1.0, we can have the tail pointer be optional,
> the backend is free to choose to enable (and use) it or disable it.
> And the software backend doesn't need to support it and just need
> to keep it disabled. Any thoughts?

Isn't the VIRTIO 1.0 "4.1.2.3 Notification structure layout" still going
to be available in the new ring layout?

That means the device already has a doorbell and does not need to keep
issuing bus transactions to poll the DESC_HW bit.

Michael: It would help to see a full draft VIRTIO 1.1 spec with the
proposed packed ring layout.  That way everyone can get on the same
page.

Stefan


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]