[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: Re: [virtio-comment] Hardware friendly proposals from Intel for packed-ring-layout
On Thu, Aug 24, 2017 at 10:11:33PM +0800, Tiwei Bie wrote: > On Thu, Aug 24, 2017 at 03:32:15PM +0200, Paolo Bonzini wrote: > > On 24/08/2017 15:11, Tiwei Bie wrote: > > > On Thu, Aug 24, 2017 at 02:10:34PM +0200, Paolo Bonzini wrote: > > >> On 24/08/2017 13:53, Tiwei Bie wrote: > > >>> > > >>> * In addition to the DESC_HW flag, each virtio queue has a tail pointer > > >>> - Driver creates suitable (i.e. multiple of cacheline) descriptors, > > >>> then performs MMIO write to tail pointer. > > >> > > >> If I understand correctly, the tail pointer is the value that is written > > >> to the MMIO register. If that is the case, this is unfortunately bad > > >> for virtualization. Virt prefers a doorbell register where the value > > >> doesn't matter. This is because: > > >> > > >> 1) the value is not available directly and computing it requires > > >> instruction decoding, which in turn requires walking page tables > > >> > > >> 2) if the value doesn't matter, the hypervisor can simply wake up a > > >> userspace thread that processes the virtio queue without bothering to > > >> pass the value. > > >> > > >> On the other hand, writing a tail pointer _before_ the MMIO write may > > >> cost a cache miss. Hence the packed ring layout proposal replaced the > > >> tail pointer write with lookahead on the ring buffer's DESC_HW flags. > > >> The idea is that lookahead is cheaper, because hopefully the first > > >> non-DESC_HW buffer will be in the same cache line as the last DESC_HW > > >> buffer. > > > > > > Thank you so much for such quick and detailed reply! > > > > > > Yeah, we know it's a bit tricky to support the tail pointer in > > > software. But it's really helpful for the hardware implementation. > > > So we want more discussions on this. > > > > > > How about having this feature be switchable at runtime, so it's > > > possible to be enabled after migrating to a hardware backend, or > > > disabled after migrating to a software backend. So for the software > > > backend, it can still use the DESC_HW based mechanism. > > > > > > It's just some rough thoughts, and we haven't thought about the > > > implementation details. What's your thoughts on this? > > > > Why is lookahead bad for hardware? Can a PCIe device use burst reads to > > retrieve many 2-byte descriptor in a single TLP transaction? > > > > I'm not a hardware engineer, so what I said may be not accurate. > Kully (Cc'ed in this thread) can provide more details if necessary. > > From my understanding, in current design hardware will need to keep > issuing TLP transactions at a certain pace to check whether the > descriptor is available or not. And it's possible that there is no > available descriptor. In this case, there will be a lot of TLP > transactions wasted. That is to say, the doorbell (MMIO write) is > helpful for the hardware. > > So maybe we can combine both of them, similar to the notification > support in virtio1.0, we can have the tail pointer be optional, > the backend is free to choose to enable (and use) it or disable it. > And the software backend doesn't need to support it and just need > to keep it disabled. Any thoughts? Isn't the VIRTIO 1.0 "4.1.2.3 Notification structure layout" still going to be available in the new ring layout? That means the device already has a doorbell and does not need to keep issuing bus transactions to poll the DESC_HW bit. Michael: It would help to see a full draft VIRTIO 1.1 spec with the proposed packed ring layout. That way everyone can get on the same page. Stefan
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]