[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: Re: [virtio-comment] Hardware friendly proposals from Intel for packed-ring-layout
On Fri, Aug 25, 2017 at 04:32:26PM +0100, Stefan Hajnoczi wrote: > On Thu, Aug 24, 2017 at 10:11:33PM +0800, Tiwei Bie wrote: > > On Thu, Aug 24, 2017 at 03:32:15PM +0200, Paolo Bonzini wrote: > > > On 24/08/2017 15:11, Tiwei Bie wrote: > > > > On Thu, Aug 24, 2017 at 02:10:34PM +0200, Paolo Bonzini wrote: > > > >> On 24/08/2017 13:53, Tiwei Bie wrote: > > > >>> > > > >>> * In addition to the DESC_HW flag, each virtio queue has a tail pointer > > > >>> - Driver creates suitable (i.e. multiple of cacheline) descriptors, > > > >>> then performs MMIO write to tail pointer. > > > >> > > > >> If I understand correctly, the tail pointer is the value that is written > > > >> to the MMIO register. If that is the case, this is unfortunately bad > > > >> for virtualization. Virt prefers a doorbell register where the value > > > >> doesn't matter. This is because: > > > >> > > > >> 1) the value is not available directly and computing it requires > > > >> instruction decoding, which in turn requires walking page tables > > > >> > > > >> 2) if the value doesn't matter, the hypervisor can simply wake up a > > > >> userspace thread that processes the virtio queue without bothering to > > > >> pass the value. > > > >> > > > >> On the other hand, writing a tail pointer _before_ the MMIO write may > > > >> cost a cache miss. Hence the packed ring layout proposal replaced the > > > >> tail pointer write with lookahead on the ring buffer's DESC_HW flags. > > > >> The idea is that lookahead is cheaper, because hopefully the first > > > >> non-DESC_HW buffer will be in the same cache line as the last DESC_HW > > > >> buffer. > > > > > > > > Thank you so much for such quick and detailed reply! > > > > > > > > Yeah, we know it's a bit tricky to support the tail pointer in > > > > software. But it's really helpful for the hardware implementation. > > > > So we want more discussions on this. > > > > > > > > How about having this feature be switchable at runtime, so it's > > > > possible to be enabled after migrating to a hardware backend, or > > > > disabled after migrating to a software backend. So for the software > > > > backend, it can still use the DESC_HW based mechanism. > > > > > > > > It's just some rough thoughts, and we haven't thought about the > > > > implementation details. What's your thoughts on this? > > > > > > Why is lookahead bad for hardware? Can a PCIe device use burst reads to > > > retrieve many 2-byte descriptor in a single TLP transaction? > > > > > > > I'm not a hardware engineer, so what I said may be not accurate. > > Kully (Cc'ed in this thread) can provide more details if necessary. > > > > From my understanding, in current design hardware will need to keep > > issuing TLP transactions at a certain pace to check whether the > > descriptor is available or not. And it's possible that there is no > > available descriptor. In this case, there will be a lot of TLP > > transactions wasted. That is to say, the doorbell (MMIO write) is > > helpful for the hardware. > > > > So maybe we can combine both of them, similar to the notification > > support in virtio1.0, we can have the tail pointer be optional, > > the backend is free to choose to enable (and use) it or disable it. > > And the software backend doesn't need to support it and just need > > to keep it disabled. Any thoughts? > > Isn't the VIRTIO 1.0 "4.1.2.3 Notification structure layout" still going > to be available in the new ring layout? > > That means the device already has a doorbell and does not need to keep > issuing bus transactions to poll the DESC_HW bit. I think I misunderstood the discussion: The VIRTIO 1.0 "4.1.2.3 Notification structure layout" doorbell does not indicate how many descriptors are available for the device. This means the device needs to read the DESC_HW bit for each descriptor until it reaches the first descriptor without DESC_HW set. If the doorbell contained the number of descriptors then the device could fetch exactly N descriptors instead of N + 1. Maybe it could also use fewer bus transactions (I'm not familiar with low-level PCIe). So there does seem to be a performance advantage if the VIRTIO 1.0 "4.1.2.3 Notification structure layout" is modified to include the number of descriptors available. Seems like a good idea. Stefan
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]