OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

virtio-comment message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Hardware friendly proposals from Intel for packed-ring-layout


Hi all,

Based on the packed-ring-layout proposal posted here:

https://lists.oasis-open.org/archives/virtio-dev/201702/msg00010.html

We have below proposals to make it more hardware friendly.

Driver Signaling Available Descriptors
======================================

## Current proposal

* Each descriptor has 1 bit flag DESC_HW
* Driver creates descriptors and then sets DESC_HW flag
* Device reads descriptors and can use it if DESC_HW is set

## New proposal

* In addition to the DESC_HW flag, each virtio queue has a tail pointer
    - Driver creates suitable (i.e. multiple of cacheline) descriptors,
      then performs MMIO write to tail pointer.
* For each virtio queue, there is a head pointer lives in device and
  not used by driver
    - Device compares tail pointer with head pointer to determine exactly
      how many new descriptors have been added to a specific queue
* The descriptors in [head, tail) are available to device
* The DESC_HW flag will be kept for device signaling used descriptors

Device Signaling Used Descriptors
=================================

## Current proposal

* Device clears each descriptor's DESC_HW flag (1 bit) after it has
  finished with the descriptor

## New proposal

* Device does not need to clear DESC_HW flag for every descriptor
* Driver controls which descriptors need to have their DESC_HW cleared:
    - Descriptor has an extra 1 bit flag, DESC_WB (Write-Back):
        * w/ DESC_WB set  => Device must write-back this descriptor
                             after use. At the minimum, clear the
                             DESC_HW flag.
        * w/o DESC_WB set => Device doesn't need to write-back the
                             descriptor.

This proposal saves PCIe bandwidth:

In many scenarios, descriptor data doesn't need to be written back,
i.e. for network devices, the packet metadata is prepended to packet
data.

An alternative would be to add a field with a number of used descriptors.
This would give the same benefit but would use more bits in the descriptor.

Indirect Chaining
=================

## Current proposal

* Indirect chaining is an optional feature

## New proposal

* Remove this feature from this new ring layout

It's very unlikely that hardware implementations would support this
due to extra latency of fetching actual descriptors.

This is a totally new ring layout, and we don't need to worry about the
compatibility issues with the old one. So it's better to not include this
feature in this new ring layout if we can't find it's necessary now.

Rx Fixed Buffer Sizes
=====================

## Current proposal

* Driver is free to choose whatever buffer sizes it wishes for Tx and
  Rx buffers
* Theoretically within a ring, a driver could have different buffer sizes

## New proposal

* Driver negotiates with device the size of a Rx buffer for a ring
    - Each descriptor in that ring will have same size buffer
    - Different rings can have different sized buffers

Data Alignment Boundaries
=========================

## Current proposal

* Driver is free to choose data buffer alignment to any byte boundary

## New proposal

* Stipulate a fixed alignment for the data buffer

----------------------------------------------------------------

We have done a basic prototype for the packed-ring-layout in DPDK
based on the v2 packed-ring-layout proposal [1].

The prototype has been sent to the DPDK mailing list as RFC [2][3].
And I also collected those public patches into my github repo [4]
to help others be able to try it easily.

Besides the v2 packed-ring-layout proposal posted on the mailing list.
This prototype also includes the proposal that introduces the DESC_WB
flag to make it possible to let the driver tell the device just update
the specified descriptors. You can find more details in this patch [5].
And we don't see the performance regression in software implementation:

64bytes iofwd loopback:
                   5c'virtio-1c'vhost     1c'virtio-5c'vhost
virtio1.0          7.655Mpps              11.48Mpps
virtio1.1 A        8.757Mpps              11.70Mpps
virtio1.1 B        8.910Mpps              11.66Mpps
The columns:
5c'virtio-1c'vhost - use 5 cores to run testpmd/virtio-user and
                     use 1 core to run testpmd/vhost-pmd (shows
                     vhost performance)
1c'virtio-5c'vhost - use 1 core to run testpmd/virtio-user and
                     use 5 cores to run testpmd/vhost-pmd (shows
                     virtio performance)
The rows:
virtio1.0    - The current (simplified) virtio/vhost implementation in DPDK
virtio1.1 A  - The prototype based on the v2 packed-ring-layout proposal
virtio1.1 B  - Introduce DESC_WB, and adopt it on the Tx path

[1] https://lists.oasis-open.org/archives/virtio-dev/201702/msg00010.html
[2] http://dpdk.org/ml/archives/dev/2017-June/068315.html
[3] http://dpdk.org/ml/archives/dev/2017-July/071562.html
[4] https://github.com/btw616/dpdk-virtio1.1
[5] http://dpdk.org/ml/archives/dev/2017-July/071568.html

Best regards,
Tiwei Bie


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]