[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: [PATCH REQUIREMENTS v1 3/7] net-features: Add low latency receive queue requirements
Add requirements for the low latency receive queue. Signed-off-by: Parav Pandit <parav@nvidia.com> --- changelog: v0->v1: - clarified the requirements further - added line for the gro case - added design goals as the motivation for the requirements --- net-workstream/features-1.4.md | 45 +++++++++++++++++++++++++++++++++- 1 file changed, 44 insertions(+), 1 deletion(-) diff --git a/net-workstream/features-1.4.md b/net-workstream/features-1.4.md index 0c3202c..3e8b5a4 100644 --- a/net-workstream/features-1.4.md +++ b/net-workstream/features-1.4.md @@ -7,7 +7,7 @@ together is desired while updating the virtio net interface. # 2. Summary 1. Device counters visible to the driver -2. Low latency tx virtqueue for PCI transport +2. Low latency tx and rx virtqueues for PCI transport # 3. Requirements ## 3.1 Device counters @@ -114,3 +114,46 @@ struct vnet_data_desc desc[2]; 7. Ability to place all transmit completion together with it per packet stream transmit timestamp using single PCIe transcation. + +### 3.2.2 Low latency rx virtqueue +0. Design goal: + a. Keep packet metadata and buffer data together which is consumed by driver + layer and make it available in a single cache line of cpu + b. Instead of having per packet descriptors which is complex to scale for + the device, supply the page directly to the device to consume it based + on packet size +1. The device should be able to write a packet receive completion that consists + of struct virtio_net_hdr (or similar) and a buffer id using a single DMA write + PCIe TLP. +2. The device should be able to perform DMA writes of multiple packets + completions in a single DMA transaction up to the PCIe maximum write limit + in a transaction. +3. The device should be able to zero pad packet write completion to align it to + 64B or CPU cache line size whenever possible. +4. An example of the above DMA completion structure: + +``` +/* Constant size receive packet completion */ +struct vnet_rx_completion { + u16 flags; + u16 id; /* buffer id */ + u8 gso_type; + u8 reserved[3]; + le16 gso_hdr_len; + le16 gso_size; + le16 csum_start; + le16 csum_offset; + u16 reserved2; + u64 timestamp; /* explained later */ + u8 padding[]; +}; +``` +5. The driver should be able to post constant-size buffer pages on a receive + queue which can be consumed by the device for an incoming packet of any size + from 64B to 9K bytes. +6. The device should be able to know the constant buffer size at receive + virtqueue level instead of per buffer level. +7. The device should be able to indicate when a full page buffer is consumed, + which can be recycled by the driver when the packets from the completed + page is fully consumed. +8. The device should be able to consume multiple pages for a receive GSO stream. -- 2.26.2
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]