[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: [PATCH requirements v4 2/7] net-features: Add low latency transmit queue requirements
Add requirements for the low latency transmit queue. Signed-off-by: Parav Pandit <parav@nvidia.com> --- chagelog: v3->v4: - Addressed comments from David - rewrote timestamp and completions pcie transcation requirement v1->v2: - added generic requirement to inline the request content along with the descriptor for non virtio-net devices - added requirement to inline the header content along with the descriptor for virtio flow filter queue as two features are similar v0->v1: - added design goals for which requirements are added --- net-workstream/features-1.4.md | 88 ++++++++++++++++++++++++++++++++++ 1 file changed, 88 insertions(+) diff --git a/net-workstream/features-1.4.md b/net-workstream/features-1.4.md index c2b1cc8..40fa07f 100644 --- a/net-workstream/features-1.4.md +++ b/net-workstream/features-1.4.md @@ -7,6 +7,7 @@ together is desired while updating the virtio net interface. # 2. Summary 1. Device counters visible to the driver +2. Low latency tx virtqueue for PCI transport # 3. Requirements ## 3.1 Device counters @@ -41,3 +42,90 @@ together is desired while updating the virtio net interface. More counters discussed in [1]. [1] https://lists.oasis-open.org/archives/virtio-comment/202308/msg00176.html + +## 3.2 Low PCI latency virtqueues +### 3.2.1 Low PCI latency tx virtqueue +0. Design goal + a. Reduce PCI access latency in packet transmit flow + b. Avoid O(N) descriptor parser to detect a packet stream to simplify device + logic + c. Reduce number of PCI transmit completion transactions and have unified + completion flow with/without transmit timestamping + d. Avoid partial cache line writes on transmit completions + +1. Packet transmit descriptor should contain data descriptors count without any + indirection and without any O(N) search to find the end of a packet stream. + For example, a packet transmit descriptor (called vnet_tx_hdr_desc + subsequently) to contain a field num_next_desc for the packet stream + indicating that a packet is located in N data descriptors. + +2. Packet transmit descriptor should contain segmentation offload-related fields + without any indirection. For example, packet transmit descriptor to contain + gso_type, gso_size/mss, header length, csum placement byte offset, and + csum start. + +3. Packet transmit descriptor should be able to place a small size packet that + does not have any L4 data after the vnet_tx_hdr_desc in the virtqueue memory. + For example a TCP ack only packet can fit in a descriptor memory which + otherwise consume more than 25% of metadata to describe the packet. + +4. Packet transmit descriptor should be able to place a full GSO header (L2 to + L4) after header descriptor and before data descriptors. For example, the + GSO header is placed after struct vnet_tx_hdr_desc in the virtqueue memory. + When such a GSO header is positioned adjacent to the packet transmit + descriptor, and when the GSO header is not aligned to 16B, the following + data descriptor to start on the 8B aligned boundary. + +5. An example of the above requirements at high level is: + +``` +struct virtio_packed_q_desc { + /* current desc for reference */ + u64 address; + u32 len; + u16 id; + u16 flags; +}; + +/* Constant size header descriptor for tx packets */ +struct vnet_tx_hdr_desc { + u16 flags; /* indicate how to parse next fields */ + u16 id; /* desc id to come back in completion */ + u8 num_next_desc; /* indicates the number of the next 16B data desc for this + * buffer. + */ + u8 gso_type; + le16 gso_hdr_len; + le16 gso_size; + le16 csum_start; + le16 csum_offset; + u8 inline_pkt_len; /* indicates the length of the inline packet after this + * desc + */ + u8 reserved; + u8 padding[]; +}; + +/* Example of a short packet or GSO header placed in the desc section of the vq + */ +struct vnet_tx_small_pkt_desc { + u8 raw_pkt[128]; +}; + +/* Example of header followed by data descriptor */ +struct vnet_tx_hdr_desc hdr_desc; +struct vnet_data_desc desc[2]; + +``` + +6. Ability to zero pad the transmit completion when the transmit completion is + shorter than the CPU cache line size. + +7. Ability to write per packet timestamp and also write multiple + transmit completions using single PCIe transcation. + +8. A generic feature of the virtqueue, to contain such header data inline for virtio + devices other than virtio-net. + +9. A flow filter virtqueue also similarly need the ability to inline the short flow + command header. -- 2.26.2
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]