[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: Re: [virtio-comment] [PATCH requirements 3/7] net-features: Add low latency receive queue requirements
On Monday, 2023-07-24 at 06:34:17 +03, Parav Pandit wrote: > Add requirements for the low latency receive queue. > > Signed-off-by: Parav Pandit <parav@nvidia.com> > --- > changelog: > v0->v1: > - clarified the requirements further > - added line for the gro case > - added design goals as the motivation for the requirements > --- > net-workstream/features-1.4.md | 45 +++++++++++++++++++++++++++++++++- > 1 file changed, 44 insertions(+), 1 deletion(-) > > diff --git a/net-workstream/features-1.4.md b/net-workstream/features-1.4.md > index eb95592..e04727a 100644 > --- a/net-workstream/features-1.4.md > +++ b/net-workstream/features-1.4.md > @@ -7,7 +7,7 @@ together is desired while updating the virtio net interface. > > # 2. Summary > 1. Device counters visible to the driver > -2. Low latency tx virtqueue for PCI transport > +2. Low latency tx and rx virtqueues for PCI transport > > # 3. Requirements > ## 3.1 Device counters > @@ -121,3 +121,46 @@ struct vnet_data_desc desc[2]; > > 9. A flow filter virtqueue also similarly need the ability to inline the short flow > command header. > + > +### 3.2.2 Low latency rx virtqueue > +0. Design goal: > + a. Keep packet metadata and buffer data together which is consumed by driver > + layer and make it available in a single cache line of cpu > + b. Instead of having per packet descriptors which is complex to scale for > + the device, supply the page directly to the device to consume it based > + on packet size Really "per packet descriptor buffers"? > +1. The device should be able to write a packet receive completion that consists > + of struct virtio_net_hdr (or similar) and a buffer id using a single DMA write > + PCIe TLP. > +2. The device should be able to perform DMA writes of multiple packets > + completions in a single DMA transaction up to the PCIe maximum write limit > + in a transaction. > +3. The device should be able to zero pad packet write completion to align it to > + 64B or CPU cache line size whenever possible. > +4. An example of the above DMA completion structure: > + > +``` > +/* Constant size receive packet completion */ > +struct vnet_rx_completion { > + u16 flags; > + u16 id; /* buffer id */ > + u8 gso_type; > + u8 reserved[3]; > + le16 gso_hdr_len; > + le16 gso_size; > + le16 csum_start; > + le16 csum_offset; > + u16 reserved2; > + u64 timestamp; /* explained later */ > + u8 padding[]; > +}; > +``` > +5. The driver should be able to post constant-size buffer pages on a receive > + queue which can be consumed by the device for an incoming packet of any size > + from 64B to 9K bytes. > +6. The device should be able to know the constant buffer size at receive > + virtqueue level instead of per buffer level. > +7. The device should be able to indicate when a full page buffer is consumed, > + which can be recycled by the driver when the packets from the completed > + page is fully consumed. s/is full consumed/are fully consumed/ > +8. The device should be able to consume multiple pages for a receive GSO stream. -- So tap at my window, maybe I might let you in.
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]