[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: RE: [PATCH requirements v5 3/7] net-features: Add low latency receive queue requirements
> From: David Edmondson <david.edmondson@oracle.com> > Sent: Monday, August 21, 2023 4:17 PM > > +### 3.2.2 Low latency rx virtqueue > > +0. Design goal: > > + a. Keep packet metadata and buffer data together which is consumed by > driver > > + layer and make it available in a single cache line of cpu > > Phrased like this, it seems to run counter to the "header data split" > requirement. > Mostly not. Currently, the packet metadata consumed by the driver is spread in two different DMAs at two different addresses. For split q: virtio_net_hdr + used ring For split q: virtio_net_hdr + desc. Instead, both to complete in single PCIe DMA and also read in single cache line from the cpu while processing it. > Is there an implicit guard that this only applies for very small payloads? > No. All packet sizes benefit from it. > > + b. Instead of having per packet descriptors which is complex to scale for > > + the device, supply the page directly to the device to consume it based > > + on packet size > > +1. The device should be able to write a packet receive completion that > consists > > + of struct virtio_net_hdr (or similar) and a buffer id using a single DMA > write > > + PCIe TLP. > > +2. The device should be able to perform DMA writes of multiple packets > > + completions in a single DMA transaction up to the PCIe maximum write > limit > > + in a transaction. > > +3. The device should be able to zero pad packet write completion to align it > to > > + 64B or CPU cache line size whenever possible. > > +4. An example of the above DMA completion structure: > > + > > +``` > > +/* Constant size receive packet completion */ struct > > +vnet_rx_completion { > > + u16 flags; > > + u16 id; /* buffer id */ > > + u8 gso_type; > > + u8 reserved[3]; > > + le16 gso_hdr_len; > > + le16 gso_size; > > + le16 csum_start; > > + le16 csum_offset; > > + u16 reserved2; > > + u64 timestamp; /* explained later */ > > + u8 padding[]; > > +}; > > +``` > > +5. The driver should be able to post constant-size buffer pages on a receive > > + queue which can be consumed by the device for an incoming packet of any > size > > + from 64B to 9K bytes. > > +6. The device should be able to know the constant buffer size at receive > > + virtqueue level instead of per buffer level. > > +7. The device should be able to indicate when a full page buffer is consumed, > > + which can be recycled by the driver when the packets from the completed > > + page is fully consumed. > > +8. The device should be able to consume multiple pages for a receive GSO > stream. > -- > Modern people tend to dance.
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]