OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

virtio-comment message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: RE: [virtio-comment] [PATCH requirements 3/7] net-features: Add low latency receive queue requirements



> From: Michael S. Tsirkin <mst@redhat.com>
> Sent: Tuesday, June 6, 2023 6:33 PM
> 
> On Fri, Jun 02, 2023 at 01:03:01AM +0300, Parav Pandit wrote:
> > Add requirements for the low latency receive queue.
> >
> > Signed-off-by: Parav Pandit <parav@nvidia.com>
> > ---
> >  net-workstream/features-1.4.md | 38
> > +++++++++++++++++++++++++++++++++-
> >  1 file changed, 37 insertions(+), 1 deletion(-)
> >
> > diff --git a/net-workstream/features-1.4.md
> > b/net-workstream/features-1.4.md index 55f1b1f..054f951 100644
> > --- a/net-workstream/features-1.4.md
> > +++ b/net-workstream/features-1.4.md
> > @@ -7,7 +7,7 @@ together is desired while updating the virtio net interface.
> >
> >  # 2. Summary
> >  1. Device counters visible to the driver -2. Low latency tx virtqueue
> > for PCI transport
> > +2. Low latency tx and rx virtqueues for PCI transport
> >
> >  # 3. Requirements
> >  ## 3.1 Device counters
> > @@ -107,3 +107,39 @@ struct vnet_data_desc desc[2];
> >
> >  7. Ability to place all transmit completion together with it per packet stream
> >     transmit timestamp using single PCIe transcation.
> > +
> > +### 3.2.2 Low latency rx virtqueue
> > +1. The device should be able to write a packet receive completion that
> consists
> > +   of struct virtio_net_hdr (or similar) and a buffer id using a single DMA write
> > +   PCIe TLP.
> 
> why? what is wrong with it being linear with packet instead?
>
It prohibits header data split, it requires multiple DMAs for the metadata consumed by single driver layer.
Data processed by one layer is placed in two different locations.
This hurts performance.
 
> > +2. The device should be able to perform DMA writes of multiple packets
> > +   completions in a single DMA transaction up to the PCIe maximum write
> limit
> > +   in a transaction.
> > +3. The device should be able to zero pad packet write completion to align it
> to
> > +   64B or CPU cache line size whenever possible.
> 
> assuming completion is used buffer, these are eactly 64 bytes with packed vq,
> and they are linear so can be written in one transaction.
When a packet spans multiple buffers, they cannot be written in contiguous manner.

> if so why list requirements which are already met?
> if you want them for completeness mention this.
> 
> > +4. An example of the above DMA completion structure:
> > +
> > +```
> > +/* Constant size receive packet completion */ struct
> > +vnet_rx_completion {
> > +   u16 flags;
> > +   u16 id; /* buffer id */
> > +   u8 gso_type;
> > +   u8 reserved[3];
> > +   le16 gso_hdr_len;
> > +   le16 gso_size;
> > +   le16 csum_start;
> > +   le16 csum_offset;
> > +   u16 reserved2;
> > +   u64 timestamp; /* explained later */
> > +   u8 padding[];
> > +};
> > +```
> > +5. The driver should be able to post constant-size buffer pages on a receive
> > +   queue which can be consumed by the device for an incoming packet of any
> size
> > +   from 64B to 9K bytes.
> 
> possible with mrg buffers
> 
It doesn't scale to post constant size 64B buffer.

> > +6. The device should be able to know the constant buffer size at receive
> > +   virtqueue level instead of per buffer level.
> 
> the bigger question is not communicating to device. that is trivial.
> the bigger question is that linux IP stack seems to benefit from variable sized
> packets because buffers waste precious kernel memory.
Want to avoid the wastage and still achieve constant size posting.

> is this for non IP stack such as xdp? non-linux guests? dpdk perhaps?
> 
For Linux kernel guest.

> > +7. The device should be able to indicate when a full page buffer is consumed,
> > +   which can be recycled by the driver when the packets from the completed
> > +   page is fully consumed.
> 
> no idea what this means.
>
 Instead of driver nearly allocating a page and splitting into buffer nearly of same size, an approach to post the page and let device to consume it based on packet size.



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]