virtio-dev message

Subject: Re: [PATCH v7] virtio_net: support split header

From: Heng Qi <hengqi@linux.alibaba.com>
To: "Michael S. Tsirkin" <mst@redhat.com>
Date: Fri, 9 Sep 2022 15:41:54 +0800



å 2022/9/5 äå4:27, Michael S. Tsirkin åé:

On Fri, Sep 02, 2022 at 03:36:25PM +0800, Heng Qi wrote:

We need to clarify that the purpose of header splitting is to make all payloads
can be independently in a page, which is beneficial for the zerocopy
implemented by the upper layer.

absolutely, pls add motivation.

If the driver does not enforce that the buffers submitted to the receiveq MUST
be composed of at least two descriptors, then header splitting will become meaningless,
or the VIRTIO_NET_F_SPLIT_TRANSPORT_HEADER feature should not be negotiated at this time.


Thanks.


This seems very narrow and unecessarily wasteful of descriptors.
What is wrong in this:

<header>...<padding>... <beginning of page><data>

seems to achieve the goal of data in a separate page without
using extra descriptors.

thus my proposal to replace the requirement of a separate
descriptor with an offset of data from beginning of
buffer that driver sets.

We have carefully considered your suggestion.

We refer to spec v7 and earlier as scheme A for short. Review scheme Abelow:


| receive buffer |

| 0th descriptor | 1th descriptor |

| virtnet hdr | mac | ip hdr | tcp hdr|<-- hold -->| payload |

We use a buffer plus a separate page when allocating the receive

buffer. In this way, we can ensure that all payloads can be

independently in a page, which is very beneficial for the zerocopy

implemented by the upper layer.

scheme A better solves the problem of headroom, tailroom and memorywaste, but as you said, this solution relies on descriptor chain.


Our rethinking approach is no longer based on or using descriptor chain.

We refer to your proposed offset-based scheme as scheme B:

As you suggested, scheme B gives the device a buffer, using offset toindicate where to place the payload like this:


<header>...<padding>... <beginning of page><data>

But how to apply for this buffer? Since we want the payload to be placedon a separate page, the method we consider is to directly apply to thedriver for two pages of contiguous memory.

Then the beginning of this contiguous memory is used to store theheadroom, and the contiguous memory after the headroom is directlyhanded over to the device. similar to the following:

<------------------------------------------ receive buffer(2 pages)----------------------------------------->

<<---------------------------------- first page-----------------------------------><---- second page ------>>

<<Driver reserved, the later part is filled><vheader><transportheader>..<padding>..<beginning of page><data>>


Based on your previous suggestion, we also considered another new scheme C.

This scheme is implemented based on mergeable buffer, filling a separatepage each time.

If the split header is negotiated and the packet can be successfullysplit by the device, the device needs to find at least two buffers,namely two pages, one for the virtio-net header and transport header,and the other for the data payload. Like the following:


| receive buffer1(page) | receive buffer2 (page) |

| virtnet hdr | mac | ip hdr | tcp hdr|<-- hold -->| payload |

At the same time, if XDP is considered, then the device needs to addheadroom at the beginning of receive buffer1 when receiving packets, sothat the driver can process programs similar to XDP. In order to solvethis problem, can scheme C introduce an offset, which requires thedevice to write data from the offset position to receive buffer1, likethe following:


| receive buffer (page) | receive buffer (page) |

Then we simply compare the advantages and disadvantages of scheme A(specv7), scheme B (offset buffer(2 pages)) and scheme C (based on mergeablebuffer):


1. desc chain:

- A depends on desciptor chain; - B, C do not depend on desciptor chain.

2. page alloc

- B fills two consecutive pages, which causes a great waste of memoryfor small packages such as arp; - C fills a single page, slightly betterthan B.


3. Memory waste:

- The memory waste of scheme A is mainly the 0th descriptor that isskipped by the device; - When scheme B and scheme C successfully splitthe header, there is a huge waste of the first page, but the first pagecan be quickly released by copying.


4. headroom

- The headrooms of plan A and plan B are reserved; - Scheme C requiresthe driver to set off to let the device skip off when using receive buffer1.


5. tailroom

- When splitting the header, skb usually needs to store each independentpage in the non-linear data area based on shinfo. - The tailroom ofscheme A is reserved by itself; - Scheme B requires the driver to setthe reserved padding area for the first receive buffer(2 pages) to useshinfo when the split header is not successfully executed; - Scheme Crequires the driver to set max_len for the first receive buffer(page).



Which plan do you prefer?

---

Thanks.

Follow-Ups:
- Re: [PATCH v7] virtio_net: support split header
  - From: "Michael S. Tsirkin" <mst@redhat.com>

References:
- Re: [PATCH v7] virtio_net: support split header
  - From: Jason Wang <jasowang@redhat.com>
- Re: [PATCH v7] virtio_net: support split header
  - From: Heng Qi <hengqi@linux.alibaba.com>
- Re: [PATCH v7] virtio_net: support split header
  - From: "Michael S. Tsirkin" <mst@redhat.com>