OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

virtio-dev message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [virtio-dev] Re: [PATCH v7] virtio_net: support split header


On Tue, 27 Sep 2022 17:35:09 -0400, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> On Wed, Sep 14, 2022 at 11:34:43AM +0800, Jason Wang wrote:
> >
> > å 2022/9/9 20:38, Xuan Zhuo åé:
> > > On Fri, 9 Sep 2022 07:15:02 -0400, "Michael S. Tsirkin" <mst@redhat.com> wrote:
> > > > On Fri, Sep 09, 2022 at 03:41:54PM +0800, Heng Qi wrote:
> > > > >
> > > > > å 2022/9/5 äå4:27, Michael S. Tsirkin åé:
> > > > > > On Fri, Sep 02, 2022 at 03:36:25PM +0800, Heng Qi wrote:
> > > > > > > We need to clarify that the purpose of header splitting is to make all payloads
> > > > > > > can be independently in a page, which is beneficial for the zerocopy
> > > > > > > implemented by the upper layer.
> > > > > > absolutely, pls add motivation.
> > > > > >
> > > > > > > If the driver does not enforce that the buffers submitted to the receiveq MUST
> > > > > > > be composed of at least two descriptors, then header splitting will become meaningless,
> > > > > > > or the VIRTIO_NET_F_SPLIT_TRANSPORT_HEADER feature should not be negotiated at this time.
> > > > > > >
> > > > > > >
> > > > > > > Thanks.
> > > > > > >
> > > > > > >
> > > > > > This seems very narrow and unecessarily wasteful of descriptors.
> > > > > > What is wrong in this:
> > > > > >
> > > > > > <header>...<padding>... <beginning of page><data>
> > > > > >
> > > > > > seems to achieve the goal of data in a separate page without
> > > > > > using extra descriptors.
> > > > > >
> > > > > > thus my proposal to replace the requirement of a separate
> > > > > > descriptor with an offset of data from beginning of
> > > > > > buffer that driver sets.
> > > > > >
> > > > > >
> > > > > We have carefully considered your suggestion.
> > > > >
> > > > > We refer to spec v7 and earlier as scheme A for short. Review scheme A
> > > > > below:
> > > > >
> > > > > | receive buffer |
> > > > >
> > > > > | 0th descriptor | 1th descriptor |
> > > > >
> > > > > | virtnet hdr | mac | ip hdr | tcp hdr|<-- hold -->| payload |
> > > > >
> > > > > We use a buffer plus a separate page when allocating the receive
> > > > >
> > > > > buffer. In this way, we can ensure that all payloads can be
> > > > >
> > > > > independently in a page, which is very beneficial for the zerocopy
> > > > >
> > > > > implemented by the upper layer.
> > > > >
> > > > > scheme A better solves the problem of headroom, tailroom and memory waste,
> > > > > but as you said, this solution relies on descriptor chain.
> > > > >
> > > > > Our rethinking approach is no longer based on or using descriptor chain.
> > > > >
> > > > > We refer to your proposed offset-based scheme as scheme B:
> > > > >
> > > > > As you suggested, scheme B gives the device a buffer, using offset to
> > > > > indicate where to place the payload like this:
> > > > >
> > > > > <header>...<padding>... <beginning of page><data>
> > > > >
> > > > > But how to apply for this buffer? Since we want the payload to be placed on
> > > > > a separate page, the method we consider is to directly apply to the driver
> > > > > for two pages of contiguous memory.
> > > > >
> > > > > Then the beginning of this contiguous memory is used to store the headroom,
> > > > > and the contiguous memory after the headroom is directly handed over to the
> > > > > device. similar to the following:
> > > > >
> > > > > <------------------------------------------ receive buffer(2 pages)
> > > > > ----------------------------------------->
> > > > >
> > > > > <<---------------------------------- first page
> > > > > -----------------------------------><---- second page ------>>
> > > > >
> > > > > <<Driver reserved, the later part is filled><vheader><transport
> > > > > header>..<padding>..<beginning of page><data>>
> > > > >
> > > > > Based on your previous suggestion, we also considered another new scheme C.
> > > > >
> > > > > This scheme is implemented based on mergeable buffer, filling a separate
> > > > > page each time.
> > > > >
> > > > > If the split header is negotiated and the packet can be successfully split
> > > > > by the device, the device needs to find at least two buffers, namely two
> > > > > pages, one for the virtio-net header and transport header, and the other for
> > > > > the data payload. Like the following:
> > > > >
> > > > > | receive buffer1(page) | receive buffer2 (page) |
> > > > >
> > > > > | virtnet hdr | mac | ip hdr | tcp hdr|<-- hold -->| payload |
> > > > >
> > > > > At the same time, if XDP is considered, then the device needs to add
> > > > > headroom at the beginning of receive buffer1 when receiving packets, so that
> > > > > the driver can process programs similar to XDP. In order to solve this
> > > > > problem, can scheme C introduce an offset, which requires the device to
> > > > > write data from the offset position to receive buffer1, like the following:
> > > > >
> > > > > | receive buffer (page) | receive buffer (page) |
> > > > >
> > > > > | <-- offset(hold) --> | virtnet hdr | mac | ip hdr | tcp hdr|<-- hold -->|
> > > > > payload |
> > > > And in fact, B and C both use an offset now, right?
> > >
> > > B: offset is used to get the position to place the payload.
> > > C: The offset is used to reserve some space for the device, which the driver can
> > >     use as headroom.
> > >
> > >     In order to make the payload page-aligned, we can only hand over the entire
> > >     page to the device, so we cannot reserve some headroom in advance.
> >
> >
> > For C, it might be better to do some tweak since mergeable buffer doesn't
> > forbid using a descriptor chain as a single buffer.
> >
> > So if it's a descriptor chain we got back the method A by placing the
> > payload in a dedicated buffer. If it's not placing the payload in an
> > adjacent buffer.
> >
> > Thanks
>
> Let's find a way so devices do not care how descriptors are laid out.

Can we try to make a desc describe a similar but unconnected piece of memory?

struct vring_desc {
	__virtio64 addr;
	__virtio32 len;
	__virtio16 flags;
	__virtio16 next;
};

Note that len in desc is currently 32 bits, but generally 16 bits is enough, so
we have 16bits free space, and because we do not use desc chain, next is also
free. We can use this two 16-bit spaces describe a hold.

|<-- buf1 --><-- hold --><--- page --->|
|<- offset-->|
             |<- size ->|

offset and size are less 65535.

Thanks.


>
> >
> > >
> > > > > Then we simply compare the advantages and disadvantages of scheme A(spec
> > > > > v7), scheme B (offset buffer(2 pages)) and scheme C (based on mergeable
> > > > > buffer):
> > > > >
> > > > > 1. desc chain:
> > > > >
> > > > > - A depends on desciptor chain; - B, C do not depend on desciptor chain.
> > > > >
> > > > > 2. page alloc
> > > > >
> > > > > - B fills two consecutive pages, which causes a great waste of memory for
> > > > > small packages such as arp; - C fills a single page, slightly better than B.
> > > > >
> > > > > 3. Memory waste:
> > > > >
> > > > > - The memory waste of scheme A is mainly the 0th descriptor that is skipped
> > > > > by the device;
> > > > there's also the cost of the indirect buffer since that is used when
> > > > there is a chain.
> > > Yes
> > >
> > >
> > > > > - When scheme B and scheme C successfully split the header,
> > > > > there is a huge waste of the first page, but the first page can be quickly
> > > > > released by copying.
> > > > >
> > > > > 4. headroom
> > > > >
> > > > > - The headrooms of plan A and plan B are reserved; - Scheme C requires the
> > > > > driver to set off to let the device skip off when using receive buffer1.
> > > > >
> > > > > 5. tailroom
> > > > >
> > > > > - When splitting the header, skb usually needs to store each independent
> > > > > page in the non-linear data area based on shinfo. - The tailroom of scheme A
> > > > > is reserved by itself; - Scheme B requires the driver to set the reserved
> > > > > padding area for the first receive buffer(2 pages) to use shinfo when the
> > > > > split header is not successfully executed; - Scheme C requires the driver to
> > > > > set max_len for the first receive buffer(page).
> > > > >
> > > > >
> > > > > Which plan do you prefer?
> > > > I think either both B and C depending on the mergeable buffers flag,
> > > > or just one of these two.
> > > If I understand correctly, B does not depend on mergeable, while C must depend
> > > on mergeable.
> > >
> > > Thanks.
> > >
> > >
> > > > > ---
> > > > >
> > > > > Thanks.
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
> > > For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
> > >
>


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]