OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

virtio-comment message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [virtio-dev] RE: [virtio-comment] ååï[virtio-dev] [PATCH v8] virtio_net: support for split transport header




å 2023/2/2 äå12:20, Parav Pandit åé:

From: Heng Qi <hengqi@linux.alibaba.com>
Sent: Wednesday, February 1, 2023 10:45 PM

å 2023/2/2 äå9:45, Parav Pandit åé:
From: Heng Qi <hengqi@linux.alibaba.com>
Sent: Wednesday, February 1, 2023 8:18 AM
[..]

Response....

Page alignment requirements should not come from the virtio spec.
There are a variety of cases which may use non page aligned data buffers.
a. A kernel only consumer can use it who doesn't have mmap requirement.
b. A VQ accessible directly in user space may also use it without
page
alignment.
c. A system with 64k page size, page aligned memory has a fair
amount of
wastage.
d. iouring example you pointed, also has non page aligned use.

So let the driver deal with alignment restriction, outside of the virtio spec.

In header data split cases, data buffers utilization is more
important than the
tiny header buffers utilization.
How about if the headers do not interfere with the data buffers?

In other words, say a given RQ has optionally linked to a circular
queue of
header buffers.
All header buffers are of the same size, supplied one time.
This header size and circular q address is configured one time at RQ
creation
time.
With this the device doesn't need to process header buffer size
every single
incoming packet.
Data buffers can continue as chains or merged mode can be supported.
When the received packetâs header cannot fit, it continues as-is in
the data
buffer.
Virtio net hdr as suggest indicates usage of hdr buffer offset/index.
This is a good direction, thanks a lot for your comments, the new
split header method might look like this:
When allocating the receive queue, allocate a circular queue storing
the header buffers at the driver layer, which is shared with the
device, and set the length of the header buffer. The length of the
circular queue is the same as the length of the descriptor table.

1. When a data packet is successfully split, all virtio-net-hdr +
transport headers are stored in the header buffer, and
VIRTIO_NET_HDR_F_SPLIT_HEADER is set in the flag of virtio-net-hdr;
If the header buffer is not enough to store virtio- net-hdr +
transport header, then roll back, that is, not split the packet;

Right.

2. When a packet is not split by the device for some reason,
virtio-net-hdr is placed in the header buffer, and the transport
header and payload are placed in the buffer. At this time, xdp is not
supported, because xdp requires the transport header to be placed in the
first buffer, the header buffer.
I do not understand why the xdp is not supported in this case.
Isn't it the case today, where xdp_prepare_buff() gets the buffer with offset
after the net hdr?
In the above case you mentioned xdp gets to used the packet buffer where
pkt hdr + data is located.

The head buffer and data buffer are not continuous, but xdp requires the data
memory in the linear area to be continuous. When virtio-net-hdr is stored in
the header buffer, and the transport header and payload are in the data buffer,
the requirements of xdp are not met at this time. Many xdp kern programs and
xdp core layers also require the transport header to be placed in the linear
region.

Lets keep aside a virtio net hdr for a moment.
Pkt HDR (L2 to L4) is in buffer_type_1.
Pkt Data (after L4) is in buffer_type_2.
They are not contiguous.
In this case XDP doesnât work.
Did I understand you correctly?

It seems not. The case you mentioned is currently supported in virtio-net:
https://lore.kernel.org/all/20230114082229.62143-1-hengqi@linux.alibaba.com/ .

I just mean:
Virtio-net-hdr is in buffer_type_1.
Pkt hdr (L2 to L4) + pkt data (after L4) are in buffer_type_2.

They are not contiguous and xdp doesn't work.

If so, there is ongoing work to support multi buffer XDP.
We should look forward to have that support it in the future in OS.
Then why
not start storing packets from the header buffer? If this is the
case, imagine a
situation: If all small packets are put into the header buffers, and
virtio net driver receives packets according to the descriptor table
and used_ring, but the data buffers are empty at this time, then can't seem
to continue?
In future when such mode is extended, vq mechanism will be extended to
handle desc table and used ring anyway.

Yes, for example, we can support the descriptor table to allow placing a buffer
with a length of 0 as a placeholder to deal with the above problem.

It seems that
the operation of circular queue has become a producer-consumer
problem again.

I didnât follow.

I tend to imagine that driver consumed meta data (vnet_hdr and used
ring) should be placed together in one buffer and
Sorry, I didn't understand what you said, can you explain more clearly?
Do you mean that
virtio-ner-hdr and some other metadata are placed in a buffer? How are used
ring and virtio-net-hdr placed together in one buffer?

We can extend virtio net used ring which will consist of,
struct virtio_net_used_ring_entry {
	struct virtio_net_hdr hdr; /* with padding bytes */
	struct used_elem elem[];
};
struct virtio_net_used_ring {
	struct virtio_net_used_ring_entry ring[];
};

With above layering is nit.
Driver metadata stays in driver level such as something like above struct.
HDR buffer can have constant size head room too, which is cache line aligned without any kind of padding.
No need to mix driver internal meta data (vnet_hdr and packet hdr) together.

This may find its use even without HDS, where one received packets, metadata is found adjacent in single cache line.

Yes, this seems to be a good solution, but we should proceed step by step, such as pushing the
split header first, and then try to change the virtio core later.

Thanks.

Thanks.

Actual packet header and data belongs to other buffer.
This way there is clear layering of ownership. But before we explore this
option, lets understand your above point.
3. The meaning of hdr_len in virtio-net-hdr:
   ÂÂÂÂ i. When splitting successfully: hdr_len represents the effective length of
the
header in the header buffer.
   ÂÂÂÂ ii. On unsuccessful split: hdr_len is 0.

What do you think of this plan? Or is there a better one?

Thanks.

This method has few benefits on perf and buffer efficiency as below.
1. Data buffers can be directly mapped at best utilization 2. Device
doesn't need to match up per packet header sizes and descriptor sizes,
efficient for device to implement 3. No need to keep reposting the header
buffers, only its tail index to be updated.
Directly gives 50% cycle reduction on buffer traversing on driver side on rx
path.
4. Ability to share this header buffer queue among multiple RQs if needed.
5. In the future there may be an extension to place tiny whole packets that
can fit in the header buffer also to contain the rest of the data.
6. Device can always fall back to place packet header in data buffer
when header buffer is not available or smaller than newer protocol 7.
Because the header buffer comes virtually contiguous memory and not
intermixed with data buffers, there isn't small per header allocations 8. Also
works in both chained and merged mode 9. memory utilization for an RQ of
depth 256, with 4K page size for data buffers = 1M, and hdr buffer per
packet =
256 * 128bytes = only 3% of the data buffer.
So, in worst case when no packet uses the header buffers wastage is only
3%.
When high number of packets larger than 4K uses the header buffer, say
8K
packets, header buffer utilization is at 50%. So, wastage is only 1.5%.
At 1500 mtu merged buffer data buffer size, it is also < 10% of hdr buffer
memory.
All 3 cases are very manageable range of buffer utilization.

Crafting and modifying the feature bits from your v7 version and virtio net
header is not difficult to get there if we like this approach.



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]