[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: Re: [PATCH v7 2/2] virtio-net: device does not deliver partially checksummed packet and may validate the checksum
On Thu, Dec 28, 2023 at 3:46âPM Heng Qi <hengqi@linux.alibaba.com> wrote: > > virtio-net works in a virtualized system and is somewhat different from > physical nics. One of the differences is that to save virtio device > resources, rx may receive partially checksummed packets. However, XDP may > cause partially checksummed packets to be dropped. > So XDP loading currently conflicts with the feature VIRTIO_NET_F_GUEST_CSUM. > When XDP is loaded, the offload of VIRTIO_NET_F_GUEST_CSUM is disabled > (to disable the tx checksum offload). > > In addition, implementation of some performant devices always do not generate > partially checksummed packets, but the standard driver still cannot load XDP > when VIRTIO_NET_F_GUEST_CSUM is there. > > This patch lets the device to supply fully checksummed packets to the driver. > > A new feature VIRTIO_NET_F_GUEST_FULLY_CSUM is added to solve the above > situation, which provides the driver with configurable offload. > If the offload is enabled, then the device must deliver fully For offload, did you mean rx csum offload? > checksummed packets to the driver and may validate the checksum. > > Use case example: > If VIRTIO_NET_F_GUEST_FULLY_CSUM is negotiated and the offload is enabled, > after XDP processes a fully checksummed packet, the VIRTIO_NET_HDR_F_DATA_VALID bit > is retained if the device has validated its checksum, resulting in the guest > not needing to validate the checksum again. This is useful for guests: > 1. Bring the driver advantages such as cpu savings. > 2. For devices that do not generate partially checksummed packets themselves, > XDP can be loaded in the driver without modifying the hardware behavior. > > Several solutions have been discussed in the previous proposal[1]. > After historical discussion, we have tried the method proposed by Jason[2], > but some complex scenarios and challenges are difficult to deal with. > We now return to the method suggested in [1]. > > [1] https://lists.oasis-open.org/archives/virtio-dev/202305/msg00291.html > [2] https://lore.kernel.org/all/20230628030506.2213-1-hengqi@linux.alibaba.com/ > > Signed-off-by: Heng Qi <hengqi@linux.alibaba.com> > Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> > --- > v6->v7: > - FULLY_CSUM no longer depends on GUEST_CSUM. > - The description related to DATA_VALID has nothing to do with > whether GUEST_CSUM is negotiated or not. @Jason > > v5->v6: > - Rewrite and clarify patch description. @Michael > > v4->v5: > - Remove the modification to the GUEST_CSUM. @Jason > - The description of this feature has been reorganized for greater clarity. @Michael > > v3->v4: > - Streamline some repetitive descriptions. @Jason > - Add how features should work, when to be enabled, and overhead. @Jason @Michael > > v2->v3: > - Add a section named "Driver Handles Fully Checksummed Packets" > and more descriptions. @Michael > > v1->v2: > - Modify full checksum functionality as a configurable offload > that is initially turned off. @Jason > > device-types/net/description.tex | 68 ++++++++++++++++++++++++++++++--- > device-types/net/device-conformance.tex | 1 + > 2 files changed, 64 insertions(+), 5 deletions(-) > > diff --git a/device-types/net/description.tex b/device-types/net/description.tex > index a5210f2..d4a7329 100644 > --- a/device-types/net/description.tex > +++ b/device-types/net/description.tex > @@ -122,6 +122,9 @@ \subsection{Feature bits}\label{sec:Device Types / Network Device / Feature bits > device with the same MAC address. > > \item[VIRTIO_NET_F_SPEED_DUPLEX(63)] Device reports speed and duplex. > + > +\item[VIRTIO_NET_F_GUEST_FULLY_CSUM (64)] Device delivers fully checksummed packets > + to the driver and may validate the checksum. > \end{description} > > \subsubsection{Feature bit requirements}\label{sec:Device Types / Network Device / Feature bits / Feature bit requirements} > @@ -136,6 +139,7 @@ \subsubsection{Feature bit requirements}\label{sec:Device Types / Network Device > \item[VIRTIO_NET_F_GUEST_UFO] Requires VIRTIO_NET_F_GUEST_CSUM. > \item[VIRTIO_NET_F_GUEST_USO4] Requires VIRTIO_NET_F_GUEST_CSUM. > \item[VIRTIO_NET_F_GUEST_USO6] Requires VIRTIO_NET_F_GUEST_CSUM. > +\item[VIRTIO_NET_F_GUEST_FULLY_CSUM] Requires VIRTIO_NET_F_CTRL_GUEST_OFFLOADS. > > \item[VIRTIO_NET_F_HOST_TSO4] Requires VIRTIO_NET_F_CSUM. > \item[VIRTIO_NET_F_HOST_TSO6] Requires VIRTIO_NET_F_CSUM. > @@ -383,7 +387,8 @@ \subsection{Device Initialization}\label{sec:Device Types / Network Device / Dev > the same system might not need checksumming at all, nor segmentation, > if both guests are amenable.} > The VIRTIO_NET_F_GUEST_CSUM feature indicates that partially > - checksummed packets can be received, and if it can do that then > + checksummed packets can be received (provided that the offload of > + VIRTIO_NET_F_GUEST_FULLY_CSUM is disabled), and if it can do that then > the VIRTIO_NET_F_GUEST_TSO4, VIRTIO_NET_F_GUEST_TSO6, > VIRTIO_NET_F_GUEST_UFO, VIRTIO_NET_F_GUEST_ECN, VIRTIO_NET_F_GUEST_USO4 > and VIRTIO_NET_F_GUEST_USO6 are the input equivalents of the features described above. > @@ -398,6 +403,55 @@ \subsection{Device Initialization}\label{sec:Device Types / Network Device / Dev > A truly minimal driver would only accept VIRTIO_NET_F_MAC and ignore > everything else. > > +\subsubsection{Device Delivers Fully Checksummed Packets}\label{sec:Device Types / Network Device / Device Initialization / Device Delivers Fully Checksummed Packets} > + > +If the feature VIRTIO_NET_F_GUEST_FULLY_CSUM is negotiated, the driver can > +benefit from the device's ability to calculate and validate the checksum. > + > +If the feature VIRTIO_NET_F_GUEST_FULLY_CSUM is negotiated > +and its offload is enabled, the device behaves as follows: > +\begin{itemize} > + \item The device delivers a fully checksummed packet to the driver rather than a partially checksummed packet. > +Partially checksummed packets come from TCP/UDP protocols \ref{devicenormative:Device Types / Network Device / Device Operation / Processing of Packets}. > + \item The device may validate the packet checksum before delivering it. > +If the packet checksum has been verified, the VIRTIO_NET_HDR_F_DATA_VALID bit > +in \field{flags} is set: in case of multiple encapsulated protocols, one > +level of checksums has been validated. > + \item The device can not set the VIRTIO_NET_HDR_F_NEEDS_CSUM bit in \field{flags}. > +\end{itemize} I may miss something, but this seems the existing behaviour of the devices regardless of FULLY_CSUM? > + > +The packet types that the device recognizes and can verify are independent of > +whether the offload of VIRTIO_NET_F_GUEST_FULLY_CSUM is enabled. Right, so I don't see why this became part of this patch. > + > +Specific transport protocols that may have VIRTIO_NET_HDR_F_DATA_VALID set > +in \field{flags} include TCP, UDP, GRE (Generic Routing Encapsulation), > +and SCTP (Stream Control Transmission Protocol). > +A fully checksummed packet's checksum field for each of the above protocols > +is set to a calculated value that covers the transport header and payload > +(TCP or UDP involves the additional pseudo header) of the packet. > + > +Delivering fully checksummed packets rather than partially > +checksummed packets incurs additional overhead for the device. > +The overhead varies from device to device, for example the overhead of > +calculating and validating the packet checksum is a few microseconds > +for some hardware devices. > + > +The feature VIRTIO_NET_F_GUEST_FULLY_CSUM has a corresponding offload \ref{sec:Device Types / Network Device / Device Operation / Control Virtqueue / Offloads State Configuration}, > +which when enabled means that the device delivers fully checksummed packets > +to the driver and may validate the checksum. > +\begin{note} This is the functional equivalent to disable GUEST_CSUM. So I guess what I'm missing is the difference between: 1) enable FULLY_CUSM and 2) disable GUEST_CSUM It looks to me 1) try to keep e.g HW_GRO (GUEST_TSOX) work? Note that GRO produce partial csum usually, not sure it can work correctly. Thanks
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]