OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

virtio-comment message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [virtio-comment] Re: [PATCH v7 2/2] virtio-net: device does not deliver partially checksummed packet and may validate the checksum


On Tue, Jan 2, 2024 at 2:33âPM Heng Qi <hengqi@linux.alibaba.com> wrote:
>
>
>
> å 2024/1/2 äå1:56, Jason Wang åé:
> > On Tue, Jan 2, 2024 at 11:33âAM Heng Qi <hengqi@linux.alibaba.com> wrote:
> >>
> >>
> >> å 2024/1/2 äå9:58, Jason Wang åé:
> >>> On Thu, Dec 28, 2023 at 3:46âPM Heng Qi <hengqi@linux.alibaba.com> wrote:
> >>>> virtio-net works in a virtualized system and is somewhat different from
> >>>> physical nics. One of the differences is that to save virtio device
> >>>> resources, rx may receive partially checksummed packets. However, XDP may
> >>>> cause partially checksummed packets to be dropped.
> >>>> So XDP loading currently conflicts with the feature VIRTIO_NET_F_GUEST_CSUM.
> >>>> When XDP is loaded, the offload of VIRTIO_NET_F_GUEST_CSUM is disabled
> >>>> (to disable the tx checksum offload).
> >>>>
> >>>> In addition, implementation of some performant devices always do not generate
> >>>> partially checksummed packets, but the standard driver still cannot load XDP
> >>>> when VIRTIO_NET_F_GUEST_CSUM is there.
> >>>>
> >>>> This patch lets the device to supply fully checksummed packets to the driver.
> >>>>
> >>>> A new feature VIRTIO_NET_F_GUEST_FULLY_CSUM is added to solve the above
> >>>> situation, which provides the driver with configurable offload.
> >>>> If the offload is enabled, then the device must deliver fully
> >>> For offload, did you mean rx csum offload?
> >> I mean FULLY_CSUM's offload.
> >>
> >>>> checksummed packets to the driver and may validate the checksum.
> >>>>
> >>>> Use case example:
> >>>> If VIRTIO_NET_F_GUEST_FULLY_CSUM is negotiated and the offload is enabled,
> >>>> after XDP processes a fully checksummed packet, the VIRTIO_NET_HDR_F_DATA_VALID bit
> >>>> is retained if the device has validated its checksum, resulting in the guest
> >>>> not needing to validate the checksum again. This is useful for guests:
> >>>>     1. Bring the driver advantages such as cpu savings.
> >>>>     2. For devices that do not generate partially checksummed packets themselves,
> >>>>        XDP can be loaded in the driver without modifying the hardware behavior.
> >>>>
> >>>> Several solutions have been discussed in the previous proposal[1].
> >>>> After historical discussion, we have tried the method proposed by Jason[2],
> >>>> but some complex scenarios and challenges are difficult to deal with.
> >>>> We now return to the method suggested in [1].
> >>>>
> >>>> [1] https://lists.oasis-open.org/archives/virtio-dev/202305/msg00291.html
> >>>> [2] https://lore.kernel.org/all/20230628030506.2213-1-hengqi@linux.alibaba.com/
> >>>>
> >>>> Signed-off-by: Heng Qi <hengqi@linux.alibaba.com>
> >>>> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
> >>>> ---
> >>>> v6->v7:
> >>>> - FULLY_CSUM no longer depends on GUEST_CSUM.
> >>>> - The description related to DATA_VALID has nothing to do with
> >>>>     whether GUEST_CSUM is negotiated or not. @Jason
> >>>>
> >>>> v5->v6:
> >>>> - Rewrite and clarify patch description. @Michael
> >>>>
> >>>> v4->v5:
> >>>> - Remove the modification to the GUEST_CSUM. @Jason
> >>>> - The description of this feature has been reorganized for greater clarity. @Michael
> >>>>
> >>>> v3->v4:
> >>>> - Streamline some repetitive descriptions. @Jason
> >>>> - Add how features should work, when to be enabled, and overhead. @Jason @Michael
> >>>>
> >>>> v2->v3:
> >>>> - Add a section named "Driver Handles Fully Checksummed Packets"
> >>>>     and more descriptions. @Michael
> >>>>
> >>>> v1->v2:
> >>>> - Modify full checksum functionality as a configurable offload
> >>>>     that is initially turned off. @Jason
> >>>>
> >>>>    device-types/net/description.tex        | 68 ++++++++++++++++++++++++++++++---
> >>>>    device-types/net/device-conformance.tex |  1 +
> >>>>    2 files changed, 64 insertions(+), 5 deletions(-)
> >>>>
> >>>> diff --git a/device-types/net/description.tex b/device-types/net/description.tex
> >>>> index a5210f2..d4a7329 100644
> >>>> --- a/device-types/net/description.tex
> >>>> +++ b/device-types/net/description.tex
> >>>> @@ -122,6 +122,9 @@ \subsection{Feature bits}\label{sec:Device Types / Network Device / Feature bits
> >>>>        device with the same MAC address.
> >>>>
> >>>>    \item[VIRTIO_NET_F_SPEED_DUPLEX(63)] Device reports speed and duplex.
> >>>> +
> >>>> +\item[VIRTIO_NET_F_GUEST_FULLY_CSUM (64)] Device delivers fully checksummed packets
> >>>> +    to the driver and may validate the checksum.
> >>>>    \end{description}
> >>>>
> >>>>    \subsubsection{Feature bit requirements}\label{sec:Device Types / Network Device / Feature bits / Feature bit requirements}
> >>>> @@ -136,6 +139,7 @@ \subsubsection{Feature bit requirements}\label{sec:Device Types / Network Device
> >>>>    \item[VIRTIO_NET_F_GUEST_UFO] Requires VIRTIO_NET_F_GUEST_CSUM.
> >>>>    \item[VIRTIO_NET_F_GUEST_USO4] Requires VIRTIO_NET_F_GUEST_CSUM.
> >>>>    \item[VIRTIO_NET_F_GUEST_USO6] Requires VIRTIO_NET_F_GUEST_CSUM.
> >>>> +\item[VIRTIO_NET_F_GUEST_FULLY_CSUM] Requires VIRTIO_NET_F_CTRL_GUEST_OFFLOADS.
> >>>>
> >>>>    \item[VIRTIO_NET_F_HOST_TSO4] Requires VIRTIO_NET_F_CSUM.
> >>>>    \item[VIRTIO_NET_F_HOST_TSO6] Requires VIRTIO_NET_F_CSUM.
> >>>> @@ -383,7 +387,8 @@ \subsection{Device Initialization}\label{sec:Device Types / Network Device / Dev
> >>>>    the same system might not need checksumming at all, nor segmentation,
> >>>>    if both guests are amenable.}
> >>>>       The VIRTIO_NET_F_GUEST_CSUM feature indicates that partially
> >>>> -  checksummed packets can be received, and if it can do that then
> >>>> +  checksummed packets can be received (provided that the offload of
> >>>> +  VIRTIO_NET_F_GUEST_FULLY_CSUM is disabled), and if it can do that then
> >>>>      the VIRTIO_NET_F_GUEST_TSO4, VIRTIO_NET_F_GUEST_TSO6,
> >>>>      VIRTIO_NET_F_GUEST_UFO, VIRTIO_NET_F_GUEST_ECN, VIRTIO_NET_F_GUEST_USO4
> >>>>      and VIRTIO_NET_F_GUEST_USO6 are the input equivalents of the features described above.
> >>>> @@ -398,6 +403,55 @@ \subsection{Device Initialization}\label{sec:Device Types / Network Device / Dev
> >>>>    A truly minimal driver would only accept VIRTIO_NET_F_MAC and ignore
> >>>>    everything else.
> >>>>
> >>>> +\subsubsection{Device Delivers Fully Checksummed Packets}\label{sec:Device Types / Network Device / Device Initialization / Device Delivers Fully Checksummed Packets}
> >>>> +
> >>>> +If the feature VIRTIO_NET_F_GUEST_FULLY_CSUM is negotiated, the driver can
> >>>> +benefit from the device's ability to calculate and validate the checksum.
> >>>> +
> >>>> +If the feature VIRTIO_NET_F_GUEST_FULLY_CSUM is negotiated
> >>>> +and its offload is enabled, the device behaves as follows:
> >>>> +\begin{itemize}
> >>>> +  \item The device delivers a fully checksummed packet to the driver rather than a partially checksummed packet.
> >>>> +Partially checksummed packets come from TCP/UDP protocols \ref{devicenormative:Device Types / Network Device / Device Operation / Processing of Packets}.
> >>>> +  \item The device may validate the packet checksum before delivering it.
> >>>> +If the packet checksum has been verified, the VIRTIO_NET_HDR_F_DATA_VALID bit
> >>>> +in \field{flags} is set: in case of multiple encapsulated protocols, one
> >>>> +level of checksums has been validated.
> >>>> +  \item The device can not set the VIRTIO_NET_HDR_F_NEEDS_CSUM bit in \field{flags}.
> >>>> +\end{itemize}
> >>> I may miss something, but this seems the existing behaviour of the
> >>> devices regardless of FULLY_CSUM?
> >> Yes Jason.
> >> I think that we have the fix from the first patch, our requirements have
> >> been met.
> >> But back to the second patch itself, on some devices, turning off
> >> GUEST_CSUM offload
> >> may mean turning off tx checksum offload,
> > For tx did you mean guest? If yes, it should be HOST_CSUM I think.
> >
> > Or you actually mean the tx csum on the host?
> >
> >> but GUEST_FULLY_CSUM does not
> >> have this requirement.
> >> GRO_HW will also be turned off.
> >>
> >>>> +
> >>>> +The packet types that the device recognizes and can verify are independent of
> >>>> +whether the offload of VIRTIO_NET_F_GUEST_FULLY_CSUM is enabled.
> >>> Right, so I don't see why this became part of this patch.
> >>>
> >>>> +
> >>>> +Specific transport protocols that may have VIRTIO_NET_HDR_F_DATA_VALID set
> >>>> +in \field{flags} include TCP, UDP, GRE (Generic Routing Encapsulation),
> >>>> +and SCTP (Stream Control Transmission Protocol).
> >>>> +A fully checksummed packet's checksum field for each of the above protocols
> >>>> +is set to a calculated value that covers the transport header and payload
> >>>> +(TCP or UDP involves the additional pseudo header) of the packet.
> >>>> +
> >>>> +Delivering fully checksummed packets rather than partially
> >>>> +checksummed packets incurs additional overhead for the device.
> >>>> +The overhead varies from device to device, for example the overhead of
> >>>> +calculating and validating the packet checksum is a few microseconds
> >>>> +for some hardware devices.
> >>>> +
> >>>> +The feature VIRTIO_NET_F_GUEST_FULLY_CSUM has a corresponding offload \ref{sec:Device Types / Network Device / Device Operation / Control Virtqueue / Offloads State Configuration},
> >>>> +which when enabled means that the device delivers fully checksummed packets
> >>>> +to the driver and may validate the checksum.
> >>>> +\begin{note}
> >>> This is the functional equivalent to disable GUEST_CSUM. So I guess
> >>> what I'm missing is the difference between:
> >>>
> >>> 1) enable FULLY_CUSM
> >>>
> >>> and
> >>>
> >>> 2) disable GUEST_CSUM
> >>>
> >>> It looks to me 1) try to keep e.g HW_GRO (GUEST_TSOX) work? Note that
> >>> GRO produce partial csum usually, not sure it can work correctly.
> >> When GUEST_CSUM is turned off, GRO_HW offload will be turned off.
> > It's the GUEST_TSO4/6 that is actually turned off. So this needs more thought:
> >
> > 1) Reuse GUEST_TSO4/6 but have a mode to have full checksum
> > 2) New feature like GUEST_GRO, and claims that the packet boundary
> > needs to be reserved there
> >
> > My understanding is, 1) may end up being somewhat ambiguous since TSO
> > usually have partial checksum.
> >
> > What's your thought?
>
>
> I think the reason why GUEST_TSO4/6 etc. rely on GUEST_CSUM is that the
> overhead of segmenting and re-merging
> packets can be saved between two vms on the same host.
>
> Currently, disabling GUEST_CSUM happens when XDP is loading, and XDP
> does conflict with GUEST_TSO4/6.
> Therefore, in this scenario, when we disable GUEST_CSUM, GUEST_TSO4/6
> must also be disabled.
>
> In non-XDP scenarios, we don't need to disable GUEST_CSUM (and the
> current Linux virtio driver does not provide a way to turn it off),
> and they work well.
>
> Designing solutions such as (1)/(2) can indeed be one of the future
> directions, but there does not seem to be a strong motivation(at least
> in the Alibaba cloud scenario).
>
> Therefore, I plan to push only the first fix patch in the next version,
> and fix the bug in Linux that \field{flags} with the DATA_VALID set is
> ignored when XDP is loaded.
>
> Thanks a lot!

That's fine.

Thanks

>
> >
> >> Meaning the hardware will not merge the packets.
> > Right.
> >
> >> IIRC, after the packets have been merged, the packets should have full
> >> checksums and have been verified:
> >>
> >>          if (!(features & NETIF_F_RXCSUM)) {
> >>                   /* NETIF_F_GRO_HW implies doing RXCSUM since every packet
> >>                    * successfully merged by hardware must also have the
> >>                    * checksum verified by hardware.  If the user does not
> >>                    * want to enable RXCSUM, logically, we should disable
> >> GRO_HW.
> >>                    */
> >>                   if (features & NETIF_F_GRO_HW) {
> >>                           netdev_dbg(dev, "Dropping NETIF_F_GRO_HW since
> >> no RXCSUM feature.\n");
> >>                           features &= ~NETIF_F_GRO_HW;
> >>                   }
> >>           }
> >>
> > Thanks
>



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]