OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

virtio-dev message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [virtio-dev] [PATCH] virtio-net: define UDP tunnel offload feature


On Tue, 2022-05-31 at 14:34 +0800, Jason Wang wrote:
> On Wed, May 18, 2022 at 6:16 PM Paolo Abeni <pabeni@redhat.com> wrote:
> > 
> > The VIRTIO_NET_HDR_GSO_UDP_TUNNEL is a gso_type flags allowing
> > GSO over UDP tunnel. It can be negotiated on both the host and
> > guest side.
> > 
> > UDP tunnel usage is ubiquitous in container deployment, and the
> > ability to offload UDP encapsulated GSO traffic impacts greatly
> > the performances and the CPU utilization of such use-case
> > 
> > To keep the design relatively simple, this copes only with tunneling
> > scenarios including an inner ethernet MAC header - e.g. inner network
> > on top of the UDP encapsulation is not allowed.
> > 
> > Other than that the new flag is designed to be independent from the
> > specific UDP tunnel encapsulation - geneve, vxlan, etc.
> > 
> > The outer UDP header may carry a second checksum, which can be
> > offloaded independently from the inner one. We don't need to specify
> > the location of such checksum, as both the host and the guest can infer
> > its location the same way they currently do for the transport header for
> > plain GSO packets.
> 
> I'm not sure I get this, the transport header is set based on
> csum_start currently.Â

As a reference I looked at the Linux implementation and I missed that
the skb_partial_csum_set() in virtio_net_hdr_to_skb() actually sets the
transport offset, sorry.

Still is possible - and quite cheap - to infer the transport header
offset from the network header.

> More below.


> 
> > 
> > One constraint addressed here is that the virtio side (either device or
> > driver) receiving an UDP tunneled GSO packet must be able to reconstruct
> > completely the inner and outer headers offset - to allow for later GSO.
> > 
> > To accommodate such need the hdr_len field is [re-]used: when
> > the UDP tunnel feature is negotiated and the VIRTIO_NET_HDR_GSO_UDP_TUNNEL
> > is set, such field is used to identify the inner network header.
> > 
> > All the other inner/outer headers location can be computed from the
> > above info.
> > 
> > Note that when the VIRTIO_NET_HDR_GSO_UDP_TUNNEL flag is not set,
> > the current semantic for the hdr_hen field does not change.
> > 
> > Signed-off-by: Paolo Abeni <pabeni@redhat.com>
> > ---
> >  content.tex | 110 +++++++++++++++++++++++++++++++++++++++++++++-------
> >  1 file changed, 96 insertions(+), 14 deletions(-)
> > 
> > diff --git a/content.tex b/content.tex
> > index c6f116c..8825d81 100644
> > --- a/content.tex
> > +++ b/content.tex
> > @@ -3092,6 +3092,16 @@ \subsection{Feature bits}\label{sec:Device Types / Network Device / Feature bits
> >  \item[VIRTIO_NET_F_CTRL_MAC_ADDR(23)] Set MAC address through control
> >      channel.
> > 
> > +\item[VIRTIO_NET_F_GUEST_UDP_TUNNEL (54)] Driver can receive GSO packets
> > +  carried by an UDP tunnel. The tunnel encapsulation is constrained to have
> > +  constant size and carring an inner ethernet header. e.g. alike vxlan and
> 
> I think you mean "carrying" here.
> 
> > +  geneve defaults.
> > +
> > +\item[VIRTIO_NET_F_HOST_UDP_TUNNEL (55)] Device can receive GSO packets
> > +  carried by an UDP tunnel. The tunnel encapsulation is constrained to have
> > +  constant size and carring an inner ethernet header. e.g. alike vxlan and
> 
> And here.
> 
> > +  geneve defaults.
> > +
> >  \item[VIRTIO_NET_F_HOST_USO (56)] Device can receive USO packets. Unlike UFO
> >   (fragmenting the packet) the USO splits large UDP packet
> >   to several segments when each of these smaller packets has UDP header.
> > @@ -3125,12 +3135,14 @@ \subsubsection{Feature bit requirements}\label{sec:Device Types / Network Device
> >  \item[VIRTIO_NET_F_GUEST_TSO6] Requires VIRTIO_NET_F_GUEST_CSUM.
> >  \item[VIRTIO_NET_F_GUEST_ECN] Requires VIRTIO_NET_F_GUEST_TSO4 or VIRTIO_NET_F_GUEST_TSO6.
> >  \item[VIRTIO_NET_F_GUEST_UFO] Requires VIRTIO_NET_F_GUEST_CSUM.
> > +\item[VIRTIO_NET_F_GUEST_UDP_TUNNEL] Requires VIRTIO_NET_F_GUEST_TSO4 or VIRTIO_NET_F_GUEST_TSO6
> > 
> >  \item[VIRTIO_NET_F_HOST_TSO4] Requires VIRTIO_NET_F_CSUM.
> >  \item[VIRTIO_NET_F_HOST_TSO6] Requires VIRTIO_NET_F_CSUM.
> >  \item[VIRTIO_NET_F_HOST_ECN] Requires VIRTIO_NET_F_HOST_TSO4 or VIRTIO_NET_F_HOST_TSO6.
> >  \item[VIRTIO_NET_F_HOST_UFO] Requires VIRTIO_NET_F_CSUM.
> >  \item[VIRTIO_NET_F_HOST_USO] Requires VIRTIO_NET_F_CSUM.
> > +\item[VIRTIO_NET_F_HOST_UDP_TUNNEL] Requires VIRTIO_NET_F_HOST_TSO4, VIRTIO_NET_F_HOST_TSO6 or VIRTIO_NET_F_HOST_USO
> > 
> >  \item[VIRTIO_NET_F_CTRL_RX] Requires VIRTIO_NET_F_CTRL_VQ.
> >  \item[VIRTIO_NET_F_CTRL_VLAN] Requires VIRTIO_NET_F_CTRL_VQ.
> > @@ -3337,6 +3349,9 @@ \subsection{Device Initialization}\label{sec:Device Types / Network Device / Dev
> >    segmentation/fragmentation offload by negotiating the VIRTIO_NET_F_HOST_TSO4 (IPv4
> >    TCP), VIRTIO_NET_F_HOST_TSO6 (IPv6 TCP), VIRTIO_NET_F_HOST_UFO
> >    (UDP fragmentation) and VIRTIO_NET_F_HOST_USO (UDP segmentation) features.
> > +  Additionally, it can negotiate the VIRTIO_NET_F_HOST_UDP_TUNNEL feature to
> > +  use TCP segmentation or UDP segmentation on top of UDP encapsulation,
> > +  respecting the other negotiated features.
> > 
> >  \item The converse features are also available: a driver can save
> >    the virtual device some work by negotiating these features.\note{For example, a network packet transported between two guests on
> > @@ -3345,8 +3360,8 @@ \subsection{Device Initialization}\label{sec:Device Types / Network Device / Dev
> >     The VIRTIO_NET_F_GUEST_CSUM feature indicates that partially
> >    checksummed packets can be received, and if it can do that then
> >    the VIRTIO_NET_F_GUEST_TSO4, VIRTIO_NET_F_GUEST_TSO6,
> > -  VIRTIO_NET_F_GUEST_UFO and VIRTIO_NET_F_GUEST_ECN are the input
> > -  equivalents of the features described above.
> > +  VIRTIO_NET_F_GUEST_UFO, VIRTIO_NET_F_GUEST_ECN and VIRTIO_NET_F_GUEST_UDP_TUNNEL
> > +  are the input equivalents of the features described above.
> >    See \ref{sec:Device Types / Network Device / Device Operation /
> >  Setting Up Receive Buffers}~\nameref{sec:Device Types / Network
> >  Device / Device Operation / Setting Up Receive Buffers} and
> > @@ -3370,12 +3385,14 @@ \subsection{Device Operation}\label{sec:Device Types / Network Device / Device O
> >  #define VIRTIO_NET_HDR_F_NEEDS_CSUM    1
> >  #define VIRTIO_NET_HDR_F_DATA_VALID    2
> >  #define VIRTIO_NET_HDR_F_RSC_INFO      4
> > +#define VIRTIO_NET_HDR_F_UDP_TUNNEL_CSUM 8
> >          u8 flags;
> >  #define VIRTIO_NET_HDR_GSO_NONE        0
> >  #define VIRTIO_NET_HDR_GSO_TCPV4       1
> >  #define VIRTIO_NET_HDR_GSO_UDP         3
> >  #define VIRTIO_NET_HDR_GSO_TCPV6       4
> >  #define VIRTIO_NET_HDR_GSO_UDP_L4      5
> > +#define VIRTIO_NET_HDR_GSO_UDP_TUNNEL 0x40
> >  #define VIRTIO_NET_HDR_GSO_ECN      0x80
> >          u8 gso_type;
> >          le16 hdr_len;
> > @@ -3443,6 +3460,8 @@ \subsubsection{Packet Transmission}\label{sec:Device Types / Network Device / De
> >  followed by the TCP header (with the TCP checksum field 16 bytes
> >  into that header). \field{csum_start} will be 14+20 = 34 (the TCP
> >  checksum includes the header), and \field{csum_offset} will be 16.
> > +If the given packets has the VIRTIO_NET_HDR_GSO_UDP_TUNNEL bit set,
> > +the above checksum fields refer to the inner header checksum.
> >  \end{note}
> 
> I think we need to explain how the outer checksum is expected to work.
> It looks to me it requires the device to examine/probe the outer
> packet header.Â

I will do in the next revision.

> If it's true, I wonder what happen if the device fail
> to do such probing.

In that case the packet should be discarded - the same way the device
discards packets when it fails to access the transport offset for the
plain GSO case.
> 
> > 
> >  \item If the driver negotiated
> > @@ -3455,11 +3474,16 @@ \subsubsection{Packet Transmission}\label{sec:Device Types / Network Device / De
> >    into smaller packets. The other gso fields are set:
> > 
> >    \begin{itemize}
> > -  \item If the VIRTIO_NET_F_GUEST_HDRLEN feature has been negotiated,
> > +  \item If the VIRTIO_NET_F_HOST_UDP_TUNNEL feature has been negotiated and
> > +    the VIRTIO_NET_HDR_GSO_UDP_TUNNEL bit is set,
> > +    \field{hdr_len} indicates the location of the inner network header.
> 
> It's probably better to explain why a network header instead of
> transport header is needed here. (Anyhow we're doing segmentation at
> L4 instead of L3).

yes, well do in the next revision.

> 
> > +    Otherwise, if the VIRTIO_NET_F_GUEST_HDRLEN feature has been negotiated
> > +    and the VIRTIO_NET_HDR_GSO_UDP_TUNNEL bit is not set,
> 
> I wonder if this just suppress the semantic of GUEST_HDR_LEN.Â

It should not. This specification gives VIRTIO_NET_F_HOST_UDP_TUNNEL/
VIRTIO_NET_HDR_GSO_UDP_TUNNEL an 'higher priority' over GUEST_HDR_LEN.
If the VIRTIO_NET_F_HOST_UDP_TUNNEL feature is not negotiated or the
VIRTIO_NET_HDR_GSO_UDP_TUNNEL bit is not set, hdr_len and GUEST_HDR_LEN
retain their current semantic.

> If yes,
> is this better to have dedicated inner hdr len?
> 
> >      \field{hdr_len} indicates the header length that needs to be replicated
> >      for each packet. It's the number of bytes from the beginning of the packet
> >      to the beginning of the transport payload.
> > -    Otherwise, if the VIRTIO_NET_F_GUEST_HDRLEN feature has not been negotiated,
> > +    Finally, if the VIRTIO_NET_F_GUEST_HDRLEN and VIRTIO_NET_F_HOST_UDP_TUNNEL
> > +    the features has not been negotiated,
> >      \field{hdr_len} is a hint to the device as to how much of the header
> >      needs to be kept to copy into each packet, usually set to the
> >      length of the headers, including the transport header\footnote{Due to various bugs in implementations, this field is not useful
> > @@ -3477,6 +3501,13 @@ \subsubsection{Packet Transmission}\label{sec:Device Types / Network Device / De
> >      the VIRTIO_NET_HDR_GSO_ECN bit in \field{gso_type}
> >      indicates that the TCP packet has the ECN bit set\footnote{This case is not handled by some older hardware, so is called out
> >  specifically in the protocol.}.
> > +
> > +  \item If the driver negotiated the VIRTIO_NET_F_HOST_UDP_TUNNEL feature,
> > +    \field{gso_type} has the VIRTIO_NET_HDR_GSO_UDP_TUNNEL bit set,
> > +    \field{flags} has the VIRTIO_NET_HDR_F_UDP_TUNNEL_CSUM set, the outer
> > +    UDP checksum field carries the checksum for the UDP pseudo header,
> > +    the complete UDP checksum can be computed in a similar way to the
> > +    inner TCP.
> 
> This looks like a partial offload of the outer checksum, if yes, we
> probably need a better name of the flag (maybe
> UDP_TUNNEL_PARTIAL_CSUM)?Â

Still looking at the linux implementation, if I read the code
correctly, it looks like VIRTIO_NET_HDR_F_NEEDS_CSUM expect to find the
L4 pseudo header csum in place (e.g. in tcp_header->csum I mean)?!? if
so this would be exactly the same.

> This is because VIRTIO_NET_HDR_F_CSUM is
> used for complete offloading.
> 
> >     \end{itemize}
> > 
> >  \item \field{num_buffers} is set to zero.  This field is unused on transmitted packets.
> > @@ -3520,6 +3551,11 @@ \subsubsection{Packet Transmission}\label{sec:Device Types / Network Device / De
> >  driver MUST set the VIRTIO_NET_HDR_GSO_ECN bit in
> >  \field{gso_type}.
> > 
> > +The driver MUST NOT sent to the device TCP or UDP packets over UDP tunnel
> > +requiring segmentation offload, unless the VIRTIO_NET_F_HOST_UDP_TUNNEL is
> > +negotiated, in which case the driver MUST set the VIRTIO_NET_HDR_GSO_UDP_TUNNEL
> > +bit in the \field{gso_type}.
> 
> Do we need to clarify that inner UFO is forbidden even with UDP_TUNNEL?

Yes, UFO is forbitten on top of UDP_TUNNEL (and USO is/will be
allowed). I'll try to clarify that in the next revision.


Thanks!

Paolo



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]