OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

virtio-comment message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [PATCH v9] virtio-net: support inner header hash


On Thu, Mar 2, 2023 at 3:42âPM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Thu, Mar 02, 2023 at 10:57:12AM +0800, Jason Wang wrote:
> > On Wed, Mar 1, 2023 at 6:36âPM Michael S. Tsirkin <mst@redhat.com> wrote:
> > >
> > > On Wed, Mar 01, 2023 at 10:36:41AM +0800, Jason Wang wrote:
> > > > On Tue, Feb 28, 2023 at 7:05âPM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > >
> > > > > On Tue, Feb 28, 2023 at 11:04:26AM +0800, Jason Wang wrote:
> > > > > > On Tue, Feb 28, 2023 at 1:49âAM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > > > >
> > > > > > > On Mon, Feb 27, 2023 at 04:35:09PM +0800, Jason Wang wrote:
> > > > > > > > On Mon, Feb 27, 2023 at 3:39âPM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > > > > > >
> > > > > > > > > On Mon, Feb 27, 2023 at 12:07:17PM +0800, Jason Wang wrote:
> > > > > > > > > > Btw, this kind of 1:1 hash features seems not scalable and flexible.
> > > > > > > > > > It requires an endless extension on bits/fields. Modern NICs allow the
> > > > > > > > > > user to customize the hash calculation, for virtio-net we can allow to
> > > > > > > > > > use eBPF program to classify the packets. It seems to be more flexible
> > > > > > > > > > and scalable and there's almost no maintain burden in the spec (only
> > > > > > > > > > bytecode is required, no need any fancy features/interactions like
> > > > > > > > > > maps), easy to be migrated etc.
> > > > > > > > > >
> > > > > > > > > > Prototype is also easy, tun/tap had an eBPF classifier for years.
> > > > > > > > > >
> > > > > > > > > > Thanks
> > > > > > > > >
> > > > > > > > > Yea BPF offload would be great to have. We have been discussing it for
> > > > > > > > > years though - security issues keep blocking it. *Maybe* it's finally
> > > > > > > > > going to be there but I'm not going to block this work waiting for BPF
> > > > > > > > > offload.  And easily migrated is what BPF is not.
> > > > > > > >
> > > > > > > > Just to make sure we're at the same page. I meant to find a way to
> > > > > > > > allow the driver/user to fully customize what it wants to
> > > > > > > > hash/classify. Similar technologies which is based on private solution
> > > > > > > > has been used by some vendors, which allow user to customize the
> > > > > > > > classifier[1]
> > > > > > > >
> > > > > > > > ePBF looks like a good open-source solution candidate for this (there
> > > > > > > > could be others). But there could be many kinds of eBPF programs that
> > > > > > > > could be offloaded. One famous one is XDP which requires many features
> > > > > > > > other than the bytecode/VM like map access, tailcall. Starting from
> > > > > > > > such a complicated type is hard. Instead, we can start from a simple
> > > > > > > > type, that is the eBPF classifier. All it needs is to pass the
> > > > > > > > bytecode to the device, the device can choose to run it or compile it
> > > > > > > > to what it can understand for classifying. We don't need maps, tail
> > > > > > > > calls and other features.
> > > > > > >
> > > > > > > Until people start asking exactly for maps because they want
> > > > > > > state for their classifier?
> > > > > >
> > > > > > Yes, but let's compare the eBPF without maps with the static feature
> > > > > > proposed here. It is much more scalable and flexible.
> > > > >
> > > > > I looked for some examples of RSS using BPF and only found this:
> > > > > https://github.com/Netronome/bpf-samples/blob/master/programmable_rss/rss_user.c
> > > > > seems to use maps.
> > > >
> > > > Yes and this is also the way we emulate RSS with TUN/TAP via steering
> > > > eBPF support for TUN/TAP. The reason is that it needs to emulate not
> > > > only the hash but also the indirection. If we only replace the hash
> > > > function with the eBPF program but reuse the RSS indirection table, we
> > > > don't need maps.
> > >
> > > How? Add a special helper?
> >
> > We can let the eBPF program return the hash:
> >
> > [eBPF hasing] -> hash value -> [indirection table lookup]
> >
> > Note that if we don't consider future full eBPF offloading, we can
> > start with classical BPF.
> >
> > Thanks
>
> So again this is a custom thing not a standard use of BPF.
> Normally value returned is pass/drop.

AFAIK there's no standard here. The semantic of the return value is
determined by the context of the (e)BPF program.

Kernel had already used the eBPF program for hashing, classifying
various types of eBPF program other than XDP/socket filter
(pass/drop).

Thanks

>
> > >
> > > > >
> > > > >
> > > > > > > And it makes sense - if you want
> > > > > > > e.g. load balancing you need stats which needs maps.
> > > > > >
> > > > > > Yes, but we know it's possible to have that (through the XDP offload).
> > > > >
> > > > > Not without a lot more work to make xdp offload happen.
> > > > >
> > > >
> > > > Yes, that's why a simple eBPF RSS hashing program looks much more easier.
> > > >
> > > > Thanks
> > >
> > > Notice that at this point this is no longer a generic BPF - you
> > > are using a special helper. For tunnels I would imagine two tables
> > > could easily turn out to be useful. Then what? Another table?
> > > If yes then I can't say I like where this is going ...
> > >
> > > > > > This is impossible with the approach proposed here.
> > > > > >
> > > > > > >
> > > > > > > > We don't need to worry about the security
> > > > > > > > because of its simplicity: the eBPF program is only in charge of doing
> > > > > > > > classification, no other interactions with the driver and packet
> > > > > > > > modification is prohibited. The feature is limited only to the
> > > > > > > > VM/bytecode abstraction itself.
> > > > > > > >
> > > > > > > > What's more, it's a good first step to achieve full eBPF offloading in
> > > > > > > > the future.
> > > > > > > >
> > > > > > > > Thanks
> > > > > > > >
> > > > > > > > [1] https://www.intel.com/content/www/us/en/architecture-and-technology/ethernet/dynamic-device-personalization-brief.html
> > > > > > >
> > > > > > > Dave seems to have nacked this approach, no?
> > > > > >
> > > > > > I may miss something but looking at kernel commit, there are few
> > > > > > patches to support that:
> > > > > >
> > > > > > E.g
> > > > > >
> > > > > > commit c7648810961682b9388be2dd041df06915647445
> > > > > > Author: Tony Nguyen <anthony.l.nguyen@intel.com>
> > > > > > Date:   Mon Sep 9 06:47:44 2019 -0700
> > > > > >
> > > > > >     ice: Implement Dynamic Device Personalization (DDP) download
> > > > > >
> > > > > > And it has been used by DPDK drivers.
> > > > > >
> > > > > > Thanks
> > > > > >
> > > > > > >
> > > > > > > > >
> > > > > > > > > --
> > > > > > > > > MST
> > > > > > > > >
> > > > > > >
> > > > >
> > >
>



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]