OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

virtio-comment message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: RE: [virtio-comment] Re: [PATCH v6 2/5] virtio-net: Add flow filter capabilities read commands


> From: Michael S. Tsirkin <mst@redhat.com>
> Sent: Friday, November 24, 2023 3:58 PM
> 
> On Fri, Nov 24, 2023 at 06:27:46AM +0000, Parav Pandit wrote:
> >
> > > From: Michael S. Tsirkin <mst@redhat.com>
> > > Sent: Friday, November 24, 2023 11:37 AM
> > >
> > > On Fri, Nov 24, 2023 at 05:53:02AM +0000, Parav Pandit wrote:
> > > >
> > > > > From: Michael S. Tsirkin <mst@redhat.com>
> > > > > Sent: Friday, November 24, 2023 11:03 AM
> > > > >
> > > > > On Fri, Nov 24, 2023 at 12:02:23PM +0800, Jason Wang wrote:
> > > > > > > > I won't be able to absorb this comment of DMA interface.
> > > > > > > > If I discuss further, I will repeat the whole document [1]
> > > > > > > > and I will avoid
> > > > > that now.
> > > > > > > >
> > > > > > > > [1]
> > > > > > > > https://docs.google.com/document/d/1Iyn-
> > > > > l3Nm0yls3pZaul4lZiVj8x1s73
> > > > > > > > Ed6rOsmn6LfXc/edit#heading=h.qexbtyc2jpwr
> > > > > > >
> > > > > > >
> > > > > > > I really worry about how provisioning will work. And I do
> > > > > > > not at all cherish replicating all of these query capability
> > > > > > > commands for
> > > provisioning.
> > > > > >
> > > > > > +1
> > > > > >
> > > > > > There's nothing that prevents the config space from being
> > > > > > implemented in a way other than registers.
> > > > >
> > > > > Care doing it finally? Let's see if what Parav is worrying about
> > > > > is then addressed.
> > > >
> > > > The whole concept that everything must be in one giant config
> > > > space is just
> > > simply bad.
> > > > It does not exist either in virtio spec today.
> > >
> > > But it does, this is what transports are doing.
> > I don't understand what is "it".
> 
> "it" here is passing device init time configuration to drivers.
> 
> 
> > Transport like pci transport bits.
> 
> in a simple, logical and functional manner.
> 
> > CVQ is functional object that helps to arrange the bits in logical, functional
> manner instead of trying to place them in bit array.
> 
> In a device specific way.
> 
Yes, and in multiple of these examples, they are device specific structures based on functionality to extend.

> > >
> > > > once can see that what is presented in the commands cannot be
> > > > placed in
> > > config space at dynamic location.
> > > > Same was the case with statistics too.
> > > > Same was the case with VQ coaleasing knobs.
> > > > Same with hash knobs.
> > > > With flow filters,
> > > > With rss contexts
> > > > With rtc
> > > > With new queues creation apis.
> > > >
> > > > The endless list continues...
> > > >
> > > > And reserving bits for future (for other than pad bytes) for
> > > > future addition in
> > > config space is equally not elegant design.
> > > > Bits will get spread out at random location making things even
> > > > harder to
> > > maintain.
> > > >
> > > > The device is no longer a simple mac_addr + N queues device with
> > > > some
> > > static rss config anymore.
> > > >
> > > > With all modern work, every capability query and run time
> > > > configuration is
> > > done over cvq interface today.
> > > > Single get/set channel from driver to device all using existing resources.
> > > >
> > > > The real hw device also does not need to refer to two places of
> > > > config and
> > > cvq when serving cvq commands.
> > > > Oh, the list of advantages just continues with what 1.3 spec has done.
> > >
> > > I don't see the problem sorry. We've been doing this for many years
> > > with many ways to access config space.  It scaled well.
> > >
> > I don't see it. sorry.
> > The configuration of the device is done using cvq.
> 
> runtime configuration, absolutely. We found out writeable config space field
> are painful for a variety of ways, the main one being device can't report errors.
> So we generally avoid writeable config space.  But read only - no good reason
> to avoid them for init time things.
> 
The good reason as explained in the doc is to not place them based on the policy of read_only vs read-write, 
Instead based on the access pattern whether it is needed in early driver initialization time or can it be read at later point over.

And the based on the access pattern has given more flexibility to implementations across sw and hw based devices without any lose.

From the device side it has consistent view of get/set via single channel = cvq when dealing with the guest driver.

The two non nvidia examples of these devices are Microsoft MANA nic and Amazon ENA nic.

> 
> 
> 
> > >
> > > Then hardware offload guys come and say that in PCI spec current
> > > transport is forcing use of on-device memory, and they want to build
> > > cheap offload PCI based devices. Fine, let's build a transport variant that
> does not force this.
> >
> > All new capabilities and control is over the cvq. What is baked until 1.2 is sort
> of legacy.
> 
> Just repeating this will not make everyone agree.
> 

I see that Satananda also is fine for Marvell to use CVQ.

> > >  And
> > > we want optional compatibility, so let's also find a way to do that.
> > > This makes much more sense than forcing transport specific issues on
> everyone.
> > >
> > Trying to attribute as some transport specific issue is just not aligned to the
> spec written today.
> >
> > > To add to that, what did not historicall scale well is transport-specific
> registers.
> > Then you should have put the VQ notification coalescing functionality in a
> horrible virtio_net_config register like how a queue reset in the common config
> space.
> 
> No because of the 4 commands: VIRTIO_NET_CTRL_NOTF_COAL_TX_SET
> VIRTIO_NET_CTRL_NOTF_COAL_RX_SET
> VIRTIO_NET_CTRL_NOTF_COAL_VQ_SET
> VIRTIO_NET_CTRL_NOTF_COAL_VQ_GET none are initialization time.
> They are not capabilities: they control and inspec device runtime state.
> 
> If we had VIRTIO_NET_CTRL_NOTF_COAL_CAP_GET that would belong in
> config space.
> 
I understood you.
As we propose above, those caps reading is not must during initialization time phase.
Hence, cvq delivers better scale like stats and consistent get/set.

> > Thank God that mistake was not done and many similar mistakes were
> avoided.
> 
> Let's not get emotional here please.
> 
> > > That was a bad design, with all transports doing exactly the same
> > > thing is slightly different ways.  And what you are advocating for
> > > with CVQ is exactly replicating the bad design not the good one.
> >
> > CVQ design is what extends the current spec in good way. Followed by many
> other non nvidia nics listed in the doc for reference.
> 
> I don't know what you are referring to here.  Register maps are all over the
> place. It's a simple, standard, well understood practice.
> 
I provided various examples of virtio and non nvidia devices which prefers to avoid registers.
Satananda from Marvell also expressed their view.

> We have some niche uses due to need for extreme VF# counts, this forces
> DMA for them, not a good reason to force it for everyone.
PFs are equally benefiting from them. Some of our hw devices has large PF count.

> The sooner you just stop forcing this down everyone's throat the faster we can
> make progress on things that matter.

CVQ is established good configuration channel. So we all want to utilize instead of being forced to grow config space which is very hard to maintain and implement.

I will fix your comments in v7 regarding the honoring max limits in driver and device.


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]