OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

virtio-comment message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [virtio-comment] Re: [virtio-dev] Re: [PATCH v3 0/3] transport-pci: Introduce legacy registers access using AQ


On Mon, Jun 26, 2023 at 11:52âAM Parav Pandit <parav@nvidia.com> wrote:
>
>
>
> > From: Jason Wang <jasowang@redhat.com>
> > Sent: Sunday, June 25, 2023 11:32 PM
>
> > > Hypervisor flow without involving guest; first sanity round to figure out things
> > can work:
> >
> > Why is this sanity round a must?
> >
> Because there is no point of failing it later when actual rw occurs.

Even with admin virtqueue, the hypervisor should be ready for any
admin command failures somehow.

And with modern devices, the reset is not a must, the hypervisor can
just read device_features to check if _MAC/_HDR is there. And what's
more important, the hypervisor can know if the feature negotiation
succeeds or not. With legacy, there's no such way.

>
> > > 1. reset the device
> > > 2. set ACK and DRIVER bit
> > > 3. read features and make sure _MAC, _HDR are supported.
> >
> > Similar step is required even for AQ since the hypervisor needs to know if legacy
> > commands are supported there.
> >
> No. similar steps are not required on AQ because AQ support discovery is done once per owner device. Not for each member device.

This seems a topic beyond legacy. If both modern and transitional
devices are supported, a per device probing or provisioning is still
required. And if provisioning is supported, there's no need to probe
MAC/HDR.

>
> > > 4. if not abort.
> > > 5. reset the device 2nd time so that guest can have right view of things.
> > >
> > > On guest access of legacy area:
> > > 6. reset the device
> > > 7. set ACK and DRIVER bit on guest request 8. read features and make
> > > sure _MAC, _HDR are offered
> > > 8.1 Hope for the best that on two completely unrelated device reset on #1
> > and #6, the device offers exact same feature bits.
> >
> > Explained and discussed in another thread, this is a hard requirement otherwise
> > a lot of things would break.
> >
> That limitation is for the past or current.

You need at least a new feature bit.

> As we move forward there is no need to continue with such limitation.
> There may be more work needed to overcome that limitation, but we donât want to build on top of that limitation.

Let's open a dedicated thread for this if necessary.

>
> > > And mention this in the spec 1.x for rest of the future.
> > >
> > > We shouldn't be adding such a limitation to the spec.
> > > We have seen this with mlx5 device where 5 years ago one thought this
> > cannot happen.
> > > And now with modern use case now features changes across two resets for
> > mlx5 device.
> > > Baking such limitation in current spec for past 1.x is sub-optimal.
> > >
> > > ok, so to avoid baking such reset flow nasty things in spec, lets avoid flow of
> > #1 to #5 in hypervisor.
> > > So, provision the device to support these new feature bits via AQ.
> > > So AQ is required for feature provisioning anyway.
> > >
> > > So, your point #4 is required in both methods and so it scores same as this
> > proposal.
> > > Hence feature bit is not of an advantage.
> > >
> > > With _MAC now we need writable mac on 1.x config space.
> > > for 1.x writable mac has two requirements.
> > > 1. It must be atomic (not for 0.9.5, but must for 1.x like CVQ)
> >
> > For LEGACY_MAC, we don't need to be atomic. We manage to survive for
> > several years.
> >
> It was cited as issue in past discussed when you and Michael both asked to consider tvq.
> What changed now?

Nothing changed, what I meant is: with admin virtqueue it could be
atomic or not.

>
> The issue is that device hw needs implement special RW registers now for the mac for legacy.
> This is the unwanted burden not needed, where AQ is better.

How can you solve the atomic issue in this case? The issue is the
driver where legacy guests write to device configuration space byte by
byte. Modern guests will use cvq.

>
> > > 2. should be synchronous to know success/failure to know when it is
> > > effective
> >
> > It's as simple as allowing mac in the device configuration space writable.
> >
> Not simple for hw at scale.

I don't understand, it uses modern device configuration space. How
does it matter with the scale of admin virtqueue?

> AQ is superior that communicates rarely used information.
>
> > >
> > > Both are present on the CVQ, so yet add another duplicate 1.x scheme that
> > fulfill above requirements.
> >
> > I'm not sure how to define duplication, the legacy device configuration space
> > access via admin virtqueue is kind of a duplication in this sense as well.
> >
> It only exists on AQ for legacy purpose.
> The feature bit defined is not for legacy, hence its duplicated.

It's for legacy mediation and what's more, it allows mac to be
configured without cvq.

>
> > >
> > > ok. So may be let's do AQ command for just mac setting.
> > >
> > > This scores down now for two reasons.
> > > a. Duplicate of existing 1.x feature
> > > b. requires AQ.
> > >
> > > So, from RW mac we moved from 1.x cfg space to AQ. This mediation for 1.x is
> > not good.
> >
> > Modern drivers just won't negotiate LEGACY_MAC, so there's no mediation at
> > all.
> As feature, there is no need to prefix it with LEGACY because itâs a new feature.
>
> >
> > > And now it's not trivial either as it is not just simple *p_mac = X.
> > > Hence, #1 of non-trivial starts to looks less appealing.
> > >
> > > Your point #3 about vendor to implement legacy behavior.
> > > If vendor needs to support legacy, vendor anyway needs to implement _HDR
> > anyway in data path.
> > > and above MAC change, and feature provisioning.
> >
> > I think there are some misunderstandings here:
> >
> > For legacy behaviour, I mainly meant the behaviour in the corner cases that
> > can't be documented (easily) by the spec since it is defined by the code (e.g
> > Linux drivers and Qemu). LEGACY_MAC/HDR doesn't belong to those since the
> > behaviour is well understood and defined in the spec.
> >
> > It's very hard to audit each corner case and it would easily result in endless
> > errata of the hardware. Modern devices have clearly defined behaviour and it's
> > easier to implement. It's never too late to fix spec if we found a bug.
> >
> I donât see how that is avoided when one side has undefined corner case that is translated to well defined behavior.

Let me give you an example. The legacy device behaviour is basically
what Qemu codes did, by reusing the existing Qemu virtio device model
codes, most of the issues can be avoided.

> It only means that piece will be in endless errata in sw.

Errata in software is much more simpler/cheaper than hardware, isn't it?

> It still scores same or less.
>
> > >
> > > For legacy there is no extra/special documentation.
> > > All the behavior listed in Legacy interfaces section for configuration register
> > present applies here.
> >
> > That's not true, the legacy section is the best effort since it's too late to define
> > (codes came before the spec).
> >
> One can added the missing documentation...

It will be too late for the legacy, and it would be too hard since it
basically tries to describe the exact code behaviour. And there could
be a lot of very hard questions that need to be answered. E.g what
happens if a driver tries to use both modern and legacy interfaces?

>
>
> > >
> > > So, I don't see mac support is trivial by any means compared to proposed
> > scheme for 1.x.
> > > Hence, comparing trivialness of two solutions seems same for your point #1
> > once you do mac plumbing.
> > >
> > > Regarding #2 on nesting. I won't claim I understand this as your level of
> > knowledge.
> > > If you are sauing only the VF is in VM as virtio PCI device to supporting nested
> > guest, that doesn't have AQ, hence it doesn't work due above issues.
> >
> > This is the most common setup, no?
> >
> Depends on who see it.
> We donât see common setup for legacy or otherwise.
>
> > The major issues are the transport function is not self contained in a device
> > (function). This will cause a lot of trouble when trying to implement nesting. E.g
> > you need to assign or emulate SR-IOV from L1 to
> > L(N-1) in order to let L(N) to work.
> >
> Those who do nesting usually do assign the PF as well.

This is not what I was told and it's not flexible.

Thanks



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]