OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

virtio-comment message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [virtio-comment] Re: [virtio-dev] Re: [PATCH v3 0/3] transport-pci: Introduce legacy registers access using AQ


On Tue, Jun 27, 2023 at 11:17âAM Parav Pandit <parav@nvidia.com> wrote:
>
>
>
> > From: Jason Wang <jasowang@redhat.com>
> > Sent: Monday, June 26, 2023 10:38 PM
>
> > > >
> > > Because there is no point of failing it later when actual rw occurs.
> >
> > Even with admin virtqueue, the hypervisor should be ready for any admin
> > command failures somehow.
> >
> Yes, device failures can happen regardless.
>
> > And with modern devices, the reset is not a must, the hypervisor can just read
> > device_features to check if _MAC/_HDR is there.
>
> >
> I read the spec differently as,
> The driver MUST follow this sequence to initialize a device:
> 1. Reset the device.

It's not the initialization, it's a probing of supported features.

>
> > And what's more important,
> > the hypervisor can know if the feature negotiation succeeds or not. With
> > legacy, there's no such way.
> Again, we are not fixing the legacy here.

I mean, it's not worse than failing silently in the legacy case so we
don't need to care.

>
> > >
> > > > > 1. reset the device
> > > > > 2. set ACK and DRIVER bit
> > > > > 3. read features and make sure _MAC, _HDR are supported.
> > > >
> > > > Similar step is required even for AQ since the hypervisor needs to
> > > > know if legacy commands are supported there.
> > > >
> > > No. similar steps are not required on AQ because AQ support discovery is
> > done once per owner device. Not for each member device.
> >
> > This seems a topic beyond legacy. If both modern and transitional devices are
> > supported, a per device probing or provisioning is still required. And if
> > provisioning is supported, there's no need to probe MAC/HDR.
> >
> That is possible to rely on the AQ.
>  > >
> > > > > 4. if not abort.
> > > > > 5. reset the device 2nd time so that guest can have right view of things.
> > > > >
> > > > > On guest access of legacy area:
> > > > > 6. reset the device
> > > > > 7. set ACK and DRIVER bit on guest request 8. read features and
> > > > > make sure _MAC, _HDR are offered
> > > > > 8.1 Hope for the best that on two completely unrelated device
> > > > > reset on #1
> > > > and #6, the device offers exact same feature bits.
> > > >
> > > > Explained and discussed in another thread, this is a hard
> > > > requirement otherwise a lot of things would break.
> > > >
> > > That limitation is for the past or current.
> >
> > You need at least a new feature bit.
> >
> Not needed, current mechanism is just fine as there is member device driver to owner communication.
>
> > > As we move forward there is no need to continue with such limitation.
> > > There may be more work needed to overcome that limitation, but we donât
> > want to build on top of that limitation.
> >
> > Let's open a dedicated thread for this if necessary.
> >
> Ok. We donât want to have that assumption and build interface.
>
> > >
> > > > > And mention this in the spec 1.x for rest of the future.
> > > > >
> > > > > We shouldn't be adding such a limitation to the spec.
> > > > > We have seen this with mlx5 device where 5 years ago one thought
> > > > > this
> > > > cannot happen.
> > > > > And now with modern use case now features changes across two
> > > > > resets for
> > > > mlx5 device.
> > > > > Baking such limitation in current spec for past 1.x is sub-optimal.
> > > > >
> > > > > ok, so to avoid baking such reset flow nasty things in spec, lets
> > > > > avoid flow of
> > > > #1 to #5 in hypervisor.
> > > > > So, provision the device to support these new feature bits via AQ.
> > > > > So AQ is required for feature provisioning anyway.
> > > > >
> > > > > So, your point #4 is required in both methods and so it scores
> > > > > same as this
> > > > proposal.
> > > > > Hence feature bit is not of an advantage.
> > > > >
> > > > > With _MAC now we need writable mac on 1.x config space.
> > > > > for 1.x writable mac has two requirements.
> > > > > 1. It must be atomic (not for 0.9.5, but must for 1.x like CVQ)
> > > >
> > > > For LEGACY_MAC, we don't need to be atomic. We manage to survive for
> > > > several years.
> > > >
> > > It was cited as issue in past discussed when you and Michael both asked to
> > consider tvq.
> > > What changed now?
> >
> > Nothing changed, what I meant is: with admin virtqueue it could be atomic or
> > not.
> >
> I donât know why you keep circling back on it.
>
> Why do you ask below "how atomicity is achieved if it is not important?"
>
> > >
> > > The issue is that device hw needs implement special RW registers now for the
> > mac for legacy.
> > > This is the unwanted burden not needed, where AQ is better.
> >
> > How can you solve the atomic issue in this case? The issue is the driver where
> > legacy guests write to device configuration space byte by byte.
> You contradict yourself, above you say that we managed, here you say its issue.
>
> It does write byte by byte, a hypervisor can collect 6 bytes and write all together in one go.
>
> It seems that now the past issue of atomicity was raised and solved with AQ, you start to oppose that it...

Let me clarify:

Let's first look at the ABI solely:

+struct virtio_admin_cmd_ld_reg_wr_data {
+       u8 offset; /* Starting byte offset of the device register(s) to write */
+       u8 register[];
+};

For accessing mac, it could be atomic or not.

Hypervisor can try to collect 6 bytes for atomicy but again it's best
effort and may (for Linux) or may not work for all kinds of guests.

>
> > Modern guests will use cvq.
> >
> Sure.
>
> > >
> > > > > 2. should be synchronous to know success/failure to know when it
> > > > > is effective
> > > >
> > > > It's as simple as allowing mac in the device configuration space writable.
> > > >
> > > Not simple for hw at scale.
> >
> > I don't understand, it uses modern device configuration space. How does it
> > matter with the scale of admin virtqueue?
> >
> The new feature bit demands mac address in config space to be read write. Modern device doesnt have it as RW.
> This burden is unnecessary to carry forward.

How much burden if we just make 6 byte RW per device?

> An AQ eliminates such heavy device side implementation.
>
> > > AQ is superior that communicates rarely used information.
> > >
> > > > >
> > > > > Both are present on the CVQ, so yet add another duplicate 1.x
> > > > > scheme that
> > > > fulfill above requirements.
> > > >
> > > > I'm not sure how to define duplication, the legacy device
> > > > configuration space access via admin virtqueue is kind of a duplication in
> > this sense as well.
> > > >
> > > It only exists on AQ for legacy purpose.
> > > The feature bit defined is not for legacy, hence its duplicated.
> >
> > It's for legacy mediation and what's more, it allows mac to be configured
> > without cvq.
> >
> This fundamental thing is not a gain, but a negative aspect to complicate the device PCI.

Is this minor stuff more complicated than the entire admin virtqueue stuff?

The point is methodology, mediation on top of modern devices is
simpler, and a lot of other methods could be used if LEGACY_MAC
doesn't suffice.

> It even contradicts for current spec definition " Device configuration space is generally used for rarely-changing or initialization-time parameters".

I don't see how, mac is rarely changed, and for blocks we have
writeable stuffs like wce as well.

>
> > > I donât see how that is avoided when one side has undefined corner case that
> > is translated to well defined behavior.
> >
> > Let me give you an example. The legacy device behaviour is basically what
> > Qemu codes did, by reusing the existing Qemu virtio device model codes, most
> > of the issues can be avoided.
> >
> You are welcome to document them in legacy interface section.

Well, you can't describe the behaviour of several kilo lines of code precisely.

>
> > > It only means that piece will be in endless errata in sw.
> >
> > Errata in software is much more simpler/cheaper than hardware, isn't it?
> >
> AQ has ability to implement things in less hw centric way.
> And sw when necessary can always decide which AQ commands to use.
> So it is more flexible.

A lot of examples have been demonstrated, software erratas is much
more flexible. It's really common to workaround hardware errata in
kernel or BIOS.

>
> > > It still scores same or less.
> > >
> > > > >
> > > > > For legacy there is no extra/special documentation.
> > > > > All the behavior listed in Legacy interfaces section for
> > > > > configuration register
> > > > present applies here.
> > > >
> > > > That's not true, the legacy section is the best effort since it's
> > > > too late to define (codes came before the spec).
> > > >
> > > One can added the missing documentation...
> >
> > It will be too late for the legacy, and it would be too hard since it basically tries
> > to describe the exact code behaviour. And there could be a lot of very hard
> > questions that need to be answered. E.g what happens if a driver tries to use
> > both modern and legacy interfaces?
> >
> It is no different than what is defined already in transitional device.

Can you show me where the spec answers this question?

Thanks

> No point repeating same questions again, please.
>
>
> > >
> > >
> > > > >
> > > > > So, I don't see mac support is trivial by any means compared to
> > > > > proposed
> > > > scheme for 1.x.
> > > > > Hence, comparing trivialness of two solutions seems same for your
> > > > > point #1
> > > > once you do mac plumbing.
> > > > >
> > > > > Regarding #2 on nesting. I won't claim I understand this as your
> > > > > level of
> > > > knowledge.
> > > > > If you are sauing only the VF is in VM as virtio PCI device to
> > > > > supporting nested
> > > > guest, that doesn't have AQ, hence it doesn't work due above issues.
> > > >
> > > > This is the most common setup, no?
> > > >
> > > Depends on who see it.
> > > We donât see common setup for legacy or otherwise.
> > >
> > > > The major issues are the transport function is not self contained in
> > > > a device (function). This will cause a lot of trouble when trying to
> > > > implement nesting. E.g you need to assign or emulate SR-IOV from L1
> > > > to
> > > > L(N-1) in order to let L(N) to work.
> > > >
> > > Those who do nesting usually do assign the PF as well.
> >
> > This is not what I was told and it's not flexible.
> One food type does not solve hunger for whole world.
> So one can come up with more elaborate solution as time progresses, we are hoping that legacy doesnt reach to that point.



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]