virtio-comment message

Subject: RE: [virtio-comment] RE: [PATCH v2 1/2] transport-pci: Introduce legacy registers access commands

From: Parav Pandit <parav@nvidia.com>
To: "Michael S. Tsirkin" <mst@redhat.com>
Date: Mon, 22 May 2023 21:05:13 +0000

> From: Michael S. Tsirkin <mst@redhat.com>
> Sent: Monday, May 22, 2023 4:07 PM
> 
> On Sun, May 21, 2023 at 02:44:03PM +0000, Parav Pandit wrote:
> >
> >
> > > From: Michael S. Tsirkin <mst@redhat.com>
> > > Sent: Sunday, May 21, 2023 10:33 AM
> >
> > > > Yet you initiate same discussion point that we already discussed
> > > > again after
> > > summarizing.
> > > > A driver is not attached to two devices.
> > > > A driver is attached to a single device.
> > >
> > > And that device is the owner no? to send commands?
> > >
> > Not for legacy registers access as discussed before.
> 
> Now I am lost. legacy register access is on top of aq which sends commands
> through pf.  

> how are you going to send commands without attaching a driver, I
> don't know.

A VF driver can send command to its parent PF driver using well defined kernel API.
There is no need to attach PF driver.
Upper and lower devices are common concept.

> > Not limited to QEMU.
> > A driver will be able to do this as well.
> 
> No, on some OS-es (windows, iokit, more) a single driver can't bind to many
> devices without a lot of pain.  
There is no need to bind. A well-defined kernel API is enough.

It's been while I worked on iokit, but iokit has similar kernel APIs.

It is very exciting to think that one has built iokit based hypervisor and hosting PCI accelerated device and VMs in it.
Last time when I checked, I was told that virtio driver is not present in the latest OS, not sure how true it is.

> If you are mapping VF BAR, this has to be done by
> VF driver, and please just avoid pain for everyone by exposing the necessary
> info through the VF itself.
> A capability seems perfect for this.
> 
Why not the MMIO region as proposed in v0?
Capability based read/write requires extra redirection, it is less optimal than MMIO.

> An alternative is not to map VF BAR at all, steal space from PF BAR instead.
Come an iokit will be able to access this PF BAR without binding to it?

> Guest is not accessing this anyway. So I really do not see why not from software
> point of view. There's a hardware reason? Could you talk to hardware guys
> about this? You objected to AQ too then hardware guys told you it is not a big
> deal.
I didn't object to AQ on hw reason.
I explained AQ is not the best compared to MMIO because it is slow compared to MMIO by law of physics as it needs to move more bits than MMIO.
And so, the capability based tunneling too is slow to MMIO.

> 
> > > So we have legacy emulation send commands to VF or to PF.  Okay. But
> > > let us avoid the need for VF driver to send commands to PF to initialize.
> > > Just get all information it needs from VF itself.
> > >
> > >
> > > Maybe it's a good idea to reuse existing notification capability, or
> > > maybe a new one, but let's avoid making VF driver depend on PF
> commands.
> > >
> > We agreed in v1 on Jason's suggestion to have the AQ command and yet
> > you and Jason hinder this in v2 with this exact repeated question.
> > Lets please avoid this and move forward.
> 
> How was I supposed to raise this issue on v1 if it was not there?
> Stop arguing in circles and things will move forward.
v1 version clearly had AQ commands.

Please revisit your comments.
v0 -> you suggested:
 - "why not AQ, like transport vq, it can coalesce partial mac writes".

v1-> over AQ, you suggested:
 - to "drop the size"
- summarized to split 2 commands to 4 commands to split device and config area.

After that summary, you propose, why not self-contained in VF itself? -> circle back to v0 method.
(Instead of mmio, suggesting capability, but not explaining why mmio is not fine. Which is better than cap).

Then now you propose "have BAR in the PF which is owned by other PF driver", with least reasoning.
And it convolutes with your suggestion of PF driver access...

So, I am bit lost on your circular proposal, or it was just a probe question...
Not sure.

> 
> If we have trouble converging on the notification thing, how about we make
> progress without it for now?  Merge a slow version that sends kicks through aq,
> then work on the best way to make notifications faster, separately.
> 
Sure, we can differ, but the VF has the notification BAR to utilize.
So why not use it?

> 
> > > > We have already these design choices and tradeoff in v0 and v1, it
> > > > doesn't fit
> > > the requirements.
> > >
> >
> > > So, I am saying one model is small driver for VF and a big one for PF.
> > > And to keep the VF driver simple, it should get information simply
> > > from config space capability.
> >
> > VF driver is small that does usual vfio passthrough work.
> > PF driver implement AQ for variety of use cases that we listed in the AQ
> cover letter.
> > VF driver implements 5 AQ commands that you suggested to split from 2 to 4.
> 
> VF(member) driver can't implement AQ commands at all. They are sent to
> PF(owner) thus by PF(owner) driver.  In virt configs, VMM can trap legacy access
> and talk to owner driver to send them.  It can also talk to member driver to
> send notifications.  You can I guess have VMM get member BAR offsets from
> owner and pass them to member for mapping. This raises all kind of questions
> of trust.
Only the member driver has access to the owner driver via kernel API.
Part of the VF BAR information at the PCI level is also on the PF.
There is no trust break here. A kernel exposes single self-contained object to user space VMM.

> If you can map BAR from PF that would be simplest, VMM then only pokes at PF
> and not VF. If not then we really should expose this info n the VF if at all
> possible.
> 
There is clearly no need to use PF for notification when the notification method already exists on the VF.
In fact, it is yet extra steering entry for the hw to perform such mapping.

To iterate what presented yday in [1] to Jason.

[1] https://lists.oasis-open.org/archives/virtio-comment/202305/msg00298.html

1. legacy register access via AQ (v1)
Pros:
a. Light weight for hypervisor and devices (mainly PCI) to implement.
b. Enables sw to coalesce some device specific registers, if needed.
Cons:
a. Not self-contained, requires PF's AQ which is anyway designed for such purpose.

2. Legacy registers access via new MMIO region (v0 + sane reset)
Pros:
a. smaller code in slow register access than AQ
b. Sw cannot coalesce some device specific registers.
c. Self-contained in the VF

Cons:
a. Relatively burdensome for the device as it requires more RW registers at scale, but it can be indirect register.

3. legacy registers tunneling with additional indirection via PCI capability
Cons:
a. transport specific (but not big concern to me)
b. Twice the slow as it requires RW cap.
c. Inferior to #2, still requires sane reset as #2
d. MMIO indirect registers are better than capabilities because PCI caps are largely for read only purpose.

Follow-Ups:
- Re: [virtio-comment] RE: [PATCH v2 1/2] transport-pci: Introduce legacy registers access commands
  - From: "Michael S. Tsirkin" <mst@redhat.com>
- Re: [virtio-comment] RE: [PATCH v2 1/2] transport-pci: Introduce legacy registers access commands
  - From: "Michael S. Tsirkin" <mst@redhat.com>

References:
- [PATCH v2 1/2] transport-pci: Introduce legacy registers access commands
  - From: Parav Pandit <parav@nvidia.com>
- Re: [PATCH v2 1/2] transport-pci: Introduce legacy registers access commands
  - From: "Michael S. Tsirkin" <mst@redhat.com>
- RE: [PATCH v2 1/2] transport-pci: Introduce legacy registers access commands
  - From: Parav Pandit <parav@nvidia.com>
- Re: [virtio-comment] RE: [PATCH v2 1/2] transport-pci: Introduce legacy registers access commands
  - From: "Michael S. Tsirkin" <mst@redhat.com>
- Re: [virtio-comment] RE: [PATCH v2 1/2] transport-pci: Introduce legacy registers access commands
  - From: "Michael S. Tsirkin" <mst@redhat.com>
- RE: [virtio-comment] RE: [PATCH v2 1/2] transport-pci: Introduce legacy registers access commands
  - From: Parav Pandit <parav@nvidia.com>
- Re: [virtio-comment] RE: [PATCH v2 1/2] transport-pci: Introduce legacy registers access commands
  - From: "Michael S. Tsirkin" <mst@redhat.com>
- RE: [virtio-comment] RE: [PATCH v2 1/2] transport-pci: Introduce legacy registers access commands
  - From: Parav Pandit <parav@nvidia.com>
- Re: [virtio-comment] RE: [PATCH v2 1/2] transport-pci: Introduce legacy registers access commands
  - From: "Michael S. Tsirkin" <mst@redhat.com>
- RE: [virtio-comment] RE: [PATCH v2 1/2] transport-pci: Introduce legacy registers access commands
  - From: Parav Pandit <parav@nvidia.com>
- Re: [virtio-comment] RE: [PATCH v2 1/2] transport-pci: Introduce legacy registers access commands
  - From: "Michael S. Tsirkin" <mst@redhat.com>