OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

virtio-comment message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [PATCH 00/11] Introduce transitional mmr pci device


On Fri, Mar 31, 2023 at 01:58:23AM +0300, Parav Pandit wrote:
> Overview:
> ---------
> The Transitional MMR device is a variant of the transitional PCI device.
> It has its own small Device ID range. It does not have I/O
> region BAR; instead it exposes legacy configuration and device
> specific registers at an offset in the memory region BAR.
> 
> Such transitional MMR devices will be used at the scale of
> thousands of devices using PCI SR-IOV and/or future scalable
> virtualization technology to provide backward
> compatibility (for legacy devices) and also future
> compatibility with new features.
> 
> Usecase:
> --------
> 1. A hypervisor/system needs to provide transitional
>    virtio devices to the guest VM at scale of thousands,
>    typically, one to eight devices per VM.
> 
> 2. A hypervisor/system needs to provide such devices using a
>    vendor agnostic driver in the hypervisor system.
> 
> 3. A hypervisor system prefers to have single stack regardless of
>    virtio device type (net/blk) and be future compatible with a
>    single vfio stack using SR-IOV or other scalable device
>    virtualization technology to map PCI devices to the guest VM.
>    (as transitional or otherwise)
> 
> Motivation/Background:
> ----------------------
> The existing transitional PCI device is missing support for
> PCI SR-IOV based devices. Currently it does not work beyond
> PCI PF, or as software emulated device in reality. It currently
> has below cited system level limitations:
> 
> [a] PCIe spec citation:
> VFs do not support I/O Space and thus VF BARs shall not
> indicate I/O Space.
> 
> [b] cpu arch citiation:
> Intel 64 and IA-32 Architectures Software Developerâs Manual:
> The processorâs I/O address space is separate and distinct from
> the physical-memory address space. The I/O address space consists
> of 64K individually addressable 8-bit I/O ports, numbered 0 through FFFFH.
> 
> [c] PCIe spec citation:
> If a bridge implements an I/O address range,...I/O address range
> will be aligned to a 4 KB boundary.
> 
> [d] I/O region accesses at PCI system level is slow as they are non-posted
> operations in PCIe fabric.
> 
> The usecase requirements and limitations above can be solved by
> extending the transitional device, mapping legacy and device
> specific configuration registers in a memory PCI BAR instead
> of using non composable I/O region.
> 
> Please review.

So as you explain in a lot of detail above, IO support is going away,
so the transitional device can no longer be used through the
legacy interface.

OK but this does not answer the following question:
since a legacy driver can not bind to this type of MMR device,
a new driver is needed anyway so
why not implement a modern driver?


I think we discussed this at some call and it made some kind of sense.
Unfortunately it has been a while and I am not sure I remember the
detail, so I can no longer say for sure whether this proposal is fit for
the purpose.  Here is what I vaguely remember:

A valid use-case is an emulation layer (e.g. a hypervisor) translating
a legacy driver I/O accesses to MMIO. Ideally layering this emulation
on top of a modern device would work ok
but there are several things making this approach problematic.
One is a different virtio net header size between legacy and modern
driver. Another is use of control VQ by modern where legacy used
IO writes. In both cases the different would require the
emulation getting involved on the DMA path, in particular
somehow finding private addresses for communication between
emulation and modern device.


Does above summarize it reasonably?


And if yes, would an alternative approach of adding legacy config
support to transport vq work well?  I can not say I thought about this
deeply so maybe there's some problem, or maybe it's a worse approach -
could you comment on this? It looks like this could be a smaller change,
but maybe it isn't? Did you consider this option?


More review later.



> Patch summary:
> --------------
> patch 1 to 5 prepares the spec
> patch 6 to 11 defines transitional mmr device
> 
> patch-1 uses lower case alphabets to name device id
> patch-2 move transitional device id in legay section along with
>         revision id
> patch-3 splits legacy feature bits description from device id
> patch-4 rename and moves virtio config registers next to 1.x
>         registers section
> patch-5 Adds missing helper verb in terminology definitions
> patch-6 introduces transitional mmr device
> patch-7 introduces transitional mmr device pci device ids
> patch-8 introduces virtio extended pci capability
> patch-9 describes new pci capability to locate legacy mmr
>         registers
> patch-10 extended usage of driver notification capability for
>          the transitional mmr device
> patch-11 adds conformance section of the transitional mmr device
> 
> This design and details further described below.
> 
> Design:
> -------
> Below picture captures the main small difference between current
> transitional PCI SR-IOV VF and transitional MMR SR-IOV VF.
> 
> +------------------+ +--------------------+ +--------------------+
> |virtio 1.x        | |Transitional        | |Transitional        |
> |SRIOV VF          | |SRIOV VF            | |MMR SRIOV VF        |
> |                  | |                    | |                    |
> ++---------------+ | ++---------------+   | ++---------------+   |
> ||dev_id =       | | ||dev_id =       |   | ||dev_id =       |   |
> ||{0x1040-0x106C}| | ||{0x1000-0x103f}|   | ||{0x10f9-0x10ff}|   |
> |+---------------+ | |+---------------+   | |+---------------+   |
> |                  | |                    | |                    |
> |+------------+    | |+------------+      | |+-----------------+ |
> ||Memory BAR  |    | ||Memory BAR  |      | ||Memory BAR       | |
> |+------------+    | |+------------+      | ||                 | |
> |                  | |                    | || +--------------+| |
> |                  | |+-----------------+ | || |legacy virtio || |
> |                  | ||IOBAR impossible | | || |+ dev cfg     || |
> |                  | |+-----------------+ | || |registers     || |
> |                  | |                    | || +--------------+| |
> |                  | |                    | |+-----------------+ |
> +------------------+ +--------------------+ +--------------------+
> 
> Here transitional MMR SR-IOV VF has legacy configuration and
> legacy device specific registers located at an offset in the memory
> region BAR.
> 
> A memory region can be dedicated at BAR0 or it can be in an
> existing BAR, allowing flexibility when implementing support
> in a hardware device.
> 
> Transitional MMR SR-IOV VFs use a distinct device ID range to that
> of existing virtio SR-IOV VFs to allow flexibility in driver
> binding.
> 
> A more zoom-in version of transitional MMR SR-IOV device shows
> that the location of the legacy registers are discovered by the
> driver using a new capability.
> 
> +------------------------------+
> |Transitional                  |
> |MMR SRIOV VF                  |
> |                              |
> ++---------------+             |
> ||dev_id =       |             |
> ||{0x10f9-0x10ff}|             |
> |+---------------+             |
> |                              |
> ++--------------------+        |
> || PCIe ext cap = 0xB |        |
> || cfg_type = 10      |        |
> || offset   = 0x1000  |        |
> || bar      = N {0..5}|        |
> |+--|-----------------+        |
> |   |                          |
> |   |                          |
> |   |    +-------------------+ |
> |   |    | Memory BAR = A    | |
> |   |    |                   | |
> |   +------>+--------------+ | |
> |        |  |legacy virtio | | |
> |        |  |+ dev cfg     | | |
> |        |  |registers     | | |
> |        |  +--------------+ | |
> |        +-----------------+ | |
> +------------------------------+
> 
> Software usage:
> ---------------
> Transitional MMR device can be used by multiple ways.
> 
> 1. The most common way to use and map to the guest VM is by
> using vfio driver framework in Linux kernel.
> 
>                 +----------------------+
>                 |pci_dev_id = 0x100X   |
> +---------------|pci_rev_id = 0x0      |-----+
> |vfio device    |BAR0 = I/O region     |     |
> |               |Other attributes      |     |
> |               +----------------------+     |
> |                                            |
> +   +--------------+     +-----------------+ |
> |   |I/O to memory |     | Other vfio      | |
> |   |rd/wr mapper  |     | functionalities | |
> |   +--------------+     +-----------------+ |
> |                                            |
> +-------------------+------------------------+
>                     |
>        +------------+-----------------+
>        |         Transitional         |
>        |         MMR SRIOV VF         |
>        +------------------------------+
> 
> 2. Virtio pci driver to bind to the listed device id and
>    use it as native device in the host.
> 
> 3. Use it in a light weight hypervisor to run bare-metal OS.
> 
> Parav Pandit (11):
>   transport-pci: Use lowecase alphabets
>   transport-pci: Move transitional device id to legacy section
>   transport-pci: Split notes of PCI Device Layout
>   transport-pci: Rename and move legacy PCI Device layout section
>   introduction: Add missing helping verb
>   introduction: Introduce transitional MMR interface
>   transport-pci: Introduce transitional MMR device id
>   transport-pci: Introduce virtio extended capability
>   transport-pci: Describe PCI MMR dev config registers
>   transport-pci: Use driver notification PCI capability
>   conformance: Add transitional MMR interface conformance
> 
>  conformance.tex      |  11 +-
>  introduction.tex     |  34 +++-
>  tmmr-conformance.tex |  27 +++
>  transport-pci.tex    | 405 ++++++++++++++++++++++++++++++-------------
>  4 files changed, 354 insertions(+), 123 deletions(-)
>  create mode 100644 tmmr-conformance.tex
> 
> -- 
> 2.26.2



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]