OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

virtio-comment message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [PATCH v2 1/9] virtio-iommu: Add ATTACH_TABLE request


Hi Jean,

Sorry for the late response. I was trying to build up a POC of virtio-iommu support vt-d page table based on your branch "virtio-iommu/pgtables", according to this proposal. I may still need several days. I'll publish the POC (e.g., send the virtio-iommu front-end change to Linux public mailing lists and make the QEMU change in a public repo as the implementation has some dependencies which are WIP in QEMU and Linux kernel communities.). Do you think it makes sense in terms of helping this proposal review?

On 10/18/23 22:41, Jean-Philippe Brucker wrote:
Hi Tina,

On Wed, Oct 11, 2023 at 07:39:02PM +0800, Tina Zhang wrote:
+For some formats, the PASID table is managed by the device rather
+than driver. The ATTACH_TABLE request contains a \field{pasid}
This is what VT-d hardware expects when working with viommu (no matter it's
virtual VT-d iommu or virtual-iommu) with VT-d IO page table support. If a
device is directly assigned to a guest, the guest viommu driver fully
manages the PASIDs of that device. If a device is providing I/O device
sharing service for multiple guests, the PASIDs value are assigned by host.
But, in either case, the PASID table is filled by host VT-d driver, not
guest viommu driver. So, from viommu front-end's point of view, the PASID
table of that device is managed by viommu device (i.e., viommu back-end).

Ok, so there are two cases of device assignment:

* whole devices (eg PCIe VF) assigned to one guest. In that case,
   PASIDs used by the guest (vPASID) could be identical to PASIDs assigned
   by the host (hPASID)

* device is shared between multiple guests, sub-devices are assigned to
   guests. In that case vPASID is different from hPASID.

+field and a page directory. Similarly to when the driver manages
+the PASID table, \field{domain} corresponds to one PASID table.
+The driver uses the same \field{domain} for each ATTACH_TABLE
+request with PASID. Issuing a DETACH request without PASID

Here are some thoughts about domain id interception with VT-d IO page
format:

Virtual VT-d driver uses the same domain id for first-stage translation is
because it's required by the VT-d specification, though guest domain id
never gets to hardware (i.e., host doesn't populate PASID table with guest
domain id value).

(To make things clearer I'll use "DID" for the HW VT-d identifier and
"domain" for the virtio-iommu one)

Do you know if the host is more likely to assign a unique DID for each VM,
for each virtual device, or some other scheme?  There was a discussion
recently about the Linux host, but I don't know where it landed. I'm
trying to understand what hosts and guests could implement, so that
virtio-iommu can be supported without too much modification to host and
guest.

We should design virtio-iommu such that it's possible to share an address
space between multiple devices. For example the guest could use a single
PASID space, assign a unique PASID to each process address space, and if
the host assigns the same DID to devices in a VM, then DMA from devices
bound to the same guest address space will share TLB entries.

TLB entries are tagged with DID+PASID. So in theory, in order to share TLB
entries between multiple virtual devices, a guest that uses virtual VT-d
would write the same vDID into each virtual device's virtual PASID table,
into entries indexed by the same vPASID. That would allow the host to
understand that they are the same address space, and assign the same hDID
to entries in the HW PASID table indexed by the same hPASID, and they
would share TLB entries.

          Dev1 PASID table                     vDev1 PASID table
             +-----------+                     +-----------+
             :           :                     :           :
  hPASID=12->+-----------+                     +-----------+<-vPASID=1
             |   hDID=24 |-+-> First stage <-+-|    vDID=8 |
             +-----------+ |   page table    | +-----------+
             :           : |                 | :           :
                           |                 |
                           |                 |
          Dev2 PASID table |                 | vDev2 PASID table
             +-----------+ |                 | +-----------+
             :           : |                 | :           :
  hPASID=12->+-----------+ |                 | +-----------+<-vPASID=1
             |   hDID=24 |-+                 +-|    vDID=8 |
             +-----------+                     +-----------+
             :           :                     :           :

Conversely, a guest can use different vPASIDs to isolate address spaces,
but it could also rely on using different vDIDs, right?  It could program
the virtual VT-d PASID tables for two devices with overlapping vPASID
spaces, and different vDIDs for isolation. Which implies that the host
can't assign a unique DID for each VM, it needs to be able to use a unique
DID for each virtual devices.

That's also needed for requests-without-PASID: the guest most likely wants
to isolate no-PASID address spaces of different devices, which requires
separate DIDs.
You're right. Host assigns a unique DID for each virtual device. Previously, I had some misunderstanding about this. Thanks for the elaboration.



According to VT-d specification (section 6.2.3.1):
"Scalable-mode PASID table entries programmed for first-
stage translation or pass-through (PGTT equal to 001b or 100b) must be
programmed with a DID value that is different from those used in any PASID
table entries that are programmed for second-stage or nested translation
(PGTT equal to 010b or 011b). This is required since hardware
implementations tag various caches with domain-id as described in Section
6.2.1. Scalable-mode PASID-table entries with PGTT value of 001b or 100b are
recommended to use the same domain-id value for best hardware efficiency."

When VT-d IO page table comes to virtual-iommu, the definition of domain id
doesn't need to be exactly the same with the one defined in VT-d
specification as the viommu providing the service is virtual-iommu, not
virtual VT-d and only VT-d IO page table is being used (domain id is defined
in PASID table which isn't being used).

Yes, even if they have the same name they don't have to mean the same
thing, we can pick for virtio-iommu the usage that's most convenient.

However, hosts have to support IOMMU emulations such as virtual VT-d as
well, so in order to reuse those host interfaces for virtio-iommu, the
safest is to specify it close to the hardware design.

Staying close to the hardware also allows supporting hardware acceleration
later, like AMD's invalidation and fault queues that can be assigned
directly to a guest.

So, it seems more reasonable to define the domain id as an identifier for an
unique domain for each address space for virtual-iommu back-end
interception. Any ideas?

Do you mean that domain would represent an address space uniquely within a
guest?

One concern I have is about reducing the number of address spaces.
DID+PASID allows for 2^36 unique address spaces, but domain alone only
allows for 2^32. Both are more than enough for current use-cases, but it
would be better not to impose such limitations.

To be compatible with DID+PASID support of hosts, maybe we should use
domain+PASID to identify an address space?  domain wouldn't represent a
PASID table as in this current draft, but it would be an arbitrary ID like
DID, that the guest can use as it wants. That would keep the design close
to virtual VT-d, which would be good for compatibility. If the host
I like this idea. It makes sense to VT-d. I adopt this idea in my POC code.

Thanks,
-Tina

expects a (vDID, vPASID) pair in attach and invalidate calls, then
supporting virtio-iommu would be easier with a similar interface.

If domain uniquely represents a guest address space, then the INVALIDATE
request wouldn't take a PASID field. For a host that is implemented to
support vDID+vPASID, it would then be difficult to translate
virtio-iommu's domain into hDID+hPASID in order to invalidate TLBs.

Thanks,
Jean



[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]