Re: virtio-gpu dedicated heap

On Fri, Mar 4, 2022 at 4:53 AM Robin Murphy <robin.murphy@arm.com> wrote:

On 2022-03-04 04:05, Gurchetan Singh wrote:
> Hi everyone,
>
> With the current virtio setup, all of guest memory is shared with host
> devices.Â There has been interest in changing this, to improve isolation of
> guest memory and increase confidentiality.
>
> The recently introduced restricted DMA mechanism makes excellent progress
> in this area:
>
> https://patchwork.kernel.org/project/xen-devel/cover/20210624155526.2775863-1-tientzu@chromium.org/
>
>
> Devices without an IOMMU (traditional virtio devices for example) would
> allocate from a specially designated region.Â Swiotlb bouncing is done for
> all DMA transfers.Â This is controlled by the VIRTIO_F_ACCESS_PLATFORM
> feature bit.
>
> https://chromium-review.googlesource.com/c/chromiumos/platform/crosvm/+/3064198
>
> This mechanism works great for the devices it was designed for, such as
> virtio-net.Â However, when trying to adapt to it for other devices, there
> are some limitations.
>
> It would be great to have a dedicated heap for virtio-gpu rather than
> allocating from guest memory.
>
> We would like to use dma_alloc_noncontiguous on the restricted dma pool,
> ideally with page-level granularity somehow.Â Continuous buffers are
> definitely going out of fashion.
>
> There are two considerations when using it with the restricted DMA approach:
>
> 1) No bouncing (aka memcpy)
>
> Expensive with graphics buffers, since guest user space would designate
> shareable graphics buffers with the host.Â We plan to use
> DMA_ATTR_SKIP_CPU_SYNC when doing any DMA transactions with GPU buffers.
>
> Bounce buffering will be utilized with virtio-cmds, like the other virtio
> devices that use the restricted DMA mechanism.
>
> 2) IO_TLB_SEGSIZE is too small for graphics buffers
>
> This issue was hit before here too:
>
> https://www.spinics.net/lists/kernel/msg4154086.html
>
> The suggestion was to use shared-dma-pool rather than restricted DMA.Â But
> we're not sure a single device can have restricted DMA (for
> VIRTIO_F_ACCESS_PLATFORM) and shared-dma-pool (for larger buffers) at the
> same time.Â Does anyone know?

Yes, it is absolutely intended that a device can have both a
"restricted-dma-pool" for bouncing data from e.g. user pages, and a
"shared-dma-pool" from which to allocate dedicated buffers. The
"restricted-dma-pool" binding even explicitly calls this case out.

As long as the "shared-dma-pool" is suitably sized, it shouldn't make
much difference to Linux whether it's a simple system memory carveout or
a special hardware/hypervisor-isolated region. The only real
considerations for the latter case are firstly that you probably want to
make sure it is used as a coherent pool rather than a CMA pool (i.e.
omit the "reusable" property), since if the guest exposes private data
in shared pages that aren't currently in use for DMA then it rather
defeats the point, and secondly that if allocating from the pool fails,
Linux will currently fall back to allocating from regular protected
memory, which is liable to end badly. There's certainly potential to
improve on the latter point, but the short-term easy dodge is just to
make the pool big enough that normal operation won't exhaust it.

Thanks for the feedback!Â Will experiment with the mixed "shared-dma-pool"Â+ "restricted-dma" approach for virtio-gpu.

> If not, it sounds like "splitting the allocation into
> dma_max_mapping_size() chunks" for restricted-dma is also possible.Â What
> is the preferred method?
>
> More generally, we would love more feedback on the proposed design or
> consider alternatives!

Another alternative, if the protection mechanism allows, is to hook into
the set_memory_(de,en)crypted() APIs to dynamically share buffers as
they are allocated instead of using a static pool. This should mostly
work as-is - I think the only impediment is pgprot_decrypted, which
would need some kind of hook in vmap() to detect it and make the
corresponding hypervisor call.

Robin.

virtio-dev message