Re: [virtio-dev] Memory sharing device

Hi all,

I'm Frank who's been using Roman's goldfish address space driver for Vulkan host visible memory for the emulator. Some more in-depth replies inline.

On Tue, Feb 5, 2019 at 2:04 AM Dr. David Alan Gilbert <dgilbert@redhat.com> wrote:

* Roman Kiryanov (rkir@google.com) wrote:
> Hi Gerd,
>
> > virtio-gpu specifically needs that to support vulkan and opengl
> > extensions for coherent buffers, which must be allocated by the host gpu
> > driver.Â It's WIP still.
>

Hi Roman,

> the proposed spec says:
>
> +Shared memory regions MUST NOT be used to control the operation
> +of the device, nor to stream data; those should still be performed
> +using virtqueues.

Yes, I put that in.

> Is there a strong reason to prohibit using memory regions for control purposes?
> Our long term goal is to have as few kernel drivers as possible and to move
> "drivers" into userspace. If we go with the virtqueues, is there
> general a purpose
> device/driver to talk between our host and guest to support custom hardware
> (with own blobs)? Could you please advise if we can use something else to
> achieve this goal?

My reason for that paragraph was to try and think about what should
still be in the virtqueues; after all a device that *just* shares a
block of memory and does everything in the block of memory itself isn't
really a virtio device - it's the standardised queue structure that
makes it a virtio device.
However, I'd be happy to accept the 'MUST NOT' might be a bit strong for
some cases where there's stuff that makes sense in the queues and
stuff that makes sense differently.

Currently, how we drive the gl/vk host coherent memory is that a host memory sharing device and a meta pipe device are used in tandem. The pipe device, goldfish_pipe, is used to send control messages (but is also currently used to send the API call parameters themselves over, and to drive other devices like sensors and camera; it's a bit of a catch-all), while the host memory sharing device does the act of sharing memory from the host and telling the guest which physical addresses are to be sent with the glMapBufferRange/vkMapMemory calls over the pipe to the host.

In the interest of having fewer custom kernel drivers for the emulator, we were thinking of two major approaches to upstreaming the control message / meta pipe part:

Come up with a new virtio driver that captures what goldfish_pipe does; it would have a virtqueue and it would be something like a virtio driver for drivers defined in userspace that interacts closely with a host memory sharing driver (virtio-userspace?). It would be used with the host memory sharing driver not just to share coherent mappings, but also to deliver the API call parameters. It'd have a single ioctl that pushes a message into the virtqueue that notifies the host a) what kind of userspace driver it is and b) how much data to send/receive.

On the host side, we would make the resolution of what virtual device code to run based on the control message decided by a plugin DLL to qemu. So once we decide to add new functionality, we would at max need to increment some version number that is sent in some initial control message, or change some enumeration in a handshake at the beginning, so no changes would have to be made to the guest kernel or QEMU itself.
This is useful for standardizing the Android Emulator drivers in the short term, but in the long term, it could be useful for quickly specifying new drivers/devices in situations where the developer has some control over both the guest/host bits. We'd use this also for:

Media codecs: the guest is given a pointer to host codec input/output buffers, downloads compressed data to the input buffer, and ioctl ping's the host. Then the host asynchronously decodes and populates the codec output buffer.
One-off extension functionalities for Vulkan, such as VK_KHR_external_memory_fd/win32. Suppose we want to simulate an OPAQUE_FD Vulkan external memory in the guest, but we are running on a win32 host (this will be an important part of our use case). Define a new driver type in userspace, say enum VulkanOpaqueFdWrapper = 55, then open virtio-userspace and run ioctls to define that fd as that kind of driver. On the host side, it would then associate the filp with a host-side win32 vulkan handle. This can be a modular way to handle further functionality in Vulkan as it comes up without requiring kernel / QEMU changes.

Add a raw ioctl for the above control messages to the proposed host memory sharing driver, and make those control messages part of the host memory sharing driver's virtqueue.

I heard somewhere that having this kind of thing might run up against virtio design philosophies of having fewer 'generic' pipes; however, it could be valuable to have a generic way of defining driver/device functionality that is configurable without needing to change guest kernels / qemu directly.

> I saw there were registers added, could you please elaborate how new address
> regions are added and associated with the host memory (and backwards)?

In virtio-fs we have two separate stages:
Â a) A shared arena is setup (and that's what the spec Stefan pointed to is about) -
Â Â Âit's statically allocated at device creation and corresponds to a chunk
Â Â Âof guest physical address space

This is quite like what we're doing for goldfish address space and Vulkan host visible currently.

Our address space device reserves a fixed region in guest physical address space on device realization. 16 gb
ÂAt the level of Vulkan, on Vulkan device creation, we map a sizable amount of host visible memory on the host, and then use the address space device to expose it to the guest. It then occupies some offset into the address space device's pci resource.
At the level of the guest Vulkan user, we satisfy host visible VkDeviceMemory allocations by faking them; creating guest-only handles and suballocating into that initial host visible memory, and then editing memory offset/size parameters to correspond to the actual memory before the API calls get to the host driver.

Â b) During operation the guest kernel asks for files to be mapped into
Â Â Âpart of that arena dynamically, using commands sent over the queue
Â Â Â- our queue carries FUSE commands, and we've added two new FUSE
Â Â Âcommands to perform the map/unmap.Â They talk in terms of offsets
Â Â Âwithin the shared arena, rather than GPAs.

Yes, we'll most likely be operating in a similar manner for OpenGL and VUlkan.Â

So I'd tried to start by doing the spec for (a).

> We allocate a region from the guest first and pass its offset to the
> host to plug
> real RAM into it and then we mmap this offset:
>
> https://photos.app.goo.gl/NJvPBvvFS3S3n9mn6

How do you transmit the glMapBufferRange command from QEMU driver to
host?

This is done through an ioctl in the address space driver together with meta pipe commands:

Using the address space driver, run an ioctl to "Allocate" a region, which reserves some space. An offset into the region is returned.
Using the meta pipe drier, tell the host about the offset and the API call parameters of glMapBufferRange. On the host, glMapBufferRange is run for real, and the resulting host pointer is sent to KVM_SET_USER_MEMORY_REGIONÂ+ pci resource startÂ+ that offset.
Âmmap the region with the supplied offset in the guest.

Dave

> Thank you.
>
> Regards,
> Roman.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
> For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
>
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

virtio-dev message