[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: Re: [virtio-dev] Re: [RFC PATCH v6] virtio-video: Add virtio video device specification
On 17.04.23 16:43, Cornelia Huck wrote:
On Mon, Apr 17 2023, Alexander Gordeev <alexander.gordeev@opensynergy.com> wrote:Hi Alexandre, Thanks for you letter! Sorry, it took me some time to write an answer. First of all I'd like to describe my perspective a little bit because it seems, that in many cases we (and other people writing their feedbacks) simply have very different priorities and background.Thank you for describing the environment you want to use this in, this helps to understand the different use cases.
Yeah, I hope too. I should have done that earlier. I think Dmitry described our use-case, but it was several years ago, so nobody remembers of course. We should reiterate all the use-cases once in a while.
OpenSynergy, the company that I work for, develops a proprietary hypervisor called COQOS mainly for automotive and aerospace domains. We have our proprietary device implementations, but overall our goal is to bring open standards into these quite closed domains and we're betting big on virtio. The idea is to run safety-critical functions like cockpit controller alongside with multimedia stuff in different VMs on the same physical board. Right now they have it on separate physical devices. So they already have maximum isolation. And we're trying to make this equally safe on a single board. The benefit is the reduced costs and some additional features. Of course, we also need features here, but at the same time security and ease of certification are among the top of our priorities. Nobody wants cars or planes to have security problems, right? Also nobody really needs DVB and even more exotic devices in cars and planes AFAIK. For the above mentioned reasons our COQOS hypervisor is running on bare metal. Also memory management for the guests is mostly static. It is possible to make a shared memory region between a device and a driver managed by device in advance. But definitely no mapping of random host pages on the fly is supported. AFAIU crosvm is about making Chrome OS more secure by putting every app in its own virtualized environment, right? Both the host and guest are linux. In this case I totally understand why V4L2 UAPI pass-through feels like a right move. I guess, you'd like to make the switch to virtualized apps as seemless as possible for your users. If they can't use their DVBs anymore, they complain. And adding the virtualization makes the whole thing more secure anyway. So I understand the desire to have the range of supported devices as broad as possible. It is also understandable that priorities are different with desktop virtualization. Also I'm not trying to diminish the great work, that you have done. It is just that from my perspective this looks like a step in the wrong direction because of the mentioned concerns. So I'm going to continue being a skeptic here, sorry. Of course, I don't expect that you continue working on the old approach now as you have put that many efforts into the V4L2 UAPI pass-through. So I think it is best to do the evolutionary changes in scope of virtio video device specification, and create a new device specification (virtio-v4l2 ?) for the revolutionary changes. Then I'd be glad to continue the virtio-video development. In fact I already started making draft v7 of the spec according to the comments. I hope it will be ready for review soon. I hope this approach will also help fix issues with virtio-video spec and driver development misalignment as well as V4L2 compliance issues with the driver. I believe the problems were caused partly by poor communication between us and by misalignment of our development cycles, not by the driver complexity. So in my opinion it is OK to have different specs with overlapping functionality for some time. My only concern is if this would be accepted by the community and the committee. How the things usually go here: preferring features and tolerating possible security issues or the other way around? Also how acceptable is having linux-specific protocols at all?My main question is: What would be something that we can merge as a spec, that would either cover the different use cases already, or that could be easily extended to cover the use cases it does not handle initially? For example, can some of the features that would be useful in crosvm be tucked behind some feature bit(s), so that the more restricted COQOS hypervisor would simply not offer them? (Two feature bits covering two different mechanisms, like the current approach and the v4l2 approach, would also be good, as long as there's enough common ground between the two.) If a staged approach (adding features controled by feature bits) would be possible, that would be my preferred way to do it.
Hmm, I see several ways how we can use the feature flags: 1. Basically making two feature flags: one for the current video spec and one for the V4L2 UAPI pass through. Kind of the same as having two different specs, but within one device. Not sure which way is better. Probably having two separate devices would be easier to review and merge. 2. Finding a subset of V4L2, that closely matches the current draft, and restrict everything else. A perfectly reasoned answer for this case will require a lot of work going through all the V4L2 structures I think. And even if we have a concrete plan, it has to be implemented first. I doubt it is possible. Based on the things, that I already know, this is going to be a compromise on security anyway, so we're not happy about that. More on that below. 3. Stop trying to simply pass the V4L2 UAPI through and focus on making the virtio video spec as close to the V4L2 UAPI as possible, but with the appropriate security model. So that the video device can be extended with a feature flag to something very close to full V4L2 UAPI. A lot of work as well, I think. And this won't allow us to simply link the V4L2 UAPI in the spec and therefore reduce its size, which is Alexandre's current goal. So Alexandre and his team are not happy this way probably. From the security point of view these are our goals from most to less important AFAIU: 1. Make the device secure. If a device is compromised, the whole physical machine is at risk. Complexity is the enemy here. It helps a lot to make the device as straightforward and easy to implement as possible. Therefore it is good to make the spec device/function-centric from this PoV. 2. Ensure, that drivers are also secure at least from user-space side. Maybe from device side too. 3. Implementing secure playback and making sure media doesn't leak. For this case it is nice to have these object UUIDs as buffers. Please correct me if there's something wrong here. When we start looking from this perspective even things like naming video buffer queues "output" for device input and "capture" for device output are problematic. In my experience this naming scheme takes some time and for sure several coding mistakes to get used to. Unfortunately this can't be turned off with some feature flags. In contrast virtio video v6 names these queues "input" and "output". This is perfectly fine if we look from the device side. It is understandable, that Alexandre's list of differences between V4L2 UAPI and the current state of virtio video doesn't include these things. But we have to count them, I think. That's why it takes me so long to make the list. :) So throwing away this simplicity is still going to be a compromise from the security perspective, that we're not happy about. This is mostly because V4L2 UAPI brings a hard dependency on V4L2, its security model, legacy, use-cases, developers, etc. It can be changed over time, but this is a long process because this means changing the Linux UAPI. Also this means, that nothing can be removed from it, only added (if the V4L2 community agrees). For example, we can't simply add a new way of sharing buffers for potential new use-cases once we switch to V4L2 UAPI AFAIU. The V4L2 community can simply reject changes because this is a UAPI after all. We kind of have a weak dependency already, because the only driver implementation is based on V4L2 and we'd like to keep the spec as close to V4L2 as possible, but it is not the same thing. So at the moment it looks like the V4L2 UAPI proposal is not super flexible. Alexandre said, that we can simply not implement some of the ioctls. Well, this definitely doesn't cover all the complexity like the structures and other subtle details. Also adding the feature flags would probably defeat the stated purpose of switching to V4L2 UAPI anyway: the simplicity of the spec and of the V4L2 driver. So I have a lot of doubts about the feasibility of adding feature flags. If Alexandre and his team want the V4L2 UAPI as is, then looks like it is best to simply have two specs: 1. virtio-video for those interested in building from ground up with the security model appropriate for virtualization in mind. This is going to take time and not going to reach feature parity with V4L2 ever I think. I mean some old devices might never get support this way. 2. virtio-v4l2 as a shortcut for those interested in having feature parity with V4L2 fast. Like a compatibility layer. Probably this is going to be used in linux host + linux guest use-cases only. Maybe it gets obsoleted by the first spec in several years for most modern use-cases. Or maybe have these two cases within a single device spec as I wrote above. This makes a lot of sense to me. If V4L2 UAPI pass through is in fact only needed for compatibility, then this way we can avoid a lot of work going through all of the V4L2 and trying to find different subsets or trying to construct something, that is close to V4L2 UAPI, but doesn't compromise on the security. I'm not really interested in doing all this work because we're already more or less satisfied with the current state. We don't need feature parity with V4L2. On the other hand for Alexandre the feature-parity with V4L2 is clearly of higher priority, than all these subtle security model differences. In my opinion it also doesn't make sense to invest that much time in something, that looks like a compatibility layer. So it seems both of us are interested in avoiding all this extra work. Then I'd just prefer to have two different specs so that everyone can work according to their priorities.
Regarding the protocol: I think Linux-originating protocols (that can be implemented on non-Linux setups) are fine, Linux-only protocols probably not so much.
Thanks for the information. Well, it looks like the V4L2 UAPI could be implemented on any platform unless it needs a completely new way of memory management since all the V4L2_MEMORY_* constants are going to be used already AFAIU.
a. So V4L2 subsystem and the current virtio-video driver are already reducing the complexity. And this seems as the right place to do this, because the complexity is caused by the amount of V4L2 use cases and its legacy. If somebody wants to use virtio-video in a Windows guest, they would prefer a simpler API, right? I think this use-case is not purely abstract at all.The V4L2 subsystem is there to factorize code that can be shared between drivers and manage their internal state. Our target is the V4L2 UAPI, so a Windows driver needs not be concerned about these details - it does what it would have done with virtio-video, and just uses the V4L2 structures to communicate with the host instead of the virtio-video ones.It can also reuse the virtio-video structures. So I think despite the ability to reuse V4L2 structures, having to implement a linux-specific interface would still be a bigger pain.Hm. Do the v4l2 structures drag in too many adjacent things that need to be implemented? Can we match the video-video structures from the current proposal with some v4l2 structures and extract a common wrapper for those that match, with a feature-bit controlled backend? It would be fine if any of those backends supported a slightly different subset of the common parts, as long as the parts implemented by both would be enough to implement a working device. (Mostly thinking out loud here.)
I don't think this is realistic unfortunately. On per ioctl level it is possible to disable some functionality probably, but the V4L2 structures are set in stone. We can only extend them.
The guest driver that I wrote is, I think, a good example of the complexity you can expect in terms of guest driver size (as it is pretty functional already with its 1000 and some LoCs). For the UAPI complexity, the host device basically unpacks the information it needs and rebuilds the V4L2 structures before calling into the host device, and I don't see this process as more complex that the unpacking of virtio-video structs which we also did in crosvm.Unfortunately our hypervisor doesn't support mapping random host pages in the guest. Static allocations of shared memory regions are possible. But then we have to tell V4L2 to allocate buffers there. Then we'll need a region per virtual device. This is just very tedious and inflexible. That's why we're mainly interested in having the guest pages sharing in the virtio video spec.This really sounds like you'll want a different approach -- two mechanisms covered by two feature bits might indeed be the way to go.
Well, basically this is the way we have it now. I'm not sure what is Alexandre's plan with the V4L2 UAPI approach. And if this is going to be solved, the solution already doesn't look future-proof anyway unfortunately.
I hope I have somehow addressed your points. The main point here is to discuss whether the V4L2 UAPI is a suitable transport for guest/host accelerated codec work, regardless of what the guest or host ultimately uses as UAPI. The goal of the PoC is to demonstrate that this is a viable solution. This PoC is largely simplified by the fact that V4L2 is used all along the way, but this is irrelevant - yes, actual devices will likely talk to other APIs and maintain more state, like a virtio-video device would do. What I want to demonstrate is that we can send encoding work and receive a valid stream, and that it is not costly, and only marginally more complex than our virtio-video spec attempts. ... and we can support cameras too, but that's just a convenient side-effect, not the ultimate solution to the camera virtualization problem (that's for the camera folks to decide).Thanks for your answer!Thanks everyone -- do you think the "two feature bits to cover different approaches, but using a common infrastructure" idea could work? If yes, I think that's the direction we should take. If we can implement this with just one feature bit, that might also be a good route to extend it later, but I'm not familiar enough with the whole infrastructure to make any judgement here.
Thanks for your suggestions. Hopefully we end up with a good solution. -- Alexander Gordeev Senior Software Engineer OpenSynergy GmbH Rotherstr. 20, 10245 Berlin Phone: +49 30 60 98 54 0 - 88 Fax: +49 (30) 60 98 54 0 - 99 EMail: alexander.gordeev@opensynergy.com www.opensynergy.com Handelsregister/Commercial Registry: Amtsgericht Charlottenburg, HRB 108616B GeschÃftsfÃhrer/Managing Director: RÃgis Adjamah Please mind our privacy notice<https://www.opensynergy.com/datenschutzerklaerung/privacy-notice-for-business-partners-pursuant-to-article-13-of-the-general-data-protection-regulation-gdpr/> pursuant to Art. 13 GDPR. // Unsere Hinweise zum Datenschutz gem. Art. 13 DSGVO finden Sie hier.<https://www.opensynergy.com/de/datenschutzerklaerung/datenschutzhinweise-fuer-geschaeftspartner-gem-art-13-dsgvo/>
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]