[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]
Subject: Re: [virtio-dev] Re: [RFC PATCH v6] virtio-video: Add virtio video device specification
On Thu, Apr 27, 2023 at 12:52âAM Alexander Gordeev <alexander.gordeev@opensynergy.com> wrote: > > On 21.04.23 06:02, Alexandre Courbot wrote: > > On Wed, Apr 19, 2023 at 4:39âPM Alexander Gordeev > > <alexander.gordeev@opensynergy.com> wrote: > >> > >> On 17.04.23 16:43, Cornelia Huck wrote: > >>> On Mon, Apr 17 2023, Alexander Gordeev <alexander.gordeev@opensynergy.com> wrote: > >>> > >>>> OpenSynergy, the company that I work for, develops a proprietary > >>>> hypervisor called COQOS mainly for automotive and aerospace domains. We > >>>> have our proprietary device implementations, but overall our goal is to > >>>> bring open standards into these quite closed domains and we're betting > >>>> big on virtio. The idea is to run safety-critical functions like cockpit > >>>> controller alongside with multimedia stuff in different VMs on the same > >>>> physical board. Right now they have it on separate physical devices. So > >>>> they already have maximum isolation. And we're trying to make this > >>>> equally safe on a single board. The benefit is the reduced costs and > >>>> some additional features. Of course, we also need features here, but at > >>>> the same time security and ease of certification are among the top of > >>>> our priorities. Nobody wants cars or planes to have security problems, > >>>> right? Also nobody really needs DVB and even more exotic devices in cars > >>>> and planes AFAIK. > >>>> > >>>> For the above mentioned reasons our COQOS hypervisor is running on bare > >>>> metal. Also memory management for the guests is mostly static. It is > >>>> possible to make a shared memory region between a device and a driver > >>>> managed by device in advance. But definitely no mapping of random host > >>>> pages on the fly is supported. > >>>> > >>>> AFAIU crosvm is about making Chrome OS more secure by putting every app > >>>> in its own virtualized environment, right? Both the host and guest are > >>>> linux. In this case I totally understand why V4L2 UAPI pass-through > >>>> feels like a right move. I guess, you'd like to make the switch to > >>>> virtualized apps as seemless as possible for your users. If they can't > >>>> use their DVBs anymore, they complain. And adding the virtualization > >>>> makes the whole thing more secure anyway. So I understand the desire to > >>>> have the range of supported devices as broad as possible. It is also > >>>> understandable that priorities are different with desktop > >>>> virtualization. Also I'm not trying to diminish the great work, that you > >>>> have done. It is just that from my perspective this looks like a step in > >>>> the wrong direction because of the mentioned concerns. So I'm going to > >>>> continue being a skeptic here, sorry. > >>>> > >>>> Of course, I don't expect that you continue working on the old approach > >>>> now as you have put that many efforts into the V4L2 UAPI pass-through. > >>>> So I think it is best to do the evolutionary changes in scope of virtio > >>>> video device specification, and create a new device specification > >>>> (virtio-v4l2 ?) for the revolutionary changes. Then I'd be glad to > >>>> continue the virtio-video development. In fact I already started making > >>>> draft v7 of the spec according to the comments. I hope it will be ready > >>>> for review soon. > >>>> > >>>> I hope this approach will also help fix issues with virtio-video spec > >>>> and driver development misalignment as well as V4L2 compliance issues > >>>> with the driver. I believe the problems were caused partly by poor > >>>> communication between us and by misalignment of our development cycles, > >>>> not by the driver complexity. > >>>> > >>>> So in my opinion it is OK to have different specs with overlapping > >>>> functionality for some time. My only concern is if this would be > >>>> accepted by the community and the committee. How the things usually go > >>>> here: preferring features and tolerating possible security issues or the > >>>> other way around? Also how acceptable is having linux-specific protocols > >>>> at all? > >>> My main question is: What would be something that we can merge as a > >>> spec, that would either cover the different use cases already, or that > >>> could be easily extended to cover the use cases it does not handle > >>> initially? > >>> > >>> For example, can some of the features that would be useful in crosvm be > >>> tucked behind some feature bit(s), so that the more restricted COQOS > >>> hypervisor would simply not offer them? (Two feature bits covering two > >>> different mechanisms, like the current approach and the v4l2 approach, > >>> would also be good, as long as there's enough common ground between the > >>> two.) > >>> > >>> If a staged approach (adding features controled by feature bits) would > >>> be possible, that would be my preferred way to do it. > >> > >> Hmm, I see several ways how we can use the feature flags: > >> 1. Basically making two feature flags: one for the current video spec > >> and one for the V4L2 UAPI pass through. Kind of the same as having two > >> different specs, but within one device. Not sure which way is better. > >> Probably having two separate devices would be easier to review and merge. > > > > Having two different devices with their own IDs would indeed be less > > confusing than using feature bits. > > > > That being said, the whole point of proposing virtio-v4l2 is to end up > > with *less* specification, not more. Having two concurrent and largely > > overlapping approaches will result in fragmentation and duplicated > > work, so my suggestion would be to decide on one or the other and > > stick to it. > > Hmm, Enrico pointed out, that having virtio-v4l2 would also be good > because of much better compatibility with Android right now. I don't > think the specification length should be our ultimate goal. Cornelia > said, that her ultimate goal is to have a spec everyone is happy with, > regardless on how we arrive there. Well, I can only say, that I also > think this should be our goal. Try to put yourself into the shoes of someone who needs to write a new video device using virtio. Oh, there are two ways to do it. This guest OS only supports virtio-video. But this guest OS only supports virtio-v4l2. Great, now you need to support two interfaces with your device, or write two different devices. > > >> 2. Finding a subset of V4L2, that closely matches the current draft, and > >> restrict everything else. A perfectly reasoned answer for this case will > >> require a lot of work going through all the V4L2 structures I think. And > > > > It's pretty trivial, on the contrary. For a decoder device one would > > just need to restrict to the structs and operations mentioned here: > > https://www.kernel.org/doc/html/v5.18/userspace-api/media/v4l/dev-decoder.html > > I don't agree. But we already talk about this in other threads. > > >> even if we have a concrete plan, it has to be implemented first. I doubt > >> it is possible. Based on the things, that I already know, this is going > >> to be a compromise on security anyway, so we're not happy about that. > >> More on that below. > >> 3. Stop trying to simply pass the V4L2 UAPI through and focus on making > >> the virtio video spec as close to the V4L2 UAPI as possible, but with > >> the appropriate security model. So that the video device can be extended > >> with a feature flag to something very close to full V4L2 UAPI. A lot of > >> work as well, I think. And this won't allow us to simply link the V4L2 > >> UAPI in the spec and therefore reduce its size, which is Alexandre's > >> current goal. So Alexandre and his team are not happy this way probably. > > > > That would indeed be reinventing the wheel and a completely pointless > > exercise IMHO. > > I agree. > > >> From the security point of view these are our goals from most to less > >> important AFAIU: > >> 1. Make the device secure. If a device is compromised, the whole > >> physical machine is at risk. Complexity is the enemy here. It helps a > >> lot to make the device as straightforward and easy to implement as > >> possible. Therefore it is good to make the spec device/function-centric > >> from this PoV. > > > > This seems to be the main point of contention we have right here. I do > > not believe that V4L2 introduces significant complexity to the video > > use-case, and certainly not to a degree that would justify writing a > > new spec just for that. > > Ok, then we should try to agree on a benchmark, I think. > > >> 2. Ensure, that drivers are also secure at least from user-space side. > >> Maybe from device side too. > > > > FWIW V4L2 has been in use for a looong time (including in Chromebooks > > that do secure playback) and I am not aware of fundamental security > > issues with it. > > Already being discussed separately... > > >> 3. Implementing secure playback and making sure media doesn't leak. For > >> this case it is nice to have these object UUIDs as buffers. > >> > >> Please correct me if there's something wrong here. > >> > >> When we start looking from this perspective even things like naming > >> video buffer queues "output" for device input and "capture" for device > >> output are problematic. In my experience this naming scheme takes some > >> time and for sure several coding mistakes to get used to. Unfortunately > >> this can't be turned off with some feature flags. In contrast virtio > >> video v6 names these queues "input" and "output". This is perfectly fine > >> if we look from the device side. It is understandable, that Alexandre's > >> list of differences between V4L2 UAPI and the current state of virtio > >> video doesn't include these things. But we have to count them, I think. > >> That's why it takes me so long to make the list. :) So throwing away > >> this simplicity is still going to be a compromise from the security > >> perspective, that we're not happy about. > >> > >> This is mostly because V4L2 UAPI brings a hard dependency on V4L2, its > >> security model, legacy, use-cases, developers, etc. It can be changed > >> over time, but this is a long process because this means changing the > >> Linux UAPI. Also this means, that nothing can be removed from it, only > >> added (if the V4L2 community agrees). For example, we can't simply add a > >> new way of sharing buffers for potential new use-cases once we switch to > >> V4L2 UAPI AFAIU. > > > > You certainly can, it would just need to be in the virtio > > specification. And again that's a very hypothetical case. > > Ok, maybe it is hypothetical. We won't know until it happens. I still > don't like it. Overriding things in the virtio specification is not > particularly nice. It is much easier to understand when things are kept > together, not as patches. > > >> The V4L2 community can simply reject changes because > >> this is a UAPI after all. We kind of have a weak dependency already, > >> because the only driver implementation is based on V4L2 and we'd like to > >> keep the spec as close to V4L2 as possible, but it is not the same > >> thing. So at the moment it looks like the V4L2 UAPI proposal is not > >> super flexible. Alexandre said, that we can simply not implement some of > >> the ioctls. Well, this definitely doesn't cover all the complexity like > >> the structures and other subtle details. > > > > Mmm? Where did I say that? That sounds like a misunderstanding. > > Here is the quote from your email on March 17: > > > V4L2 has a larger UAPI surface because it manages more kinds of > > devices, but drivers only need to implement the ioctls they need. For > > the rest, they just return -ENOTTY, and evil actors are hopefully kept > > at bay. > > > >> Also adding the feature flags would probably defeat the stated purpose > >> of switching to V4L2 UAPI anyway: the simplicity of the spec and of the > >> V4L2 driver. > >> > >> So I have a lot of doubts about the feasibility of adding feature flags. > >> If Alexandre and his team want the V4L2 UAPI as is, then looks like it > >> is best to simply have two specs: > >> 1. virtio-video for those interested in building from ground up with the > >> security model appropriate for virtualization in mind. This is going to > >> take time and not going to reach feature parity with V4L2 ever I think. > >> I mean some old devices might never get support this way. > >> 2. virtio-v4l2 as a shortcut for those interested in having feature > >> parity with V4L2 fast. Like a compatibility layer. Probably this is > >> going to be used in linux host + linux guest use-cases only. Maybe it > >> gets obsoleted by the first spec in several years for most modern use-cases. > >> > >> Or maybe have these two cases within a single device spec as I wrote above. > >> > >> This makes a lot of sense to me. If V4L2 UAPI pass through is in fact > >> only needed for compatibility, then this way we can avoid a lot of work > >> going through all of the V4L2 and trying to find different subsets or > >> trying to construct something, that is close to V4L2 UAPI, but doesn't > >> compromise on the security. I'm not really interested in doing all this > >> work because we're already more or less satisfied with the current > >> state. We don't need feature parity with V4L2. On the other hand for > >> Alexandre the feature-parity with V4L2 is clearly of higher priority, > >> than all these subtle security model differences. In my opinion it also > >> doesn't make sense to invest that much time in something, that looks > >> like a compatibility layer. So it seems both of us are interested in > >> avoiding all this extra work. Then I'd just prefer to have two different > >> specs so that everyone can work according to their priorities. > >> > >> > >>> Regarding the protocol: I think Linux-originating protocols (that can be > >>> implemented on non-Linux setups) are fine, Linux-only protocols probably > >>> not so much. > >> > >> Thanks for the information. Well, it looks like the V4L2 UAPI could be > >> implemented on any platform unless it needs a completely new way of > >> memory management since all the V4L2_MEMORY_* constants are going to be > >> used already AFAIU. > >> > >> > >>>>>> a. So V4L2 subsystem and the current virtio-video driver are already > >>>>>> reducing the complexity. And this seems as the right place to do this, > >>>>>> because the complexity is caused by the amount of V4L2 use cases and its > >>>>>> legacy. If somebody wants to use virtio-video in a Windows guest, they > >>>>>> would prefer a simpler API, right? I think this use-case is not purely > >>>>>> abstract at all. > >>>>> The V4L2 subsystem is there to factorize code that can be shared > >>>>> between drivers and manage their internal state. Our target is the > >>>>> V4L2 UAPI, so a Windows driver needs not be concerned about these > >>>>> details - it does what it would have done with virtio-video, and just > >>>>> uses the V4L2 structures to communicate with the host instead of the > >>>>> virtio-video ones. > >>>> It can also reuse the virtio-video structures. So I think despite the > >>>> ability to reuse V4L2 structures, having to implement a linux-specific > >>>> interface would still be a bigger pain. > >>> Hm. Do the v4l2 structures drag in too many adjacent things that need to > >>> be implemented? Can we match the video-video structures from the current > >>> proposal with some v4l2 structures and extract a common wrapper for > >>> those that match, with a feature-bit controlled backend? It would be > >>> fine if any of those backends supported a slightly different subset of > >>> the common parts, as long as the parts implemented by both would be > >>> enough to implement a working device. (Mostly thinking out loud here.) > >> > >> I don't think this is realistic unfortunately. On per ioctl level it is > >> possible to disable some functionality probably, but the V4L2 structures > >> are set in stone. We can only extend them. > >> > >> > >>>>> The guest driver that I wrote is, I think, a good example of the > >>>>> complexity you can expect in terms of guest driver size (as it is > >>>>> pretty functional already with its 1000 and some LoCs). For the UAPI > >>>>> complexity, the host device basically unpacks the information it needs > >>>>> and rebuilds the V4L2 structures before calling into the host device, > >>>>> and I don't see this process as more complex that the unpacking of > >>>>> virtio-video structs which we also did in crosvm. > >>>> Unfortunately our hypervisor doesn't support mapping random host pages > >>>> in the guest. Static allocations of shared memory regions are possible. > >>>> But then we have to tell V4L2 to allocate buffers there. Then we'll need > >>>> a region per virtual device. This is just very tedious and inflexible. > >>>> That's why we're mainly interested in having the guest pages sharing in > >>>> the virtio video spec. > >>> This really sounds like you'll want a different approach -- two > >>> mechanisms covered by two feature bits might indeed be the way to go. > >> > >> Well, basically this is the way we have it now. I'm not sure what is > >> Alexandre's plan with the V4L2 UAPI approach. And if this is going to be > >> solved, the solution already doesn't look future-proof anyway unfortunately. > >> > >> > >>>>> I hope I have somehow addressed your points. The main point here is to > >>>>> discuss whether the V4L2 UAPI is a suitable transport for guest/host > >>>>> accelerated codec work, regardless of what the guest or host > >>>>> ultimately uses as UAPI. The goal of the PoC is to demonstrate that > >>>>> this is a viable solution. This PoC is largely simplified by the fact > >>>>> that V4L2 is used all along the way, but this is irrelevant - yes, > >>>>> actual devices will likely talk to other APIs and maintain more state, > >>>>> like a virtio-video device would do. What I want to demonstrate is > >>>>> that we can send encoding work and receive a valid stream, and that it > >>>>> is not costly, and only marginally more complex than our virtio-video > >>>>> spec attempts. > >>>>> > >>>>> ... and we can support cameras too, but that's just a convenient > >>>>> side-effect, not the ultimate solution to the camera virtualization > >>>>> problem (that's for the camera folks to decide). > >>>> Thanks for your answer! > >>> Thanks everyone -- do you think the "two feature bits to cover different > >>> approaches, but using a common infrastructure" idea could work? If yes, > >>> I think that's the direction we should take. If we can implement this > >>> with just one feature bit, that might also be a good route to extend it > >>> later, but I'm not familiar enough with the whole infrastructure to make > >>> any judgement here. > >> > >> Thanks for your suggestions. Hopefully we end up with a good solution. > > > > To summarize my position: > > > > * I am still not convinced that V4L2 is lacking from a security > > perspective. It would take just one valid example to change my mind > > (and no, the way the queues are named is not valid). And btw, if it > > really introduces security issues, then this makes it invalid for > > inclusion in virtio entirely, just not OpSy's hypervisor. > > Already being discussed separately... > > > * Having two overlapping specifications for video is overkill and will > > just fragment virtio (as tempting as it is, I won't link to XKCD). I > > strongly advise against that. > > I think they're not going to create more problems, than virtio-blk, > virtio-scsi and virtio-fs, for example. At least these devices work at different layers, that makes them more justifiable. virtio-video and virtio-v4l2 are just going to provide the same API for video devices, only with different structures and commands. > The decision can be made like this: > 1. You have a V4L2 device, you don't need any more processing, just want > it inside a Linux/Android VM => use virtio-v4l2. > 2. You don't have a V4L2 device, or your host is not Linux, or your > maybe your guest is not Linux/Android, or you want some extra processing > on the host (say you have a third-party proprietary library or whatever) > => use virtio-video. That would make sense if 2. could not be done just as easily by also using virtio-v4l2, which I believe it can be.
[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]