OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

virtio-dev message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [virtio-dev] Re: [RFC PATCH v6] virtio-video: Add virtio video device specification


On Thu, Apr 27, 2023 at 11:20âPM Alexander Gordeev
<alexander.gordeev@opensynergy.com> wrote:
>
> On 26.04.23 07:52, Alexandre Courbot wrote:
> > On Mon, Apr 24, 2023 at 4:52âPM Alexander Gordeev
> > <alexander.gordeev@opensynergy.com> wrote:
> >>
> >> On 21.04.23 18:01, Alexander Gordeev wrote:
> >>
> >>> On 21.04.23 06:02, Alexandre Courbot wrote:
> >>>
> >>>> * I am still not convinced that V4L2 is lacking from a security
> >>>> perspective. It would take just one valid example to change my mind
> >>>> (and no, the way the queues are named is not valid). And btw, if it
> >>>> really introduces security issues, then this makes it invalid for
> >>>> inclusion in virtio entirely, just not OpSy's hypervisor.
> >>>
> >>>
> >>> I'd like to start with this and then answer everything else later.
> >>>
> >>> Let's compare VIRTIO_VIDEO_CMD_RESOURCE_QUEUE with
> >>> VIDIOC_QBUF+VIDIOC_DQBUF. Including the parameters, of course. First,
> >>> let's compare the word count to get a very rough estimate of complexity.
> >>> I counted 585 words for VIRTIO_VIDEO_CMD_RESOURCE_QUEUE, including the
> >>> parameters. VIDIOC_QBUF+VIDIOC_DQBUF are defined together and take 1206
> >>> words, they both use struct v4l2_buffer as a parameter. The struct takes
> >>> 2716 words to be described. So the whole thing takes 3922 words. This is
> >>> 6.7 times more, than VIRTIO_VIDEO_CMD_RESOURCE_QUEUE. If we check the
> >>> definitions of the structs, it is also very obvious, that V4L2 UAPI is
> >>> almost like an order of magnitude more complex.
> >>
> >>
> >> I think, it is best to add all the steps necessary to reproduce my calculations just in case.
> >>
> >> VIRTIO_VIDEO_CMD_RESOURCE_QUEUE is doing essentially the same thing as VIDIOC_QBUF+VIDIOC_DQBUF, so we're comparing apples to apples (if we don't forget to compare their parameters too).
> >>
> >> To get the word count for the VIRTIO_VIDEO_CMD_RESOURCE_QUEUE I opened the rendered PDF of video section only from the first email in this thread. Here is the link: https://drive.google.com/file/d/1Sm6LSqvKqQiwYmDE9BXZ0po3XTKnKYlD/view?usp=sharing . Then I scrolled to page 11 and copied everything related a text file. This is around two pages in the PDF. Then I removed page numbers from the copied text and used 'wc -w' to count words.
> >>
> >> To get the word count for VIDIOC_QBUF+VIDIOC_DQBUF I opened this link: https://docs.kernel.org/userspace-api/media/v4l/vidioc-qbuf.html . Then I selected all the text except table of contents and did followed the same procedure.
> >>
> >> To get the word count for struct v4l2_buffer and other types, that are referenced from it, I opened this link: https://docs.kernel.org/userspace-api/media/v4l/buffer.html#struct-v4l2-buffer . Then I selected all the text except the table of contents and the text above struct v4l2_buffer definition. The rest is the same.
> >>
> >> Also it's quite obvious if you look at them how much bigger struct v4l2_buffer (including the referenced types) is compared to struct virtio_video_resource_queue.
> >
> > You are comparing not the complexity of the structures but the
> > verbosity of their documentation, which are written in a different
> > style, format, and by different people.
>
> I agree to some extent. At least this benchmark is simple and it
> provokes to actually go and look at the definitions, which IMO should be
> enough to already see the difference. What could be a better benchmark?
> Maybe counting the number of various fields and flags and enum cases,
> that one has to read through?

I give you another point of comparison literally 5 lines down my email.

>
> > And the V4L2 page also
> > contains the description of memory types, which is part of another
> > section in the virtio-video spec.
>
> You mean only enum v4l2_memory? Or anything else too?
>
> > There is no way to draw a meaningful
> > conclusion from this.
> >
> > If you want to compare, do it with how the structures are actually
> > used. Here is how you would queue an input buffer with virtio-video:
> >
> >    struct virtio_video_resource_queue queue_buf = {
> >        .cmd_type = VIRTIO_VIDEO_CMD_RESOURCE_QUEUE,
> >        .stream_id = 42,
> >        .queue_type = VIRTIO_VIDEO_QUEUE_TYPE_INPUT,
> >        .resource_id = 1,
> >        .timestamp = 0x10,
> >        .data_sizes = {
> >          [0] = 0x1000,
> >        },
> >    };
> >
> > Now the same with virtio-v4l2:
> >
> >    struct virtio_v4l2_queue_buf queue_buf = {
> >        .cmd = VIRTIO_V4L2_CMD_IOCTL,
> >        .code = VIDIOC_QBUF,
> >        .session_id = 42,
> >        .buffer.type = V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE,
> >        .buffer.index = 1,
> >        .buffer.timestamp.tv_usec = 0x10,
> >        .buffer.memory = V4L2_MEMORY_MMAP,
> >        .planes = {
> >          [0] = { .bytesused = 0x1000 },
> >        }
> >    };
> >
> > In both cases, you pass a structure with some members set, and the
> > rest to 0. The host receives basically the same thing - it's the same
> > data! The only difference is how it is laid out.
>
> How do I know this from the text? How do I verify, that this is correct?
> Can I be sure this code is going to work or not between any device and
> any driver?
>
> With virtio-video you can just go to the spec/header file, take the
> struct definition and simply set all the fields.
>
> With V4L2 UAPI I don't see any other way except going through the whole
> buffer.html file, filtering potentially irrelevant stuff, then doing
> some trials, and maybe looking into the device or driver code. Maybe
> also asking you for an advice?

The relevant fields for each decoding operation are clearly detailed
in https://www.kernel.org/doc/html/v5.18/userspace-api/media/v4l/dev-decoder.html,
which I already linked to multiple times.

> I think it is worth trying to imagine you're a newcomer in the project.
> Or a security auditor.
>
> So I think we can't draw conclusions about the spec quality from these
> code samples.
>
> > Also as mentioned by Bart, the apparent simplicity of
> > VIRTIO_VIDEO_CMD_RESOURCE_QUEUE, which does not require a dequeue
> > operation as the dequeue is sent with its response, is actually a
> > fallacy: that design choice makes the specification simpler, but at
> > the cost of more complexity on the device side and the potential to
> > starve on command descriptors. By contrast, adopting the V4L2 model
> > resulted in simpler code on both sides and no possibility to deadlock.
> > That point could be addressed by revising the virtio-video spec, but
> > then you get even closer to V4L2.
>
> Hmm, I understand, thanks. Well, I can fix this in the spec. I have some
> ideas. I think I'd better describe them in an answer to Bart's email.
>
> >> Do we agree now, that V4L2 UAPI is not only marginally more complex?
> >
> > No, and I rest my case on the code samples above.
> >
> >>>
> >>>
> >>> Also please read:
> >>>
> >>> https://medium.com/starting-up-security/evidence-of-absence-8148958da092
> >>>
> >>
> >> This reference probably needs a clarification. You argued, that V4L2 has a good track record so far. Here is the quote:
> >>
> >>> FWIW V4L2 has been in use for a looong time (including in Chromebooks
> >>> that do secure playback) and I am not aware of fundamental security
> >>> issues with it.
> >>
> >> But absence of found major security issues doesn't tell us much about the number of not found ones. Absence of evidence is not an evidence of absence. At the link above a former Director of Security at Facebook shares his thoughts about what could be a good evidence of absence of major security problems.
> >
> > You are just FUDing now.
>
> These are well established security concepts AFAIK. Are you gaslighting?
>
> > The exact same argument could be made about
> > virtio-video. If you want to borrow from the security world, I can ask
> > you how going with virtio-video when V4L2 exists is different from
> > rolling your own crypto.
>
> It is very different because obviously neither of them is a crypto.
>
> >> So a bug bounty program with high premiums that covers V4L2 would be a better argument in favor of *already written code* in my opinion. Not for new code. Also probably it is also an argument in favor of the spec, that is the V4L2 UAPI. Like that it is polished enough. Not so sure about that though.
> >>
> >> There actually are several bug bounty programs, that cover the kernel. These are Google's kctf, ZDI's pwn2own, and zerodium AFAIK. However the premiums are not even close to the ones mentioned in my reference. Anyway this means, that using *the existing V4L2 code in the kernel* is probably OK. But this creates some limitations if we want the actual code to still be covered with these bug bounties, right? This means, that the host OS has to be Linux and the actual hardware has to be exposed through a stable V4L2 driver, that is in mainline for some time, and there has to be no or little processing on top. For us this is not possible unfortunately. In the end both things could be secure:
> >>
> >> 1. V4L2 pass through can be secure because of the bug bounty programs and a lot of attention to the kernel in general.
> >> 2. For the new code this doesn't work, so the spec should be as simple and device-centric as possible. Because, all other things being equal, there are fewer errors in simpler programs. So defining a subset of V4L2 UAPI including the data types looks like a good idea to me. The stateful decoder interface, that you point to, does not define a subset in the data types.
> >
> > Wouldn't you have exactly the same problem by using a new guest-host
> > protocol?
>
> I think this is far-fetched. Especially if our new protocol is
> intentionally kept close to the reference protocol with a few things
> made better and simpler. That would be my preferred route.
>
> > Are you going to start a bounty program for virtio-video to
> > get an assurance that it is secure?
>
> I'm not sure it is possible because the device is proprietary. At least
> the driver is going to be covered hopefully. So for the device we're
> going to rely on simplicity, tests and audits. But who knows. I'm not
> the one to make this decision.
>
> >> This is basically my reasoning.
> >>
> >> Also these two specs don't need to compete with each other. They have different limitations and they are for different audiences. If you check the XKCD's comic, it is about competing standards.
> >
> > They allow exactly the same thing (virtualization of video
> > decoding/encoding) and there isn't any use-case of virtio-video that
> > could not be covered equally well by V4L2. Reinventing a new video
> > specification is pointless and will lead to unneeded fragmentation.
>
> I don't agree with these statements.
>
> > We should also expect reticence from the Linux community to upstream
> > two virtual video drivers that basically do the same thing.
>
> This is hypothetical.
>
> --
> Alexander Gordeev
> Senior Software Engineer
>
> OpenSynergy GmbH
> Rotherstr. 20, 10245 Berlin
>
> Phone: +49 30 60 98 54 0 - 88
> Fax: +49 (30) 60 98 54 0 - 99
> EMail: alexander.gordeev@opensynergy.com
>
> www.opensynergy.com
>
> Handelsregister/Commercial Registry: Amtsgericht Charlottenburg, HRB 108616B
> GeschÃftsfÃhrer/Managing Director: RÃgis Adjamah
>
> Please mind our privacy notice<https://www.opensynergy.com/datenschutzerklaerung/privacy-notice-for-business-partners-pursuant-to-article-13-of-the-general-data-protection-regulation-gdpr/> pursuant to Art. 13 GDPR. // Unsere Hinweise zum Datenschutz gem. Art. 13 DSGVO finden Sie hier.<https://www.opensynergy.com/de/datenschutzerklaerung/datenschutzhinweise-fuer-geschaeftspartner-gem-art-13-dsgvo/>


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]