virtio-dev message

Subject: Re: [virtio-dev] Re: [RFC PATCH v6] virtio-video: Add virtio video device specification

From: Alexander Gordeev <alexander.gordeev@opensynergy.com>
To: Alexandre Courbot <acourbot@chromium.org>
Date: Tue, 16 May 2023 16:53:59 +0200

On 12.05.23 06:09, Alexandre Courbot wrote:

On Thu, May 11, 2023 at 5:50âPM Alexander Gordeev
<alexander.gordeev@opensynergy.com> wrote:


On 08.05.23 06:55, Alexandre Courbot wrote:

On Fri, May 5, 2023 at 8:55âPM Alexander Gordeev
<alexander.gordeev@opensynergy.com> wrote:


On 03.05.23 16:04, Cornelia Huck wrote:

On Fri, Apr 28 2023, Alexander Gordeev <alexander.gordeev@opensynergy.com> wrote:

On 27.04.23 15:16, Alexandre Courbot wrote:

But in any case, that's irrelevant to the guest-host interface, and I
think a big part of the disagreement stems from the misconception that
V4L2 absolutely needs to be used on either the guest or the host,
which is absolutely not the case.


I understand this, of course. I'm arguing, that it is harder to
implement it, get it straight and then maintain it over years. Also it
brings limitations, that sometimes can be workarounded in the virtio
spec, but this always comes at a cost of decreased readability and
increased complexity. Overall it looks clearly as a downgrade compared
to virtio-video for our use-case. And I believe it would be the same for
every developer, that has to actually implement the spec, not just do
the pass through. So if we think of V4L2 UAPI pass through as a
compatibility device (which I believe it is), then it is fine to have
both and keep improving the virtio-video, including taking the best
ideas from the V4L2 and overall using it as a reference to make writing
the driver simpler.


Let me jump in here and ask another question:

Imagine that, some years in the future, somebody wants to add a virtio
device for handling video encoding/decoding to their hypervisor.

Option 1: There are different devices to chose from. How is the person
implementing this supposed to pick a device? They might have a narrow
use case, where it is clear which of the devices is the one that needs to
be supported; but they also might have multiple, diverse use cases, and
end up needing to implement all of the devices.


I think in this case virtio-v4l2 should be used as a compatibility
device exclusively. This means discouraging increasing its complexity
even more with more patches in the spec. virtio-video should eventually
cover all the use-cases of V4L2, so I think it is reasonable to use it
in both complex use-cases and in simple use-cases, where there is no
decoder/encoder V4L2 device on the host.

Option 2: There is one device with various optional features. The person
implementing this can start off with a certain subset of features
depending on their expected use cases, and add to it later, if needed;
but the upfront complexity might be too high for specialized use cases.


I don't see that many negociable features we can provide for a
decoder/encoder device - at least not many that are not considered
basic (like guest buffers). In terms of provided features for codecs
virtio-video and virtio-v4l2 are essentially equivalent.


Actually I see a lot of potential in using the virtio feature flag
negotiation for virtio-video:

1. We already have some feature flags related to memory management.

2. I think it would be great to take V4L2 controls negotiation and turn
it into the feature flags negotiation. I really like this idea and I'd
like to implement it. Not all the controls at once, of course. Still it
would be very easy to port more given the existing process. They
correspond well enough to each other, I think. This way we don't need to
introduce something like the VIDIOC_QUERYCTRL/VIDIOC_QUERY_EXT_CTRL, we
don't need two mechanisms for feature negotiations (like it would be
with virtio-v4l2, right?), also all the features would be in one place.
Then we can directly reference some enums from V4L2, like
v4l2_mpeg_video_h264_profile or v4l2_mpeg_video_h264_level. That's what
I call taking the best from V4L2.

3. We can have things, that V4L2 doesn't support in their stateful UAPI.
For example, dequeuing output buffers in decoder order.


... provided the host can do that (i.e. has a stateless decoder
interface). What is the use-case for this btw?


We have discussed this already. All the relevant quotes can be seen
closer to the end of the email here:
https://markmail.org/message/skotipxlqfiijj7c#query:+page:1+mid:hvkyxsj4tjyq56hj+state:results

The question is more: do we want a decoder/encoder specification and
another one for cameras or do we want something that covers video
devices in general. And is it desirable to reuse an already existing
protocol for communication between a non-privileged entity and a
privileged one (V4L2) or define our own?


Please note, that virtio-video could be extended to support everything
V4L2 does in the future. Including cameras if necessary. So it can cover
video devices in general too.


That's a very, very bold claim, and if you do you will inevitably need
to add things that are not relevant to the codec case, which is your
complaint about V4L2. Not to mention the many new pages of spec that
this will require.


Hmm, you're right. My statement was not specific enough. Indeed V4L2
does a lot of stuff. I'll rephrase.
I think, virtio-video could be extended to support at least the more or
less basic devices like USB cameras, maybe a little more. Everything
beyond this is possible, but should be avoided.

About the latter point, Alex BennÃe mentioned that it was difficult to
find the engineering time to define virtio-camera. And virtio-video
has also not been particularly fast to come together. virtio-v4l2
basically serves us both on a plate for a lower effort.


If we talk only about codecs, the effort is lower only in case you have
V4L2 codecs on the host. Otherwise the effort seems higher.


The effort is lower in terms of spec writing, and also lower if your
guest is Linux as you can support all video devices with a single
driver. That's a very large portion of the virtio users here.


As we found out already, the V4L2 probably has to be patched to be
actually useful in virtio. So I'm not convinced that more complex
devices wouldn't require more patches. So in the end it is hard to
compare the effort.
For the spec I'm also not so sure about that. I think we'll have to see
it first and let it pass a couple of review rounds.
Anyway the effort writing the devices would be higher in cases, when
pass through is not possible. That's our use-case. I think the goal of
making it easy to write devices should have higher priority compared to
the length of the device part in the virtio spec.
But I think we already wrote all these points, so we're making circles.

I also hope to be able to update virtio-video at a faster pace. Please
let me try.


Please hold on a bit - there are two things here.

1) I'd like to settle the virtio-v4l2/virtio-video argument first to
make sure we don't get two things clashing head-on. As far as codecs
are concerned we certainly don't need both. Cornelia, I think we'll
need you to make a call on this, or at least tell us what you need to
make the call. If it helps I can send a draft of what the virtio-v4l2
spec would look like, it should be relatively short.

2) I (and other ChromeOS contributors) have been driving this spec so
far and while I think virtio-v4l2 is a better solution, I have not
said I would give up on virtio-video if virtio-v4l2 was not adopted
and will keep iterating on it in that case.


It actually very much looks like you gave up. I mean, you developed it
for years, and now you'd like to throw it away. Well, I meant something
like "please have some faith in my ability to update virtio-video at a
faster pace". I think I already have the permission to continue the
development. You know, OpenSynergy also spent some time on the
virtio-video. This includes the draft v1 version and the V4L2 driver. I
think this was kind of an informal agreement between us, that you do the
spec and we do the driver. It didn't work well enough since draft v4, I
think. Now as our interests are not aligned anymore, it is fine to
continue separately. I'm not going to wait until you change your mind on
virtio-v4l2. Obviously you're very busy with it right now. If I stop and
wait now, we won't have the video device in the 1.3 release for sure. I
think when we are ready to develop virtio-video together again, we'll
need a better agreement.
That said, I very much appreciate your efforts with the spec. It
definitely received a lot of improvements.

That said, your contributions are of course welcome if you have stuff
written down that you want to include. The current virtio-video spec
is available here:

https://github.com/Gnurou/virtio-video-spec

I'm writing it in Markdown (virtio-video.md) to avoid dealing with
LaTeX directly and use a pandoc filter to convert it before submission
- there is a Makefile that takes care of that. Feel free to send a
pull request with changes you have worked on, including your
Signed-off-by in the patches so I can carry it on into the v7 patch if
we go that route.


Sorry, I tried this, but I had a lot of troubles with any maths in
Markdown. So after some time I just moved to TeX. Now I have a great
development environment for TeX. I have the final pdf file on the right
side, and it gets updated every time I save a file. I can immediately
see the formatted result as it would be in the spec. So I'd prefer to
stay with TeX.

Also I'm not completely sure if the OASIS IPR license allows this kind
of workflow. Simply sending patches to the mailing list seems like a
safer choice. Could anyone please help understand if it is OK when
several developers work in a separate repository and then one of them
publishes the combined changes to the mailing list?

This is understandable, that working on the specs takes time. That's why
I'm all for making room for everybody to work. I think eventually
virtio-video or virtio-video + virtio-camera can replace virtio-v4l2. I
strongly believe this is a better solution for the long term.

In this case I'd prefer to have the simpler device first, that is the
current virtio-video, then to add features incrementally using feature
flags and taking into account the virtualization context. V4L2 is a
complex thing from a different context. They already tried to carve out
some of the use-cases like stateful decoder/encoder API, but this work
is not finished (struct v4l2_buffer can serve as an evidence). This is
like dissecting a monolith. Also it has to be patched to make it more
appropriate for virtualization (we can see this in Alexandre's PoC already).

Leaving concrete references to V4L2 out of the picture, we're currently
trying to decide whether our future will be more like Option 1 or Option
2, with their respective trade-offs.


I'd like to rely on opinions of people, who know more about virtio
development and goals. I would be happy to present or reiterate my
arguments to anyone interested if necessary.

I'm slightly biased towards Option 2; does it look feasible at all, or
am I missing something essential here? (I had the impression that some
previous confusion had been cleared up; apologies in advance if I'm
misrepresenting things.)


Indeed some of the previous confusion has been cleared up. But not the
key thing. Alexandre still claims, that this patched V4L2 UAPI pass
through is only marginally more complex, for example. I don't agree with
this and I have evidence. We haven't finished discussing this evidence.


Are you talking about v4l2_buffer?

https://www.kernel.org/doc/html/v4.9/media/uapi/v4l/buffer.html#struct-v4l2-buffer

I think you implied that some of its fields were not relevant for
video decoding or encoding, which if you examine them is again
incorrect. That also answers your question of why the stateful decoder
spec did not mention the valid fields - because it is documented on
this page, which tells exactly which fields the driver/device are
expected to set for each queue.


I'm talking about struct v4l2_buffer, yes, but not only. Also about
struct v4l2_plane, enum v4l2_buf_type, the buffer flags, enum
v4l2_memory (but this one is comparable to virtio-video), timecodes.
For example, the way the fields in the struct v4l2_buffer and struct
v4l2_plane are filled and interpreted depends a lot on the type. Here is
the enum v4l2_buf_type:

enum v4l2_buf_type {
          V4L2_BUF_TYPE_VIDEO_CAPTURE        = 1,
          V4L2_BUF_TYPE_VIDEO_OUTPUT         = 2,
          V4L2_BUF_TYPE_VIDEO_OVERLAY        = 3,
          V4L2_BUF_TYPE_VBI_CAPTURE          = 4,
          V4L2_BUF_TYPE_VBI_OUTPUT           = 5,
          V4L2_BUF_TYPE_SLICED_VBI_CAPTURE   = 6,
          V4L2_BUF_TYPE_SLICED_VBI_OUTPUT    = 7,
          V4L2_BUF_TYPE_VIDEO_OUTPUT_OVERLAY = 8,
          V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE = 9,
          V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE  = 10,
          V4L2_BUF_TYPE_SDR_CAPTURE          = 11,
          V4L2_BUF_TYPE_SDR_OUTPUT           = 12,
          V4L2_BUF_TYPE_META_CAPTURE         = 13,
          V4L2_BUF_TYPE_META_OUTPUT          = 14,
          /* Deprecated, do not use */
          V4L2_BUF_TYPE_PRIVATE              = 0x80,
};

Of these 14 cases we only need 2 for the codecs. Right?
Also the flags. There are 22 of them. Are they all needed too? I don't
think so. We only have like 5 in virtio-video at the moment.


Upon reading the V4L2 spec it is pretty clear which queue types are
valid for each device, and anything other than CAPTURE_MPLANE and
OUTPUT_MPLANE is unlikely to be used anyway.

We are really splitting hairs here and this looks like a wild goose
chase for software purity - even within virtio-video there are already
flags that only make sense for an encoder, and you cannot remove them
without defining new more specific structures and complicating things
overall. If you extend virtio-video to support more use-cases, you
will end up with more of these as well. So seriously, why is this such
a big deal when the instructions on how to use these structures for
each use case are at the other end of a link click?


I don't agree with this description. I have written my arguments several
times. It took me a lot of time to write all these emails. It is very
disappointing, that you still don't seem to take them seriously.

You posted code for filling v4l2_buffer in one of your previous emails.
What I'm trying to say is that a person, who doesn't know this in
advance, will have a hard time writing this same code if they only have
the virtio-v4l2 spec.


Well yes, the counterpart of virtio-v4l2 being shorter is that you
need to refer to V4L2 as well - that's actually the point, to reduce
the burden on virtio by reusing a spec that already exists and
referring to it.


As I wrote, I don't agree, that reducing the burden on virtio is our
primary goal.

Earlier in this email you mentioned reusing structs like
v4l2_mpeg_video_h264_profile in virtio-video, that creates the same
dependency to the V4L2 spec. The difference is how much we are taking
from it.


Yes, exactly. I think taking all of the V4L2 spec is too much. I'd
prefer to only take the parts, that are well-defined, well aligned with
virtio and don't need patches on top.

--
Alexander Gordeev
Senior Software Engineer

OpenSynergy GmbH
Rotherstr. 20, 10245 Berlin

Phone: +49 30 60 98 54 0 - 88
Fax: +49 (30) 60 98 54 0 - 99
EMail: alexander.gordeev@opensynergy.com

www.opensynergy.com

Handelsregister/Commercial Registry: Amtsgericht Charlottenburg, HRB 108616B
GeschÃftsfÃhrer/Managing Director: RÃgis Adjamah

Please mind our privacy notice<https://www.opensynergy.com/datenschutzerklaerung/privacy-notice-for-business-partners-pursuant-to-article-13-of-the-general-data-protection-regulation-gdpr/> pursuant to Art. 13 GDPR. // Unsere Hinweise zum Datenschutz gem. Art. 13 DSGVO finden Sie hier.<https://www.opensynergy.com/de/datenschutzerklaerung/datenschutzhinweise-fuer-geschaeftspartner-gem-art-13-dsgvo/>

Follow-Ups:
- Re: [virtio-dev] Re: [RFC PATCH v6] virtio-video: Add virtio video device specification
  - From: Cornelia Huck <cohuck@redhat.com>

References:
- Re: [virtio-dev] Re: [RFC PATCH v6] virtio-video: Add virtio video device specification
  - From: Cornelia Huck <cohuck@redhat.com>
- Re: [virtio-dev] Re: [RFC PATCH v6] virtio-video: Add virtio video device specification
  - From: Alexander Gordeev <alexander.gordeev@opensynergy.com>
- Re: [virtio-dev] Re: [RFC PATCH v6] virtio-video: Add virtio video device specification
  - From: Alexandre Courbot <acourbot@chromium.org>
- Re: [virtio-dev] Re: [RFC PATCH v6] virtio-video: Add virtio video device specification
  - From: Alexander Gordeev <alexander.gordeev@opensynergy.com>
- Re: [virtio-dev] Re: [RFC PATCH v6] virtio-video: Add virtio video device specification
  - From: Alexandre Courbot <acourbot@chromium.org>