OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

virtio-dev message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [virtio-dev] Re: [RFC PATCH v6] virtio-video: Add virtio video device specification


On 28.04.23 05:22, Alexandre Courbot wrote:
On Thu, Apr 27, 2023 at 11:35âPM Alexander Gordeev
<alexander.gordeev@opensynergy.com> wrote:

On 27.04.23 12:13, BartÅomiej Grzesik wrote:
Hi Alexander

On Wed, Apr 26, 2023 at 6:00âPM Alexander Gordeev
<alexander.gordeev@opensynergy.com> wrote:

Hi BartÅomiej,

On 21.04.23 13:49, BartÅomiej Grzesik wrote:
+CC chromeos-arc-video-eng team that also works on virtio-video

Hi everyone!

   From the experience of working on virtio-video I can definitely agree
with Alex Courbot, that moving to virtio-v4l2 will be a great move.
This move will not only simply things a lot but also allow us things like
vhost-net like implementation for some devices (however this is
thinking way ahead and linux only).

One added benefit of this move that I'd like to point out now, that
probably
haven't been mentioned before, is moving to asynchronous resource queue
call. Previously `VIRTIO_VIDEO_CMD_RESOURCE_QUEUE` have been
synchronous and caused one hard to debug bug caused by this flawed
design. During the command execution the virtio queue descriptors are
blocked, potentially leading to dead locking the device. The implementation
of virtio-v4l2 (even as is - btw nice work Alex!) eliminates this issue
by moving to asynchronous response of the resource queue (VIDIOC_QBUF).

Thanks for your valuable feedback! Could you please share some details
about the bug? That would be very helpful. I'm working on the next
version of the virtio-video draft, so I can change it there. I like the
idea to use V4L2 as a reference, so we should probably do it like it is
done there, only simpler. Still it would be interesting to know the
details, because we didn't have issues with the current design.

In this bug, an app preallocated and enqueued all output buffers (to
CAPTURE queue). This is inline with V4L2 and in case of virtualized video
stack helps with latency. Those enqueued buffers are holding virtqueue
descriptors until filled with a decoded frame. While for one stream it is not an
issue, but for more simultaneous streams, it quickly becomes a serious
problem. In our case all descriptors from the virtqueue were consumed
by enqueued output buffers and no other command could be issued
to hypervisor. This dead locked the entire driver, by starving the driver with
descriptors - even STREAM_DESTROY command could not be issued and
only solution was to reboot the guest.

While it is easily solvable by adjusting the size of the virtqueue, it
is a flaw to
this design. The number would always have to a function of maximum number
of supported streams - raising rather quickly.

I remember having a few thoughts on how it could be solved and I think that
removing the need to block those descriptors is the best approach in my opinion.
One would argue that preallocating descriptors for this purpose or splitting
the command queue to input, output and control might be a viable solution
or per stream. However it would only delay the issue in time or could
cause other streams to "starve".

Thank you for the detailed description. This makes total sense to me
indeed. I thought about this problem some time and discussed it with my
colleagues. Indeed looks like it would be best to stop blocking these
descriptors. We can add more queues, but this doesn't look scalable
indeed. There are several ways to unblock the descriptors, I think.
First, we can do the same thing as in V4L2: add a separate (blocking?)
DEQUEUE command. But then we theoretically can have the same problem
with DRAIN, because it also blocks. So why not just use the event queue
to receive the completion events for both QUEUE and DRAIN commands
asynchronously? One could argue, that the errors should maybe come out
of band. But at the same time we already use event queue to deliver
dynamic resolution change events, that clearly should be delivered with
the flow of buffers.

FWIW that's exactly how virtio-v4l2 handles things.

I'm not sure you read my email to the end.
I'm proposing a different thing. We already have an event queue. It is
used even for something, that should actually go with the flow of output
buffers: dynamic resolution change events. So we can simply add a
BUFFER_DEQUEUED events instead of doing the same thing in the command
queue by implementing a blocking DEQUEUE command as it is done in V4L2.
I think this is even better. This way we have blocked descriptors only
on the event queue, and this should never be a problem. Also this way we
automatically have correct ordering of DEQUEUE and DRC. Add a DRAIN
completion event here, and the whole thing becomes even better. It also
has to be synchronized to the flow of buffers. This way we don't have to
block on DRAIN or have to dequeue empty buffers with an EOS flag. The
whole thing becomes simpler and more consistent. Right?

--
Alexander Gordeev
Senior Software Engineer

OpenSynergy GmbH
Rotherstr. 20, 10245 Berlin

Phone: +49 30 60 98 54 0 - 88
Fax: +49 (30) 60 98 54 0 - 99
EMail: alexander.gordeev@opensynergy.com

www.opensynergy.com

Handelsregister/Commercial Registry: Amtsgericht Charlottenburg, HRB 108616B
GeschÃftsfÃhrer/Managing Director: RÃgis Adjamah

Please mind our privacy notice<https://www.opensynergy.com/datenschutzerklaerung/privacy-notice-for-business-partners-pursuant-to-article-13-of-the-general-data-protection-regulation-gdpr/> pursuant to Art. 13 GDPR. // Unsere Hinweise zum Datenschutz gem. Art. 13 DSGVO finden Sie hier.<https://www.opensynergy.com/de/datenschutzerklaerung/datenschutzhinweise-fuer-geschaeftspartner-gem-art-13-dsgvo/>


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]