OASIS Mailing List ArchivesView the OASIS mailing list archive below
or browse/search using MarkMail.

 


Help: OASIS Mailing Lists Help | MarkMail Help

virtio-dev message

[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]


Subject: Re: [virtio-dev] Re: [RFC PATCH v6] virtio-video: Add virtio video device specification


On Thu, Apr 27, 2023 at 11:35âPM Alexander Gordeev
<alexander.gordeev@opensynergy.com> wrote:
>
> On 27.04.23 12:13, BartÅomiej Grzesik wrote:
> > Hi Alexander
> >
> > On Wed, Apr 26, 2023 at 6:00âPM Alexander Gordeev
> > <alexander.gordeev@opensynergy.com> wrote:
> >>
> >> Hi BartÅomiej,
> >>
> >> On 21.04.23 13:49, BartÅomiej Grzesik wrote:
> >>> +CC chromeos-arc-video-eng team that also works on virtio-video
> >>>
> >>> Hi everyone!
> >>>
> >>>   From the experience of working on virtio-video I can definitely agree
> >>> with Alex Courbot, that moving to virtio-v4l2 will be a great move.
> >>> This move will not only simply things a lot but also allow us things like
> >>> vhost-net like implementation for some devices (however this is
> >>> thinking way ahead and linux only).
> >>>
> >>> One added benefit of this move that I'd like to point out now, that
> >>> probably
> >>> haven't been mentioned before, is moving to asynchronous resource queue
> >>> call. Previously `VIRTIO_VIDEO_CMD_RESOURCE_QUEUE` have been
> >>> synchronous and caused one hard to debug bug caused by this flawed
> >>> design. During the command execution the virtio queue descriptors are
> >>> blocked, potentially leading to dead locking the device. The implementation
> >>> of virtio-v4l2 (even as is - btw nice work Alex!) eliminates this issue
> >>> by moving to asynchronous response of the resource queue (VIDIOC_QBUF).
> >>
> >> Thanks for your valuable feedback! Could you please share some details
> >> about the bug? That would be very helpful. I'm working on the next
> >> version of the virtio-video draft, so I can change it there. I like the
> >> idea to use V4L2 as a reference, so we should probably do it like it is
> >> done there, only simpler. Still it would be interesting to know the
> >> details, because we didn't have issues with the current design.
> >
> > In this bug, an app preallocated and enqueued all output buffers (to
> > CAPTURE queue). This is inline with V4L2 and in case of virtualized video
> > stack helps with latency. Those enqueued buffers are holding virtqueue
> > descriptors until filled with a decoded frame. While for one stream it is not an
> > issue, but for more simultaneous streams, it quickly becomes a serious
> > problem. In our case all descriptors from the virtqueue were consumed
> > by enqueued output buffers and no other command could be issued
> > to hypervisor. This dead locked the entire driver, by starving the driver with
> > descriptors - even STREAM_DESTROY command could not be issued and
> > only solution was to reboot the guest.
> >
> > While it is easily solvable by adjusting the size of the virtqueue, it
> > is a flaw to
> > this design. The number would always have to a function of maximum number
> > of supported streams - raising rather quickly.
> >
> > I remember having a few thoughts on how it could be solved and I think that
> > removing the need to block those descriptors is the best approach in my opinion.
> > One would argue that preallocating descriptors for this purpose or splitting
> > the command queue to input, output and control might be a viable solution
> > or per stream. However it would only delay the issue in time or could
> > cause other streams to "starve".
>
> Thank you for the detailed description. This makes total sense to me
> indeed. I thought about this problem some time and discussed it with my
> colleagues. Indeed looks like it would be best to stop blocking these
> descriptors. We can add more queues, but this doesn't look scalable
> indeed. There are several ways to unblock the descriptors, I think.
> First, we can do the same thing as in V4L2: add a separate (blocking?)
> DEQUEUE command. But then we theoretically can have the same problem
> with DRAIN, because it also blocks. So why not just use the event queue
> to receive the completion events for both QUEUE and DRAIN commands
> asynchronously? One could argue, that the errors should maybe come out
> of band. But at the same time we already use event queue to deliver
> dynamic resolution change events, that clearly should be delivered with
> the flow of buffers.

FWIW that's exactly how virtio-v4l2 handles things.


[Date Prev] | [Thread Prev] | [Thread Next] | [Date Next] -- [Date Index] | [Thread Index] | [List Home]