V4L2 Stateless Codec Support
Embedded Linux Conference Europe,
Microsoft Sponsor Suite / the bar
Attendees
- Alexandre Courbot
- Chris Healy
- Ezequiel Garcia
- Hans Verkuil
- Kieran Bingham
- Laurent Pinchart
- Maxime Ripard
- Mauro Carvalho Chehab
- Nicolas Dufresne
- Niklas Söderlund
- Philip Zabell
- Sakari Ailus
- Tomasz Figa
- Victor Jáquez
Nicolas reported an issue with buffer management. V4L2 decouples buffers from buffer layouts, assuming that all buffers that will be used on a queue will have the same layout. This makes importing buffers difficult.
Tomasz proposed using the format passed to VIDIOC_CREATE_BUFS to communicate the buffer format, and keep that format associated with the buffer internally. An ioctl to delete buffers selectively would then be needed.
Could we take a stream parser from an existing project (such as gstreamer), and make it a standalone component, that could then be used by different implementations (gstreamer, libva, ...) ?
Maxime said that VLC recently released a new parser meant to be a library which could be useful. Nicolas believes we need a parser library split out from any other code base, to avoid pulling in other libraries, and being able to maintain it separately from gstreamer, VLC or any other project. One or several people need to step up and maintain a parser, and it might be difficult to find volunteers.
Many existing parsers are meant to be used internally in the project they're part of. Nicolas explained that the ffmpeg parser, for instance, was used by gstreamer, but then got dropped, as it only offered the APIs that ffmpeg needed, but not the APIs needed by gstreamer.
Does ChromeOS have its own in-house parser ? Alexandre believes it does, but isn't sure whether it was initially developed internally.
There's also a language problem: ffmpeg and gstreamer are written in C, the ChomeOS parser in C++, VLC is moving to Rust. What do we pick, how do we ensure interoperability ?
A parser isn't just about extracting a few tables, it's really part of the decoder. Decisions such as error concealment (i.e. what do we do if a frame is missing from the stream ?) are part of the userspace codec component. Applications may have different requirements, so this is possibly best left to applications, that may delegate this to libva.
- LibVA re-use - Allows both ARM / x86 commonality
As a short-term solution, implementing a generic libva on top of the V4L2 stateless codec kernel API would give support for our codecs to all applications that currently use libva, including applications based on ffmpeg and gstreamer as they both have a libva backend (gstreamer uses libva directly, it doesn't go through ffmpeg to do so).
70% of applications pick one stack: ffmpeg. It has a software codec API almost identical to the V4L2 stateful codecs. It would be trivial for applications to switch to V4L2 natively.
Mauro would like us to approach Intel to explain our plans with libva, to avoid later surprises.
AV1 is a complex codec. It's a successor to VP9/VP10 requiring more tables and parameters, which will come through soon.
libva is hosted on freedesktop. Should we host the libva-v4l2-codec backend there, or host it on linuxtv.org ? Hans would prefer linuxtv as it's "closer to our kernel implementation".
libva loads backends in order, and picks the first one that reports it can support the platform. There is also an environment variable that can specify a backend. Ezequiel enquired how to support platforms that would have multiple hardware codecs. libva doesn't seem to support this at the moment. Nicolas reported that there's an Intel SoC that have both an Intel graphics core and a Vega64 graphics core that both have a codec.
Hans said that a platform that expose multiple codecs will likely be used for specialized applications, and requiring those to implement codec support directly is acceptable. Our main focus should be to support the common case.
NVidia is following our progress and is interested in using the V4L2 stateless API. On the userspace side, vdpau is pretty much dead, they have moved to nvdec. Usage of OMX is phasing out.
On RaspberryPi OMX is going away.
bootlin has developed a debugging tool called v4l2-request-test (https://github.com/bootlin/v4l2-request-test) that has been very useful to debug the codec driver without going through the full userspace stack. This is worth mentioning and integrating.
- Using buffer indices as handles to reference frames
- This has been proposed by Tomasz, and Hans has serious concerns, he believes that having userspace predict what buffer indices will be used in the future is very fragile and would prefer using a separate 64-bit cookie associated with v4l2_buffers.
- Using capture buffer indices as reference frame handles requires predicting the buffer index on the capture queue which the output queue frames will be decoded into. We could use the output queue buffer index instead, but that wouldn't work with multi-slice decoding (multiple output buffers for a single capture buffer). Using a cookie set by userspace on the output side, then copied to the capture queue by the driver, solves that problem. All slices queued on the output queue for the same decoded picture will have the same cookie value (userspace will have to ensure that).
- Tomasz would prefer a buffer index-based solution, to avoid keeping a cookie-index map in userspace. Due to how V4L2 works, enqueuing a new dmabuf handle on the capture side for a V4L2 buffer with a given index will effectively delete the corresponding cookie, so userspace would need to ensure it doesn't overwrite buffers; (Tomasz: To clarify, I don't see the significant benefit of using cookies over indices. It makes it easier for user space, because it doesn't have to predict the CAPTURE buffers, but still is error prone because of the buffer requeuing problem. For now it would be good to see how it translates into real code, though. In the meantime I can try to find a better idea.)