[RFC,0/2] VP8 stateless V4L2 encoding uAPI + driver

Message ID 20230309125651.23911-1-andrzej.p@collabora.com (mailing list archive)
Headers
Series VP8 stateless V4L2 encoding uAPI + driver |

Message

Andrzej Pietrasiewicz March 9, 2023, 12:56 p.m. UTC
  Dear All,

This two-patch series adds uAPI for stateless VP8 encoding
and an accompanying driver using it.

It has been tested on an rk3399 board and there exists
a gstreamer user:

https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/3736

example pipeline:

gst-launch-1.0 videotestsrc num-buffers=500 ! video/x-raw,width=640,height=480 \
! v4l2slvp8enc ! queue ! matroskamux ! filesink location=test_vp8.mkv

I kindly ask for comments.

Notably, the documentation for the added uAPI is missing,
that is to be addressed when sending a patch series proper (not RFC).

For the RFC I also did not care to replace a BUG_ON() in the boolean encoder.

Rebased onto v6.2.

Regards,

Andrzej

Andrzej Pietrasiewicz (2):
  media: uapi: Add VP8 stateless encoder controls
  media: rkvdec: Add VP8 encoder

 drivers/media/platform/verisilicon/Makefile   |    2 +
 drivers/media/platform/verisilicon/hantro.h   |   10 +
 .../platform/verisilicon/hantro_boolenc.c     |   69 +
 .../platform/verisilicon/hantro_boolenc.h     |   21 +
 .../media/platform/verisilicon/hantro_drv.c   |   18 +-
 .../media/platform/verisilicon/hantro_hw.h    |   90 +
 .../media/platform/verisilicon/hantro_v4l2.c  |    5 +-
 .../media/platform/verisilicon/hantro_vp8.c   |  118 ++
 .../verisilicon/rockchip_vpu2_hw_vp8_enc.c    | 1574 +++++++++++++++++
 .../platform/verisilicon/rockchip_vpu2_regs.h |    1 +
 .../platform/verisilicon/rockchip_vpu_hw.c    |   23 +-
 drivers/media/v4l2-core/v4l2-ctrls-core.c     |   16 +
 drivers/media/v4l2-core/v4l2-ctrls-defs.c     |    5 +
 include/media/v4l2-ctrls.h                    |    1 +
 include/uapi/linux/v4l2-controls.h            |   91 +
 include/uapi/linux/videodev2.h                |    3 +
 16 files changed, 2041 insertions(+), 6 deletions(-)
 create mode 100644 drivers/media/platform/verisilicon/hantro_boolenc.c
 create mode 100644 drivers/media/platform/verisilicon/hantro_boolenc.h
 create mode 100644 drivers/media/platform/verisilicon/rockchip_vpu2_hw_vp8_enc.c


base-commit: c9c3395d5e3dcc6daee66c6908354d47bf98cb0c
  

Comments

Nicolas Frattaroli March 18, 2023, 9:20 a.m. UTC | #1
On Thursday, 9 March 2023 13:56:49 CET Andrzej Pietrasiewicz wrote:
> Dear All,
> 
> This two-patch series adds uAPI for stateless VP8 encoding
> and an accompanying driver using it.
> 
> It has been tested on an rk3399 board and there exists
> a gstreamer user:
> 
> https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/3736
> 
> example pipeline:
> 
> gst-launch-1.0 videotestsrc num-buffers=500 !
> video/x-raw,width=640,height=480 \ ! v4l2slvp8enc ! queue ! matroskamux !
> filesink location=test_vp8.mkv
> 
> I kindly ask for comments.
> 
> Notably, the documentation for the added uAPI is missing,
> that is to be addressed when sending a patch series proper (not RFC).
> 
> For the RFC I also did not care to replace a BUG_ON() in the boolean
> encoder.
> 
> Rebased onto v6.2.
> 
> Regards,
> 
> Andrzej

Hello,

I can't offer much in terms on technical comments on the implementation,
but thank you for your work on this. A more general question: Is the
rate control done by the userspace component or the kernel or even the
hardware?

I tried this patchset (and the gstreamer merge request) out last night
and ran into quite noticable i-frame pulsing, and am wondering who the
culprit of that is. Looking at the vp8 encode params in the uAPI, it
looks like it'll be userspace in charge of rate control?

On a related side note, since I let this run all night with different
parameters I can happily report that it seems to be quite stable, no
problems encountered at all.

Kind regards,
Nicolas Frattaroli
  
Andrzej Pietrasiewicz March 20, 2023, 10:07 a.m. UTC | #2
Hi Nicolas,

+Cc Benjamin

W dniu 18.03.2023 o 10:20, Nicolas Frattaroli pisze:
> On Thursday, 9 March 2023 13:56:49 CET Andrzej Pietrasiewicz wrote:
>> Dear All,
>>
>> This two-patch series adds uAPI for stateless VP8 encoding
>> and an accompanying driver using it.
>>
>> It has been tested on an rk3399 board and there exists
>> a gstreamer user:
>>
>> https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/3736
>>
>> example pipeline:
>>
>> gst-launch-1.0 videotestsrc num-buffers=500 !
>> video/x-raw,width=640,height=480 \ ! v4l2slvp8enc ! queue ! matroskamux !
>> filesink location=test_vp8.mkv
>>
>> I kindly ask for comments.
>>
>> Notably, the documentation for the added uAPI is missing,
>> that is to be addressed when sending a patch series proper (not RFC).
>>
>> For the RFC I also did not care to replace a BUG_ON() in the boolean
>> encoder.
>>
>> Rebased onto v6.2.
>>
>> Regards,
>>
>> Andrzej
> 
> Hello,
> 
> I can't offer much in terms on technical comments on the implementation,
> but thank you for your work on this. A more general question: Is the
> rate control done by the userspace component or the kernel or even the
> hardware?
> 
> I tried this patchset (and the gstreamer merge request) out last night
> and ran into quite noticable i-frame pulsing, and am wondering who the
> culprit of that is. Looking at the vp8 encode params in the uAPI, it
> looks like it'll be userspace in charge of rate control?
> 

Yes, rate control is entirely on userspace.

Modern codec specs (not just vp8) only mandate what constitutes a valid
bitstream and how do decode it. No word about encoding, which means that
actually many different strategies can be applied to produce a valid
bitstream. As such, these strategies look a lot like policies which do not
belong to the kernel and I'd rather provide a tool than a policy.

As far as I know the rate control mechanism used in the gst element is
nothing sophisticated, not even a (full) PID algorithm. Which, I think,
was exactly intended to get the thing running in the first place. I would
assume that the expected algorithm would be PID proper in this case.
Specifically, PID being PID will not prevent the encoding stack from
overrunning the set bandwidth for short periods of time, but exactly
because rate control belongs to userspace anyone is welcome to develop
a solution which provides hard bandwidth guarantees.

> On a related side note, since I let this run all night with different
> parameters I can happily report that it seems to be quite stable, no
> problems encountered at all.

Thank you for reporting. In the (expected) case this turns into a
patchset proper I would kindly ask for your Tested-by then.

Can you share what you used for the nightly tests, both in terms of
testing harness and unencoded video material?

Regards,

Andrzej
  
Nicolas Frattaroli March 20, 2023, 2:37 p.m. UTC | #3
On Monday, 20 March 2023 11:07:19 CET Andrzej Pietrasiewicz wrote:
> Hi Nicolas,
> 
> 
> W dniu 18.03.2023 o 10:20, Nicolas Frattaroli pisze:
> > On a related side note, since I let this run all night with different
> > parameters I can happily report that it seems to be quite stable, no
> > problems encountered at all.
> 
> Thank you for reporting. In the (expected) case this turns into a
> patchset proper I would kindly ask for your Tested-by then.

Will do, I'll be closely tracking this patchset and might also throw
a patch your way to enable it on RK356x at some point, since that has
the same Hantro encode IP as well as far as I know.

> Can you share what you used for the nightly tests, both in terms of
> testing harness and unencoded video material?

The source material I used is the "Original" quality of the short film
"Wanderers" by Erik Wernquist: https://vimeo.com/108650530

You can click on "Download" and choose "Original" from there, which
gives you a 4:2:2 10-bit Apple ProRes .mov file. It's quite high
quality and includes some interesting segments, such as the asteroid
field shot, as well as plenty of grain, both of which really stress an 
encoder.

My testing harness is a little primitive, the precise gst pipeline I
used is in this command I ran:

for i in {0..63}; do echo "q $i"; \
gst-launch-1.0 filesrc location=~/Wanderers.mov ! \
qtdemux name=demux demux.video_0 ! decodebin ! videoconvert ! \
v4l2slvp8enc min-quality=$i max-quality=$i ! queue ! matroskamux ! \
filesink location="/mnt/usb/w2/vp8_wanderers_q_$i.mkv"; done

I figured I'd try out all the quantiser levels this way. It is worth
noting that the resultant mkv files are somewhat odd, I had to remux
them with ffmpeg (with -c:v copy to copy the vp8 bitstream over) to
get mpv to seek properly in them and show bitrate information in the
stats overlay (shift+i by default). There's probably a gstreamer
thing I'm unaware of to make it properly generate the matroska
container as well. Either way, not a problem with the encoder, just
the muxer.

I'm pretty sure videoconvert gets rid of the 4:2:2-ness and 10-bit-ness
of it, since I don't think this hardware encoder is capable of handling
that, which is fine by me.

However, since this does nicely transcode to a 10-bit video with a
software encoder of your choice, Collabora might also be interested
in using this footage for any upcoming 10-bit patches to the video
decode side of things. :)

> Regards,
> 
> Andrzej

Kind regards,
Nicolas Frattaroli