[RFC,v2,09/10] media: uapi: h264: Add DPB entry field reference flags

Message ID HE1PR06MB401142C1E45B302094AD1149AC610@HE1PR06MB4011.eurprd06.prod.outlook.com (mailing list archive)
State Superseded, archived
Delegated to: Hans Verkuil
Headers
Series media: hantro: H264 fixes and improvements |

Commit Message

Jonas Karlman Oct. 29, 2019, 1:26 a.m. UTC
  Add DPB entry flags to help indicate when a reference frame is a field picture
and how the DPB entry is referenced, top or bottom field or full frame.

Signed-off-by: Jonas Karlman <jonas@kwiboo.se>
---
 Documentation/media/uapi/v4l/ext-ctrls-codec.rst | 12 ++++++++++++
 include/media/h264-ctrls.h                       |  4 ++++
 2 files changed, 16 insertions(+)
  

Comments

Boris Brezillon Oct. 31, 2019, 10:20 a.m. UTC | #1
On Tue, 29 Oct 2019 01:26:01 +0000
Jonas Karlman <jonas@kwiboo.se> wrote:

> Add DPB entry flags to help indicate when a reference frame is a field picture
> and how the DPB entry is referenced, top or bottom field or full frame.
> 
> Signed-off-by: Jonas Karlman <jonas@kwiboo.se>
> ---
>  Documentation/media/uapi/v4l/ext-ctrls-codec.rst | 12 ++++++++++++
>  include/media/h264-ctrls.h                       |  4 ++++
>  2 files changed, 16 insertions(+)
> 
> diff --git a/Documentation/media/uapi/v4l/ext-ctrls-codec.rst b/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
> index 28313c0f4e7c..d472a54d1c4d 100644
> --- a/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
> +++ b/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
> @@ -2028,6 +2028,18 @@ enum v4l2_mpeg_video_h264_hierarchical_coding_type -
>      * - ``V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM``
>        - 0x00000004
>        - The DPB entry is a long term reference frame
> +    * - ``V4L2_H264_DPB_ENTRY_FLAG_FIELD_PICTURE``
> +      - 0x00000008
> +      - The DPB entry is a field picture
> +    * - ``V4L2_H264_DPB_ENTRY_FLAG_REF_TOP``
> +      - 0x00000010
> +      - The DPB entry is a top field reference
> +    * - ``V4L2_H264_DPB_ENTRY_FLAG_REF_BOTTOM``
> +      - 0x00000020
> +      - The DPB entry is a bottom field reference
> +    * - ``V4L2_H264_DPB_ENTRY_FLAG_REF_FRAME``
> +      - 0x00000030
> +      - The DPB entry is a reference frame
>  
>  ``V4L2_CID_MPEG_VIDEO_H264_DECODE_MODE (enum)``
>      Specifies the decoding mode to use. Currently exposes slice-based and
> diff --git a/include/media/h264-ctrls.h b/include/media/h264-ctrls.h
> index e877bf1d537c..76020ebd1e6c 100644
> --- a/include/media/h264-ctrls.h
> +++ b/include/media/h264-ctrls.h
> @@ -185,6 +185,10 @@ struct v4l2_ctrl_h264_slice_params {
>  #define V4L2_H264_DPB_ENTRY_FLAG_VALID		0x01
>  #define V4L2_H264_DPB_ENTRY_FLAG_ACTIVE		0x02
>  #define V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM	0x04
> +#define V4L2_H264_DPB_ENTRY_FLAG_FIELD_PICTURE	0x08
> +#define V4L2_H264_DPB_ENTRY_FLAG_REF_TOP	0x10
> +#define V4L2_H264_DPB_ENTRY_FLAG_REF_BOTTOM	0x20
> +#define V4L2_H264_DPB_ENTRY_FLAG_REF_FRAME	0x30

I don't remember all the details, but do we really need 3 flags?
Maybe I'm wrong, but it looks like the following combination doesn't
make sense:

- FIELD_PICTURE + REF_FRAME: if it's a full frame ref it should
  contain both top and bottom fields right, so it's no longer a
  FIELD_PICTURE, is it?

Can't we just have 2 flags?

FIELD_PICTURE		0x08
FIELD_REF_TOP		0x10 (meaning that FIELD_REF_BOTTOM is
			      0x00)

and then have the following combinations:

top field ref => FIELD_PICTURE | FIELD_REF_TOP
bottom field ref => FIELD_PICTURE
full frame ref => 0x0

>  
>  struct v4l2_h264_dpb_entry {
>  	__u64 reference_ts;
  
Jonas Karlman Nov. 15, 2019, 6:26 a.m. UTC | #2
On 2019-10-31 11:20, Boris Brezillon wrote:
> On Tue, 29 Oct 2019 01:26:01 +0000
> Jonas Karlman <jonas@kwiboo.se> wrote:
>
>> Add DPB entry flags to help indicate when a reference frame is a field picture
>> and how the DPB entry is referenced, top or bottom field or full frame.
>>
>> Signed-off-by: Jonas Karlman <jonas@kwiboo.se>
>> ---
>>  Documentation/media/uapi/v4l/ext-ctrls-codec.rst | 12 ++++++++++++
>>  include/media/h264-ctrls.h                       |  4 ++++
>>  2 files changed, 16 insertions(+)
>>
>> diff --git a/Documentation/media/uapi/v4l/ext-ctrls-codec.rst b/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
>> index 28313c0f4e7c..d472a54d1c4d 100644
>> --- a/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
>> +++ b/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
>> @@ -2028,6 +2028,18 @@ enum v4l2_mpeg_video_h264_hierarchical_coding_type -
>>      * - ``V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM``
>>        - 0x00000004
>>        - The DPB entry is a long term reference frame
>> +    * - ``V4L2_H264_DPB_ENTRY_FLAG_FIELD_PICTURE``
>> +      - 0x00000008
>> +      - The DPB entry is a field picture
>> +    * - ``V4L2_H264_DPB_ENTRY_FLAG_REF_TOP``
>> +      - 0x00000010
>> +      - The DPB entry is a top field reference
>> +    * - ``V4L2_H264_DPB_ENTRY_FLAG_REF_BOTTOM``
>> +      - 0x00000020
>> +      - The DPB entry is a bottom field reference
>> +    * - ``V4L2_H264_DPB_ENTRY_FLAG_REF_FRAME``
>> +      - 0x00000030
>> +      - The DPB entry is a reference frame
>>  
>>  ``V4L2_CID_MPEG_VIDEO_H264_DECODE_MODE (enum)``
>>      Specifies the decoding mode to use. Currently exposes slice-based and
>> diff --git a/include/media/h264-ctrls.h b/include/media/h264-ctrls.h
>> index e877bf1d537c..76020ebd1e6c 100644
>> --- a/include/media/h264-ctrls.h
>> +++ b/include/media/h264-ctrls.h
>> @@ -185,6 +185,10 @@ struct v4l2_ctrl_h264_slice_params {
>>  #define V4L2_H264_DPB_ENTRY_FLAG_VALID		0x01
>>  #define V4L2_H264_DPB_ENTRY_FLAG_ACTIVE		0x02
>>  #define V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM	0x04
>> +#define V4L2_H264_DPB_ENTRY_FLAG_FIELD_PICTURE	0x08
>> +#define V4L2_H264_DPB_ENTRY_FLAG_REF_TOP	0x10
>> +#define V4L2_H264_DPB_ENTRY_FLAG_REF_BOTTOM	0x20
>> +#define V4L2_H264_DPB_ENTRY_FLAG_REF_FRAME	0x30
> I don't remember all the details, but do we really need 3 flags?
> Maybe I'm wrong, but it looks like the following combination doesn't
> make sense:
>
> - FIELD_PICTURE + REF_FRAME: if it's a full frame ref it should
>   contain both top and bottom fields right, so it's no longer a
>   FIELD_PICTURE, is it?
>
> Can't we just have 2 flags?
>
> FIELD_PICTURE		0x08
> FIELD_REF_TOP		0x10 (meaning that FIELD_REF_BOTTOM is
> 			      0x00)
>
> and then have the following combinations:
>
> top field ref => FIELD_PICTURE | FIELD_REF_TOP
> bottom field ref => FIELD_PICTURE
> full frame ref => 0x0

I am not sure and will need to look closer at spec and what ffmpeg is doing.
These flags was mostly inspired by the information ffmpeg stores in
H264Picture->reference and H264Picture->field_picture.

I also believe that the new FLAG_REF_TOP/BOTTOM may make FLAG_ACTIVE obsolete.

active => flags & FLAG_REF_FRAME

Hopefully I will have some time to rework and respin this at the end of next week.

Regards,
Jonas

>
>>  
>>  struct v4l2_h264_dpb_entry {
>>  	__u64 reference_ts;
  
Boris Brezillon Nov. 15, 2019, 7:45 a.m. UTC | #3
On Fri, 15 Nov 2019 06:26:54 +0000
Jonas Karlman <jonas@kwiboo.se> wrote:

> On 2019-10-31 11:20, Boris Brezillon wrote:
> > On Tue, 29 Oct 2019 01:26:01 +0000
> > Jonas Karlman <jonas@kwiboo.se> wrote:
> >  
> >> Add DPB entry flags to help indicate when a reference frame is a field picture
> >> and how the DPB entry is referenced, top or bottom field or full frame.
> >>
> >> Signed-off-by: Jonas Karlman <jonas@kwiboo.se>
> >> ---
> >>  Documentation/media/uapi/v4l/ext-ctrls-codec.rst | 12 ++++++++++++
> >>  include/media/h264-ctrls.h                       |  4 ++++
> >>  2 files changed, 16 insertions(+)
> >>
> >> diff --git a/Documentation/media/uapi/v4l/ext-ctrls-codec.rst b/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
> >> index 28313c0f4e7c..d472a54d1c4d 100644
> >> --- a/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
> >> +++ b/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
> >> @@ -2028,6 +2028,18 @@ enum v4l2_mpeg_video_h264_hierarchical_coding_type -
> >>      * - ``V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM``
> >>        - 0x00000004
> >>        - The DPB entry is a long term reference frame
> >> +    * - ``V4L2_H264_DPB_ENTRY_FLAG_FIELD_PICTURE``
> >> +      - 0x00000008
> >> +      - The DPB entry is a field picture
> >> +    * - ``V4L2_H264_DPB_ENTRY_FLAG_REF_TOP``
> >> +      - 0x00000010
> >> +      - The DPB entry is a top field reference
> >> +    * - ``V4L2_H264_DPB_ENTRY_FLAG_REF_BOTTOM``
> >> +      - 0x00000020
> >> +      - The DPB entry is a bottom field reference
> >> +    * - ``V4L2_H264_DPB_ENTRY_FLAG_REF_FRAME``
> >> +      - 0x00000030
> >> +      - The DPB entry is a reference frame
> >>  
> >>  ``V4L2_CID_MPEG_VIDEO_H264_DECODE_MODE (enum)``
> >>      Specifies the decoding mode to use. Currently exposes slice-based and
> >> diff --git a/include/media/h264-ctrls.h b/include/media/h264-ctrls.h
> >> index e877bf1d537c..76020ebd1e6c 100644
> >> --- a/include/media/h264-ctrls.h
> >> +++ b/include/media/h264-ctrls.h
> >> @@ -185,6 +185,10 @@ struct v4l2_ctrl_h264_slice_params {
> >>  #define V4L2_H264_DPB_ENTRY_FLAG_VALID		0x01
> >>  #define V4L2_H264_DPB_ENTRY_FLAG_ACTIVE		0x02
> >>  #define V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM	0x04
> >> +#define V4L2_H264_DPB_ENTRY_FLAG_FIELD_PICTURE	0x08
> >> +#define V4L2_H264_DPB_ENTRY_FLAG_REF_TOP	0x10
> >> +#define V4L2_H264_DPB_ENTRY_FLAG_REF_BOTTOM	0x20
> >> +#define V4L2_H264_DPB_ENTRY_FLAG_REF_FRAME	0x30  
> > I don't remember all the details, but do we really need 3 flags?
> > Maybe I'm wrong, but it looks like the following combination doesn't
> > make sense:
> >
> > - FIELD_PICTURE + REF_FRAME: if it's a full frame ref it should
> >   contain both top and bottom fields right, so it's no longer a
> >   FIELD_PICTURE, is it?
> >
> > Can't we just have 2 flags?
> >
> > FIELD_PICTURE		0x08
> > FIELD_REF_TOP		0x10 (meaning that FIELD_REF_BOTTOM is
> > 			      0x00)
> >
> > and then have the following combinations:
> >
> > top field ref => FIELD_PICTURE | FIELD_REF_TOP
> > bottom field ref => FIELD_PICTURE
> > full frame ref => 0x0  
> 
> I am not sure and will need to look closer at spec and what ffmpeg is doing.
> These flags was mostly inspired by the information ffmpeg stores in
> H264Picture->reference and H264Picture->field_picture.
> 
> I also believe that the new FLAG_REF_TOP/BOTTOM may make FLAG_ACTIVE obsolete.
> 
> active => flags & FLAG_REF_FRAME

Can't we keep the ACTIVE flag and drop the REF_FRAME one then? AFAIU,
all we need to know in addition to what we already have is:

* is the reference a full-frame or a field?
* if it is a field, which one (top or bottom)?

Hence my suggestion to keep only 2 flags.
  
Jonas Karlman Nov. 15, 2019, 8:15 a.m. UTC | #4
On 2019-11-15 08:45, Boris Brezillon wrote:
> On Fri, 15 Nov 2019 06:26:54 +0000
> Jonas Karlman <jonas@kwiboo.se> wrote:
>
>> On 2019-10-31 11:20, Boris Brezillon wrote:
>>> On Tue, 29 Oct 2019 01:26:01 +0000
>>> Jonas Karlman <jonas@kwiboo.se> wrote:
>>>  
>>>> Add DPB entry flags to help indicate when a reference frame is a field picture
>>>> and how the DPB entry is referenced, top or bottom field or full frame.
>>>>
>>>> Signed-off-by: Jonas Karlman <jonas@kwiboo.se>
>>>> ---
>>>>  Documentation/media/uapi/v4l/ext-ctrls-codec.rst | 12 ++++++++++++
>>>>  include/media/h264-ctrls.h                       |  4 ++++
>>>>  2 files changed, 16 insertions(+)
>>>>
>>>> diff --git a/Documentation/media/uapi/v4l/ext-ctrls-codec.rst b/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
>>>> index 28313c0f4e7c..d472a54d1c4d 100644
>>>> --- a/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
>>>> +++ b/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
>>>> @@ -2028,6 +2028,18 @@ enum v4l2_mpeg_video_h264_hierarchical_coding_type -
>>>>      * - ``V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM``
>>>>        - 0x00000004
>>>>        - The DPB entry is a long term reference frame
>>>> +    * - ``V4L2_H264_DPB_ENTRY_FLAG_FIELD_PICTURE``
>>>> +      - 0x00000008
>>>> +      - The DPB entry is a field picture
>>>> +    * - ``V4L2_H264_DPB_ENTRY_FLAG_REF_TOP``
>>>> +      - 0x00000010
>>>> +      - The DPB entry is a top field reference
>>>> +    * - ``V4L2_H264_DPB_ENTRY_FLAG_REF_BOTTOM``
>>>> +      - 0x00000020
>>>> +      - The DPB entry is a bottom field reference
>>>> +    * - ``V4L2_H264_DPB_ENTRY_FLAG_REF_FRAME``
>>>> +      - 0x00000030
>>>> +      - The DPB entry is a reference frame
>>>>  
>>>>  ``V4L2_CID_MPEG_VIDEO_H264_DECODE_MODE (enum)``
>>>>      Specifies the decoding mode to use. Currently exposes slice-based and
>>>> diff --git a/include/media/h264-ctrls.h b/include/media/h264-ctrls.h
>>>> index e877bf1d537c..76020ebd1e6c 100644
>>>> --- a/include/media/h264-ctrls.h
>>>> +++ b/include/media/h264-ctrls.h
>>>> @@ -185,6 +185,10 @@ struct v4l2_ctrl_h264_slice_params {
>>>>  #define V4L2_H264_DPB_ENTRY_FLAG_VALID		0x01
>>>>  #define V4L2_H264_DPB_ENTRY_FLAG_ACTIVE		0x02
>>>>  #define V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM	0x04
>>>> +#define V4L2_H264_DPB_ENTRY_FLAG_FIELD_PICTURE	0x08
>>>> +#define V4L2_H264_DPB_ENTRY_FLAG_REF_TOP	0x10
>>>> +#define V4L2_H264_DPB_ENTRY_FLAG_REF_BOTTOM	0x20
>>>> +#define V4L2_H264_DPB_ENTRY_FLAG_REF_FRAME	0x30  
>>> I don't remember all the details, but do we really need 3 flags?
>>> Maybe I'm wrong, but it looks like the following combination doesn't
>>> make sense:
>>>
>>> - FIELD_PICTURE + REF_FRAME: if it's a full frame ref it should
>>>   contain both top and bottom fields right, so it's no longer a
>>>   FIELD_PICTURE, is it?
>>>
>>> Can't we just have 2 flags?
>>>
>>> FIELD_PICTURE		0x08
>>> FIELD_REF_TOP		0x10 (meaning that FIELD_REF_BOTTOM is
>>> 			      0x00)
>>>
>>> and then have the following combinations:
>>>
>>> top field ref => FIELD_PICTURE | FIELD_REF_TOP
>>> bottom field ref => FIELD_PICTURE
>>> full frame ref => 0x0  
>> I am not sure and will need to look closer at spec and what ffmpeg is doing.
>> These flags was mostly inspired by the information ffmpeg stores in
>> H264Picture->reference and H264Picture->field_picture.
>>
>> I also believe that the new FLAG_REF_TOP/BOTTOM may make FLAG_ACTIVE obsolete.
>>
>> active => flags & FLAG_REF_FRAME
> Can't we keep the ACTIVE flag and drop the REF_FRAME one then? AFAIU,
> all we need to know in addition to what we already have is:
>
> * is the reference a full-frame or a field?
> * if it is a field, which one (top or bottom)?
>
> Hence my suggestion to keep only 2 flags.

I think we also need to know if the reference frame was a field or a full-frame picture
not just how it is referenced, I think there was an issue when I originally tried a similar
approach as you suggested, but it was too long time ago as I seem to have forgotten the details.

Will do more testing and see if I can refresh my memory.

Best regards,
Jonas
  

Patch

diff --git a/Documentation/media/uapi/v4l/ext-ctrls-codec.rst b/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
index 28313c0f4e7c..d472a54d1c4d 100644
--- a/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
+++ b/Documentation/media/uapi/v4l/ext-ctrls-codec.rst
@@ -2028,6 +2028,18 @@  enum v4l2_mpeg_video_h264_hierarchical_coding_type -
     * - ``V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM``
       - 0x00000004
       - The DPB entry is a long term reference frame
+    * - ``V4L2_H264_DPB_ENTRY_FLAG_FIELD_PICTURE``
+      - 0x00000008
+      - The DPB entry is a field picture
+    * - ``V4L2_H264_DPB_ENTRY_FLAG_REF_TOP``
+      - 0x00000010
+      - The DPB entry is a top field reference
+    * - ``V4L2_H264_DPB_ENTRY_FLAG_REF_BOTTOM``
+      - 0x00000020
+      - The DPB entry is a bottom field reference
+    * - ``V4L2_H264_DPB_ENTRY_FLAG_REF_FRAME``
+      - 0x00000030
+      - The DPB entry is a reference frame
 
 ``V4L2_CID_MPEG_VIDEO_H264_DECODE_MODE (enum)``
     Specifies the decoding mode to use. Currently exposes slice-based and
diff --git a/include/media/h264-ctrls.h b/include/media/h264-ctrls.h
index e877bf1d537c..76020ebd1e6c 100644
--- a/include/media/h264-ctrls.h
+++ b/include/media/h264-ctrls.h
@@ -185,6 +185,10 @@  struct v4l2_ctrl_h264_slice_params {
 #define V4L2_H264_DPB_ENTRY_FLAG_VALID		0x01
 #define V4L2_H264_DPB_ENTRY_FLAG_ACTIVE		0x02
 #define V4L2_H264_DPB_ENTRY_FLAG_LONG_TERM	0x04
+#define V4L2_H264_DPB_ENTRY_FLAG_FIELD_PICTURE	0x08
+#define V4L2_H264_DPB_ENTRY_FLAG_REF_TOP	0x10
+#define V4L2_H264_DPB_ENTRY_FLAG_REF_BOTTOM	0x20
+#define V4L2_H264_DPB_ENTRY_FLAG_REF_FRAME	0x30
 
 struct v4l2_h264_dpb_entry {
 	__u64 reference_ts;