[RESEND,v6,2/4] media: chips-media: wave5: Support runtime suspend/resume

Message ID 20240617104818.221-3-jackson.lee@chipsnmedia.com (mailing list archive)
State Changes Requested
Delegated to: Sebastian Fricke
Headers
Series Add features to an existing driver |

Commit Message

jackson.lee June 17, 2024, 10:48 a.m. UTC
  From: "jackson.lee" <jackson.lee@chipsnmedia.com>

Add support for runtime suspend/resume in the encoder and decoder. This is
achieved by saving the VPU state and powering it off while the VPU idle.

Signed-off-by: Jackson.lee <jackson.lee@chipsnmedia.com>
Signed-off-by: Nas Chung <nas.chung@chipsnmedia.com>
Reviewed-by: Nicolas Dufresne <nicolas.dufresne@collabora.com>
---
 .../platform/chips-media/wave5/wave5-hw.c     |  4 +-
 .../chips-media/wave5/wave5-vpu-dec.c         | 16 ++++++-
 .../chips-media/wave5/wave5-vpu-enc.c         | 15 +++++++
 .../platform/chips-media/wave5/wave5-vpu.c    | 43 +++++++++++++++++++
 .../platform/chips-media/wave5/wave5-vpuapi.c | 14 ++++--
 .../media/platform/chips-media/wave5/wave5.h  |  3 ++
 6 files changed, 88 insertions(+), 7 deletions(-)
  

Comments

Devarsh Thakkar June 19, 2024, 1 p.m. UTC | #1
Hi Jackson,

Thanks for the patch.
On 17/06/24 16:18, Jackson.lee wrote:
> From: "jackson.lee" <jackson.lee@chipsnmedia.com>
> 
> Add support for runtime suspend/resume in the encoder and decoder. This is
> achieved by saving the VPU state and powering it off while the VPU idle.
> 
> Signed-off-by: Jackson.lee <jackson.lee@chipsnmedia.com>
> Signed-off-by: Nas Chung <nas.chung@chipsnmedia.com>
> Reviewed-by: Nicolas Dufresne <nicolas.dufresne@collabora.com>

[..]
>  static int wave5_vpu_probe(struct platform_device *pdev)
>  {
>  	int ret;
> @@ -268,6 +301,12 @@ static int wave5_vpu_probe(struct platform_device *pdev)
>  		 (match_data->flags & WAVE5_IS_DEC) ? "'DECODE'" : "");
>  	dev_info(&pdev->dev, "Product Code:      0x%x\n", dev->product_code);
>  	dev_info(&pdev->dev, "Firmware Revision: %u\n", fw_revision);
> +
> +	pm_runtime_set_autosuspend_delay(&pdev->dev, 5000);

Why are we putting 5s delay for autosuspend ? Without using auto-suspend delay
too, we can directly go to suspended state when last instance is closed and
resume back when first instance is open.

I don't think having an autosuspend delay (especially of 5s) bodes well with
low power-centric devices such as AM62A where we would prefer to go to suspend
state as soon as possible when the last instance is closed.

Also apologies for the delay in review, this didn't caught my eye earlier as
commit message did not mention it either.

Regards
Devarsh
  
jackson.lee June 19, 2024, 11:56 p.m. UTC | #2
Hi Devarsh

If there is no feeding bitstreams during encoding and decoding frames, then driver's status is switched to suspended automatically by autosuspend.
And if we don’t use autosuspend, it is very difficult for us to catch if there is feeding or not while working a pipeline.
So it is very efficient for managing power status.

If the delay is very great value, we can adjust it.

Thanks
Jackson

> -----Original Message-----
> From: Devarsh Thakkar <devarsht@ti.com>
> Sent: Wednesday, June 19, 2024 10:00 PM
> To: jackson.lee <jackson.lee@chipsnmedia.com>; mchehab@kernel.org;
> nicolas@ndufresne.ca; sebastian.fricke@collabora.com
> Cc: linux-media@vger.kernel.org; linux-kernel@vger.kernel.org;
> hverkuil@xs4all.nl; Nas Chung <nas.chung@chipsnmedia.com>; lafley.kim
> <lafley.kim@chipsnmedia.com>; b-brnich@ti.com; Nicolas Dufresne
> <nicolas.dufresne@collabora.com>
> Subject: Re: [RESEND PATCH v6 2/4] media: chips-media: wave5: Support runtime
> suspend/resume
> 
> Hi Jackson,
> 
> Thanks for the patch.
> On 17/06/24 16:18, Jackson.lee wrote:
> > From: "jackson.lee" <jackson.lee@chipsnmedia.com>
> >
> > Add support for runtime suspend/resume in the encoder and decoder.
> > This is achieved by saving the VPU state and powering it off while the VPU
> idle.
> >
> > Signed-off-by: Jackson.lee <jackson.lee@chipsnmedia.com>
> > Signed-off-by: Nas Chung <nas.chung@chipsnmedia.com>
> > Reviewed-by: Nicolas Dufresne <nicolas.dufresne@collabora.com>
> 
> [..]
> >  static int wave5_vpu_probe(struct platform_device *pdev)  {
> >  	int ret;
> > @@ -268,6 +301,12 @@ static int wave5_vpu_probe(struct platform_device
> *pdev)
> >  		 (match_data->flags & WAVE5_IS_DEC) ? "'DECODE'" : "");
> >  	dev_info(&pdev->dev, "Product Code:      0x%x\n", dev->product_code);
> >  	dev_info(&pdev->dev, "Firmware Revision: %u\n", fw_revision);
> > +
> > +	pm_runtime_set_autosuspend_delay(&pdev->dev, 5000);
> 
> Why are we putting 5s delay for autosuspend ? Without using auto-suspend
> delay too, we can directly go to suspended state when last instance is closed
> and resume back when first instance is open.
> 
> I don't think having an autosuspend delay (especially of 5s) bodes well with
> low power-centric devices such as AM62A where we would prefer to go to
> suspend state as soon as possible when the last instance is closed.
> 
> Also apologies for the delay in review, this didn't caught my eye earlier as
> commit message did not mention it either.
> 
> Regards
> Devarsh
  
jackson.lee June 20, 2024, 12:11 a.m. UTC | #3
> -----Original Message-----
> From: jackson.lee
> Sent: Thursday, June 20, 2024 8:57 AM
> To: Devarsh Thakkar <devarsht@ti.com>; mchehab@kernel.org;
> nicolas@ndufresne.ca; sebastian.fricke@collabora.com
> Cc: linux-media@vger.kernel.org; linux-kernel@vger.kernel.org;
> hverkuil@xs4all.nl; Nas Chung <nas.chung@chipsnmedia.com>; lafley.kim
> <lafley.kim@chipsnmedia.com>; b-brnich@ti.com; Nicolas Dufresne
> <nicolas.dufresne@collabora.com>
> Subject: RE: [RESEND PATCH v6 2/4] media: chips-media: wave5: Support runtime
> suspend/resume
> 
> Hi Devarsh
> 
> If there is no feeding bitstreams during encoding and decoding frames, then
> driver's status is switched to suspended automatically by autosuspend.
> And if we don’t use autosuspend, it is very difficult for us to catch if
> there is feeding or not while working a pipeline.
> So it is very efficient for managing power status.
> 
> If the delay is very great value, we can adjust it.
> 
> Thanks
> Jackson
> 

One more thing, 

When an instance is closed or started, we are currently putting a power status to suspend or resumed immediately.
The autospend feature is being only used when there is no feeding while working a pipeline.
I don’t think the delay is very great value.

Thanks


> > -----Original Message-----
> > From: Devarsh Thakkar <devarsht@ti.com>
> > Sent: Wednesday, June 19, 2024 10:00 PM
> > To: jackson.lee <jackson.lee@chipsnmedia.com>; mchehab@kernel.org;
> > nicolas@ndufresne.ca; sebastian.fricke@collabora.com
> > Cc: linux-media@vger.kernel.org; linux-kernel@vger.kernel.org;
> > hverkuil@xs4all.nl; Nas Chung <nas.chung@chipsnmedia.com>; lafley.kim
> > <lafley.kim@chipsnmedia.com>; b-brnich@ti.com; Nicolas Dufresne
> > <nicolas.dufresne@collabora.com>
> > Subject: Re: [RESEND PATCH v6 2/4] media: chips-media: wave5: Support
> > runtime suspend/resume
> >
> > Hi Jackson,
> >
> > Thanks for the patch.
> > On 17/06/24 16:18, Jackson.lee wrote:
> > > From: "jackson.lee" <jackson.lee@chipsnmedia.com>
> > >
> > > Add support for runtime suspend/resume in the encoder and decoder.
> > > This is achieved by saving the VPU state and powering it off while
> > > the VPU
> > idle.
> > >
> > > Signed-off-by: Jackson.lee <jackson.lee@chipsnmedia.com>
> > > Signed-off-by: Nas Chung <nas.chung@chipsnmedia.com>
> > > Reviewed-by: Nicolas Dufresne <nicolas.dufresne@collabora.com>
> >
> > [..]
> > >  static int wave5_vpu_probe(struct platform_device *pdev)  {
> > >  	int ret;
> > > @@ -268,6 +301,12 @@ static int wave5_vpu_probe(struct
> > > platform_device
> > *pdev)
> > >  		 (match_data->flags & WAVE5_IS_DEC) ? "'DECODE'" : "");
> > >  	dev_info(&pdev->dev, "Product Code:      0x%x\n", dev->product_code);
> > >  	dev_info(&pdev->dev, "Firmware Revision: %u\n", fw_revision);
> > > +
> > > +	pm_runtime_set_autosuspend_delay(&pdev->dev, 5000);
> >
> > Why are we putting 5s delay for autosuspend ? Without using
> > auto-suspend delay too, we can directly go to suspended state when
> > last instance is closed and resume back when first instance is open.
> >
> > I don't think having an autosuspend delay (especially of 5s) bodes
> > well with low power-centric devices such as AM62A where we would
> > prefer to go to suspend state as soon as possible when the last instance is
> closed.
> >
> > Also apologies for the delay in review, this didn't caught my eye
> > earlier as commit message did not mention it either.
> >
> > Regards
> > Devarsh
  
Devarsh Thakkar June 20, 2024, 9:35 a.m. UTC | #4
Hi Jackson,

On 20/06/24 05:41, jackson.lee wrote:
> 
> 
>> -----Original Message-----
>> From: jackson.lee
>> Sent: Thursday, June 20, 2024 8:57 AM
>> To: Devarsh Thakkar <devarsht@ti.com>; mchehab@kernel.org;
>> nicolas@ndufresne.ca; sebastian.fricke@collabora.com
>> Cc: linux-media@vger.kernel.org; linux-kernel@vger.kernel.org;
>> hverkuil@xs4all.nl; Nas Chung <nas.chung@chipsnmedia.com>; lafley.kim
>> <lafley.kim@chipsnmedia.com>; b-brnich@ti.com; Nicolas Dufresne
>> <nicolas.dufresne@collabora.com>
>> Subject: RE: [RESEND PATCH v6 2/4] media: chips-media: wave5: Support runtime
>> suspend/resume
>>
>> Hi Devarsh
>>
>> If there is no feeding bitstreams during encoding and decoding frames, then
>> driver's status is switched to suspended automatically by autosuspend.

I think the pm_runtime_*_autosuspend helpers are to schedule a delayed suspend
i.e. after the pm counter goes to 0, suspend the device after timeout period
which is set to 5s in this case.

Even without using the pm_runtime_*_autosuspend helpers, i.e if you use
pm_runtime_resume_and_get on start streaming and pm_runtime_put_sync on stop
streaming the device gets suspended automatically if not in use albeit
immediately after the pm counter goes to 0. And this is what many codec
devices drivers do today [1]. Ain't that suffice what we want ?

In my view the delayed suspend functionality is generally helpful for devices
where resume latencies are higher for e.g. this light sensor driver [2] uses
it because it takes 250ms to stabilize after resumption and I don't see this
being used in codec drivers generally since there is no such large resume
latency. Please let me know if I am missing something or there is a strong
reason to have delayed suspend for wave5.

>> And if we don’t use autosuspend, it is very difficult for us to catch if
>> there is feeding or not while working a pipeline.
>> So it is very efficient for managing power status.

As mentioned above, if you mean by autosuspend that device should
automatically suspend if not used then you don't require to use
pm_runtime_*_autosuspend helpers (as those are for delayed suspend actually)
and instead use the generic pm helpers pm_runtime_resume_and_get and
pm_runtime_put_sync and PM core will automatically suspend the device when pm
counter drops to 0 and resume it back when pm counter is incremented.

>>
>> If the delay is very great value, we can adjust it.
>>

As mentioned above, I feel we don't require to use pm_runtime_*_autosuspend
helpers at first place.

>> Thanks
>> Jackson
>>
> 
> One more thing, 
> 
> When an instance is closed or started, we are currently putting a power status to suspend or resumed immediately.
So I tested this series and see below issues :

1) I see it seems to break VPU operation on AM62A using upstream linux-next
colliding with the polling functionality there since that device does not have
an irq and relies on polling as I see below logs on bootup :

[2024-06-20 13:01:12] root@am62axx-evm:~# dmesg | tail
[2024-06-20 13:01:16] [   23.744372] x8 : ffff000804248a50 x7 :
ffff00087f6ba0c0 x6 : 0000000000000000
[2024-06-20 13:01:16] [   23.744381] x5 : 0000000000f42400 x4 :
0000000000000000 x3 : 0000000000000001
[2024-06-20 13:01:16] [   23.744390] x2 : ffff0008041ad808 x1 :
0000000000000044 x0 : ffff800082150044
[2024-06-20 13:01:16] [   23.744400] Call trace:
[2024-06-20 13:01:16] [   23.744404]  wave5_vdi_read_register+0x8/0x20 [wave5]
[2024-06-20 13:01:16] [   23.744420]  kthread_worker_fn+0xcc/0x184
[2024-06-20 13:01:16] [   23.744432]  kthread+0x118/0x11c
[2024-06-20 13:01:16] [   23.744440]  ret_from_fork+0x10/0x20
[2024-06-20 13:01:16] [   23.744452] Code: b9000022 d65f03c0 f940b000 8b214000
(b9400000)
[2024-06-20 13:01:16] [   23.744458] ---[ end trace 0000000000000000 ]---

I think care needs to be taken to make sure timer is started after device is
powered on and stopped before device gets powered off.


> The autospend feature is being only used when there is no feeding while working a pipeline.

2) I think above doesn't seem to work, Brandon had a hack patch on vendor tree
[3] for AM62A timer, with that I no longer see above crash issue but I observe
that there is a 5 second wait to power off device even after last instance is
closed as seen here [4], seems like power counter is not getting set to 0 on
instance close, you may try to reproduce the same on j721s2 evm too.


[1]:
https://gitlab.com/linux-kernel/linux-next/-/blob/master/drivers/media/platform/nxp/imx-jpeg/mxc-jpeg.c?ref_type=heads#L1637
[2]:
https://gitlab.com/linux-kernel/linux-next/-/blob/next-20240619/drivers/iio/light/bh1780.c?ref_type=tags#L179
[3]:
https://git.ti.com/cgit/ti-linux-kernel/ti-linux-kernel/commit/?h=ti-linux-6.6.y-cicd&id=0be8de03825c2834a39af603b088cbf31e19d55d
[4]: https://gist.github.com/devarsht/009075d8706001f447733ed859152d90

Regards
Devarsh
  
Nicolas Dufresne June 20, 2024, 2:03 p.m. UTC | #5
Hi Jackson, Devarsh,

Le mercredi 19 juin 2024 à 23:56 +0000, jackson.lee a écrit :
> Hi Devarsh
> 
> If there is no feeding bitstreams during encoding and decoding frames, then driver's status is switched to suspended automatically by autosuspend.
> And if we don’t use autosuspend, it is very difficult for us to catch if there is feeding or not while working a pipeline.
> So it is very efficient for managing power status.
> 
> If the delay is very great value, we can adjust it.

One way to resolve this, would be if someone share measurement of the suspend /
resume cycle duration. With firmware (third party OS) like this, the cost and
duration is few order of magnitude higher then with more basic ASIC like Hantro
and other single function HW.

Yet, 5s might be to much (but clearly safe), but getting two low may means that
we suspect "between two frames", and if that happens, we may endup with various
range of side effect, like reduce throughput due to suspend collisions, or even
worse power footprint. Some lab testing to adjust the value will be needed, we
have very little of that happening at the moment as I understood.

Nicolas

> 
> Thanks
> Jackson
> 
> > -----Original Message-----
> > From: Devarsh Thakkar <devarsht@ti.com>
> > Sent: Wednesday, June 19, 2024 10:00 PM
> > To: jackson.lee <jackson.lee@chipsnmedia.com>; mchehab@kernel.org;
> > nicolas@ndufresne.ca; sebastian.fricke@collabora.com
> > Cc: linux-media@vger.kernel.org; linux-kernel@vger.kernel.org;
> > hverkuil@xs4all.nl; Nas Chung <nas.chung@chipsnmedia.com>; lafley.kim
> > <lafley.kim@chipsnmedia.com>; b-brnich@ti.com; Nicolas Dufresne
> > <nicolas.dufresne@collabora.com>
> > Subject: Re: [RESEND PATCH v6 2/4] media: chips-media: wave5: Support runtime
> > suspend/resume
> > 
> > Hi Jackson,
> > 
> > Thanks for the patch.
> > On 17/06/24 16:18, Jackson.lee wrote:
> > > From: "jackson.lee" <jackson.lee@chipsnmedia.com>
> > > 
> > > Add support for runtime suspend/resume in the encoder and decoder.
> > > This is achieved by saving the VPU state and powering it off while the VPU
> > idle.
> > > 
> > > Signed-off-by: Jackson.lee <jackson.lee@chipsnmedia.com>
> > > Signed-off-by: Nas Chung <nas.chung@chipsnmedia.com>
> > > Reviewed-by: Nicolas Dufresne <nicolas.dufresne@collabora.com>
> > 
> > [..]
> > >  static int wave5_vpu_probe(struct platform_device *pdev)  {
> > >  	int ret;
> > > @@ -268,6 +301,12 @@ static int wave5_vpu_probe(struct platform_device
> > *pdev)
> > >  		 (match_data->flags & WAVE5_IS_DEC) ? "'DECODE'" : "");
> > >  	dev_info(&pdev->dev, "Product Code:      0x%x\n", dev->product_code);
> > >  	dev_info(&pdev->dev, "Firmware Revision: %u\n", fw_revision);
> > > +
> > > +	pm_runtime_set_autosuspend_delay(&pdev->dev, 5000);
> > 
> > Why are we putting 5s delay for autosuspend ? Without using auto-suspend
> > delay too, we can directly go to suspended state when last instance is closed
> > and resume back when first instance is open.
> > 
> > I don't think having an autosuspend delay (especially of 5s) bodes well with
> > low power-centric devices such as AM62A where we would prefer to go to
> > suspend state as soon as possible when the last instance is closed.
> > 
> > Also apologies for the delay in review, this didn't caught my eye earlier as
> > commit message did not mention it either.
> > 
> > Regards
> > Devarsh
  
Nicolas Dufresne June 20, 2024, 2:05 p.m. UTC | #6
Hi Devarsh,

Le jeudi 20 juin 2024 à 15:05 +0530, Devarsh Thakkar a écrit :
> In my view the delayed suspend functionality is generally helpful for devices
> where resume latencies are higher for e.g. this light sensor driver [2] uses
> it because it takes 250ms to stabilize after resumption and I don't see this
> being used in codec drivers generally since there is no such large resume
> latency. Please let me know if I am missing something or there is a strong
> reason to have delayed suspend for wave5.

It sounds like you did proper scientific testing of the suspend results calls,
mind sharing the actual data ?

Nicolas
  
Devarsh Thakkar June 20, 2024, 2:20 p.m. UTC | #7
Hi Nicolas,

On 20/06/24 19:35, Nicolas Dufresne wrote:
> Hi Devarsh,
> 
> Le jeudi 20 juin 2024 à 15:05 +0530, Devarsh Thakkar a écrit :
>> In my view the delayed suspend functionality is generally helpful for devices
>> where resume latencies are higher for e.g. this light sensor driver [2] uses
>> it because it takes 250ms to stabilize after resumption and I don't see this
>> being used in codec drivers generally since there is no such large resume
>> latency. Please let me know if I am missing something or there is a strong
>> reason to have delayed suspend for wave5.
> 
> It sounds like you did proper scientific testing of the suspend results calls,
> mind sharing the actual data ?

Nopes, I did not do that but yes I agree it is good to profile and evaluate
the trade-off but I am not expecting 250ms kind of latency. I would suggest
Jackson to do the profiling for the resume latencies.

But perhaps a separate issue, I did notice that intention of the patchset was
to suspend without waiting for the timeout if there is no application having a
handle to the wave5 device but even if I close the last instance I still see
the IP stays on for 5seconds as seen in this logs [1] and this perhaps could
be because extra pm counter references being hold.

[2024-06-20 12:32:50] Freeing pipeline ...

and after 5 seconds..

[2024-06-20 12:32:55] |   204     | AM62AX_DEV_CODEC0 | DEVICE_STATE_ON |
[2024-06-20 12:32:56] |   204     | AM62AX_DEV_CODEC0 | DEVICE_STATE_OFF

[1]: https://gist.github.com/devarsht/009075d8706001f447733ed859152d90

Regards
Devarsh
  
Devarsh Thakkar June 20, 2024, 2:52 p.m. UTC | #8
Hi Jackson, Nicolas,

On 20/06/24 19:33, Nicolas Dufresne wrote:
> Hi Jackson, Devarsh,
> 
> Le mercredi 19 juin 2024 à 23:56 +0000, jackson.lee a écrit :
>> Hi Devarsh
>>
>> If there is no feeding bitstreams during encoding and decoding frames, then driver's status is switched to suspended automatically by autosuspend.
>> And if we don’t use autosuspend, it is very difficult for us to catch if there is feeding or not while working a pipeline.
>> So it is very efficient for managing power status.
>>
>> If the delay is very great value, we can adjust it.
> 
> One way to resolve this, would be if someone share measurement of the suspend /
> resume cycle duration. With firmware (third party OS) like this, the cost and
> duration is few order of magnitude higher then with more basic ASIC like Hantro
> and other single function HW.
> 
> Yet, 5s might be to much (but clearly safe), but getting two low may means that
> we suspect "between two frames", and if that happens, we may endup with various
> range of side effect, like reduce throughput due to suspend collisions, or even
> worse power footprint. Some lab testing to adjust the value will be needed, we
> have very little of that happening at the moment as I understood.
> 

Okay I see the intention here is that if there is a process holding the vpu
device handle and the input feed is stalled for some seconds due to network
delay or CPU throughput then after a specified timeout say 5 seconds we want
to suspend even if the process is still active and holding the vpu device
handle ? I agree then if we want to support this feature a safer/slightly
larger value is required to avoid frequent suspend/resume due to network
jitter or any other bottleneck and maybe 5s is a good value to start with.

But if last instance is closed/stops streaming and there is no process holding
the device handle anymore then I think we should suspend immediately without
any delay.

Regards
Devarsh
  
Nicolas Dufresne June 20, 2024, 3:20 p.m. UTC | #9
Hi,

one more thing I notice below when comparing with Hantro ...

Le lundi 17 juin 2024 à 19:48 +0900, Jackson.lee a écrit :
> From: "jackson.lee" <jackson.lee@chipsnmedia.com>
> 

[...]

>  
>  err_enc_unreg:
> @@ -295,6 +334,9 @@ static void wave5_vpu_remove(struct platform_device *pdev)
>  		hrtimer_cancel(&dev->hrtimer);
>  	}
>  
> +	pm_runtime_put_sync(&pdev->dev);

I don't know if its strictly needed, but I noticed that Hantro calls
pm_runtime_dont_use_autosuspend() in its remove function. Can you check if this
is strictly needed, we don't want anything to call again later if we are
removing the module, so better check.

Nicolas

> +	pm_runtime_disable(&pdev->dev);
> +
>  	mutex_destroy(&dev->dev_lock);
>  	mutex_destroy(&dev->hw_lock);
>  	clk_bulk_disable_unprepare(dev->num_clks, dev->clks);
> @@ -320,6 +362,7 @@ static struct platform_driver wave5_vpu_driver = {
>  	.driver = {
>  		.name = VPU_PLATFORM_DEVICE_NAME,
>  		.of_match_table = of_match_ptr(wave5_dt_ids),
> +		.pm = &wave5_pm_ops,
>  		},
>  	.probe = wave5_vpu_probe,
>  	.remove_new = wave5_vpu_remove,
  
Nicolas Dufresne June 20, 2024, 3:24 p.m. UTC | #10
Le jeudi 20 juin 2024 à 20:22 +0530, Devarsh Thakkar a écrit :
> Hi Jackson, Nicolas,
> 
> On 20/06/24 19:33, Nicolas Dufresne wrote:
> > Hi Jackson, Devarsh,
> > 
> > Le mercredi 19 juin 2024 à 23:56 +0000, jackson.lee a écrit :
> > > Hi Devarsh
> > > 
> > > If there is no feeding bitstreams during encoding and decoding frames, then driver's status is switched to suspended automatically by autosuspend.
> > > And if we don’t use autosuspend, it is very difficult for us to catch if there is feeding or not while working a pipeline.
> > > So it is very efficient for managing power status.
> > > 
> > > If the delay is very great value, we can adjust it.
> > 
> > One way to resolve this, would be if someone share measurement of the suspend /
> > resume cycle duration. With firmware (third party OS) like this, the cost and
> > duration is few order of magnitude higher then with more basic ASIC like Hantro
> > and other single function HW.
> > 
> > Yet, 5s might be to much (but clearly safe), but getting two low may means that
> > we suspect "between two frames", and if that happens, we may endup with various
> > range of side effect, like reduce throughput due to suspend collisions, or even
> > worse power footprint. Some lab testing to adjust the value will be needed, we
> > have very little of that happening at the moment as I understood.
> > 
> 
> Okay I see the intention here is that if there is a process holding the vpu
> device handle and the input feed is stalled for some seconds due to network
> delay or CPU throughput then after a specified timeout say 5 seconds we want
> to suspend even if the process is still active and holding the vpu device
> handle ? I agree then if we want to support this feature a safer/slightly
> larger value is required to avoid frequent suspend/resume due to network
> jitter or any other bottleneck and maybe 5s is a good value to start with.
> 
> But if last instance is closed/stops streaming and there is no process holding
> the device handle anymore then I think we should suspend immediately without
> any delay.

Our emails crossed each other, but see my explanation about gapless playback
transiton, were userspace may destroy and create a new video session. I believe
5s is way too long to be honest.

Nicolas

> 
> Regards
> Devarsh
  
Nicolas Dufresne June 20, 2024, 5:32 p.m. UTC | #11
Le jeudi 20 juin 2024 à 19:50 +0530, Devarsh Thakkar a écrit :
> Hi Nicolas,
> 
> On 20/06/24 19:35, Nicolas Dufresne wrote:
> > Hi Devarsh,
> > 
> > Le jeudi 20 juin 2024 à 15:05 +0530, Devarsh Thakkar a écrit :
> > > In my view the delayed suspend functionality is generally helpful for devices
> > > where resume latencies are higher for e.g. this light sensor driver [2] uses
> > > it because it takes 250ms to stabilize after resumption and I don't see this
> > > being used in codec drivers generally since there is no such large resume
> > > latency. Please let me know if I am missing something or there is a strong
> > > reason to have delayed suspend for wave5.
> > 
> > It sounds like you did proper scientific testing of the suspend results calls,
> > mind sharing the actual data ?
> 
> Nopes, I did not do that but yes I agree it is good to profile and evaluate
> the trade-off but I am not expecting 250ms kind of latency. I would suggest
> Jackson to do the profiling for the resume latencies.

I'd clearly like to see numbers before we proceed.

> 
> But perhaps a separate issue, I did notice that intention of the patchset was
> to suspend without waiting for the timeout if there is no application having a
> handle to the wave5 device but even if I close the last instance I still see
> the IP stays on for 5seconds as seen in this logs [1] and this perhaps could
> be because extra pm counter references being hold.

Not sure where this comes from, I'm not aware of drivers doing that with M2M
instances. Only 

> 
> [2024-06-20 12:32:50] Freeing pipeline ...
> 
> and after 5 seconds..
> 
> [2024-06-20 12:32:55] |   204     | AM62AX_DEV_CODEC0 | DEVICE_STATE_ON |
> [2024-06-20 12:32:56] |   204     | AM62AX_DEV_CODEC0 | DEVICE_STATE_OFF
> 
> [1]: https://gist.github.com/devarsht/009075d8706001f447733ed859152d90

Appart from the 5s being too long, that is expected. If it fails after that,
this is a bug, we we should hold on merging this until the problem has been
resolved.

Imagine that userspace is going gapless playback, if you have a lets say 30ms on
forced suspend cycle due to close/open of the decoder instance, it won't
actually endup gapless. The delay will ensure that we only suspend when needed.

There is other changes I have asked in this series, since we always have the
case where userspace just pause on streaming, and we want that prolonged paused
lead to suspend. Hopefully this has been strongly tested and is not just added
for "completeness".

Its important to note that has a reviewer only, my time is limited, and I
completely rely on the author judgment of delay tuning and actual testing.

Nicolas

> 
> Regards
> Devarsh
  
jackson.lee June 21, 2024, 12:30 a.m. UTC | #12
Hi Nicolas / Devarsh


There are lots of mail thread in the loop, I have confusion.
I'd like to make check-up list for the "Support runtime suspend/resume" patch.

1. Profiling resume latency
2. after that, adjusting the time.

The patch set is okay except the above thing. ?

Thanks.
Jackson


> -----Original Message-----
> From: Nicolas Dufresne <nicolas.dufresne@collabora.com>
> Sent: Friday, June 21, 2024 2:33 AM
> To: Devarsh Thakkar <devarsht@ti.com>; jackson.lee
> <jackson.lee@chipsnmedia.com>; mchehab@kernel.org;
> sebastian.fricke@collabora.com
> Cc: linux-media@vger.kernel.org; linux-kernel@vger.kernel.org;
> hverkuil@xs4all.nl; Nas Chung <nas.chung@chipsnmedia.com>; lafley.kim
> <lafley.kim@chipsnmedia.com>; b-brnich@ti.com; Luthra, Jai <j-luthra@ti.com>;
> Vibhore <vibhore@ti.com>; Dhruva Gole <d-gole@ti.com>; Aradhya <a-
> bhatia1@ti.com>; Raghavendra, Vignesh <vigneshr@ti.com>
> Subject: Re: [RESEND PATCH v6 2/4] media: chips-media: wave5: Support runtime
> suspend/resume
> 
> Le jeudi 20 juin 2024 à 19:50 +0530, Devarsh Thakkar a écrit :
> > Hi Nicolas,
> >
> > On 20/06/24 19:35, Nicolas Dufresne wrote:
> > > Hi Devarsh,
> > >
> > > Le jeudi 20 juin 2024 à 15:05 +0530, Devarsh Thakkar a écrit :
> > > > In my view the delayed suspend functionality is generally helpful
> > > > for devices where resume latencies are higher for e.g. this light
> > > > sensor driver [2] uses it because it takes 250ms to stabilize
> > > > after resumption and I don't see this being used in codec drivers
> > > > generally since there is no such large resume latency. Please let
> > > > me know if I am missing something or there is a strong reason to have
> delayed suspend for wave5.
> > >
> > > It sounds like you did proper scientific testing of the suspend
> > > results calls, mind sharing the actual data ?
> >
> > Nopes, I did not do that but yes I agree it is good to profile and
> > evaluate the trade-off but I am not expecting 250ms kind of latency. I
> > would suggest Jackson to do the profiling for the resume latencies.
> 
> I'd clearly like to see numbers before we proceed.
> 
> >
> > But perhaps a separate issue, I did notice that intention of the
> > patchset was to suspend without waiting for the timeout if there is no
> > application having a handle to the wave5 device but even if I close
> > the last instance I still see the IP stays on for 5seconds as seen in
> > this logs [1] and this perhaps could be because extra pm counter references
> being hold.
> 
> Not sure where this comes from, I'm not aware of drivers doing that with M2M
> instances. Only
> 
> >
> > [2024-06-20 12:32:50] Freeing pipeline ...
> >
> > and after 5 seconds..
> >
> > [2024-06-20 12:32:55] |   204     | AM62AX_DEV_CODEC0 | DEVICE_STATE_ON |
> > [2024-06-20 12:32:56] |   204     | AM62AX_DEV_CODEC0 | DEVICE_STATE_OFF
> >
> > [1]: https://gist.github.com/devarsht/009075d8706001f447733ed859152d90
> 
> Appart from the 5s being too long, that is expected. If it fails after that,
> this is a bug, we we should hold on merging this until the problem has been
> resolved.
> 
> Imagine that userspace is going gapless playback, if you have a lets say 30ms
> on forced suspend cycle due to close/open of the decoder instance, it won't
> actually endup gapless. The delay will ensure that we only suspend when
> needed.
> 
> There is other changes I have asked in this series, since we always have the
> case where userspace just pause on streaming, and we want that prolonged
> paused lead to suspend. Hopefully this has been strongly tested and is not
> just added for "completeness".
> 
> Its important to note that has a reviewer only, my time is limited, and I
> completely rely on the author judgment of delay tuning and actual testing.
> 
> Nicolas
> 
> >
> > Regards
> > Devarsh
  
jackson.lee June 21, 2024, 7:36 a.m. UTC | #13
Hi Nicolas and Devarsh

> -----Original Message-----
> From: Nicolas Dufresne <nicolas.dufresne@collabora.com>
> Sent: Friday, June 21, 2024 12:24 AM
> To: Devarsh Thakkar <devarsht@ti.com>; jackson.lee
> <jackson.lee@chipsnmedia.com>; mchehab@kernel.org;
> sebastian.fricke@collabora.com
> Cc: linux-media@vger.kernel.org; linux-kernel@vger.kernel.org;
> hverkuil@xs4all.nl; Nas Chung <nas.chung@chipsnmedia.com>; lafley.kim
> <lafley.kim@chipsnmedia.com>; b-brnich@ti.com
> Subject: Re: [RESEND PATCH v6 2/4] media: chips-media: wave5: Support runtime
> suspend/resume
> 
> Le jeudi 20 juin 2024 à 20:22 +0530, Devarsh Thakkar a écrit :
> > Hi Jackson, Nicolas,
> >
> > On 20/06/24 19:33, Nicolas Dufresne wrote:
> > > Hi Jackson, Devarsh,
> > >
> > > Le mercredi 19 juin 2024 à 23:56 +0000, jackson.lee a écrit :
> > > > Hi Devarsh
> > > >
> > > > If there is no feeding bitstreams during encoding and decoding frames,
> then driver's status is switched to suspended automatically by autosuspend.
> > > > And if we don’t use autosuspend, it is very difficult for us to catch
> if there is feeding or not while working a pipeline.
> > > > So it is very efficient for managing power status.
> > > >
> > > > If the delay is very great value, we can adjust it.
> > >
> > > One way to resolve this, would be if someone share measurement of
> > > the suspend / resume cycle duration. With firmware (third party OS)
> > > like this, the cost and duration is few order of magnitude higher
> > > then with more basic ASIC like Hantro and other single function HW.
> > >
> > > Yet, 5s might be to much (but clearly safe), but getting two low may
> > > means that we suspect "between two frames", and if that happens, we
> > > may endup with various range of side effect, like reduce throughput
> > > due to suspend collisions, or even worse power footprint. Some lab
> > > testing to adjust the value will be needed, we have very little of that
> happening at the moment as I understood.
> > >
> >
> > Okay I see the intention here is that if there is a process holding
> > the vpu device handle and the input feed is stalled for some seconds
> > due to network delay or CPU throughput then after a specified timeout
> > say 5 seconds we want to suspend even if the process is still active
> > and holding the vpu device handle ? I agree then if we want to support
> > this feature a safer/slightly larger value is required to avoid
> > frequent suspend/resume due to network jitter or any other bottleneck and
> maybe 5s is a good value to start with.
> >
> > But if last instance is closed/stops streaming and there is no process
> > holding the device handle anymore then I think we should suspend
> > immediately without any delay.
> 
> Our emails crossed each other, but see my explanation about gapless playback
> transiton, were userspace may destroy and create a new video session. I
> believe 5s is way too long to be honest.
> 

I investigated why it takes 5 sec to go to suspend even if the last instance is closed
The reason is if autosuspend is used, timeout should happen to go to suspend even though the power.usage count is 0 after an instance is closed.


So I made the below modification, when the last instance is closed, autosuspend turns off,  and when the first instance is opened, the autosuspend turns on, again.
When I tested with the change, it works well.

Can you review the below code?


diff --git a/drivers/media/platform/chips-media/wave5/wave5-vpu-dec.c b/drivers/media/platform/chips-media/wave5/wave5-vpu-dec.c
index 1aa5b6788266..87932d1550ce 100644
--- a/drivers/media/platform/chips-media/wave5/wave5-vpu-dec.c
+++ b/drivers/media/platform/chips-media/wave5/wave5-vpu-dec.c
@@ -1714,6 +1714,8 @@ static int wave5_vpu_open_dec(struct file *filp)
        struct vpu_device *dev = video_drvdata(filp);
        struct vpu_instance *inst = NULL;
        struct v4l2_m2m_ctx *m2m_ctx;
+       int inst_count = 0;
+       struct vpu_instance *inst_elm;
        int ret = 0;

        inst = kzalloc(sizeof(*inst), GFP_KERNEL);
@@ -1799,6 +1801,12 @@ static int wave5_vpu_open_dec(struct file *filp)
                hrtimer_start(&dev->hrtimer, ns_to_ktime(dev->vpu_poll_interval * NSEC_PER_MSEC),
                              HRTIMER_MODE_REL_PINNED);

+       list_for_each_entry(inst_elm, &dev->instances, list)
+               inst_count++;
+
+       if (!inst_count)
+               pm_runtime_use_autosuspend(inst->dev->dev);
+
        list_add_tail(&inst->list, &dev->instances);

        mutex_unlock(&dev->dev_lock);


diff --git a/drivers/media/platform/chips-media/wave5/wave5-vpuapi.c b/drivers/media/platform/chips-media/wave5/wave5-vpuapi.c
index b0911fef232f..05b83445c650 100644
--- a/drivers/media/platform/chips-media/wave5/wave5-vpuapi.c
+++ b/drivers/media/platform/chips-media/wave5/wave5-vpuapi.c
@@ -197,6 +197,8 @@ int wave5_vpu_dec_close(struct vpu_instance *inst, u32 *fail_res)
        int retry = 0;
        struct vpu_device *vpu_dev = inst->dev;
        int i;
+       int inst_count = 0;
+       struct vpu_instance *inst_elm;

        *fail_res = 0;
        if (!inst->codec_info)
@@ -239,8 +241,14 @@ int wave5_vpu_dec_close(struct vpu_instance *inst, u32 *fail_res)
        wave5_vdi_free_dma_memory(vpu_dev, &p_dec_info->vb_task);

 unlock_and_return:
-       mutex_unlock(&vpu_dev->hw_lock);
+       list_for_each_entry(inst_elm, &vpu_dev->instances, list)
+               inst_count++;
+
+       if (inst_count == 1)
+               pm_runtime_dont_use_autosuspend(inst->dev->dev);
+
        pm_runtime_put_sync(inst->dev->dev);
+       mutex_unlock(&vpu_dev->hw_lock);
        return ret;
 }



> Nicolas
> 
> >
> > Regards
> > Devarsh
  
Devarsh Thakkar June 21, 2024, 11:55 a.m. UTC | #14
Hi Jackson,

On 21/06/24 06:00, jackson.lee wrote:
> Hi Nicolas / Devarsh
> 
> 
> There are lots of mail thread in the loop, I have confusion.
> I'd like to make check-up list for the "Support runtime suspend/resume" patch.
> 
> 1. Profiling resume latency
> 2. after that, adjusting the time.
> 

Beyond above two points,

3. I think this patchset also breaks hrtimer polling and so the VPU operation
on AM62A which completely relies on polling, you can test with removing the
interrupt property from your dts file before/after this patch-set. With the
polling it needs to be taken care that polling is started only after device is
on power-on state and is stopped before device gets suspended.

4. There is some discussion going on between me and Nicholas on whether
delayed suspend is really required after last instance close or not. My
thought was that we should suspend immediately after last instance close, but
Nicolas mentioned some concerns w.r.t use-cases such as gapless playback so I
am following up with him.

Regards
Devarsh
  
Devarsh Thakkar June 21, 2024, 12:31 p.m. UTC | #15
Hi Nicolas,

On 20/06/24 23:02, Nicolas Dufresne wrote:
> Le jeudi 20 juin 2024 à 19:50 +0530, Devarsh Thakkar a écrit :
[..]
 > Imagine that userspace is going gapless playback, if you have a lets say
30ms on
> forced suspend cycle due to close/open of the decoder instance, it won't
> actually endup gapless. The delay will ensure that we only suspend when needed.
> 

Shouldn't the applications doing gapless playback avoid frequent open/close of
the decoder instance too as it will add up re-instantiation (initializing hw,
allocating buffers) and cleanup (de-initialization and freeing up of buffers)
delay for each open/close respectively ? Even in case of scenario where
resolution of next stream is different than previous, I guess the application
can still hold up the file handle and do the necessary setup (stream
off/stream on/REQBUFS etc) required for re-initialization ?

Regards
Devarsh
  
jackson.lee June 21, 2024, 12:45 p.m. UTC | #16
> -----Original Message-----
> From: Devarsh Thakkar <devarsht@ti.com>
> Sent: Friday, June 21, 2024 8:55 PM
> To: jackson.lee <jackson.lee@chipsnmedia.com>; Nicolas Dufresne
> <nicolas.dufresne@collabora.com>; mchehab@kernel.org;
> sebastian.fricke@collabora.com
> Cc: linux-media@vger.kernel.org; linux-kernel@vger.kernel.org;
> hverkuil@xs4all.nl; Nas Chung <nas.chung@chipsnmedia.com>; lafley.kim
> <lafley.kim@chipsnmedia.com>; b-brnich@ti.com; Luthra, Jai <j-luthra@ti.com>;
> Vibhore <vibhore@ti.com>; Dhruva Gole <d-gole@ti.com>; Aradhya <a-
> bhatia1@ti.com>; Raghavendra, Vignesh <vigneshr@ti.com>
> Subject: Re: [RESEND PATCH v6 2/4] media: chips-media: wave5: Support runtime
> suspend/resume
> 
> Hi Jackson,
> 
> On 21/06/24 06:00, jackson.lee wrote:
> > Hi Nicolas / Devarsh
> >
> >
> > There are lots of mail thread in the loop, I have confusion.
> > I'd like to make check-up list for the "Support runtime suspend/resume"
> patch.
> >
> > 1. Profiling resume latency
> > 2. after that, adjusting the time.
> >
> 
> Beyond above two points,
> 

Hi Brandon

According to today meeting, should we take care of this ?

> 3. I think this patchset also breaks hrtimer polling and so the VPU operation
> on AM62A which completely relies on polling, you can test with removing the
> interrupt property from your dts file before/after this patch-set. With the
> polling it needs to be taken care that polling is started only after device
> is on power-on state and is stopped before device gets suspended.
> 

Hi Devarsh

I have already sent some changes to fix this in the previous e-mail. Please refer to the e-mail.

> 4. There is some discussion going on between me and Nicholas on whether
> delayed suspend is really required after last instance close or not. My
> thought was that we should suspend immediately after last instance close, but
> Nicolas mentioned some concerns w.r.t use-cases such as gapless playback so I
> am following up with him.
> 
> Regards
> Devarsh
  
jackson.lee July 12, 2024, 6:10 a.m. UTC | #17
Hi Nicolas

> -----Original Message-----
> From: Nicolas Dufresne <nicolas.dufresne@collabora.com>
> Sent: Friday, June 21, 2024 2:33 AM
> To: Devarsh Thakkar <devarsht@ti.com>; jackson.lee
> <jackson.lee@chipsnmedia.com>; mchehab@kernel.org;
> sebastian.fricke@collabora.com
> Cc: linux-media@vger.kernel.org; linux-kernel@vger.kernel.org;
> hverkuil@xs4all.nl; Nas Chung <nas.chung@chipsnmedia.com>; lafley.kim
> <lafley.kim@chipsnmedia.com>; b-brnich@ti.com; Luthra, Jai <j-luthra@ti.com>;
> Vibhore <vibhore@ti.com>; Dhruva Gole <d-gole@ti.com>; Aradhya <a-
> bhatia1@ti.com>; Raghavendra, Vignesh <vigneshr@ti.com>
> Subject: Re: [RESEND PATCH v6 2/4] media: chips-media: wave5: Support runtime
> suspend/resume
> 
> Le jeudi 20 juin 2024 à 19:50 +0530, Devarsh Thakkar a écrit :
> > Hi Nicolas,
> >
> > On 20/06/24 19:35, Nicolas Dufresne wrote:
> > > Hi Devarsh,
> > >
> > > Le jeudi 20 juin 2024 à 15:05 +0530, Devarsh Thakkar a écrit :
> > > > In my view the delayed suspend functionality is generally helpful
> > > > for devices where resume latencies are higher for e.g. this light
> > > > sensor driver [2] uses it because it takes 250ms to stabilize
> > > > after resumption and I don't see this being used in codec drivers
> > > > generally since there is no such large resume latency. Please let
> > > > me know if I am missing something or there is a strong reason to have
> delayed suspend for wave5.
> > >
> > > It sounds like you did proper scientific testing of the suspend
> > > results calls, mind sharing the actual data ?
> >
> > Nopes, I did not do that but yes I agree it is good to profile and
> > evaluate the trade-off but I am not expecting 250ms kind of latency. I
> > would suggest Jackson to do the profiling for the resume latencies.
> 
> I'd clearly like to see numbers before we proceed.
> 

I measured latency for the resume and suspend of our hw block.

Resume : 124 microsecond
Suspend : 355 microsecond

I think if the delay is 100ms, it is enough.
How about this ?

> >
> > But perhaps a separate issue, I did notice that intention of the
> > patchset was to suspend without waiting for the timeout if there is no
> > application having a handle to the wave5 device but even if I close
> > the last instance I still see the IP stays on for 5seconds as seen in
> > this logs [1] and this perhaps could be because extra pm counter references
> being hold.
> 
> Not sure where this comes from, I'm not aware of drivers doing that with M2M
> instances. Only
> 
> >
> > [2024-06-20 12:32:50] Freeing pipeline ...
> >
> > and after 5 seconds..
> >
> > [2024-06-20 12:32:55] |   204     | AM62AX_DEV_CODEC0 | DEVICE_STATE_ON |
> > [2024-06-20 12:32:56] |   204     | AM62AX_DEV_CODEC0 | DEVICE_STATE_OFF
> >
> > [1]: https://gist.github.com/devarsht/009075d8706001f447733ed859152d90
> 
> Appart from the 5s being too long, that is expected. If it fails after that,
> this is a bug, we we should hold on merging this until the problem has been
> resolved.
> 

After 5sec, the hw goes to suspend. So there is no bug in the current patch-set.


Thanks


> Imagine that userspace is going gapless playback, if you have a lets say 30ms
> on forced suspend cycle due to close/open of the decoder instance, it won't
> actually endup gapless. The delay will ensure that we only suspend when
> needed.
> 
> There is other changes I have asked in this series, since we always have the
> case where userspace just pause on streaming, and we want that prolonged
> paused lead to suspend. Hopefully this has been strongly tested and is not
> just added for "completeness".
> 
> Its important to note that has a reviewer only, my time is limited, and I
> completely rely on the author judgment of delay tuning and actual testing.
> 
> Nicolas
> 
> >
> > Regards
> > Devarsh
  
Nicolas Dufresne July 15, 2024, 5:01 p.m. UTC | #18
Hi,

Le vendredi 21 juin 2024 à 18:01 +0530, Devarsh Thakkar a écrit :
> Hi Nicolas,
> 
> On 20/06/24 23:02, Nicolas Dufresne wrote:
> > Le jeudi 20 juin 2024 à 19:50 +0530, Devarsh Thakkar a écrit :
> [..]
>  > Imagine that userspace is going gapless playback, if you have a lets say
> 30ms on
> > forced suspend cycle due to close/open of the decoder instance, it won't
> > actually endup gapless. The delay will ensure that we only suspend when needed.
> > 
> 
> Shouldn't the applications doing gapless playback avoid frequent open/close of
> the decoder instance too as it will add up re-instantiation (initializing hw,
> allocating buffers) and cleanup (de-initialization and freeing up of buffers)
> delay for each open/close respectively ? Even in case of scenario where
> resolution of next stream is different than previous, I guess the application
> can still hold up the file handle and do the necessary setup (stream
> off/stream on/REQBUFS etc) required for re-initialization ?

I don't have a very strong opinion here, I usually try to avoid optimizing for
what userspace should do. Best would be to build your opinion on your own
testing of existing userspace (perhaps not just GStreamer).

I think if you have good reason to force suspend when the last instance is
destroyed, please do so (e.g. stability issue, race conditions etc). So far, I
don't personally know what is the issue with leaving a small delay in order to
avoid a suspend / resume cycle if one quickly close the last instance and open
the next one immediately. A comment would be nice, so no one fall in such a trap
later.

Nicolas
  
Nicolas Dufresne July 15, 2024, 5:05 p.m. UTC | #19
Hi Jackson,

Le vendredi 12 juillet 2024 à 06:10 +0000, jackson.lee a écrit :
> Hi Nicolas
> 
> > -----Original Message-----
> > From: Nicolas Dufresne <nicolas.dufresne@collabora.com>
> > Sent: Friday, June 21, 2024 2:33 AM
> > To: Devarsh Thakkar <devarsht@ti.com>; jackson.lee
> > <jackson.lee@chipsnmedia.com>; mchehab@kernel.org;
> > sebastian.fricke@collabora.com
> > Cc: linux-media@vger.kernel.org; linux-kernel@vger.kernel.org;
> > hverkuil@xs4all.nl; Nas Chung <nas.chung@chipsnmedia.com>; lafley.kim
> > <lafley.kim@chipsnmedia.com>; b-brnich@ti.com; Luthra, Jai <j-luthra@ti.com>;
> > Vibhore <vibhore@ti.com>; Dhruva Gole <d-gole@ti.com>; Aradhya <a-
> > bhatia1@ti.com>; Raghavendra, Vignesh <vigneshr@ti.com>
> > Subject: Re: [RESEND PATCH v6 2/4] media: chips-media: wave5: Support runtime
> > suspend/resume
> > 
> > Le jeudi 20 juin 2024 à 19:50 +0530, Devarsh Thakkar a écrit :
> > > Hi Nicolas,
> > > 
> > > On 20/06/24 19:35, Nicolas Dufresne wrote:
> > > > Hi Devarsh,
> > > > 
> > > > Le jeudi 20 juin 2024 à 15:05 +0530, Devarsh Thakkar a écrit :
> > > > > In my view the delayed suspend functionality is generally helpful
> > > > > for devices where resume latencies are higher for e.g. this light
> > > > > sensor driver [2] uses it because it takes 250ms to stabilize
> > > > > after resumption and I don't see this being used in codec drivers
> > > > > generally since there is no such large resume latency. Please let
> > > > > me know if I am missing something or there is a strong reason to have
> > delayed suspend for wave5.
> > > > 
> > > > It sounds like you did proper scientific testing of the suspend
> > > > results calls, mind sharing the actual data ?
> > > 
> > > Nopes, I did not do that but yes I agree it is good to profile and
> > > evaluate the trade-off but I am not expecting 250ms kind of latency. I
> > > would suggest Jackson to do the profiling for the resume latencies.
> > 
> > I'd clearly like to see numbers before we proceed.
> > 
> 
> I measured latency for the resume and suspend of our hw block.
> 
> Resume : 124 microsecond
> Suspend : 355 microsecond
> 
> I think if the delay is 100ms, it is enough.
> How about this ?

Seem very fast operation indeed, so a very small delay seems logical. I believe
this is similar to what other drivers uses, so sounds good to me. 

**If** we decide to go lower or drop the delay, then I'd like see someones
benchmark that show that doing suspend during playback does improve power
efficiently without reducing throughput.

Nicolas

> 
> > > 
> > > But perhaps a separate issue, I did notice that intention of the
> > > patchset was to suspend without waiting for the timeout if there is no
> > > application having a handle to the wave5 device but even if I close
> > > the last instance I still see the IP stays on for 5seconds as seen in
> > > this logs [1] and this perhaps could be because extra pm counter references
> > being hold.
> > 
> > Not sure where this comes from, I'm not aware of drivers doing that with M2M
> > instances. Only
> > 
> > > 
> > > [2024-06-20 12:32:50] Freeing pipeline ...
> > > 
> > > and after 5 seconds..
> > > 
> > > [2024-06-20 12:32:55] |   204     | AM62AX_DEV_CODEC0 | DEVICE_STATE_ON |
> > > [2024-06-20 12:32:56] |   204     | AM62AX_DEV_CODEC0 | DEVICE_STATE_OFF
> > > 
> > > [1]: https://gist.github.com/devarsht/009075d8706001f447733ed859152d90
> > 
> > Appart from the 5s being too long, that is expected. If it fails after that,
> > this is a bug, we we should hold on merging this until the problem has been
> > resolved.
> > 
> 
> After 5sec, the hw goes to suspend. So there is no bug in the current patch-set.
> 
> 
> Thanks
> 
> 
> > Imagine that userspace is going gapless playback, if you have a lets say 30ms
> > on forced suspend cycle due to close/open of the decoder instance, it won't
> > actually endup gapless. The delay will ensure that we only suspend when
> > needed.
> > 
> > There is other changes I have asked in this series, since we always have the
> > case where userspace just pause on streaming, and we want that prolonged
> > paused lead to suspend. Hopefully this has been strongly tested and is not
> > just added for "completeness".
> > 
> > Its important to note that has a reviewer only, my time is limited, and I
> > completely rely on the author judgment of delay tuning and actual testing.
> > 
> > Nicolas
> > 
> > > 
> > > Regards
> > > Devarsh
>
  
jackson.lee July 16, 2024, 1:02 a.m. UTC | #20
Hi Nicolas

Thanks for your reply.

> -----Original Message-----
> From: Nicolas Dufresne <nicolas.dufresne@collabora.com>
> Sent: Tuesday, July 16, 2024 2:06 AM
> To: jackson.lee <jackson.lee@chipsnmedia.com>; Devarsh Thakkar
> <devarsht@ti.com>; mchehab@kernel.org; sebastian.fricke@collabora.com
> Cc: linux-media@vger.kernel.org; linux-kernel@vger.kernel.org;
> hverkuil@xs4all.nl; Nas Chung <nas.chung@chipsnmedia.com>; lafley.kim
> <lafley.kim@chipsnmedia.com>; b-brnich@ti.com; Luthra, Jai <j-luthra@ti.com>;
> Vibhore <vibhore@ti.com>; Dhruva Gole <d-gole@ti.com>; Aradhya <a-
> bhatia1@ti.com>; Raghavendra, Vignesh <vigneshr@ti.com>
> Subject: Re: [RESEND PATCH v6 2/4] media: chips-media: wave5: Support runtime
> suspend/resume
> 
> Hi Jackson,
> 
> Le vendredi 12 juillet 2024 à 06:10 +0000, jackson.lee a écrit :
> > Hi Nicolas
> >
> > > -----Original Message-----
> > > From: Nicolas Dufresne <nicolas.dufresne@collabora.com>
> > > Sent: Friday, June 21, 2024 2:33 AM
> > > To: Devarsh Thakkar <devarsht@ti.com>; jackson.lee
> > > <jackson.lee@chipsnmedia.com>; mchehab@kernel.org;
> > > sebastian.fricke@collabora.com
> > > Cc: linux-media@vger.kernel.org; linux-kernel@vger.kernel.org;
> > > hverkuil@xs4all.nl; Nas Chung <nas.chung@chipsnmedia.com>;
> > > lafley.kim <lafley.kim@chipsnmedia.com>; b-brnich@ti.com; Luthra,
> > > Jai <j-luthra@ti.com>; Vibhore <vibhore@ti.com>; Dhruva Gole
> > > <d-gole@ti.com>; Aradhya <a- bhatia1@ti.com>; Raghavendra, Vignesh
> > > <vigneshr@ti.com>
> > > Subject: Re: [RESEND PATCH v6 2/4] media: chips-media: wave5:
> > > Support runtime suspend/resume
> > >
> > > Le jeudi 20 juin 2024 à 19:50 +0530, Devarsh Thakkar a écrit :
> > > > Hi Nicolas,
> > > >
> > > > On 20/06/24 19:35, Nicolas Dufresne wrote:
> > > > > Hi Devarsh,
> > > > >
> > > > > Le jeudi 20 juin 2024 à 15:05 +0530, Devarsh Thakkar a écrit :
> > > > > > In my view the delayed suspend functionality is generally
> > > > > > helpful for devices where resume latencies are higher for e.g.
> > > > > > this light sensor driver [2] uses it because it takes 250ms to
> > > > > > stabilize after resumption and I don't see this being used in
> > > > > > codec drivers generally since there is no such large resume
> > > > > > latency. Please let me know if I am missing something or there
> > > > > > is a strong reason to have
> > > delayed suspend for wave5.
> > > > >
> > > > > It sounds like you did proper scientific testing of the suspend
> > > > > results calls, mind sharing the actual data ?
> > > >
> > > > Nopes, I did not do that but yes I agree it is good to profile and
> > > > evaluate the trade-off but I am not expecting 250ms kind of
> > > > latency. I would suggest Jackson to do the profiling for the resume
> latencies.
> > >
> > > I'd clearly like to see numbers before we proceed.
> > >
> >
> > I measured latency for the resume and suspend of our hw block.
> >
> > Resume : 124 microsecond
> > Suspend : 355 microsecond
> >
> > I think if the delay is 100ms, it is enough.
> > How about this ?
> 
> Seem very fast operation indeed, so a very small delay seems logical. I
> believe this is similar to what other drivers uses, so sounds good to me.
> 
> **If** we decide to go lower or drop the delay, then I'd like see someones
> benchmark that show that doing suspend during playback does improve power
> efficiently without reducing throughput.
> 
> Nicolas
> 
> >
> > > >
> > > > But perhaps a separate issue, I did notice that intention of the
> > > > patchset was to suspend without waiting for the timeout if there
> > > > is no application having a handle to the wave5 device but even if
> > > > I close the last instance I still see the IP stays on for 5seconds
> > > > as seen in this logs [1] and this perhaps could be because extra
> > > > pm counter references
> > > being hold.
> > >
> > > Not sure where this comes from, I'm not aware of drivers doing that
> > > with M2M instances. Only
> > >
> > > >
> > > > [2024-06-20 12:32:50] Freeing pipeline ...
> > > >
> > > > and after 5 seconds..
> > > >
> > > > [2024-06-20 12:32:55] |   204     | AM62AX_DEV_CODEC0 | DEVICE_STATE_ON
> |
> > > > [2024-06-20 12:32:56] |   204     | AM62AX_DEV_CODEC0 | DEVICE_STATE_OFF
> > > >
> > > > [1]:
> > > > https://gist.github.com/devarsht/009075d8706001f447733ed859152d90
> > >
> > > Appart from the 5s being too long, that is expected. If it fails
> > > after that, this is a bug, we we should hold on merging this until
> > > the problem has been resolved.
> > >
> >
> > After 5sec, the hw goes to suspend. So there is no bug in the current
> patch-set.
> >
> >
> > Thanks
> >
> >
> > > Imagine that userspace is going gapless playback, if you have a lets
> > > say 30ms on forced suspend cycle due to close/open of the decoder
> > > instance, it won't actually endup gapless. The delay will ensure
> > > that we only suspend when needed.
> > >
> > > There is other changes I have asked in this series, since we always
> > > have the case where userspace just pause on streaming, and we want
> > > that prolonged paused lead to suspend. Hopefully this has been
> > > strongly tested and is not just added for "completeness".
> > >
> > > Its important to note that has a reviewer only, my time is limited,
> > > and I completely rely on the author judgment of delay tuning and actual
> testing.
> > >
> > > Nicolas
> > >
> > > >
> > > > Regards
> > > > Devarsh
> >
  

Patch

diff --git a/drivers/media/platform/chips-media/wave5/wave5-hw.c b/drivers/media/platform/chips-media/wave5/wave5-hw.c
index 6ef5bd5fb325..dcdb1eab0174 100644
--- a/drivers/media/platform/chips-media/wave5/wave5-hw.c
+++ b/drivers/media/platform/chips-media/wave5/wave5-hw.c
@@ -1084,8 +1084,8 @@  int wave5_vpu_re_init(struct device *dev, u8 *fw, size_t size)
 	return setup_wave5_properties(dev);
 }
 
-static int wave5_vpu_sleep_wake(struct device *dev, bool i_sleep_wake, const uint16_t *code,
-				size_t size)
+int wave5_vpu_sleep_wake(struct device *dev, bool i_sleep_wake, const uint16_t *code,
+			 size_t size)
 {
 	u32 reg_val;
 	struct vpu_buf *common_vb;
diff --git a/drivers/media/platform/chips-media/wave5/wave5-vpu-dec.c b/drivers/media/platform/chips-media/wave5/wave5-vpu-dec.c
index c8624c681fa6..861a0664047c 100644
--- a/drivers/media/platform/chips-media/wave5/wave5-vpu-dec.c
+++ b/drivers/media/platform/chips-media/wave5/wave5-vpu-dec.c
@@ -5,6 +5,7 @@ 
  * Copyright (C) 2021-2023 CHIPS&MEDIA INC
  */
 
+#include <linux/pm_runtime.h>
 #include "wave5-helper.h"
 
 #define VPU_DEC_DEV_NAME "C&M Wave5 VPU decoder"
@@ -518,6 +519,8 @@  static void wave5_vpu_dec_finish_decode(struct vpu_instance *inst)
 	if (q_status.report_queue_count == 0 &&
 	    (q_status.instance_queue_count == 0 || dec_info.sequence_changed)) {
 		dev_dbg(inst->dev->dev, "%s: finishing job.\n", __func__);
+		pm_runtime_mark_last_busy(inst->dev->dev);
+		pm_runtime_put_autosuspend(inst->dev->dev);
 		v4l2_m2m_job_finish(inst->v4l2_m2m_dev, m2m_ctx);
 	}
 }
@@ -1382,6 +1385,7 @@  static int wave5_vpu_dec_start_streaming(struct vb2_queue *q, unsigned int count
 	int ret = 0;
 
 	dev_dbg(inst->dev->dev, "%s: type: %u\n", __func__, q->type);
+	pm_runtime_resume_and_get(inst->dev->dev);
 
 	v4l2_m2m_update_start_streaming_state(m2m_ctx, q);
 
@@ -1425,13 +1429,15 @@  static int wave5_vpu_dec_start_streaming(struct vb2_queue *q, unsigned int count
 			}
 		}
 	}
-
+	pm_runtime_mark_last_busy(inst->dev->dev);
+	pm_runtime_put_autosuspend(inst->dev->dev);
 	return ret;
 
 free_bitstream_vbuf:
 	wave5_vdi_free_dma_memory(inst->dev, &inst->bitstream_vbuf);
 return_buffers:
 	wave5_return_bufs(q, VB2_BUF_STATE_QUEUED);
+	pm_runtime_put_autosuspend(inst->dev->dev);
 	return ret;
 }
 
@@ -1517,6 +1523,7 @@  static void wave5_vpu_dec_stop_streaming(struct vb2_queue *q)
 	bool check_cmd = TRUE;
 
 	dev_dbg(inst->dev->dev, "%s: type: %u\n", __func__, q->type);
+	pm_runtime_resume_and_get(inst->dev->dev);
 
 	while (check_cmd) {
 		struct queue_status_info q_status;
@@ -1540,6 +1547,9 @@  static void wave5_vpu_dec_stop_streaming(struct vb2_queue *q)
 		streamoff_output(q);
 	else
 		streamoff_capture(q);
+
+	pm_runtime_mark_last_busy(inst->dev->dev);
+	pm_runtime_put_autosuspend(inst->dev->dev);
 }
 
 static const struct vb2_ops wave5_vpu_dec_vb2_ops = {
@@ -1626,7 +1636,7 @@  static void wave5_vpu_dec_device_run(void *priv)
 	int ret = 0;
 
 	dev_dbg(inst->dev->dev, "%s: Fill the ring buffer with new bitstream data", __func__);
-
+	pm_runtime_resume_and_get(inst->dev->dev);
 	ret = fill_ringbuffer(inst);
 	if (ret) {
 		dev_warn(inst->dev->dev, "Filling ring buffer failed\n");
@@ -1709,6 +1719,8 @@  static void wave5_vpu_dec_device_run(void *priv)
 
 finish_job_and_return:
 	dev_dbg(inst->dev->dev, "%s: leave and finish job", __func__);
+	pm_runtime_mark_last_busy(inst->dev->dev);
+	pm_runtime_put_autosuspend(inst->dev->dev);
 	v4l2_m2m_job_finish(inst->v4l2_m2m_dev, m2m_ctx);
 }
 
diff --git a/drivers/media/platform/chips-media/wave5/wave5-vpu-enc.c b/drivers/media/platform/chips-media/wave5/wave5-vpu-enc.c
index a23908011a39..703fd8d1c7da 100644
--- a/drivers/media/platform/chips-media/wave5/wave5-vpu-enc.c
+++ b/drivers/media/platform/chips-media/wave5/wave5-vpu-enc.c
@@ -5,6 +5,7 @@ 
  * Copyright (C) 2021-2023 CHIPS&MEDIA INC
  */
 
+#include <linux/pm_runtime.h>
 #include "wave5-helper.h"
 
 #define VPU_ENC_DEV_NAME "C&M Wave5 VPU encoder"
@@ -1310,6 +1311,7 @@  static int wave5_vpu_enc_start_streaming(struct vb2_queue *q, unsigned int count
 	struct v4l2_m2m_ctx *m2m_ctx = inst->v4l2_fh.m2m_ctx;
 	int ret = 0;
 
+	pm_runtime_resume_and_get(inst->dev->dev);
 	v4l2_m2m_update_start_streaming_state(m2m_ctx, q);
 
 	if (inst->state == VPU_INST_STATE_NONE && q->type == V4L2_BUF_TYPE_VIDEO_OUTPUT_MPLANE) {
@@ -1364,9 +1366,13 @@  static int wave5_vpu_enc_start_streaming(struct vb2_queue *q, unsigned int count
 	if (ret)
 		goto return_buffers;
 
+	pm_runtime_mark_last_busy(inst->dev->dev);
+	pm_runtime_put_autosuspend(inst->dev->dev);
 	return 0;
 return_buffers:
 	wave5_return_bufs(q, VB2_BUF_STATE_QUEUED);
+	pm_runtime_mark_last_busy(inst->dev->dev);
+	pm_runtime_put_autosuspend(inst->dev->dev);
 	return ret;
 }
 
@@ -1408,6 +1414,7 @@  static void wave5_vpu_enc_stop_streaming(struct vb2_queue *q)
 	 */
 
 	dev_dbg(inst->dev->dev, "%s: type: %u\n", __func__, q->type);
+	pm_runtime_resume_and_get(inst->dev->dev);
 
 	if (wave5_vpu_both_queues_are_streaming(inst))
 		switch_state(inst, VPU_INST_STATE_STOP);
@@ -1432,6 +1439,9 @@  static void wave5_vpu_enc_stop_streaming(struct vb2_queue *q)
 		streamoff_output(inst, q);
 	else
 		streamoff_capture(inst, q);
+
+	pm_runtime_mark_last_busy(inst->dev->dev);
+	pm_runtime_put_autosuspend(inst->dev->dev);
 }
 
 static const struct vb2_ops wave5_vpu_enc_vb2_ops = {
@@ -1478,6 +1488,7 @@  static void wave5_vpu_enc_device_run(void *priv)
 	u32 fail_res = 0;
 	int ret = 0;
 
+	pm_runtime_resume_and_get(inst->dev->dev);
 	switch (inst->state) {
 	case VPU_INST_STATE_PIC_RUN:
 		ret = start_encode(inst, &fail_res);
@@ -1491,6 +1502,8 @@  static void wave5_vpu_enc_device_run(void *priv)
 			break;
 		}
 		dev_dbg(inst->dev->dev, "%s: leave with active job", __func__);
+		pm_runtime_mark_last_busy(inst->dev->dev);
+		pm_runtime_put_autosuspend(inst->dev->dev);
 		return;
 	default:
 		WARN(1, "Execution of a job in state %s is invalid.\n",
@@ -1498,6 +1511,8 @@  static void wave5_vpu_enc_device_run(void *priv)
 		break;
 	}
 	dev_dbg(inst->dev->dev, "%s: leave and finish job", __func__);
+	pm_runtime_mark_last_busy(inst->dev->dev);
+	pm_runtime_put_autosuspend(inst->dev->dev);
 	v4l2_m2m_job_finish(inst->v4l2_m2m_dev, m2m_ctx);
 }
 
diff --git a/drivers/media/platform/chips-media/wave5/wave5-vpu.c b/drivers/media/platform/chips-media/wave5/wave5-vpu.c
index 68a519ac412d..0e7c1c255563 100644
--- a/drivers/media/platform/chips-media/wave5/wave5-vpu.c
+++ b/drivers/media/platform/chips-media/wave5/wave5-vpu.c
@@ -10,6 +10,7 @@ 
 #include <linux/clk.h>
 #include <linux/firmware.h>
 #include <linux/interrupt.h>
+#include <linux/pm_runtime.h>
 #include "wave5-vpu.h"
 #include "wave5-regdefine.h"
 #include "wave5-vpuconfig.h"
@@ -145,6 +146,38 @@  static int wave5_vpu_load_firmware(struct device *dev, const char *fw_name,
 	return 0;
 }
 
+static __maybe_unused int wave5_pm_suspend(struct device *dev)
+{
+	struct vpu_device *vpu = dev_get_drvdata(dev);
+
+	if (pm_runtime_suspended(dev))
+		return 0;
+
+	wave5_vpu_sleep_wake(dev, true, NULL, 0);
+	clk_bulk_disable_unprepare(vpu->num_clks, vpu->clks);
+
+	return 0;
+}
+
+static __maybe_unused int wave5_pm_resume(struct device *dev)
+{
+	struct vpu_device *vpu = dev_get_drvdata(dev);
+	int ret = 0;
+
+	wave5_vpu_sleep_wake(dev, false, NULL, 0);
+	ret = clk_bulk_prepare_enable(vpu->num_clks, vpu->clks);
+	if (ret) {
+		dev_err(dev, "Enabling clocks, fail: %d\n", ret);
+		return ret;
+	}
+
+	return ret;
+}
+
+static const struct dev_pm_ops wave5_pm_ops = {
+	SET_RUNTIME_PM_OPS(wave5_pm_suspend, wave5_pm_resume, NULL)
+};
+
 static int wave5_vpu_probe(struct platform_device *pdev)
 {
 	int ret;
@@ -268,6 +301,12 @@  static int wave5_vpu_probe(struct platform_device *pdev)
 		 (match_data->flags & WAVE5_IS_DEC) ? "'DECODE'" : "");
 	dev_info(&pdev->dev, "Product Code:      0x%x\n", dev->product_code);
 	dev_info(&pdev->dev, "Firmware Revision: %u\n", fw_revision);
+
+	pm_runtime_set_autosuspend_delay(&pdev->dev, 5000);
+	pm_runtime_use_autosuspend(&pdev->dev);
+	pm_runtime_enable(&pdev->dev);
+	wave5_vpu_sleep_wake(&pdev->dev, true, NULL, 0);
+
 	return 0;
 
 err_enc_unreg:
@@ -295,6 +334,9 @@  static void wave5_vpu_remove(struct platform_device *pdev)
 		hrtimer_cancel(&dev->hrtimer);
 	}
 
+	pm_runtime_put_sync(&pdev->dev);
+	pm_runtime_disable(&pdev->dev);
+
 	mutex_destroy(&dev->dev_lock);
 	mutex_destroy(&dev->hw_lock);
 	clk_bulk_disable_unprepare(dev->num_clks, dev->clks);
@@ -320,6 +362,7 @@  static struct platform_driver wave5_vpu_driver = {
 	.driver = {
 		.name = VPU_PLATFORM_DEVICE_NAME,
 		.of_match_table = of_match_ptr(wave5_dt_ids),
+		.pm = &wave5_pm_ops,
 		},
 	.probe = wave5_vpu_probe,
 	.remove_new = wave5_vpu_remove,
diff --git a/drivers/media/platform/chips-media/wave5/wave5-vpuapi.c b/drivers/media/platform/chips-media/wave5/wave5-vpuapi.c
index 1a3efb638dde..b0911fef232f 100644
--- a/drivers/media/platform/chips-media/wave5/wave5-vpuapi.c
+++ b/drivers/media/platform/chips-media/wave5/wave5-vpuapi.c
@@ -6,6 +6,8 @@ 
  */
 
 #include <linux/bug.h>
+#include <linux/pm_runtime.h>
+#include <linux/delay.h>
 #include "wave5-vpuapi.h"
 #include "wave5-regdefine.h"
 #include "wave5.h"
@@ -200,9 +202,13 @@  int wave5_vpu_dec_close(struct vpu_instance *inst, u32 *fail_res)
 	if (!inst->codec_info)
 		return -EINVAL;
 
+	pm_runtime_resume_and_get(inst->dev->dev);
+
 	ret = mutex_lock_interruptible(&vpu_dev->hw_lock);
-	if (ret)
+	if (ret) {
+		pm_runtime_put_sync(inst->dev->dev);
 		return ret;
+	}
 
 	do {
 		ret = wave5_vpu_dec_finish_seq(inst, fail_res);
@@ -234,7 +240,7 @@  int wave5_vpu_dec_close(struct vpu_instance *inst, u32 *fail_res)
 
 unlock_and_return:
 	mutex_unlock(&vpu_dev->hw_lock);
-
+	pm_runtime_put_sync(inst->dev->dev);
 	return ret;
 }
 
@@ -702,6 +708,8 @@  int wave5_vpu_enc_close(struct vpu_instance *inst, u32 *fail_res)
 	if (!inst->codec_info)
 		return -EINVAL;
 
+	pm_runtime_resume_and_get(inst->dev->dev);
+
 	ret = mutex_lock_interruptible(&vpu_dev->hw_lock);
 	if (ret)
 		return ret;
@@ -733,9 +741,9 @@  int wave5_vpu_enc_close(struct vpu_instance *inst, u32 *fail_res)
 	}
 
 	wave5_vdi_free_dma_memory(vpu_dev, &p_enc_info->vb_task);
-
 	mutex_unlock(&vpu_dev->hw_lock);
 
+	pm_runtime_put_sync(inst->dev->dev);
 	return 0;
 }
 
diff --git a/drivers/media/platform/chips-media/wave5/wave5.h b/drivers/media/platform/chips-media/wave5/wave5.h
index 063028eccd3b..6125eff938a8 100644
--- a/drivers/media/platform/chips-media/wave5/wave5.h
+++ b/drivers/media/platform/chips-media/wave5/wave5.h
@@ -56,6 +56,9 @@  int wave5_vpu_get_version(struct vpu_device *vpu_dev, u32 *revision);
 
 int wave5_vpu_init(struct device *dev, u8 *fw, size_t size);
 
+int wave5_vpu_sleep_wake(struct device *dev, bool i_sleep_wake, const uint16_t *code,
+			 size_t size);
+
 int wave5_vpu_reset(struct device *dev, enum sw_reset_mode reset_mode);
 
 int wave5_vpu_build_up_dec_param(struct vpu_instance *inst, struct dec_open_param *param);