media: mtk-jpeg: Fix use after free bug due to error path handling in mtk_jpeg_dec_device_run

Message ID 20231020040732.2499269-1-zyytlz.wz@163.com (mailing list archive)
State Superseded
Headers
Series media: mtk-jpeg: Fix use after free bug due to error path handling in mtk_jpeg_dec_device_run |

Commit Message

Zheng Wang Oct. 20, 2023, 4:07 a.m. UTC
  In mtk_jpeg_probe, &jpeg->job_timeout_work is bound with
mtk_jpeg_job_timeout_work.

In mtk_jpeg_dec_device_run, if error happens in
mtk_jpeg_set_dec_dst, it will finally start the worker while
mark the job as finished by invoking v4l2_m2m_job_finish.

There are two methods to trigger the bug. If we remove the
module, it which will call mtk_jpeg_remove to make cleanup.
The possible sequence is as follows, which will cause a
use-after-free bug.

CPU0                  CPU1
mtk_jpeg_dec_...    |
  start worker	    |
                    |mtk_jpeg_job_timeout_work
mtk_jpeg_remove     |
  v4l2_m2m_release  |
    kfree(m2m_dev); |
                    |
                    | v4l2_m2m_get_curr_priv
                    |   m2m_dev->curr_ctx //use

If we close the file descriptor, which will call mtk_jpeg_release,
it will have a similar sequence.

Fix this bug by start timeout worker only if started jpegdec worker
successfully so the v4l2_m2m_job_finish will only be called on
either mtk_jpeg_job_timeout_work or mtk_jpeg_dec_device_run.

This patch also reverts commit c677d7ae8314 
("media: mtk-jpeg: Fix use after free bug due to uncanceled work")
for this patch also fixed the use-after-free bug mentioned before.
Before mtk_jpeg_remove is invoked, mtk_jpeg_release must be invoked
to close opened files. And it will call v4l2_m2m_cancel_job to wait
for the timeout worker finished so the canceling in mtk_jpeg_remove
is unnecessary.

Fixes: b2f0d2724ba4 ("[media] vcodec: mediatek: Add Mediatek JPEG Decoder Driver")
Signed-off-by: Zheng Wang <zyytlz.wz@163.com>
Signed-off-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
Cc: stable@vger.kernel.org
---
 .../media/platform/mediatek/jpeg/mtk_jpeg_core.c    | 13 ++++++-------
 1 file changed, 6 insertions(+), 7 deletions(-)
  

Comments

AngeloGioacchino Del Regno Oct. 20, 2023, 8:20 a.m. UTC | #1
Il 20/10/23 06:07, Zheng Wang ha scritto:
> In mtk_jpeg_probe, &jpeg->job_timeout_work is bound with
> mtk_jpeg_job_timeout_work.
> 
> In mtk_jpeg_dec_device_run, if error happens in
> mtk_jpeg_set_dec_dst, it will finally start the worker while
> mark the job as finished by invoking v4l2_m2m_job_finish.
> 
> There are two methods to trigger the bug. If we remove the
> module, it which will call mtk_jpeg_remove to make cleanup.
> The possible sequence is as follows, which will cause a
> use-after-free bug.
> 
> CPU0                  CPU1
> mtk_jpeg_dec_...    |
>    start worker	    |
>                      |mtk_jpeg_job_timeout_work
> mtk_jpeg_remove     |
>    v4l2_m2m_release  |
>      kfree(m2m_dev); |
>                      |
>                      | v4l2_m2m_get_curr_priv
>                      |   m2m_dev->curr_ctx //use
> 
> If we close the file descriptor, which will call mtk_jpeg_release,
> it will have a similar sequence.
> 
> Fix this bug by start timeout worker only if started jpegdec worker
> successfully so the v4l2_m2m_job_finish will only be called on
> either mtk_jpeg_job_timeout_work or mtk_jpeg_dec_device_run.
> 
> This patch also reverts commit c677d7ae8314
> ("media: mtk-jpeg: Fix use after free bug due to uncanceled work")
> for this patch also fixed the use-after-free bug mentioned before.
> Before mtk_jpeg_remove is invoked, mtk_jpeg_release must be invoked
> to close opened files. And it will call v4l2_m2m_cancel_job to wait
> for the timeout worker finished so the canceling in mtk_jpeg_remove
> is unnecessary.
> 
> Fixes: b2f0d2724ba4 ("[media] vcodec: mediatek: Add Mediatek JPEG Decoder Driver")
> Signed-off-by: Zheng Wang <zyytlz.wz@163.com>
> Signed-off-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>

Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
  
Alexandre Mergnat Oct. 20, 2023, 10:32 a.m. UTC | #2
Reviewed-by: Alexandre Mergnat <amergnat@baylibre.com>

On 20/10/2023 06:07, Zheng Wang wrote:
> In mtk_jpeg_probe, &jpeg->job_timeout_work is bound with 
> mtk_jpeg_job_timeout_work. In mtk_jpeg_dec_device_run, if error happens 
> in mtk_jpeg_set_dec_dst, it will finally start the worker while mark the 
> job as finished by invoking v4l2_m2m_job_finish. There are two methods 
> to trigger the bug. If we remove the module, it which will call 
> mtk_jpeg_remove to make cleanup. The possible sequence is as follows, 
> which will cause a use-after-free bug. CPU0 CPU1 mtk_jpeg_dec_... | 
> start worker | |mtk_jpeg_job_timeout_work mtk_jpeg_remove | 
> v4l2_m2m_release | kfree(m2m_dev); | | | v4l2_m2m_get_curr_priv | 
> m2m_dev->curr_ctx //use If we close the file descriptor, which will call 
> mtk_jpeg_release, it will have a similar sequence. Fix this bug by start 
> timeout worker only if started jpegdec worker successfully so the 
> v4l2_m2m_job_finish will only be called on either 
> mtk_jpeg_job_timeout_work or mtk_jpeg_dec_device_run. This patch also 
> reverts commit c677d7ae8314 ("media: mtk-jpeg: Fix use after free bug 
> due to uncanceled work") for this patch also fixed the use-after-free 
> bug mentioned before. Before mtk_jpeg_remove is invoked, 
> mtk_jpeg_release must be invoked to close opened files. And it will call 
> v4l2_m2m_cancel_job to wait for the timeout worker finished so the 
> canceling in mtk_jpeg_remove is unnecessary.
  
Dmitry Osipenko Oct. 24, 2023, 1:18 p.m. UTC | #3
On 10/20/23 07:07, Zheng Wang wrote:
> In mtk_jpeg_probe, &jpeg->job_timeout_work is bound with
> mtk_jpeg_job_timeout_work.
> 
> In mtk_jpeg_dec_device_run, if error happens in
> mtk_jpeg_set_dec_dst, it will finally start the worker while
> mark the job as finished by invoking v4l2_m2m_job_finish.
> 
> There are two methods to trigger the bug. If we remove the
> module, it which will call mtk_jpeg_remove to make cleanup.
> The possible sequence is as follows, which will cause a
> use-after-free bug.
> 
> CPU0                  CPU1
> mtk_jpeg_dec_...    |
>   start worker	    |
>                     |mtk_jpeg_job_timeout_work
> mtk_jpeg_remove     |
>   v4l2_m2m_release  |
>     kfree(m2m_dev); |
>                     |
>                     | v4l2_m2m_get_curr_priv
>                     |   m2m_dev->curr_ctx //use
> 
> If we close the file descriptor, which will call mtk_jpeg_release,
> it will have a similar sequence.
> 
> Fix this bug by start timeout worker only if started jpegdec worker
> successfully so the v4l2_m2m_job_finish will only be called on
> either mtk_jpeg_job_timeout_work or mtk_jpeg_dec_device_run.
> 
> This patch also reverts commit c677d7ae8314 
> ("media: mtk-jpeg: Fix use after free bug due to uncanceled work")
> for this patch also fixed the use-after-free bug mentioned before.
> Before mtk_jpeg_remove is invoked, mtk_jpeg_release must be invoked
> to close opened files. And it will call v4l2_m2m_cancel_job to wait
> for the timeout worker finished so the canceling in mtk_jpeg_remove
> is unnecessary.
> 
> Fixes: b2f0d2724ba4 ("[media] vcodec: mediatek: Add Mediatek JPEG Decoder Driver")
> Signed-off-by: Zheng Wang <zyytlz.wz@163.com>
> Signed-off-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
> Cc: stable@vger.kernel.org
> ---
>  .../media/platform/mediatek/jpeg/mtk_jpeg_core.c    | 13 ++++++-------
>  1 file changed, 6 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/media/platform/mediatek/jpeg/mtk_jpeg_core.c b/drivers/media/platform/mediatek/jpeg/mtk_jpeg_core.c
> index 7194f88edc0f..c3456c700c07 100644
> --- a/drivers/media/platform/mediatek/jpeg/mtk_jpeg_core.c
> +++ b/drivers/media/platform/mediatek/jpeg/mtk_jpeg_core.c
> @@ -1021,13 +1021,13 @@ static void mtk_jpeg_dec_device_run(void *priv)
>  	if (ret < 0)
>  		goto dec_end;
>  
> -	schedule_delayed_work(&jpeg->job_timeout_work,
> -			      msecs_to_jiffies(MTK_JPEG_HW_TIMEOUT_MSEC));
> -
>  	mtk_jpeg_set_dec_src(ctx, &src_buf->vb2_buf, &bs);
>  	if (mtk_jpeg_set_dec_dst(ctx, &jpeg_src_buf->dec_param, &dst_buf->vb2_buf, &fb))
>  		goto dec_end;
>  
> +	schedule_delayed_work(&jpeg->job_timeout_work,
> +			      msecs_to_jiffies(MTK_JPEG_HW_TIMEOUT_MSEC));
> +
>  	spin_lock_irqsave(&jpeg->hw_lock, flags);
>  	mtk_jpeg_dec_reset(jpeg->reg_base);
>  	mtk_jpeg_dec_set_config(jpeg->reg_base,
> @@ -1403,7 +1403,6 @@ static void mtk_jpeg_remove(struct platform_device *pdev)
>  {
>  	struct mtk_jpeg_dev *jpeg = platform_get_drvdata(pdev);
>  
> -	cancel_delayed_work_sync(&jpeg->job_timeout_work);
>  	pm_runtime_disable(&pdev->dev);
>  	video_unregister_device(jpeg->vdev);
>  	v4l2_m2m_release(jpeg->m2m_dev);
> @@ -1750,9 +1749,6 @@ static void mtk_jpegdec_worker(struct work_struct *work)
>  	v4l2_m2m_src_buf_remove(ctx->fh.m2m_ctx);
>  	v4l2_m2m_dst_buf_remove(ctx->fh.m2m_ctx);
>  
> -	schedule_delayed_work(&comp_jpeg[hw_id]->job_timeout_work,
> -			      msecs_to_jiffies(MTK_JPEG_HW_TIMEOUT_MSEC));
> -
>  	mtk_jpeg_set_dec_src(ctx, &src_buf->vb2_buf, &bs);
>  	if (mtk_jpeg_set_dec_dst(ctx,
>  				 &jpeg_src_buf->dec_param,
> @@ -1762,6 +1758,9 @@ static void mtk_jpegdec_worker(struct work_struct *work)
>  		goto setdst_end;
>  	}
>  
> +	schedule_delayed_work(&comp_jpeg[hw_id]->job_timeout_work,
> +			      msecs_to_jiffies(MTK_JPEG_HW_TIMEOUT_MSEC));
> +
>  	spin_lock_irqsave(&comp_jpeg[hw_id]->hw_lock, flags);
>  	ctx->total_frame_num++;
>  	mtk_jpeg_dec_reset(comp_jpeg[hw_id]->reg_base);

What about to split this patch into 3 patches:

1. will remove cancel_delayed_work_sync()
2. will update mtk_jpeg_dec_device_run()
3. will update mtk_jpegdec_worker()

The reason for splitting is because the multi-core mtk_jpegdec_worker()
doesn't present in older stable kernels, and thus, the patch isn't
backportable as-is.
  
Zheng Hacker Oct. 26, 2023, 2:37 a.m. UTC | #4
Get it. I'll figure it out how to split up.

Thanks,
Zheng

Dmitry Osipenko <dmitry.osipenko@collabora.com> 于2023年10月24日周二 21:18写道:
>
> On 10/20/23 07:07, Zheng Wang wrote:
> > In mtk_jpeg_probe, &jpeg->job_timeout_work is bound with
> > mtk_jpeg_job_timeout_work.
> >
> > In mtk_jpeg_dec_device_run, if error happens in
> > mtk_jpeg_set_dec_dst, it will finally start the worker while
> > mark the job as finished by invoking v4l2_m2m_job_finish.
> >
> > There are two methods to trigger the bug. If we remove the
> > module, it which will call mtk_jpeg_remove to make cleanup.
> > The possible sequence is as follows, which will cause a
> > use-after-free bug.
> >
> > CPU0                  CPU1
> > mtk_jpeg_dec_...    |
> >   start worker            |
> >                     |mtk_jpeg_job_timeout_work
> > mtk_jpeg_remove     |
> >   v4l2_m2m_release  |
> >     kfree(m2m_dev); |
> >                     |
> >                     | v4l2_m2m_get_curr_priv
> >                     |   m2m_dev->curr_ctx //use
> >
> > If we close the file descriptor, which will call mtk_jpeg_release,
> > it will have a similar sequence.
> >
> > Fix this bug by start timeout worker only if started jpegdec worker
> > successfully so the v4l2_m2m_job_finish will only be called on
> > either mtk_jpeg_job_timeout_work or mtk_jpeg_dec_device_run.
> >
> > This patch also reverts commit c677d7ae8314
> > ("media: mtk-jpeg: Fix use after free bug due to uncanceled work")
> > for this patch also fixed the use-after-free bug mentioned before.
> > Before mtk_jpeg_remove is invoked, mtk_jpeg_release must be invoked
> > to close opened files. And it will call v4l2_m2m_cancel_job to wait
> > for the timeout worker finished so the canceling in mtk_jpeg_remove
> > is unnecessary.
> >
> > Fixes: b2f0d2724ba4 ("[media] vcodec: mediatek: Add Mediatek JPEG Decoder Driver")
> > Signed-off-by: Zheng Wang <zyytlz.wz@163.com>
> > Signed-off-by: Dmitry Osipenko <dmitry.osipenko@collabora.com>
> > Cc: stable@vger.kernel.org
> > ---
> >  .../media/platform/mediatek/jpeg/mtk_jpeg_core.c    | 13 ++++++-------
> >  1 file changed, 6 insertions(+), 7 deletions(-)
> >
> > diff --git a/drivers/media/platform/mediatek/jpeg/mtk_jpeg_core.c b/drivers/media/platform/mediatek/jpeg/mtk_jpeg_core.c
> > index 7194f88edc0f..c3456c700c07 100644
> > --- a/drivers/media/platform/mediatek/jpeg/mtk_jpeg_core.c
> > +++ b/drivers/media/platform/mediatek/jpeg/mtk_jpeg_core.c
> > @@ -1021,13 +1021,13 @@ static void mtk_jpeg_dec_device_run(void *priv)
> >       if (ret < 0)
> >               goto dec_end;
> >
> > -     schedule_delayed_work(&jpeg->job_timeout_work,
> > -                           msecs_to_jiffies(MTK_JPEG_HW_TIMEOUT_MSEC));
> > -
> >       mtk_jpeg_set_dec_src(ctx, &src_buf->vb2_buf, &bs);
> >       if (mtk_jpeg_set_dec_dst(ctx, &jpeg_src_buf->dec_param, &dst_buf->vb2_buf, &fb))
> >               goto dec_end;
> >
> > +     schedule_delayed_work(&jpeg->job_timeout_work,
> > +                           msecs_to_jiffies(MTK_JPEG_HW_TIMEOUT_MSEC));
> > +
> >       spin_lock_irqsave(&jpeg->hw_lock, flags);
> >       mtk_jpeg_dec_reset(jpeg->reg_base);
> >       mtk_jpeg_dec_set_config(jpeg->reg_base,
> > @@ -1403,7 +1403,6 @@ static void mtk_jpeg_remove(struct platform_device *pdev)
> >  {
> >       struct mtk_jpeg_dev *jpeg = platform_get_drvdata(pdev);
> >
> > -     cancel_delayed_work_sync(&jpeg->job_timeout_work);
> >       pm_runtime_disable(&pdev->dev);
> >       video_unregister_device(jpeg->vdev);
> >       v4l2_m2m_release(jpeg->m2m_dev);
> > @@ -1750,9 +1749,6 @@ static void mtk_jpegdec_worker(struct work_struct *work)
> >       v4l2_m2m_src_buf_remove(ctx->fh.m2m_ctx);
> >       v4l2_m2m_dst_buf_remove(ctx->fh.m2m_ctx);
> >
> > -     schedule_delayed_work(&comp_jpeg[hw_id]->job_timeout_work,
> > -                           msecs_to_jiffies(MTK_JPEG_HW_TIMEOUT_MSEC));
> > -
> >       mtk_jpeg_set_dec_src(ctx, &src_buf->vb2_buf, &bs);
> >       if (mtk_jpeg_set_dec_dst(ctx,
> >                                &jpeg_src_buf->dec_param,
> > @@ -1762,6 +1758,9 @@ static void mtk_jpegdec_worker(struct work_struct *work)
> >               goto setdst_end;
> >       }
> >
> > +     schedule_delayed_work(&comp_jpeg[hw_id]->job_timeout_work,
> > +                           msecs_to_jiffies(MTK_JPEG_HW_TIMEOUT_MSEC));
> > +
> >       spin_lock_irqsave(&comp_jpeg[hw_id]->hw_lock, flags);
> >       ctx->total_frame_num++;
> >       mtk_jpeg_dec_reset(comp_jpeg[hw_id]->reg_base);
>
> What about to split this patch into 3 patches:
>
> 1. will remove cancel_delayed_work_sync()
> 2. will update mtk_jpeg_dec_device_run()
> 3. will update mtk_jpegdec_worker()
>
> The reason for splitting is because the multi-core mtk_jpegdec_worker()
> doesn't present in older stable kernels, and thus, the patch isn't
> backportable as-is.
>
> --
> Best regards,
> Dmitry
>
  

Patch

diff --git a/drivers/media/platform/mediatek/jpeg/mtk_jpeg_core.c b/drivers/media/platform/mediatek/jpeg/mtk_jpeg_core.c
index 7194f88edc0f..c3456c700c07 100644
--- a/drivers/media/platform/mediatek/jpeg/mtk_jpeg_core.c
+++ b/drivers/media/platform/mediatek/jpeg/mtk_jpeg_core.c
@@ -1021,13 +1021,13 @@  static void mtk_jpeg_dec_device_run(void *priv)
 	if (ret < 0)
 		goto dec_end;
 
-	schedule_delayed_work(&jpeg->job_timeout_work,
-			      msecs_to_jiffies(MTK_JPEG_HW_TIMEOUT_MSEC));
-
 	mtk_jpeg_set_dec_src(ctx, &src_buf->vb2_buf, &bs);
 	if (mtk_jpeg_set_dec_dst(ctx, &jpeg_src_buf->dec_param, &dst_buf->vb2_buf, &fb))
 		goto dec_end;
 
+	schedule_delayed_work(&jpeg->job_timeout_work,
+			      msecs_to_jiffies(MTK_JPEG_HW_TIMEOUT_MSEC));
+
 	spin_lock_irqsave(&jpeg->hw_lock, flags);
 	mtk_jpeg_dec_reset(jpeg->reg_base);
 	mtk_jpeg_dec_set_config(jpeg->reg_base,
@@ -1403,7 +1403,6 @@  static void mtk_jpeg_remove(struct platform_device *pdev)
 {
 	struct mtk_jpeg_dev *jpeg = platform_get_drvdata(pdev);
 
-	cancel_delayed_work_sync(&jpeg->job_timeout_work);
 	pm_runtime_disable(&pdev->dev);
 	video_unregister_device(jpeg->vdev);
 	v4l2_m2m_release(jpeg->m2m_dev);
@@ -1750,9 +1749,6 @@  static void mtk_jpegdec_worker(struct work_struct *work)
 	v4l2_m2m_src_buf_remove(ctx->fh.m2m_ctx);
 	v4l2_m2m_dst_buf_remove(ctx->fh.m2m_ctx);
 
-	schedule_delayed_work(&comp_jpeg[hw_id]->job_timeout_work,
-			      msecs_to_jiffies(MTK_JPEG_HW_TIMEOUT_MSEC));
-
 	mtk_jpeg_set_dec_src(ctx, &src_buf->vb2_buf, &bs);
 	if (mtk_jpeg_set_dec_dst(ctx,
 				 &jpeg_src_buf->dec_param,
@@ -1762,6 +1758,9 @@  static void mtk_jpegdec_worker(struct work_struct *work)
 		goto setdst_end;
 	}
 
+	schedule_delayed_work(&comp_jpeg[hw_id]->job_timeout_work,
+			      msecs_to_jiffies(MTK_JPEG_HW_TIMEOUT_MSEC));
+
 	spin_lock_irqsave(&comp_jpeg[hw_id]->hw_lock, flags);
 	ctx->total_frame_num++;
 	mtk_jpeg_dec_reset(comp_jpeg[hw_id]->reg_base);