diff mbox series

[v9,15/15] drm/i915: Add deadline based boost support

Message ID	20230302235356.3148279-16-robdclark@gmail.com (mailing list archive)
State	Not Applicable
Headers	From: Rob Clark <robdclark@gmail.com> To: dri-devel@lists.freedesktop.org Cc: freedreno@lists.freedesktop.org, Daniel Vetter <daniel@ffwll.ch>, =?utf-8?q?Christian_K=C3=B6nig?= <ckoenig.leichtzumerken@gmail.com>, =?utf-8?q?Michel_D=C3=A4nzer?= <michel@daenzer.net>, Tvrtko Ursulin <tvrtko.ursulin@intel.com>, Rodrigo Vivi <rodrigo.vivi@intel.com>, Alex Deucher <alexander.deucher@amd.com>, Pekka Paalanen <ppaalanen@gmail.com>, Simon Ser <contact@emersion.fr>, Luben Tuikov <luben.tuikov@amd.com>, Rob Clark <robdclark@chromium.org>, Jani Nikula <jani.nikula@linux.intel.com>, Joonas Lahtinen <joonas.lahtinen@linux.intel.com>, Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>, David Airlie <airlied@gmail.com>, Sumit Semwal <sumit.semwal@linaro.org>, =?utf-8?q?Christian_K=C3=B6nig?= <christian.koenig@amd.com>, intel-gfx@lists.freedesktop.org, linux-kernel@vger.kernel.org (open list), linux-media@vger.kernel.org (open list:DMA BUFFER SHARING FRAMEWORK), linaro-mm-sig@lists.linaro.org (moderated list:DMA BUFFER SHARING FRAMEWORK) Subject: [PATCH v9 15/15] drm/i915: Add deadline based boost support Date: Thu, 2 Mar 2023 15:53:37 -0800 Message-Id: <20230302235356.3148279-16-robdclark@gmail.com> In-Reply-To: <20230302235356.3148279-1-robdclark@gmail.com> References: <20230302235356.3148279-1-robdclark@gmail.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk
Series	dma-fence: Deadline awareness \| [v9,00/15] dma-fence: Deadline awareness [v9,01/15] dma-buf/dma-fence: Add deadline awareness [v9,02/15] dma-buf/fence-array: Add fence deadline support [v9,03/15] dma-buf/fence-chain: Add fence deadline support [v9,04/15] dma-buf/dma-resv: Add a way to set fence deadline [v9,05/15] dma-buf/sync_file: Surface sync-file uABI [v9,06/15] dma-buf/sync_file: Add SET_DEADLINE ioctl [v9,07/15] dma-buf/sw_sync: Add fence deadline support [v9,08/15] drm/scheduler: Add fence deadline support [v9,12/15] drm/msm: Add deadline based boost support [v9,15/15] drm/i915: Add deadline based boost support

Commit Message

Rob Clark March 2, 2023, 11:53 p.m. UTC

  From: Rob Clark <robdclark@chromium.org>

v2: rebase

Signed-off-by: Rob Clark <robdclark@chromium.org>
---
 drivers/gpu/drm/i915/i915_request.c | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

Comments

Rodrigo Vivi March 3, 2023, 3:21 a.m. UTC | #1

On Thu, Mar 02, 2023 at 03:53:37PM -0800, Rob Clark wrote:
> From: Rob Clark <robdclark@chromium.org>
>

missing some wording here...

> v2: rebase
> 
> Signed-off-by: Rob Clark <robdclark@chromium.org>
> ---
>  drivers/gpu/drm/i915/i915_request.c | 20 ++++++++++++++++++++
>  1 file changed, 20 insertions(+)
> 
> diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
> index 7503dcb9043b..44491e7e214c 100644
> --- a/drivers/gpu/drm/i915/i915_request.c
> +++ b/drivers/gpu/drm/i915/i915_request.c
> @@ -97,6 +97,25 @@ static bool i915_fence_enable_signaling(struct dma_fence *fence)
>  	return i915_request_enable_breadcrumb(to_request(fence));
>  }
>  
> +static void i915_fence_set_deadline(struct dma_fence *fence, ktime_t deadline)
> +{
> +	struct i915_request *rq = to_request(fence);
> +
> +	if (i915_request_completed(rq))
> +		return;
> +
> +	if (i915_request_started(rq))
> +		return;

why do we skip the boost if already started?
don't we want to boost the freq anyway?

> +
> +	/*
> +	 * TODO something more clever for deadlines that are in the
> +	 * future.  I think probably track the nearest deadline in
> +	 * rq->timeline and set timer to trigger boost accordingly?
> +	 */

I'm afraid it will be very hard to find some heuristics of what's
late enough for the boost no?
I mean, how early to boost the freq on an upcoming deadline for the
timer?

> +
> +	intel_rps_boost(rq);
> +}
> +
>  static signed long i915_fence_wait(struct dma_fence *fence,
>  				   bool interruptible,
>  				   signed long timeout)
> @@ -182,6 +201,7 @@ const struct dma_fence_ops i915_fence_ops = {
>  	.signaled = i915_fence_signaled,
>  	.wait = i915_fence_wait,
>  	.release = i915_fence_release,
> +	.set_deadline = i915_fence_set_deadline,
>  };
>  
>  static void irq_execute_cb(struct irq_work *wrk)
> -- 
> 2.39.1
>

Tvrtko Ursulin March 3, 2023, 9:58 a.m. UTC | #2

On 03/03/2023 03:21, Rodrigo Vivi wrote:
> On Thu, Mar 02, 2023 at 03:53:37PM -0800, Rob Clark wrote:
>> From: Rob Clark <robdclark@chromium.org>
>>
> 
> missing some wording here...
> 
>> v2: rebase
>>
>> Signed-off-by: Rob Clark <robdclark@chromium.org>
>> ---
>>   drivers/gpu/drm/i915/i915_request.c | 20 ++++++++++++++++++++
>>   1 file changed, 20 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
>> index 7503dcb9043b..44491e7e214c 100644
>> --- a/drivers/gpu/drm/i915/i915_request.c
>> +++ b/drivers/gpu/drm/i915/i915_request.c
>> @@ -97,6 +97,25 @@ static bool i915_fence_enable_signaling(struct dma_fence *fence)
>>   	return i915_request_enable_breadcrumb(to_request(fence));
>>   }
>>   
>> +static void i915_fence_set_deadline(struct dma_fence *fence, ktime_t deadline)
>> +{
>> +	struct i915_request *rq = to_request(fence);
>> +
>> +	if (i915_request_completed(rq))
>> +		return;
>> +
>> +	if (i915_request_started(rq))
>> +		return;
> 
> why do we skip the boost if already started?
> don't we want to boost the freq anyway?

I'd wager Rob is just copying the current i915 wait boost logic.

>> +
>> +	/*
>> +	 * TODO something more clever for deadlines that are in the
>> +	 * future.  I think probably track the nearest deadline in
>> +	 * rq->timeline and set timer to trigger boost accordingly?
>> +	 */
> 
> I'm afraid it will be very hard to find some heuristics of what's
> late enough for the boost no?
> I mean, how early to boost the freq on an upcoming deadline for the
> timer?

We can off load this patch from Rob and deal with it separately, or 
after the fact?

It's a half solution without a smarter scheduler too. Like 
https://lore.kernel.org/all/20210208105236.28498-10-chris@chris-wilson.co.uk/, 
or if GuC plans to do something like that at any point.

Or bump the priority too if deadline is looming?

IMO it is not very effective to fiddle with the heuristic on an ad-hoc 
basis. For instance I have a new heuristics which improves the 
problematic OpenCL cases for further 5% (relative to the current 
waitboost improvement from adding missing syncobj waitboost). But I 
can't really test properly for regressions over platforms, stacks, 
workloads.. :(

Regards,

Tvrtko

> 
>> +
>> +	intel_rps_boost(rq);
>> +}
>> +
>>   static signed long i915_fence_wait(struct dma_fence *fence,
>>   				   bool interruptible,
>>   				   signed long timeout)
>> @@ -182,6 +201,7 @@ const struct dma_fence_ops i915_fence_ops = {
>>   	.signaled = i915_fence_signaled,
>>   	.wait = i915_fence_wait,
>>   	.release = i915_fence_release,
>> +	.set_deadline = i915_fence_set_deadline,
>>   };
>>   
>>   static void irq_execute_cb(struct irq_work *wrk)
>> -- 
>> 2.39.1
>>

Andi Shyti March 3, 2023, 11:21 a.m. UTC | #3

On Fri, Mar 03, 2023 at 09:58:36AM +0000, Tvrtko Ursulin wrote:
> 
> On 03/03/2023 03:21, Rodrigo Vivi wrote:
> > On Thu, Mar 02, 2023 at 03:53:37PM -0800, Rob Clark wrote:
> > > From: Rob Clark <robdclark@chromium.org>
> > > 
> > 
> > missing some wording here...
> > 
> > > v2: rebase
> > > 
> > > Signed-off-by: Rob Clark <robdclark@chromium.org>
> > > ---
> > >   drivers/gpu/drm/i915/i915_request.c | 20 ++++++++++++++++++++
> > >   1 file changed, 20 insertions(+)
> > > 
> > > diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
> > > index 7503dcb9043b..44491e7e214c 100644
> > > --- a/drivers/gpu/drm/i915/i915_request.c
> > > +++ b/drivers/gpu/drm/i915/i915_request.c
> > > @@ -97,6 +97,25 @@ static bool i915_fence_enable_signaling(struct dma_fence *fence)
> > >   	return i915_request_enable_breadcrumb(to_request(fence));
> > >   }
> > > +static void i915_fence_set_deadline(struct dma_fence *fence, ktime_t deadline)
> > > +{
> > > +	struct i915_request *rq = to_request(fence);
> > > +
> > > +	if (i915_request_completed(rq))
> > > +		return;
> > > +
> > > +	if (i915_request_started(rq))
> > > +		return;
> > 
> > why do we skip the boost if already started?
> > don't we want to boost the freq anyway?
> 
> I'd wager Rob is just copying the current i915 wait boost logic.
> 
> > > +
> > > +	/*
> > > +	 * TODO something more clever for deadlines that are in the
> > > +	 * future.  I think probably track the nearest deadline in
> > > +	 * rq->timeline and set timer to trigger boost accordingly?
> > > +	 */
> > 
> > I'm afraid it will be very hard to find some heuristics of what's
> > late enough for the boost no?
> > I mean, how early to boost the freq on an upcoming deadline for the
> > timer?
> 
> We can off load this patch from Rob and deal with it separately, or after
> the fact?
> 
> It's a half solution without a smarter scheduler too. Like https://lore.kernel.org/all/20210208105236.28498-10-chris@chris-wilson.co.uk/,
> or if GuC plans to do something like that at any point.

Indeed, we already have the deadline implementation (and not just
that), we just need to have some willingness to apply it.

Andi

> Or bump the priority too if deadline is looming?
> 
> IMO it is not very effective to fiddle with the heuristic on an ad-hoc
> basis. For instance I have a new heuristics which improves the problematic
> OpenCL cases for further 5% (relative to the current waitboost improvement
> from adding missing syncobj waitboost). But I can't really test properly for
> regressions over platforms, stacks, workloads.. :(
> 
> Regards,
> 
> Tvrtko
> 
> > 
> > > +
> > > +	intel_rps_boost(rq);
> > > +}
> > > +
> > >   static signed long i915_fence_wait(struct dma_fence *fence,
> > >   				   bool interruptible,
> > >   				   signed long timeout)
> > > @@ -182,6 +201,7 @@ const struct dma_fence_ops i915_fence_ops = {
> > >   	.signaled = i915_fence_signaled,
> > >   	.wait = i915_fence_wait,
> > >   	.release = i915_fence_release,
> > > +	.set_deadline = i915_fence_set_deadline,
> > >   };
> > >   static void irq_execute_cb(struct irq_work *wrk)
> > > -- 
> > > 2.39.1
> > >

Rob Clark March 3, 2023, 2:48 p.m. UTC | #4

On Fri, Mar 3, 2023 at 1:58 AM Tvrtko Ursulin
<tvrtko.ursulin@linux.intel.com> wrote:
>
>
> On 03/03/2023 03:21, Rodrigo Vivi wrote:
> > On Thu, Mar 02, 2023 at 03:53:37PM -0800, Rob Clark wrote:
> >> From: Rob Clark <robdclark@chromium.org>
> >>
> >
> > missing some wording here...
> >
> >> v2: rebase
> >>
> >> Signed-off-by: Rob Clark <robdclark@chromium.org>
> >> ---
> >>   drivers/gpu/drm/i915/i915_request.c | 20 ++++++++++++++++++++
> >>   1 file changed, 20 insertions(+)
> >>
> >> diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
> >> index 7503dcb9043b..44491e7e214c 100644
> >> --- a/drivers/gpu/drm/i915/i915_request.c
> >> +++ b/drivers/gpu/drm/i915/i915_request.c
> >> @@ -97,6 +97,25 @@ static bool i915_fence_enable_signaling(struct dma_fence *fence)
> >>      return i915_request_enable_breadcrumb(to_request(fence));
> >>   }
> >>
> >> +static void i915_fence_set_deadline(struct dma_fence *fence, ktime_t deadline)
> >> +{
> >> +    struct i915_request *rq = to_request(fence);
> >> +
> >> +    if (i915_request_completed(rq))
> >> +            return;
> >> +
> >> +    if (i915_request_started(rq))
> >> +            return;
> >
> > why do we skip the boost if already started?
> > don't we want to boost the freq anyway?
>
> I'd wager Rob is just copying the current i915 wait boost logic.

Yup, and probably incorrectly.. Matt reported fewer boosts/sec
compared to your RFC, this could be the bug

> >> +
> >> +    /*
> >> +     * TODO something more clever for deadlines that are in the
> >> +     * future.  I think probably track the nearest deadline in
> >> +     * rq->timeline and set timer to trigger boost accordingly?
> >> +     */
> >
> > I'm afraid it will be very hard to find some heuristics of what's
> > late enough for the boost no?
> > I mean, how early to boost the freq on an upcoming deadline for the
> > timer?
>
> We can off load this patch from Rob and deal with it separately, or
> after the fact?

That is completely my intention, I expect you to replace my i915 patch ;-)

Rough idea when everyone is happy with the core bits is to setup an
immutable branch without the driver specific patches, which could be
merged into drm-next and $driver-next and then each driver team can
add there own driver patches on top

BR,
-R

> It's a half solution without a smarter scheduler too. Like
> https://lore.kernel.org/all/20210208105236.28498-10-chris@chris-wilson.co.uk/,
> or if GuC plans to do something like that at any point.
>
> Or bump the priority too if deadline is looming?
>
> IMO it is not very effective to fiddle with the heuristic on an ad-hoc
> basis. For instance I have a new heuristics which improves the
> problematic OpenCL cases for further 5% (relative to the current
> waitboost improvement from adding missing syncobj waitboost). But I
> can't really test properly for regressions over platforms, stacks,
> workloads.. :(
>
> Regards,
>
> Tvrtko
>
> >
> >> +
> >> +    intel_rps_boost(rq);
> >> +}
> >> +
> >>   static signed long i915_fence_wait(struct dma_fence *fence,
> >>                                 bool interruptible,
> >>                                 signed long timeout)
> >> @@ -182,6 +201,7 @@ const struct dma_fence_ops i915_fence_ops = {
> >>      .signaled = i915_fence_signaled,
> >>      .wait = i915_fence_wait,
> >>      .release = i915_fence_release,
> >> +    .set_deadline = i915_fence_set_deadline,
> >>   };
> >>
> >>   static void irq_execute_cb(struct irq_work *wrk)
> >> --
> >> 2.39.1
> >>

Rob Clark March 3, 2023, 2:56 p.m. UTC | #5

On Thu, Mar 2, 2023 at 7:21 PM Rodrigo Vivi <rodrigo.vivi@intel.com> wrote:
>
> On Thu, Mar 02, 2023 at 03:53:37PM -0800, Rob Clark wrote:
> > From: Rob Clark <robdclark@chromium.org>
> >
>
> missing some wording here...

the wording should be "Pls replace this patch, kthx" ;-)

>
> > v2: rebase
> >
> > Signed-off-by: Rob Clark <robdclark@chromium.org>
> > ---
> >  drivers/gpu/drm/i915/i915_request.c | 20 ++++++++++++++++++++
> >  1 file changed, 20 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
> > index 7503dcb9043b..44491e7e214c 100644
> > --- a/drivers/gpu/drm/i915/i915_request.c
> > +++ b/drivers/gpu/drm/i915/i915_request.c
> > @@ -97,6 +97,25 @@ static bool i915_fence_enable_signaling(struct dma_fence *fence)
> >       return i915_request_enable_breadcrumb(to_request(fence));
> >  }
> >
> > +static void i915_fence_set_deadline(struct dma_fence *fence, ktime_t deadline)
> > +{
> > +     struct i915_request *rq = to_request(fence);
> > +
> > +     if (i915_request_completed(rq))
> > +             return;
> > +
> > +     if (i915_request_started(rq))
> > +             return;
>
> why do we skip the boost if already started?
> don't we want to boost the freq anyway?
>
> > +
> > +     /*
> > +      * TODO something more clever for deadlines that are in the
> > +      * future.  I think probably track the nearest deadline in
> > +      * rq->timeline and set timer to trigger boost accordingly?
> > +      */
>
> I'm afraid it will be very hard to find some heuristics of what's
> late enough for the boost no?
> I mean, how early to boost the freq on an upcoming deadline for the
> timer?

So, from my understanding of i915 boosting, it applies more
specifically to a given request (vs msm which just bumps up the freq
until the next devfreq sampling period which then recalculates target
freq based on busy cycles and avg freq over the last sampling period).
For msm I just set a timer for 3ms before the deadline and boost if
the fence isn't signaled when the timer fires.  It is kinda impossible
to predict, even for userspace, how long a job will take to complete,
so the goal isn't really to finish the specified job by the deadline,
but instead to avoid devfreq landing at a local minimum (maximum?)

AFAIU what I _think_ would make sense for i915 is to do the same thing
you do if you miss a vblank pageflip deadline if the deadline passes
without the fence signaling.

BR,
-R

> > +
> > +     intel_rps_boost(rq);
> > +}
> > +
> >  static signed long i915_fence_wait(struct dma_fence *fence,
> >                                  bool interruptible,
> >                                  signed long timeout)
> > @@ -182,6 +201,7 @@ const struct dma_fence_ops i915_fence_ops = {
> >       .signaled = i915_fence_signaled,
> >       .wait = i915_fence_wait,
> >       .release = i915_fence_release,
> > +     .set_deadline = i915_fence_set_deadline,
> >  };
> >
> >  static void irq_execute_cb(struct irq_work *wrk)
> > --
> > 2.39.1
> >

Ville Syrjälä March 3, 2023, 3 p.m. UTC | #6

On Fri, Mar 03, 2023 at 06:48:43AM -0800, Rob Clark wrote:
> On Fri, Mar 3, 2023 at 1:58 AM Tvrtko Ursulin
> <tvrtko.ursulin@linux.intel.com> wrote:
> >
> >
> > On 03/03/2023 03:21, Rodrigo Vivi wrote:
> > > On Thu, Mar 02, 2023 at 03:53:37PM -0800, Rob Clark wrote:
> > >> From: Rob Clark <robdclark@chromium.org>
> > >>
> > >
> > > missing some wording here...
> > >
> > >> v2: rebase
> > >>
> > >> Signed-off-by: Rob Clark <robdclark@chromium.org>
> > >> ---
> > >>   drivers/gpu/drm/i915/i915_request.c | 20 ++++++++++++++++++++
> > >>   1 file changed, 20 insertions(+)
> > >>
> > >> diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
> > >> index 7503dcb9043b..44491e7e214c 100644
> > >> --- a/drivers/gpu/drm/i915/i915_request.c
> > >> +++ b/drivers/gpu/drm/i915/i915_request.c
> > >> @@ -97,6 +97,25 @@ static bool i915_fence_enable_signaling(struct dma_fence *fence)
> > >>      return i915_request_enable_breadcrumb(to_request(fence));
> > >>   }
> > >>
> > >> +static void i915_fence_set_deadline(struct dma_fence *fence, ktime_t deadline)
> > >> +{
> > >> +    struct i915_request *rq = to_request(fence);
> > >> +
> > >> +    if (i915_request_completed(rq))
> > >> +            return;
> > >> +
> > >> +    if (i915_request_started(rq))
> > >> +            return;
> > >
> > > why do we skip the boost if already started?
> > > don't we want to boost the freq anyway?
> >
> > I'd wager Rob is just copying the current i915 wait boost logic.
> 
> Yup, and probably incorrectly.. Matt reported fewer boosts/sec
> compared to your RFC, this could be the bug

I don't think i915 calls drm_atomic_helper_wait_for_fences()
so that could explain something.

Tvrtko Ursulin March 3, 2023, 3:07 p.m. UTC | #7

On 03/03/2023 14:48, Rob Clark wrote:
> On Fri, Mar 3, 2023 at 1:58 AM Tvrtko Ursulin
> <tvrtko.ursulin@linux.intel.com> wrote:
>>
>>
>> On 03/03/2023 03:21, Rodrigo Vivi wrote:
>>> On Thu, Mar 02, 2023 at 03:53:37PM -0800, Rob Clark wrote:
>>>> From: Rob Clark <robdclark@chromium.org>
>>>>
>>>
>>> missing some wording here...
>>>
>>>> v2: rebase
>>>>
>>>> Signed-off-by: Rob Clark <robdclark@chromium.org>
>>>> ---
>>>>    drivers/gpu/drm/i915/i915_request.c | 20 ++++++++++++++++++++
>>>>    1 file changed, 20 insertions(+)
>>>>
>>>> diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
>>>> index 7503dcb9043b..44491e7e214c 100644
>>>> --- a/drivers/gpu/drm/i915/i915_request.c
>>>> +++ b/drivers/gpu/drm/i915/i915_request.c
>>>> @@ -97,6 +97,25 @@ static bool i915_fence_enable_signaling(struct dma_fence *fence)
>>>>       return i915_request_enable_breadcrumb(to_request(fence));
>>>>    }
>>>>
>>>> +static void i915_fence_set_deadline(struct dma_fence *fence, ktime_t deadline)
>>>> +{
>>>> +    struct i915_request *rq = to_request(fence);
>>>> +
>>>> +    if (i915_request_completed(rq))
>>>> +            return;
>>>> +
>>>> +    if (i915_request_started(rq))
>>>> +            return;
>>>
>>> why do we skip the boost if already started?
>>> don't we want to boost the freq anyway?
>>
>> I'd wager Rob is just copying the current i915 wait boost logic.
> 
> Yup, and probably incorrectly.. Matt reported fewer boosts/sec
> compared to your RFC, this could be the bug

Hm, there I have preserved this same !i915_request_started logic.

Presumably it's not just fewer boosts but lower performance. How is he 
setting the deadline? Somehow from clFlush or so?

Regards,

Tvrtko

P.S. Take note that I did not post the latest version of my RFC. The one 
where I fix the fence chain and array misses you pointed out. I did not 
think it would be worthwhile given no universal love for it, but if 
people are testing with it more widely that I was aware perhaps I should.

>>>> +
>>>> +    /*
>>>> +     * TODO something more clever for deadlines that are in the
>>>> +     * future.  I think probably track the nearest deadline in
>>>> +     * rq->timeline and set timer to trigger boost accordingly?
>>>> +     */
>>>
>>> I'm afraid it will be very hard to find some heuristics of what's
>>> late enough for the boost no?
>>> I mean, how early to boost the freq on an upcoming deadline for the
>>> timer?
>>
>> We can off load this patch from Rob and deal with it separately, or
>> after the fact?
> 
> That is completely my intention, I expect you to replace my i915 patch ;-)
> 
> Rough idea when everyone is happy with the core bits is to setup an
> immutable branch without the driver specific patches, which could be
> merged into drm-next and $driver-next and then each driver team can
> add there own driver patches on top
> 
> BR,
> -R
> 
>> It's a half solution without a smarter scheduler too. Like
>> https://lore.kernel.org/all/20210208105236.28498-10-chris@chris-wilson.co.uk/,
>> or if GuC plans to do something like that at any point.
>>
>> Or bump the priority too if deadline is looming?
>>
>> IMO it is not very effective to fiddle with the heuristic on an ad-hoc
>> basis. For instance I have a new heuristics which improves the
>> problematic OpenCL cases for further 5% (relative to the current
>> waitboost improvement from adding missing syncobj waitboost). But I
>> can't really test properly for regressions over platforms, stacks,
>> workloads.. :(
>>
>> Regards,
>>
>> Tvrtko
>>
>>>
>>>> +
>>>> +    intel_rps_boost(rq);
>>>> +}
>>>> +
>>>>    static signed long i915_fence_wait(struct dma_fence *fence,
>>>>                                  bool interruptible,
>>>>                                  signed long timeout)
>>>> @@ -182,6 +201,7 @@ const struct dma_fence_ops i915_fence_ops = {
>>>>       .signaled = i915_fence_signaled,
>>>>       .wait = i915_fence_wait,
>>>>       .release = i915_fence_release,
>>>> +    .set_deadline = i915_fence_set_deadline,
>>>>    };
>>>>
>>>>    static void irq_execute_cb(struct irq_work *wrk)
>>>> --
>>>> 2.39.1
>>>>

Ville Syrjälä March 3, 2023, 3:19 p.m. UTC | #8

On Fri, Mar 03, 2023 at 05:00:03PM +0200, Ville Syrjälä wrote:
> On Fri, Mar 03, 2023 at 06:48:43AM -0800, Rob Clark wrote:
> > On Fri, Mar 3, 2023 at 1:58 AM Tvrtko Ursulin
> > <tvrtko.ursulin@linux.intel.com> wrote:
> > >
> > >
> > > On 03/03/2023 03:21, Rodrigo Vivi wrote:
> > > > On Thu, Mar 02, 2023 at 03:53:37PM -0800, Rob Clark wrote:
> > > >> From: Rob Clark <robdclark@chromium.org>
> > > >>
> > > >
> > > > missing some wording here...
> > > >
> > > >> v2: rebase
> > > >>
> > > >> Signed-off-by: Rob Clark <robdclark@chromium.org>
> > > >> ---
> > > >>   drivers/gpu/drm/i915/i915_request.c | 20 ++++++++++++++++++++
> > > >>   1 file changed, 20 insertions(+)
> > > >>
> > > >> diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
> > > >> index 7503dcb9043b..44491e7e214c 100644
> > > >> --- a/drivers/gpu/drm/i915/i915_request.c
> > > >> +++ b/drivers/gpu/drm/i915/i915_request.c
> > > >> @@ -97,6 +97,25 @@ static bool i915_fence_enable_signaling(struct dma_fence *fence)
> > > >>      return i915_request_enable_breadcrumb(to_request(fence));
> > > >>   }
> > > >>
> > > >> +static void i915_fence_set_deadline(struct dma_fence *fence, ktime_t deadline)
> > > >> +{
> > > >> +    struct i915_request *rq = to_request(fence);
> > > >> +
> > > >> +    if (i915_request_completed(rq))
> > > >> +            return;
> > > >> +
> > > >> +    if (i915_request_started(rq))
> > > >> +            return;
> > > >
> > > > why do we skip the boost if already started?
> > > > don't we want to boost the freq anyway?
> > >
> > > I'd wager Rob is just copying the current i915 wait boost logic.
> > 
> > Yup, and probably incorrectly.. Matt reported fewer boosts/sec
> > compared to your RFC, this could be the bug
> 
> I don't think i915 calls drm_atomic_helper_wait_for_fences()
> so that could explain something.

Oh, I guess this wasn't even supposed to take over the current 
display boost stuff since you didn't remove the old one.

The current one just boosts after a missed vblank. The deadline
could use your timer approach I suppose and boost already a bit
earlier in the hopes of not missing the vblank.

Rob Clark March 3, 2023, 3:41 p.m. UTC | #9

On Fri, Mar 3, 2023 at 7:08 AM Tvrtko Ursulin
<tvrtko.ursulin@linux.intel.com> wrote:
>
>
> On 03/03/2023 14:48, Rob Clark wrote:
> > On Fri, Mar 3, 2023 at 1:58 AM Tvrtko Ursulin
> > <tvrtko.ursulin@linux.intel.com> wrote:
> >>
> >>
> >> On 03/03/2023 03:21, Rodrigo Vivi wrote:
> >>> On Thu, Mar 02, 2023 at 03:53:37PM -0800, Rob Clark wrote:
> >>>> From: Rob Clark <robdclark@chromium.org>
> >>>>
> >>>
> >>> missing some wording here...
> >>>
> >>>> v2: rebase
> >>>>
> >>>> Signed-off-by: Rob Clark <robdclark@chromium.org>
> >>>> ---
> >>>>    drivers/gpu/drm/i915/i915_request.c | 20 ++++++++++++++++++++
> >>>>    1 file changed, 20 insertions(+)
> >>>>
> >>>> diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
> >>>> index 7503dcb9043b..44491e7e214c 100644
> >>>> --- a/drivers/gpu/drm/i915/i915_request.c
> >>>> +++ b/drivers/gpu/drm/i915/i915_request.c
> >>>> @@ -97,6 +97,25 @@ static bool i915_fence_enable_signaling(struct dma_fence *fence)
> >>>>       return i915_request_enable_breadcrumb(to_request(fence));
> >>>>    }
> >>>>
> >>>> +static void i915_fence_set_deadline(struct dma_fence *fence, ktime_t deadline)
> >>>> +{
> >>>> +    struct i915_request *rq = to_request(fence);
> >>>> +
> >>>> +    if (i915_request_completed(rq))
> >>>> +            return;
> >>>> +
> >>>> +    if (i915_request_started(rq))
> >>>> +            return;
> >>>
> >>> why do we skip the boost if already started?
> >>> don't we want to boost the freq anyway?
> >>
> >> I'd wager Rob is just copying the current i915 wait boost logic.
> >
> > Yup, and probably incorrectly.. Matt reported fewer boosts/sec
> > compared to your RFC, this could be the bug
>
> Hm, there I have preserved this same !i915_request_started logic.
>
> Presumably it's not just fewer boosts but lower performance. How is he
> setting the deadline? Somehow from clFlush or so?

Yeah, fewer boosts, lower freq/perf.. I cobbled together a quick mesa
hack to set the DEADLINE flag on syncobj waits, but it seems likely
that I missed something somewhere

BR,
-R

> Regards,
>
> Tvrtko
>
> P.S. Take note that I did not post the latest version of my RFC. The one
> where I fix the fence chain and array misses you pointed out. I did not
> think it would be worthwhile given no universal love for it, but if
> people are testing with it more widely that I was aware perhaps I should.
>
> >>>> +
> >>>> +    /*
> >>>> +     * TODO something more clever for deadlines that are in the
> >>>> +     * future.  I think probably track the nearest deadline in
> >>>> +     * rq->timeline and set timer to trigger boost accordingly?
> >>>> +     */
> >>>
> >>> I'm afraid it will be very hard to find some heuristics of what's
> >>> late enough for the boost no?
> >>> I mean, how early to boost the freq on an upcoming deadline for the
> >>> timer?
> >>
> >> We can off load this patch from Rob and deal with it separately, or
> >> after the fact?
> >
> > That is completely my intention, I expect you to replace my i915 patch ;-)
> >
> > Rough idea when everyone is happy with the core bits is to setup an
> > immutable branch without the driver specific patches, which could be
> > merged into drm-next and $driver-next and then each driver team can
> > add there own driver patches on top
> >
> > BR,
> > -R
> >
> >> It's a half solution without a smarter scheduler too. Like
> >> https://lore.kernel.org/all/20210208105236.28498-10-chris@chris-wilson.co.uk/,
> >> or if GuC plans to do something like that at any point.
> >>
> >> Or bump the priority too if deadline is looming?
> >>
> >> IMO it is not very effective to fiddle with the heuristic on an ad-hoc
> >> basis. For instance I have a new heuristics which improves the
> >> problematic OpenCL cases for further 5% (relative to the current
> >> waitboost improvement from adding missing syncobj waitboost). But I
> >> can't really test properly for regressions over platforms, stacks,
> >> workloads.. :(
> >>
> >> Regards,
> >>
> >> Tvrtko
> >>
> >>>
> >>>> +
> >>>> +    intel_rps_boost(rq);
> >>>> +}
> >>>> +
> >>>>    static signed long i915_fence_wait(struct dma_fence *fence,
> >>>>                                  bool interruptible,
> >>>>                                  signed long timeout)
> >>>> @@ -182,6 +201,7 @@ const struct dma_fence_ops i915_fence_ops = {
> >>>>       .signaled = i915_fence_signaled,
> >>>>       .wait = i915_fence_wait,
> >>>>       .release = i915_fence_release,
> >>>> +    .set_deadline = i915_fence_set_deadline,
> >>>>    };
> >>>>
> >>>>    static void irq_execute_cb(struct irq_work *wrk)
> >>>> --
> >>>> 2.39.1
> >>>>

Rob Clark March 3, 2023, 3:43 p.m. UTC | #10

On Fri, Mar 3, 2023 at 7:20 AM Ville Syrjälä
<ville.syrjala@linux.intel.com> wrote:
>
> On Fri, Mar 03, 2023 at 05:00:03PM +0200, Ville Syrjälä wrote:
> > On Fri, Mar 03, 2023 at 06:48:43AM -0800, Rob Clark wrote:
> > > On Fri, Mar 3, 2023 at 1:58 AM Tvrtko Ursulin
> > > <tvrtko.ursulin@linux.intel.com> wrote:
> > > >
> > > >
> > > > On 03/03/2023 03:21, Rodrigo Vivi wrote:
> > > > > On Thu, Mar 02, 2023 at 03:53:37PM -0800, Rob Clark wrote:
> > > > >> From: Rob Clark <robdclark@chromium.org>
> > > > >>
> > > > >
> > > > > missing some wording here...
> > > > >
> > > > >> v2: rebase
> > > > >>
> > > > >> Signed-off-by: Rob Clark <robdclark@chromium.org>
> > > > >> ---
> > > > >>   drivers/gpu/drm/i915/i915_request.c | 20 ++++++++++++++++++++
> > > > >>   1 file changed, 20 insertions(+)
> > > > >>
> > > > >> diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
> > > > >> index 7503dcb9043b..44491e7e214c 100644
> > > > >> --- a/drivers/gpu/drm/i915/i915_request.c
> > > > >> +++ b/drivers/gpu/drm/i915/i915_request.c
> > > > >> @@ -97,6 +97,25 @@ static bool i915_fence_enable_signaling(struct dma_fence *fence)
> > > > >>      return i915_request_enable_breadcrumb(to_request(fence));
> > > > >>   }
> > > > >>
> > > > >> +static void i915_fence_set_deadline(struct dma_fence *fence, ktime_t deadline)
> > > > >> +{
> > > > >> +    struct i915_request *rq = to_request(fence);
> > > > >> +
> > > > >> +    if (i915_request_completed(rq))
> > > > >> +            return;
> > > > >> +
> > > > >> +    if (i915_request_started(rq))
> > > > >> +            return;
> > > > >
> > > > > why do we skip the boost if already started?
> > > > > don't we want to boost the freq anyway?
> > > >
> > > > I'd wager Rob is just copying the current i915 wait boost logic.
> > >
> > > Yup, and probably incorrectly.. Matt reported fewer boosts/sec
> > > compared to your RFC, this could be the bug
> >
> > I don't think i915 calls drm_atomic_helper_wait_for_fences()
> > so that could explain something.
>
> Oh, I guess this wasn't even supposed to take over the current
> display boost stuff since you didn't remove the old one.

Right, I didn't try to replace the current thing.. but hopefully at
least make it possible for i915 to use more of the atomic helpers in
the future

BR,
-R

> The current one just boosts after a missed vblank. The deadline
> could use your timer approach I suppose and boost already a bit
> earlier in the hopes of not missing the vblank.
>
> --
> Ville Syrjälä
> Intel

Matt Turner March 3, 2023, 6:20 p.m. UTC | #11

On Fri, Mar 3, 2023 at 10:08 AM Tvrtko Ursulin
<tvrtko.ursulin@linux.intel.com> wrote:
>
>
> On 03/03/2023 14:48, Rob Clark wrote:
> > On Fri, Mar 3, 2023 at 1:58 AM Tvrtko Ursulin
> > <tvrtko.ursulin@linux.intel.com> wrote:
> >>
> >>
> >> On 03/03/2023 03:21, Rodrigo Vivi wrote:
> >>> On Thu, Mar 02, 2023 at 03:53:37PM -0800, Rob Clark wrote:
> >>>> From: Rob Clark <robdclark@chromium.org>
> >>>>
> >>>
> >>> missing some wording here...
> >>>
> >>>> v2: rebase
> >>>>
> >>>> Signed-off-by: Rob Clark <robdclark@chromium.org>
> >>>> ---
> >>>>    drivers/gpu/drm/i915/i915_request.c | 20 ++++++++++++++++++++
> >>>>    1 file changed, 20 insertions(+)
> >>>>
> >>>> diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
> >>>> index 7503dcb9043b..44491e7e214c 100644
> >>>> --- a/drivers/gpu/drm/i915/i915_request.c
> >>>> +++ b/drivers/gpu/drm/i915/i915_request.c
> >>>> @@ -97,6 +97,25 @@ static bool i915_fence_enable_signaling(struct dma_fence *fence)
> >>>>       return i915_request_enable_breadcrumb(to_request(fence));
> >>>>    }
> >>>>
> >>>> +static void i915_fence_set_deadline(struct dma_fence *fence, ktime_t deadline)
> >>>> +{
> >>>> +    struct i915_request *rq = to_request(fence);
> >>>> +
> >>>> +    if (i915_request_completed(rq))
> >>>> +            return;
> >>>> +
> >>>> +    if (i915_request_started(rq))
> >>>> +            return;
> >>>
> >>> why do we skip the boost if already started?
> >>> don't we want to boost the freq anyway?
> >>
> >> I'd wager Rob is just copying the current i915 wait boost logic.
> >
> > Yup, and probably incorrectly.. Matt reported fewer boosts/sec
> > compared to your RFC, this could be the bug
>
> Hm, there I have preserved this same !i915_request_started logic.
>
> Presumably it's not just fewer boosts but lower performance. How is he
> setting the deadline? Somehow from clFlush or so?
>
> Regards,
>
> Tvrtko
>
> P.S. Take note that I did not post the latest version of my RFC. The one
> where I fix the fence chain and array misses you pointed out. I did not
> think it would be worthwhile given no universal love for it, but if
> people are testing with it more widely that I was aware perhaps I should.

Yep, that would be great. We're interested in it for ChromeOS. Please
Cc me on the series when you send it.

diff mbox series

Patch

diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c
index 7503dcb9043b..44491e7e214c 100644
--- a/drivers/gpu/drm/i915/i915_request.c
+++ b/drivers/gpu/drm/i915/i915_request.c
@@ -97,6 +97,25 @@  static bool i915_fence_enable_signaling(struct dma_fence *fence)
 	return i915_request_enable_breadcrumb(to_request(fence));
 }
 
+static void i915_fence_set_deadline(struct dma_fence *fence, ktime_t deadline)
+{
+	struct i915_request *rq = to_request(fence);
+
+	if (i915_request_completed(rq))
+		return;
+
+	if (i915_request_started(rq))
+		return;
+
+	/*
+	 * TODO something more clever for deadlines that are in the
+	 * future.  I think probably track the nearest deadline in
+	 * rq->timeline and set timer to trigger boost accordingly?
+	 */
+
+	intel_rps_boost(rq);
+}
+
 static signed long i915_fence_wait(struct dma_fence *fence,
 				   bool interruptible,
 				   signed long timeout)
@@ -182,6 +201,7 @@  const struct dma_fence_ops i915_fence_ops = {
 	.signaled = i915_fence_signaled,
 	.wait = i915_fence_wait,
 	.release = i915_fence_release,
+	.set_deadline = i915_fence_set_deadline,
 };
 
 static void irq_execute_cb(struct irq_work *wrk)