Message ID | 20230302235356.3148279-16-robdclark@gmail.com (mailing list archive) |
---|---|
State | Not Applicable |
Headers |
Received: from vger.kernel.org ([23.128.96.18]) by www.linuxtv.org with esmtp (Exim 4.92) (envelope-from <linux-media-owner@vger.kernel.org>) id 1pXslj-00AJ5o-9i; Thu, 02 Mar 2023 23:55:28 +0000 Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230268AbjCBXzY (ORCPT <rfc822;mkrufky@linuxtv.org> + 1 other); Thu, 2 Mar 2023 18:55:24 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42302 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230205AbjCBXzJ (ORCPT <rfc822;linux-media@vger.kernel.org>); Thu, 2 Mar 2023 18:55:09 -0500 Received: from mail-pj1-x1030.google.com (mail-pj1-x1030.google.com [IPv6:2607:f8b0:4864:20::1030]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6A02438EA9; Thu, 2 Mar 2023 15:54:46 -0800 (PST) Received: by mail-pj1-x1030.google.com with SMTP id bo22so796134pjb.4; Thu, 02 Mar 2023 15:54:46 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; t=1677801278; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=9VcsUu3unD+maL4g/jB1gMbq//XnoqiDQvEiNIyt7mI=; b=ilNhgi7gA9g5UF3JdbyQJ98pGE28jeyXokpoYZgo/4oGV5MZDWTuptdkD3ZA1ZMJvZ rCQMSfGH0oH4lksZHwpeD/Tvqzs9/xczhovoCeEZIqu+we8vWtqvV/g64xYPZBOzhU1V QGu9jUSw8WMjx2PIomcktDqHUO7xacblyp2KMC5q78bDv3NEgIgkgoCPdMJnAvemOlq0 9t79FhU9jlhR3S52crrH45gg41OmGuWea1o9Q1kELGp/FR+eOT82YQpRgPM16l3yr3a3 UI+Wv1rxy6t7MsrV2Q3tj9nng5KYC1vhTYNOeTQBs4OxwcDzwEC6rRK2Po63yuSC+VY1 NvMQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1677801278; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=9VcsUu3unD+maL4g/jB1gMbq//XnoqiDQvEiNIyt7mI=; b=7lNTjr4+6XcNdlEpCZpDBlQUZUOVRAxnb9gQuySwnQjSCLL0t//0WpO0bObXktI+aY XrLfa6zx5zVU5AC2X5UAR3j44e4OHhoiHtAbITBL4XIyW+pNeB+1d0C26jXRuGf8qYHE VSqwLnV9a0elcTlFYsNixCFiJ7YpBDg06xas8Vqvk6ofzsUOwC4NQGodNVdfB5+AfVTk hQagFJaXkp2OEAW4JAk8bxzUtuxzbHqOym8m7AtUcDZeEA+96ZMELfOPuuwPLdXVmiqG JDMag04Q3lde02Hj5JB7xiC/zysMZJL9qRJow0LaRU3qklartOAOLiB9JhxwGnTTBqZk ZkKw== X-Gm-Message-State: AO0yUKXAebleezLTGxV3iZYLA3APvITrnbmijfxvgfvL0gvTpaGkPlZM 14s8jtK2HZTi3dtUqv/ykDM= X-Google-Smtp-Source: AK7set8MMNMyoxi1CQFIM96JL5lxCKCAgiy9KtwRs4PV9rjzDG2oshbAyTSE0x8FSQcNt9BUvoOzxQ== X-Received: by 2002:a17:903:1c1:b0:19e:72c5:34df with SMTP id e1-20020a17090301c100b0019e72c534dfmr17373plh.52.1677801277784; Thu, 02 Mar 2023 15:54:37 -0800 (PST) Received: from localhost ([2a00:79e1:abd:4a00:61b:48ed:72ab:435b]) by smtp.gmail.com with ESMTPSA id m9-20020a170902768900b0019ac5d3ee9dsm218526pll.157.2023.03.02.15.54.36 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 02 Mar 2023 15:54:37 -0800 (PST) From: Rob Clark <robdclark@gmail.com> To: dri-devel@lists.freedesktop.org Cc: freedreno@lists.freedesktop.org, Daniel Vetter <daniel@ffwll.ch>, =?utf-8?q?Christian_K=C3=B6nig?= <ckoenig.leichtzumerken@gmail.com>, =?utf-8?q?Michel_D=C3=A4nzer?= <michel@daenzer.net>, Tvrtko Ursulin <tvrtko.ursulin@intel.com>, Rodrigo Vivi <rodrigo.vivi@intel.com>, Alex Deucher <alexander.deucher@amd.com>, Pekka Paalanen <ppaalanen@gmail.com>, Simon Ser <contact@emersion.fr>, Luben Tuikov <luben.tuikov@amd.com>, Rob Clark <robdclark@chromium.org>, Jani Nikula <jani.nikula@linux.intel.com>, Joonas Lahtinen <joonas.lahtinen@linux.intel.com>, Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>, David Airlie <airlied@gmail.com>, Sumit Semwal <sumit.semwal@linaro.org>, =?utf-8?q?Christian_K=C3=B6nig?= <christian.koenig@amd.com>, intel-gfx@lists.freedesktop.org, linux-kernel@vger.kernel.org (open list), linux-media@vger.kernel.org (open list:DMA BUFFER SHARING FRAMEWORK), linaro-mm-sig@lists.linaro.org (moderated list:DMA BUFFER SHARING FRAMEWORK) Subject: [PATCH v9 15/15] drm/i915: Add deadline based boost support Date: Thu, 2 Mar 2023 15:53:37 -0800 Message-Id: <20230302235356.3148279-16-robdclark@gmail.com> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230302235356.3148279-1-robdclark@gmail.com> References: <20230302235356.3148279-1-robdclark@gmail.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM, RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-media.vger.kernel.org> X-Mailing-List: linux-media@vger.kernel.org X-LSpam-Score: -2.5 (--) X-LSpam-Report: No, score=-2.5 required=5.0 tests=BAYES_00=-1.9,DKIM_SIGNED=0.1,DKIM_VALID=-0.1,DKIM_VALID_AU=-0.1,FREEMAIL_FORGED_FROMDOMAIN=0.001,FREEMAIL_FROM=0.001,HEADER_FROM_DIFFERENT_DOMAINS=0.5,MAILING_LIST_MULTI=-1 autolearn=ham autolearn_force=no |
Series |
dma-fence: Deadline awareness
|
|
Commit Message
Rob Clark
March 2, 2023, 11:53 p.m. UTC
From: Rob Clark <robdclark@chromium.org> v2: rebase Signed-off-by: Rob Clark <robdclark@chromium.org> --- drivers/gpu/drm/i915/i915_request.c | 20 ++++++++++++++++++++ 1 file changed, 20 insertions(+)
Comments
On Thu, Mar 02, 2023 at 03:53:37PM -0800, Rob Clark wrote: > From: Rob Clark <robdclark@chromium.org> > missing some wording here... > v2: rebase > > Signed-off-by: Rob Clark <robdclark@chromium.org> > --- > drivers/gpu/drm/i915/i915_request.c | 20 ++++++++++++++++++++ > 1 file changed, 20 insertions(+) > > diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c > index 7503dcb9043b..44491e7e214c 100644 > --- a/drivers/gpu/drm/i915/i915_request.c > +++ b/drivers/gpu/drm/i915/i915_request.c > @@ -97,6 +97,25 @@ static bool i915_fence_enable_signaling(struct dma_fence *fence) > return i915_request_enable_breadcrumb(to_request(fence)); > } > > +static void i915_fence_set_deadline(struct dma_fence *fence, ktime_t deadline) > +{ > + struct i915_request *rq = to_request(fence); > + > + if (i915_request_completed(rq)) > + return; > + > + if (i915_request_started(rq)) > + return; why do we skip the boost if already started? don't we want to boost the freq anyway? > + > + /* > + * TODO something more clever for deadlines that are in the > + * future. I think probably track the nearest deadline in > + * rq->timeline and set timer to trigger boost accordingly? > + */ I'm afraid it will be very hard to find some heuristics of what's late enough for the boost no? I mean, how early to boost the freq on an upcoming deadline for the timer? > + > + intel_rps_boost(rq); > +} > + > static signed long i915_fence_wait(struct dma_fence *fence, > bool interruptible, > signed long timeout) > @@ -182,6 +201,7 @@ const struct dma_fence_ops i915_fence_ops = { > .signaled = i915_fence_signaled, > .wait = i915_fence_wait, > .release = i915_fence_release, > + .set_deadline = i915_fence_set_deadline, > }; > > static void irq_execute_cb(struct irq_work *wrk) > -- > 2.39.1 >
On 03/03/2023 03:21, Rodrigo Vivi wrote: > On Thu, Mar 02, 2023 at 03:53:37PM -0800, Rob Clark wrote: >> From: Rob Clark <robdclark@chromium.org> >> > > missing some wording here... > >> v2: rebase >> >> Signed-off-by: Rob Clark <robdclark@chromium.org> >> --- >> drivers/gpu/drm/i915/i915_request.c | 20 ++++++++++++++++++++ >> 1 file changed, 20 insertions(+) >> >> diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c >> index 7503dcb9043b..44491e7e214c 100644 >> --- a/drivers/gpu/drm/i915/i915_request.c >> +++ b/drivers/gpu/drm/i915/i915_request.c >> @@ -97,6 +97,25 @@ static bool i915_fence_enable_signaling(struct dma_fence *fence) >> return i915_request_enable_breadcrumb(to_request(fence)); >> } >> >> +static void i915_fence_set_deadline(struct dma_fence *fence, ktime_t deadline) >> +{ >> + struct i915_request *rq = to_request(fence); >> + >> + if (i915_request_completed(rq)) >> + return; >> + >> + if (i915_request_started(rq)) >> + return; > > why do we skip the boost if already started? > don't we want to boost the freq anyway? I'd wager Rob is just copying the current i915 wait boost logic. >> + >> + /* >> + * TODO something more clever for deadlines that are in the >> + * future. I think probably track the nearest deadline in >> + * rq->timeline and set timer to trigger boost accordingly? >> + */ > > I'm afraid it will be very hard to find some heuristics of what's > late enough for the boost no? > I mean, how early to boost the freq on an upcoming deadline for the > timer? We can off load this patch from Rob and deal with it separately, or after the fact? It's a half solution without a smarter scheduler too. Like https://lore.kernel.org/all/20210208105236.28498-10-chris@chris-wilson.co.uk/, or if GuC plans to do something like that at any point. Or bump the priority too if deadline is looming? IMO it is not very effective to fiddle with the heuristic on an ad-hoc basis. For instance I have a new heuristics which improves the problematic OpenCL cases for further 5% (relative to the current waitboost improvement from adding missing syncobj waitboost). But I can't really test properly for regressions over platforms, stacks, workloads.. :( Regards, Tvrtko > >> + >> + intel_rps_boost(rq); >> +} >> + >> static signed long i915_fence_wait(struct dma_fence *fence, >> bool interruptible, >> signed long timeout) >> @@ -182,6 +201,7 @@ const struct dma_fence_ops i915_fence_ops = { >> .signaled = i915_fence_signaled, >> .wait = i915_fence_wait, >> .release = i915_fence_release, >> + .set_deadline = i915_fence_set_deadline, >> }; >> >> static void irq_execute_cb(struct irq_work *wrk) >> -- >> 2.39.1 >>
On Fri, Mar 03, 2023 at 09:58:36AM +0000, Tvrtko Ursulin wrote: > > On 03/03/2023 03:21, Rodrigo Vivi wrote: > > On Thu, Mar 02, 2023 at 03:53:37PM -0800, Rob Clark wrote: > > > From: Rob Clark <robdclark@chromium.org> > > > > > > > missing some wording here... > > > > > v2: rebase > > > > > > Signed-off-by: Rob Clark <robdclark@chromium.org> > > > --- > > > drivers/gpu/drm/i915/i915_request.c | 20 ++++++++++++++++++++ > > > 1 file changed, 20 insertions(+) > > > > > > diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c > > > index 7503dcb9043b..44491e7e214c 100644 > > > --- a/drivers/gpu/drm/i915/i915_request.c > > > +++ b/drivers/gpu/drm/i915/i915_request.c > > > @@ -97,6 +97,25 @@ static bool i915_fence_enable_signaling(struct dma_fence *fence) > > > return i915_request_enable_breadcrumb(to_request(fence)); > > > } > > > +static void i915_fence_set_deadline(struct dma_fence *fence, ktime_t deadline) > > > +{ > > > + struct i915_request *rq = to_request(fence); > > > + > > > + if (i915_request_completed(rq)) > > > + return; > > > + > > > + if (i915_request_started(rq)) > > > + return; > > > > why do we skip the boost if already started? > > don't we want to boost the freq anyway? > > I'd wager Rob is just copying the current i915 wait boost logic. > > > > + > > > + /* > > > + * TODO something more clever for deadlines that are in the > > > + * future. I think probably track the nearest deadline in > > > + * rq->timeline and set timer to trigger boost accordingly? > > > + */ > > > > I'm afraid it will be very hard to find some heuristics of what's > > late enough for the boost no? > > I mean, how early to boost the freq on an upcoming deadline for the > > timer? > > We can off load this patch from Rob and deal with it separately, or after > the fact? > > It's a half solution without a smarter scheduler too. Like https://lore.kernel.org/all/20210208105236.28498-10-chris@chris-wilson.co.uk/, > or if GuC plans to do something like that at any point. Indeed, we already have the deadline implementation (and not just that), we just need to have some willingness to apply it. Andi > Or bump the priority too if deadline is looming? > > IMO it is not very effective to fiddle with the heuristic on an ad-hoc > basis. For instance I have a new heuristics which improves the problematic > OpenCL cases for further 5% (relative to the current waitboost improvement > from adding missing syncobj waitboost). But I can't really test properly for > regressions over platforms, stacks, workloads.. :( > > Regards, > > Tvrtko > > > > > > + > > > + intel_rps_boost(rq); > > > +} > > > + > > > static signed long i915_fence_wait(struct dma_fence *fence, > > > bool interruptible, > > > signed long timeout) > > > @@ -182,6 +201,7 @@ const struct dma_fence_ops i915_fence_ops = { > > > .signaled = i915_fence_signaled, > > > .wait = i915_fence_wait, > > > .release = i915_fence_release, > > > + .set_deadline = i915_fence_set_deadline, > > > }; > > > static void irq_execute_cb(struct irq_work *wrk) > > > -- > > > 2.39.1 > > >
On Fri, Mar 3, 2023 at 1:58 AM Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> wrote: > > > On 03/03/2023 03:21, Rodrigo Vivi wrote: > > On Thu, Mar 02, 2023 at 03:53:37PM -0800, Rob Clark wrote: > >> From: Rob Clark <robdclark@chromium.org> > >> > > > > missing some wording here... > > > >> v2: rebase > >> > >> Signed-off-by: Rob Clark <robdclark@chromium.org> > >> --- > >> drivers/gpu/drm/i915/i915_request.c | 20 ++++++++++++++++++++ > >> 1 file changed, 20 insertions(+) > >> > >> diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c > >> index 7503dcb9043b..44491e7e214c 100644 > >> --- a/drivers/gpu/drm/i915/i915_request.c > >> +++ b/drivers/gpu/drm/i915/i915_request.c > >> @@ -97,6 +97,25 @@ static bool i915_fence_enable_signaling(struct dma_fence *fence) > >> return i915_request_enable_breadcrumb(to_request(fence)); > >> } > >> > >> +static void i915_fence_set_deadline(struct dma_fence *fence, ktime_t deadline) > >> +{ > >> + struct i915_request *rq = to_request(fence); > >> + > >> + if (i915_request_completed(rq)) > >> + return; > >> + > >> + if (i915_request_started(rq)) > >> + return; > > > > why do we skip the boost if already started? > > don't we want to boost the freq anyway? > > I'd wager Rob is just copying the current i915 wait boost logic. Yup, and probably incorrectly.. Matt reported fewer boosts/sec compared to your RFC, this could be the bug > >> + > >> + /* > >> + * TODO something more clever for deadlines that are in the > >> + * future. I think probably track the nearest deadline in > >> + * rq->timeline and set timer to trigger boost accordingly? > >> + */ > > > > I'm afraid it will be very hard to find some heuristics of what's > > late enough for the boost no? > > I mean, how early to boost the freq on an upcoming deadline for the > > timer? > > We can off load this patch from Rob and deal with it separately, or > after the fact? That is completely my intention, I expect you to replace my i915 patch ;-) Rough idea when everyone is happy with the core bits is to setup an immutable branch without the driver specific patches, which could be merged into drm-next and $driver-next and then each driver team can add there own driver patches on top BR, -R > It's a half solution without a smarter scheduler too. Like > https://lore.kernel.org/all/20210208105236.28498-10-chris@chris-wilson.co.uk/, > or if GuC plans to do something like that at any point. > > Or bump the priority too if deadline is looming? > > IMO it is not very effective to fiddle with the heuristic on an ad-hoc > basis. For instance I have a new heuristics which improves the > problematic OpenCL cases for further 5% (relative to the current > waitboost improvement from adding missing syncobj waitboost). But I > can't really test properly for regressions over platforms, stacks, > workloads.. :( > > Regards, > > Tvrtko > > > > >> + > >> + intel_rps_boost(rq); > >> +} > >> + > >> static signed long i915_fence_wait(struct dma_fence *fence, > >> bool interruptible, > >> signed long timeout) > >> @@ -182,6 +201,7 @@ const struct dma_fence_ops i915_fence_ops = { > >> .signaled = i915_fence_signaled, > >> .wait = i915_fence_wait, > >> .release = i915_fence_release, > >> + .set_deadline = i915_fence_set_deadline, > >> }; > >> > >> static void irq_execute_cb(struct irq_work *wrk) > >> -- > >> 2.39.1 > >>
On Thu, Mar 2, 2023 at 7:21 PM Rodrigo Vivi <rodrigo.vivi@intel.com> wrote: > > On Thu, Mar 02, 2023 at 03:53:37PM -0800, Rob Clark wrote: > > From: Rob Clark <robdclark@chromium.org> > > > > missing some wording here... the wording should be "Pls replace this patch, kthx" ;-) > > > v2: rebase > > > > Signed-off-by: Rob Clark <robdclark@chromium.org> > > --- > > drivers/gpu/drm/i915/i915_request.c | 20 ++++++++++++++++++++ > > 1 file changed, 20 insertions(+) > > > > diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c > > index 7503dcb9043b..44491e7e214c 100644 > > --- a/drivers/gpu/drm/i915/i915_request.c > > +++ b/drivers/gpu/drm/i915/i915_request.c > > @@ -97,6 +97,25 @@ static bool i915_fence_enable_signaling(struct dma_fence *fence) > > return i915_request_enable_breadcrumb(to_request(fence)); > > } > > > > +static void i915_fence_set_deadline(struct dma_fence *fence, ktime_t deadline) > > +{ > > + struct i915_request *rq = to_request(fence); > > + > > + if (i915_request_completed(rq)) > > + return; > > + > > + if (i915_request_started(rq)) > > + return; > > why do we skip the boost if already started? > don't we want to boost the freq anyway? > > > + > > + /* > > + * TODO something more clever for deadlines that are in the > > + * future. I think probably track the nearest deadline in > > + * rq->timeline and set timer to trigger boost accordingly? > > + */ > > I'm afraid it will be very hard to find some heuristics of what's > late enough for the boost no? > I mean, how early to boost the freq on an upcoming deadline for the > timer? So, from my understanding of i915 boosting, it applies more specifically to a given request (vs msm which just bumps up the freq until the next devfreq sampling period which then recalculates target freq based on busy cycles and avg freq over the last sampling period). For msm I just set a timer for 3ms before the deadline and boost if the fence isn't signaled when the timer fires. It is kinda impossible to predict, even for userspace, how long a job will take to complete, so the goal isn't really to finish the specified job by the deadline, but instead to avoid devfreq landing at a local minimum (maximum?) AFAIU what I _think_ would make sense for i915 is to do the same thing you do if you miss a vblank pageflip deadline if the deadline passes without the fence signaling. BR, -R > > + > > + intel_rps_boost(rq); > > +} > > + > > static signed long i915_fence_wait(struct dma_fence *fence, > > bool interruptible, > > signed long timeout) > > @@ -182,6 +201,7 @@ const struct dma_fence_ops i915_fence_ops = { > > .signaled = i915_fence_signaled, > > .wait = i915_fence_wait, > > .release = i915_fence_release, > > + .set_deadline = i915_fence_set_deadline, > > }; > > > > static void irq_execute_cb(struct irq_work *wrk) > > -- > > 2.39.1 > >
On Fri, Mar 03, 2023 at 06:48:43AM -0800, Rob Clark wrote: > On Fri, Mar 3, 2023 at 1:58 AM Tvrtko Ursulin > <tvrtko.ursulin@linux.intel.com> wrote: > > > > > > On 03/03/2023 03:21, Rodrigo Vivi wrote: > > > On Thu, Mar 02, 2023 at 03:53:37PM -0800, Rob Clark wrote: > > >> From: Rob Clark <robdclark@chromium.org> > > >> > > > > > > missing some wording here... > > > > > >> v2: rebase > > >> > > >> Signed-off-by: Rob Clark <robdclark@chromium.org> > > >> --- > > >> drivers/gpu/drm/i915/i915_request.c | 20 ++++++++++++++++++++ > > >> 1 file changed, 20 insertions(+) > > >> > > >> diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c > > >> index 7503dcb9043b..44491e7e214c 100644 > > >> --- a/drivers/gpu/drm/i915/i915_request.c > > >> +++ b/drivers/gpu/drm/i915/i915_request.c > > >> @@ -97,6 +97,25 @@ static bool i915_fence_enable_signaling(struct dma_fence *fence) > > >> return i915_request_enable_breadcrumb(to_request(fence)); > > >> } > > >> > > >> +static void i915_fence_set_deadline(struct dma_fence *fence, ktime_t deadline) > > >> +{ > > >> + struct i915_request *rq = to_request(fence); > > >> + > > >> + if (i915_request_completed(rq)) > > >> + return; > > >> + > > >> + if (i915_request_started(rq)) > > >> + return; > > > > > > why do we skip the boost if already started? > > > don't we want to boost the freq anyway? > > > > I'd wager Rob is just copying the current i915 wait boost logic. > > Yup, and probably incorrectly.. Matt reported fewer boosts/sec > compared to your RFC, this could be the bug I don't think i915 calls drm_atomic_helper_wait_for_fences() so that could explain something.
On 03/03/2023 14:48, Rob Clark wrote: > On Fri, Mar 3, 2023 at 1:58 AM Tvrtko Ursulin > <tvrtko.ursulin@linux.intel.com> wrote: >> >> >> On 03/03/2023 03:21, Rodrigo Vivi wrote: >>> On Thu, Mar 02, 2023 at 03:53:37PM -0800, Rob Clark wrote: >>>> From: Rob Clark <robdclark@chromium.org> >>>> >>> >>> missing some wording here... >>> >>>> v2: rebase >>>> >>>> Signed-off-by: Rob Clark <robdclark@chromium.org> >>>> --- >>>> drivers/gpu/drm/i915/i915_request.c | 20 ++++++++++++++++++++ >>>> 1 file changed, 20 insertions(+) >>>> >>>> diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c >>>> index 7503dcb9043b..44491e7e214c 100644 >>>> --- a/drivers/gpu/drm/i915/i915_request.c >>>> +++ b/drivers/gpu/drm/i915/i915_request.c >>>> @@ -97,6 +97,25 @@ static bool i915_fence_enable_signaling(struct dma_fence *fence) >>>> return i915_request_enable_breadcrumb(to_request(fence)); >>>> } >>>> >>>> +static void i915_fence_set_deadline(struct dma_fence *fence, ktime_t deadline) >>>> +{ >>>> + struct i915_request *rq = to_request(fence); >>>> + >>>> + if (i915_request_completed(rq)) >>>> + return; >>>> + >>>> + if (i915_request_started(rq)) >>>> + return; >>> >>> why do we skip the boost if already started? >>> don't we want to boost the freq anyway? >> >> I'd wager Rob is just copying the current i915 wait boost logic. > > Yup, and probably incorrectly.. Matt reported fewer boosts/sec > compared to your RFC, this could be the bug Hm, there I have preserved this same !i915_request_started logic. Presumably it's not just fewer boosts but lower performance. How is he setting the deadline? Somehow from clFlush or so? Regards, Tvrtko P.S. Take note that I did not post the latest version of my RFC. The one where I fix the fence chain and array misses you pointed out. I did not think it would be worthwhile given no universal love for it, but if people are testing with it more widely that I was aware perhaps I should. >>>> + >>>> + /* >>>> + * TODO something more clever for deadlines that are in the >>>> + * future. I think probably track the nearest deadline in >>>> + * rq->timeline and set timer to trigger boost accordingly? >>>> + */ >>> >>> I'm afraid it will be very hard to find some heuristics of what's >>> late enough for the boost no? >>> I mean, how early to boost the freq on an upcoming deadline for the >>> timer? >> >> We can off load this patch from Rob and deal with it separately, or >> after the fact? > > That is completely my intention, I expect you to replace my i915 patch ;-) > > Rough idea when everyone is happy with the core bits is to setup an > immutable branch without the driver specific patches, which could be > merged into drm-next and $driver-next and then each driver team can > add there own driver patches on top > > BR, > -R > >> It's a half solution without a smarter scheduler too. Like >> https://lore.kernel.org/all/20210208105236.28498-10-chris@chris-wilson.co.uk/, >> or if GuC plans to do something like that at any point. >> >> Or bump the priority too if deadline is looming? >> >> IMO it is not very effective to fiddle with the heuristic on an ad-hoc >> basis. For instance I have a new heuristics which improves the >> problematic OpenCL cases for further 5% (relative to the current >> waitboost improvement from adding missing syncobj waitboost). But I >> can't really test properly for regressions over platforms, stacks, >> workloads.. :( >> >> Regards, >> >> Tvrtko >> >>> >>>> + >>>> + intel_rps_boost(rq); >>>> +} >>>> + >>>> static signed long i915_fence_wait(struct dma_fence *fence, >>>> bool interruptible, >>>> signed long timeout) >>>> @@ -182,6 +201,7 @@ const struct dma_fence_ops i915_fence_ops = { >>>> .signaled = i915_fence_signaled, >>>> .wait = i915_fence_wait, >>>> .release = i915_fence_release, >>>> + .set_deadline = i915_fence_set_deadline, >>>> }; >>>> >>>> static void irq_execute_cb(struct irq_work *wrk) >>>> -- >>>> 2.39.1 >>>>
On Fri, Mar 03, 2023 at 05:00:03PM +0200, Ville Syrjälä wrote: > On Fri, Mar 03, 2023 at 06:48:43AM -0800, Rob Clark wrote: > > On Fri, Mar 3, 2023 at 1:58 AM Tvrtko Ursulin > > <tvrtko.ursulin@linux.intel.com> wrote: > > > > > > > > > On 03/03/2023 03:21, Rodrigo Vivi wrote: > > > > On Thu, Mar 02, 2023 at 03:53:37PM -0800, Rob Clark wrote: > > > >> From: Rob Clark <robdclark@chromium.org> > > > >> > > > > > > > > missing some wording here... > > > > > > > >> v2: rebase > > > >> > > > >> Signed-off-by: Rob Clark <robdclark@chromium.org> > > > >> --- > > > >> drivers/gpu/drm/i915/i915_request.c | 20 ++++++++++++++++++++ > > > >> 1 file changed, 20 insertions(+) > > > >> > > > >> diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c > > > >> index 7503dcb9043b..44491e7e214c 100644 > > > >> --- a/drivers/gpu/drm/i915/i915_request.c > > > >> +++ b/drivers/gpu/drm/i915/i915_request.c > > > >> @@ -97,6 +97,25 @@ static bool i915_fence_enable_signaling(struct dma_fence *fence) > > > >> return i915_request_enable_breadcrumb(to_request(fence)); > > > >> } > > > >> > > > >> +static void i915_fence_set_deadline(struct dma_fence *fence, ktime_t deadline) > > > >> +{ > > > >> + struct i915_request *rq = to_request(fence); > > > >> + > > > >> + if (i915_request_completed(rq)) > > > >> + return; > > > >> + > > > >> + if (i915_request_started(rq)) > > > >> + return; > > > > > > > > why do we skip the boost if already started? > > > > don't we want to boost the freq anyway? > > > > > > I'd wager Rob is just copying the current i915 wait boost logic. > > > > Yup, and probably incorrectly.. Matt reported fewer boosts/sec > > compared to your RFC, this could be the bug > > I don't think i915 calls drm_atomic_helper_wait_for_fences() > so that could explain something. Oh, I guess this wasn't even supposed to take over the current display boost stuff since you didn't remove the old one. The current one just boosts after a missed vblank. The deadline could use your timer approach I suppose and boost already a bit earlier in the hopes of not missing the vblank.
On Fri, Mar 3, 2023 at 7:08 AM Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> wrote: > > > On 03/03/2023 14:48, Rob Clark wrote: > > On Fri, Mar 3, 2023 at 1:58 AM Tvrtko Ursulin > > <tvrtko.ursulin@linux.intel.com> wrote: > >> > >> > >> On 03/03/2023 03:21, Rodrigo Vivi wrote: > >>> On Thu, Mar 02, 2023 at 03:53:37PM -0800, Rob Clark wrote: > >>>> From: Rob Clark <robdclark@chromium.org> > >>>> > >>> > >>> missing some wording here... > >>> > >>>> v2: rebase > >>>> > >>>> Signed-off-by: Rob Clark <robdclark@chromium.org> > >>>> --- > >>>> drivers/gpu/drm/i915/i915_request.c | 20 ++++++++++++++++++++ > >>>> 1 file changed, 20 insertions(+) > >>>> > >>>> diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c > >>>> index 7503dcb9043b..44491e7e214c 100644 > >>>> --- a/drivers/gpu/drm/i915/i915_request.c > >>>> +++ b/drivers/gpu/drm/i915/i915_request.c > >>>> @@ -97,6 +97,25 @@ static bool i915_fence_enable_signaling(struct dma_fence *fence) > >>>> return i915_request_enable_breadcrumb(to_request(fence)); > >>>> } > >>>> > >>>> +static void i915_fence_set_deadline(struct dma_fence *fence, ktime_t deadline) > >>>> +{ > >>>> + struct i915_request *rq = to_request(fence); > >>>> + > >>>> + if (i915_request_completed(rq)) > >>>> + return; > >>>> + > >>>> + if (i915_request_started(rq)) > >>>> + return; > >>> > >>> why do we skip the boost if already started? > >>> don't we want to boost the freq anyway? > >> > >> I'd wager Rob is just copying the current i915 wait boost logic. > > > > Yup, and probably incorrectly.. Matt reported fewer boosts/sec > > compared to your RFC, this could be the bug > > Hm, there I have preserved this same !i915_request_started logic. > > Presumably it's not just fewer boosts but lower performance. How is he > setting the deadline? Somehow from clFlush or so? Yeah, fewer boosts, lower freq/perf.. I cobbled together a quick mesa hack to set the DEADLINE flag on syncobj waits, but it seems likely that I missed something somewhere BR, -R > Regards, > > Tvrtko > > P.S. Take note that I did not post the latest version of my RFC. The one > where I fix the fence chain and array misses you pointed out. I did not > think it would be worthwhile given no universal love for it, but if > people are testing with it more widely that I was aware perhaps I should. > > >>>> + > >>>> + /* > >>>> + * TODO something more clever for deadlines that are in the > >>>> + * future. I think probably track the nearest deadline in > >>>> + * rq->timeline and set timer to trigger boost accordingly? > >>>> + */ > >>> > >>> I'm afraid it will be very hard to find some heuristics of what's > >>> late enough for the boost no? > >>> I mean, how early to boost the freq on an upcoming deadline for the > >>> timer? > >> > >> We can off load this patch from Rob and deal with it separately, or > >> after the fact? > > > > That is completely my intention, I expect you to replace my i915 patch ;-) > > > > Rough idea when everyone is happy with the core bits is to setup an > > immutable branch without the driver specific patches, which could be > > merged into drm-next and $driver-next and then each driver team can > > add there own driver patches on top > > > > BR, > > -R > > > >> It's a half solution without a smarter scheduler too. Like > >> https://lore.kernel.org/all/20210208105236.28498-10-chris@chris-wilson.co.uk/, > >> or if GuC plans to do something like that at any point. > >> > >> Or bump the priority too if deadline is looming? > >> > >> IMO it is not very effective to fiddle with the heuristic on an ad-hoc > >> basis. For instance I have a new heuristics which improves the > >> problematic OpenCL cases for further 5% (relative to the current > >> waitboost improvement from adding missing syncobj waitboost). But I > >> can't really test properly for regressions over platforms, stacks, > >> workloads.. :( > >> > >> Regards, > >> > >> Tvrtko > >> > >>> > >>>> + > >>>> + intel_rps_boost(rq); > >>>> +} > >>>> + > >>>> static signed long i915_fence_wait(struct dma_fence *fence, > >>>> bool interruptible, > >>>> signed long timeout) > >>>> @@ -182,6 +201,7 @@ const struct dma_fence_ops i915_fence_ops = { > >>>> .signaled = i915_fence_signaled, > >>>> .wait = i915_fence_wait, > >>>> .release = i915_fence_release, > >>>> + .set_deadline = i915_fence_set_deadline, > >>>> }; > >>>> > >>>> static void irq_execute_cb(struct irq_work *wrk) > >>>> -- > >>>> 2.39.1 > >>>>
On Fri, Mar 3, 2023 at 7:20 AM Ville Syrjälä <ville.syrjala@linux.intel.com> wrote: > > On Fri, Mar 03, 2023 at 05:00:03PM +0200, Ville Syrjälä wrote: > > On Fri, Mar 03, 2023 at 06:48:43AM -0800, Rob Clark wrote: > > > On Fri, Mar 3, 2023 at 1:58 AM Tvrtko Ursulin > > > <tvrtko.ursulin@linux.intel.com> wrote: > > > > > > > > > > > > On 03/03/2023 03:21, Rodrigo Vivi wrote: > > > > > On Thu, Mar 02, 2023 at 03:53:37PM -0800, Rob Clark wrote: > > > > >> From: Rob Clark <robdclark@chromium.org> > > > > >> > > > > > > > > > > missing some wording here... > > > > > > > > > >> v2: rebase > > > > >> > > > > >> Signed-off-by: Rob Clark <robdclark@chromium.org> > > > > >> --- > > > > >> drivers/gpu/drm/i915/i915_request.c | 20 ++++++++++++++++++++ > > > > >> 1 file changed, 20 insertions(+) > > > > >> > > > > >> diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c > > > > >> index 7503dcb9043b..44491e7e214c 100644 > > > > >> --- a/drivers/gpu/drm/i915/i915_request.c > > > > >> +++ b/drivers/gpu/drm/i915/i915_request.c > > > > >> @@ -97,6 +97,25 @@ static bool i915_fence_enable_signaling(struct dma_fence *fence) > > > > >> return i915_request_enable_breadcrumb(to_request(fence)); > > > > >> } > > > > >> > > > > >> +static void i915_fence_set_deadline(struct dma_fence *fence, ktime_t deadline) > > > > >> +{ > > > > >> + struct i915_request *rq = to_request(fence); > > > > >> + > > > > >> + if (i915_request_completed(rq)) > > > > >> + return; > > > > >> + > > > > >> + if (i915_request_started(rq)) > > > > >> + return; > > > > > > > > > > why do we skip the boost if already started? > > > > > don't we want to boost the freq anyway? > > > > > > > > I'd wager Rob is just copying the current i915 wait boost logic. > > > > > > Yup, and probably incorrectly.. Matt reported fewer boosts/sec > > > compared to your RFC, this could be the bug > > > > I don't think i915 calls drm_atomic_helper_wait_for_fences() > > so that could explain something. > > Oh, I guess this wasn't even supposed to take over the current > display boost stuff since you didn't remove the old one. Right, I didn't try to replace the current thing.. but hopefully at least make it possible for i915 to use more of the atomic helpers in the future BR, -R > The current one just boosts after a missed vblank. The deadline > could use your timer approach I suppose and boost already a bit > earlier in the hopes of not missing the vblank. > > -- > Ville Syrjälä > Intel
On Fri, Mar 3, 2023 at 10:08 AM Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> wrote: > > > On 03/03/2023 14:48, Rob Clark wrote: > > On Fri, Mar 3, 2023 at 1:58 AM Tvrtko Ursulin > > <tvrtko.ursulin@linux.intel.com> wrote: > >> > >> > >> On 03/03/2023 03:21, Rodrigo Vivi wrote: > >>> On Thu, Mar 02, 2023 at 03:53:37PM -0800, Rob Clark wrote: > >>>> From: Rob Clark <robdclark@chromium.org> > >>>> > >>> > >>> missing some wording here... > >>> > >>>> v2: rebase > >>>> > >>>> Signed-off-by: Rob Clark <robdclark@chromium.org> > >>>> --- > >>>> drivers/gpu/drm/i915/i915_request.c | 20 ++++++++++++++++++++ > >>>> 1 file changed, 20 insertions(+) > >>>> > >>>> diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c > >>>> index 7503dcb9043b..44491e7e214c 100644 > >>>> --- a/drivers/gpu/drm/i915/i915_request.c > >>>> +++ b/drivers/gpu/drm/i915/i915_request.c > >>>> @@ -97,6 +97,25 @@ static bool i915_fence_enable_signaling(struct dma_fence *fence) > >>>> return i915_request_enable_breadcrumb(to_request(fence)); > >>>> } > >>>> > >>>> +static void i915_fence_set_deadline(struct dma_fence *fence, ktime_t deadline) > >>>> +{ > >>>> + struct i915_request *rq = to_request(fence); > >>>> + > >>>> + if (i915_request_completed(rq)) > >>>> + return; > >>>> + > >>>> + if (i915_request_started(rq)) > >>>> + return; > >>> > >>> why do we skip the boost if already started? > >>> don't we want to boost the freq anyway? > >> > >> I'd wager Rob is just copying the current i915 wait boost logic. > > > > Yup, and probably incorrectly.. Matt reported fewer boosts/sec > > compared to your RFC, this could be the bug > > Hm, there I have preserved this same !i915_request_started logic. > > Presumably it's not just fewer boosts but lower performance. How is he > setting the deadline? Somehow from clFlush or so? > > Regards, > > Tvrtko > > P.S. Take note that I did not post the latest version of my RFC. The one > where I fix the fence chain and array misses you pointed out. I did not > think it would be worthwhile given no universal love for it, but if > people are testing with it more widely that I was aware perhaps I should. Yep, that would be great. We're interested in it for ChromeOS. Please Cc me on the series when you send it.
diff --git a/drivers/gpu/drm/i915/i915_request.c b/drivers/gpu/drm/i915/i915_request.c index 7503dcb9043b..44491e7e214c 100644 --- a/drivers/gpu/drm/i915/i915_request.c +++ b/drivers/gpu/drm/i915/i915_request.c @@ -97,6 +97,25 @@ static bool i915_fence_enable_signaling(struct dma_fence *fence) return i915_request_enable_breadcrumb(to_request(fence)); } +static void i915_fence_set_deadline(struct dma_fence *fence, ktime_t deadline) +{ + struct i915_request *rq = to_request(fence); + + if (i915_request_completed(rq)) + return; + + if (i915_request_started(rq)) + return; + + /* + * TODO something more clever for deadlines that are in the + * future. I think probably track the nearest deadline in + * rq->timeline and set timer to trigger boost accordingly? + */ + + intel_rps_boost(rq); +} + static signed long i915_fence_wait(struct dma_fence *fence, bool interruptible, signed long timeout) @@ -182,6 +201,7 @@ const struct dma_fence_ops i915_fence_ops = { .signaled = i915_fence_signaled, .wait = i915_fence_wait, .release = i915_fence_release, + .set_deadline = i915_fence_set_deadline, }; static void irq_execute_cb(struct irq_work *wrk)