Message ID | 20210521090959.1663703-4-daniel.vetter@ffwll.ch (mailing list archive) |
---|---|
State | Not Applicable, archived |
Delegated to: | Hans Verkuil |
Headers |
Received: from vger.kernel.org ([23.128.96.18]) by www.linuxtv.org with esmtp (Exim 4.92) (envelope-from <linux-media-owner@vger.kernel.org>) id 1lk1Ab-00EEte-0B; Fri, 21 May 2021 09:10:13 +0000 Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235690AbhEUJLd (ORCPT <rfc822;mkrufky@linuxtv.org> + 1 other); Fri, 21 May 2021 05:11:33 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54128 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233695AbhEUJLc (ORCPT <rfc822;linux-media@vger.kernel.org>); Fri, 21 May 2021 05:11:32 -0400 Received: from mail-wm1-x32f.google.com (mail-wm1-x32f.google.com [IPv6:2a00:1450:4864:20::32f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9ACFFC061574 for <linux-media@vger.kernel.org>; Fri, 21 May 2021 02:10:09 -0700 (PDT) Received: by mail-wm1-x32f.google.com with SMTP id z85-20020a1c7e580000b029017a76f3afbaso4206116wmc.2 for <linux-media@vger.kernel.org>; Fri, 21 May 2021 02:10:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ffwll.ch; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=HABPMdXSrKsW4uEw0rscNeUzj5fp9ApU11szDLOfF2g=; b=fXNUkTfQQylHve4e5q7hej2PfB3CsxTiwKvzcCa0DmQnFmgJgOxLi3L3xKRECoS9nf 8asDsKAzfQXRUIArgbk0BFuIwPtkPX0uk4ZgG0KeL0UFd364oeAvvFjkw9q6URsV7GNa i467cpK5nxrW+8LWbSddPOQS5DUODHlMEhX8E= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=HABPMdXSrKsW4uEw0rscNeUzj5fp9ApU11szDLOfF2g=; b=L6dK7Q1cn/iJ5tSkIu2EeFuEsA52AewpYoLmeK0keIre1bDWY1cdkN5U+WD3O//nv0 umRB5v1hEEzlMw63XtZ9IvjzlYI9gRLIPSKkTQr86xZU2286xBlzAPW7x9hBccQdE1tf dCAXWLHvIVqgCNpMaPJgZWq7lwJhfd2p53snxjs7I0mnVCSWKCvNuVvmOGq65IUDNl7i vJvrc5kOdPkS7jxBTFzTB/4tVV2RdE76BMZ0RIEuByWMb2Oiy4P+YToeUnFcw9aq5ruQ cMyYm2CkbdsHhxj9sZa3ObHugG6owDwGT9FxpqAWD47VES4+mx9xTtszNMLigaFoe0Uv EWuQ== X-Gm-Message-State: AOAM533djp0MkRjPyC1joY3S3GYTR/j45rkR3lTRQ/XIcOMAZ6sk1dCI lV21A0mfI5VDi/+oWEF2jq8nrw== X-Google-Smtp-Source: ABdhPJy4AppsKMSYrINJJ5HHtcoyKQ5HCs+pp50JtN605hKrw5dYRryDwHmn4jAzZiOvi2dJ36vjOw== X-Received: by 2002:a7b:cb45:: with SMTP id v5mr7986141wmj.48.1621588208292; Fri, 21 May 2021 02:10:08 -0700 (PDT) Received: from phenom.ffwll.local ([2a02:168:57f4:0:efd0:b9e5:5ae6:c2fa]) by smtp.gmail.com with ESMTPSA id y2sm13589457wmq.45.2021.05.21.02.10.07 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 21 May 2021 02:10:07 -0700 (PDT) From: Daniel Vetter <daniel.vetter@ffwll.ch> To: DRI Development <dri-devel@lists.freedesktop.org> Cc: Intel Graphics Development <intel-gfx@lists.freedesktop.org>, Daniel Vetter <daniel.vetter@ffwll.ch>, Daniel Vetter <daniel.vetter@intel.com>, Rob Herring <robh@kernel.org>, Tomeu Vizoso <tomeu.vizoso@collabora.com>, Steven Price <steven.price@arm.com>, Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>, Sumit Semwal <sumit.semwal@linaro.org>, =?utf-8?q?Christian_K=C3=B6nig?= <christian.koenig@amd.com>, linux-media@vger.kernel.org, linaro-mm-sig@lists.linaro.org Subject: [PATCH 04/11] drm/panfrost: Fix implicit sync Date: Fri, 21 May 2021 11:09:52 +0200 Message-Id: <20210521090959.1663703-4-daniel.vetter@ffwll.ch> X-Mailer: git-send-email 2.31.0 In-Reply-To: <20210521090959.1663703-1-daniel.vetter@ffwll.ch> References: <20210521090959.1663703-1-daniel.vetter@ffwll.ch> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: <linux-media.vger.kernel.org> X-Mailing-List: linux-media@vger.kernel.org X-LSpam-Score: -3.2 (---) X-LSpam-Report: No, score=-3.2 required=5.0 tests=BAYES_00=-1.9,DKIM_SIGNED=0.1,DKIM_VALID=-0.1,DKIM_VALID_AU=-0.1,HEADER_FROM_DIFFERENT_DOMAINS=0.5,MAILING_LIST_MULTI=-1,RCVD_IN_DNSWL_LOW=-0.7 autolearn=ham autolearn_force=no |
Series |
None
|
|
Commit Message
Daniel Vetter
May 21, 2021, 9:09 a.m. UTC
Currently this has no practial relevance I think because there's not
many who can pull off a setup with panfrost and another gpu in the
same system. But the rules are that if you're setting an exclusive
fence, indicating a gpu write access in the implicit fencing system,
then you need to wait for all fences, not just the previous exclusive
fence.
panfrost against itself has no problem, because it always sets the
exclusive fence (but that's probably something that will need to be
fixed for vulkan and/or multi-engine gpus, or you'll suffer badly).
Also no problem with that against display.
With the prep work done to switch over to the dependency helpers this
is now a oneliner.
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Cc: Rob Herring <robh@kernel.org>
Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Cc: Steven Price <steven.price@arm.com>
Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: "Christian König" <christian.koenig@amd.com>
Cc: linux-media@vger.kernel.org
Cc: linaro-mm-sig@lists.linaro.org
---
drivers/gpu/drm/panfrost/panfrost_job.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)
Comments
Hi, On Fri, 21 May 2021 at 10:10, Daniel Vetter <daniel.vetter@ffwll.ch> wrote: > Currently this has no practial relevance I think because there's not > many who can pull off a setup with panfrost and another gpu in the > same system. But the rules are that if you're setting an exclusive > fence, indicating a gpu write access in the implicit fencing system, > then you need to wait for all fences, not just the previous exclusive > fence. > > panfrost against itself has no problem, because it always sets the > exclusive fence (but that's probably something that will need to be > fixed for vulkan and/or multi-engine gpus, or you'll suffer badly). > Also no problem with that against display. Yeah, the 'second-generation Valhall' GPUs coming later this year / early next year are starting to get pretty weird. Firmware-mediated job scheduling out of multiple queues, userspace having direct access to the queues and can do inter-queue synchronisation (at least I think so), etc. For bonus points, synchronisation is based on $addr = $val to signal and $addr == $val to wait, with a separate fence primitive as well. Obviously Arm should be part of this conversation here, but I guess we'll have to wait for a while yet to see how everything's shaken out with this new gen, and hope that whatever's been designed upstream in the meantime is actually vaguely compatible ... Cheers, Daniel
Am 21.05.21 um 14:22 schrieb Daniel Stone: > Hi, > > On Fri, 21 May 2021 at 10:10, Daniel Vetter <daniel.vetter@ffwll.ch> wrote: >> Currently this has no practial relevance I think because there's not >> many who can pull off a setup with panfrost and another gpu in the >> same system. But the rules are that if you're setting an exclusive >> fence, indicating a gpu write access in the implicit fencing system, >> then you need to wait for all fences, not just the previous exclusive >> fence. >> >> panfrost against itself has no problem, because it always sets the >> exclusive fence (but that's probably something that will need to be >> fixed for vulkan and/or multi-engine gpus, or you'll suffer badly). >> Also no problem with that against display. > Yeah, the 'second-generation Valhall' GPUs coming later this year / > early next year are starting to get pretty weird. Firmware-mediated > job scheduling out of multiple queues, userspace having direct access > to the queues and can do inter-queue synchronisation (at least I think > so), etc. For bonus points, synchronisation is based on $addr = $val > to signal and $addr == $val to wait, with a separate fence primitive > as well. Well that sounds familiar :) > Obviously Arm should be part of this conversation here, but I guess > we'll have to wait for a while yet to see how everything's shaken out > with this new gen, and hope that whatever's been designed upstream in > the meantime is actually vaguely compatible ... Yeah, going to keep you in CC when we start to code and review user fences. Cheers, Christian. > > Cheers, > Daniel > _______________________________________________ > Linaro-mm-sig mailing list > Linaro-mm-sig@lists.linaro.org > https://lists.linaro.org/mailman/listinfo/linaro-mm-sig
On Fri, 21 May 2021 at 13:28, Christian König <christian.koenig@amd.com> wrote: > Am 21.05.21 um 14:22 schrieb Daniel Stone: > > Yeah, the 'second-generation Valhall' GPUs coming later this year / > > early next year are starting to get pretty weird. Firmware-mediated > > job scheduling out of multiple queues, userspace having direct access > > to the queues and can do inter-queue synchronisation (at least I think > > so), etc. For bonus points, synchronisation is based on $addr = $val > > to signal and $addr == $val to wait, with a separate fence primitive > > as well. > > Well that sounds familiar :) I laughed when I first saw it, because it was better than crying I guess. If you're curious, the interface definitions are in the csf/ directory in the 'Bifrost kernel driver' r30p0 download you can get from the Arm developer site. Unfortunately the exact semantics aren't completely clear. > > Obviously Arm should be part of this conversation here, but I guess > > we'll have to wait for a while yet to see how everything's shaken out > > with this new gen, and hope that whatever's been designed upstream in > > the meantime is actually vaguely compatible ... > > Yeah, going to keep you in CC when we start to code and review user fences. Awesome, thanks Christian. Appreciate it. :) Cheers, Daniel
Am 21.05.21 um 14:54 schrieb Daniel Stone: > On Fri, 21 May 2021 at 13:28, Christian König <christian.koenig@amd.com> wrote: >> Am 21.05.21 um 14:22 schrieb Daniel Stone: >>> Yeah, the 'second-generation Valhall' GPUs coming later this year / >>> early next year are starting to get pretty weird. Firmware-mediated >>> job scheduling out of multiple queues, userspace having direct access >>> to the queues and can do inter-queue synchronisation (at least I think >>> so), etc. For bonus points, synchronisation is based on $addr = $val >>> to signal and $addr == $val to wait, with a separate fence primitive >>> as well. >> Well that sounds familiar :) > I laughed when I first saw it, because it was better than crying I guess. In Germany we say "Ich freue mich drauf wie auf Zahnschmerzen". > If you're curious, the interface definitions are in the csf/ directory > in the 'Bifrost kernel driver' r30p0 download you can get from the Arm > developer site. Unfortunately the exact semantics aren't completely > clear. Well it is actually relatively simple. Take a look at the timeline semaphores from Vulkan, everybody is basically implementing the same semantics now. When you queued up a bunch of commands on your hardware, the first one will write value 1 to a 64bit memory location, the second one will write value 2, the third value 3 and so on. After writing the value the hardware raises and interrupt signal to everybody interested. In other words pretty standard memory fence behavior. When you now have a second queue which depends on work of the first one you look at the memory location and do a compare. If you depend on the third submission you just wait for the value to be >3 and are done. Regards, Christian. > >>> Obviously Arm should be part of this conversation here, but I guess >>> we'll have to wait for a while yet to see how everything's shaken out >>> with this new gen, and hope that whatever's been designed upstream in >>> the meantime is actually vaguely compatible ... >> Yeah, going to keep you in CC when we start to code and review user fences. > Awesome, thanks Christian. Appreciate it. :) > > Cheers, > Daniel
On Fri, 21 May 2021 at 14:09, Christian König <christian.koenig@amd.com> wrote: > Am 21.05.21 um 14:54 schrieb Daniel Stone: > > If you're curious, the interface definitions are in the csf/ directory > > in the 'Bifrost kernel driver' r30p0 download you can get from the Arm > > developer site. Unfortunately the exact semantics aren't completely > > clear. > > Well it is actually relatively simple. Take a look at the timeline > semaphores from Vulkan, everybody is basically implementing the same > semantics now. > > When you queued up a bunch of commands on your hardware, the first one > will write value 1 to a 64bit memory location, the second one will write > value 2, the third value 3 and so on. After writing the value the > hardware raises and interrupt signal to everybody interested. > > In other words pretty standard memory fence behavior. > > When you now have a second queue which depends on work of the first one > you look at the memory location and do a compare. If you depend on the > third submission you just wait for the value to be >3 and are done. Right, it is clearly defined to the timeline semaphore semantics, I just meant that it's not clear how it works at a lower level wrt the synchronisation and signaling. The simplest possible interpretation is that wait_addrval blocks infinitely before kick-cmdbuf, but that seems painful with only 32 queues. And the same for fences, which are a binary signal. I guess we'll find out. My tooth hurts. Cheers, Daniel
diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c b/drivers/gpu/drm/panfrost/panfrost_job.c index 707d912ff64a..619d6104040c 100644 --- a/drivers/gpu/drm/panfrost/panfrost_job.c +++ b/drivers/gpu/drm/panfrost/panfrost_job.c @@ -203,9 +203,8 @@ static int panfrost_acquire_object_fences(struct drm_gem_object **bos, int i, ret; for (i = 0; i < bo_count; i++) { - struct dma_fence *fence = dma_resv_get_excl_rcu(bos[i]->resv); - - ret = drm_gem_fence_array_add(deps, fence); + /* panfrost always uses write mode in its current uapi */ + ret = drm_gem_fence_array_add_implicit(deps, bos[i], true); if (ret) return ret; }