Message ID | 3535c205b9bce52556abbf2f63384fb38e009df9.1597931876.git.robin.murphy@arm.com (mailing list archive) |
---|---|
State | Not Applicable, archived |
Headers |
Received: from vger.kernel.org ([23.128.96.18]) by www.linuxtv.org with esmtp (Exim 4.92) (envelope-from <linux-media-owner@vger.kernel.org>) id 1k8m7v-00AK6W-5B; Thu, 20 Aug 2020 15:05:15 +0000 Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728862AbgHTPKD (ORCPT <rfc822;mkrufky@linuxtv.org> + 1 other); Thu, 20 Aug 2020 11:10:03 -0400 Received: from foss.arm.com ([217.140.110.172]:41298 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728799AbgHTPJz (ORCPT <rfc822;linux-media@vger.kernel.org>); Thu, 20 Aug 2020 11:09:55 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id B02CC1529; Thu, 20 Aug 2020 08:09:54 -0700 (PDT) Received: from e121345-lin.cambridge.arm.com (e121345-lin.cambridge.arm.com [10.1.196.37]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 200693F6CF; Thu, 20 Aug 2020 08:09:51 -0700 (PDT) From: Robin Murphy <robin.murphy@arm.com> To: hch@lst.de, joro@8bytes.org, linux@armlinux.org.uk Cc: will@kernel.org, inki.dae@samsung.com, sw0312.kim@samsung.com, kyungmin.park@samsung.com, m.szyprowski@samsung.com, agross@kernel.org, bjorn.andersson@linaro.org, thierry.reding@gmail.com, jonathanh@nvidia.com, vdumpa@nvidia.com, digetx@gmail.com, matthias.bgg@gmail.com, yong.wu@mediatek.com, geert+renesas@glider.be, magnus.damm@gmail.com, t-kristo@ti.com, s-anna@ti.com, laurent.pinchart@ideasonboard.com, linux-arm-kernel@lists.infradead.org, iommu@lists.linux-foundation.org, linux-samsung-soc@vger.kernel.org, linux-tegra@vger.kernel.org, linux-arm-msm@vger.kernel.org, linux-mediatek@lists.infradead.org, dri-devel@lists.freedesktop.org, linux-media@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH 16/18] staging/media/tegra-vde: Clean up IOMMU workaround Date: Thu, 20 Aug 2020 16:08:35 +0100 Message-Id: <3535c205b9bce52556abbf2f63384fb38e009df9.1597931876.git.robin.murphy@arm.com> X-Mailer: git-send-email 2.28.0.dirty In-Reply-To: <cover.1597931875.git.robin.murphy@arm.com> References: <cover.1597931875.git.robin.murphy@arm.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Sender: linux-media-owner@vger.kernel.org Precedence: bulk List-ID: <linux-media.vger.kernel.org> X-Mailing-List: linux-media@vger.kernel.org X-LSpam-Score: -2.4 (--) X-LSpam-Report: No, score=-2.4 required=5.0 tests=BAYES_00=-1.9,HEADER_FROM_DIFFERENT_DOMAINS=0.5,MAILING_LIST_MULTI=-1 autolearn=ham autolearn_force=no |
Series |
Convert arch/arm to use iommu-dma
|
|
Commit Message
Robin Murphy
Aug. 20, 2020, 3:08 p.m. UTC
Now that arch/arm is wired up for default domains and iommu-dma, we no
longer need to work around the arch-private mapping.
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
---
drivers/staging/media/tegra-vde/iommu.c | 12 ------------
1 file changed, 12 deletions(-)
Comments
20.08.2020 18:08, Robin Murphy пишет: > Now that arch/arm is wired up for default domains and iommu-dma, we no > longer need to work around the arch-private mapping. > > Signed-off-by: Robin Murphy <robin.murphy@arm.com> > --- > drivers/staging/media/tegra-vde/iommu.c | 12 ------------ > 1 file changed, 12 deletions(-) > > diff --git a/drivers/staging/media/tegra-vde/iommu.c b/drivers/staging/media/tegra-vde/iommu.c > index 6af863d92123..4f770189ed34 100644 > --- a/drivers/staging/media/tegra-vde/iommu.c > +++ b/drivers/staging/media/tegra-vde/iommu.c > @@ -10,10 +10,6 @@ > #include <linux/kernel.h> > #include <linux/platform_device.h> > > -#if IS_ENABLED(CONFIG_ARM_DMA_USE_IOMMU) > -#include <asm/dma-iommu.h> > -#endif > - > #include "vde.h" > > int tegra_vde_iommu_map(struct tegra_vde *vde, > @@ -70,14 +66,6 @@ int tegra_vde_iommu_init(struct tegra_vde *vde) > if (!vde->group) > return 0; > > -#if IS_ENABLED(CONFIG_ARM_DMA_USE_IOMMU) > - if (dev->archdata.mapping) { > - struct dma_iommu_mapping *mapping = to_dma_iommu_mapping(dev); > - > - arm_iommu_detach_device(dev); > - arm_iommu_release_mapping(mapping); > - } > -#endif > vde->domain = iommu_domain_alloc(&platform_bus_type); > if (!vde->domain) { > err = -ENOMEM; > Hello, Robin! Thank you for yours work! Some drivers, like this Tegra VDE (Video Decoder Engine) driver for example, do not want to use implicit IOMMU domain. Tegra VDE driver relies on explicit IOMMU domain in a case of Tegra SMMU because VDE hardware can't access last page of the AS and because driver wants to reserve some fixed addresses [1]. [1] https://elixir.bootlin.com/linux/v5.9-rc1/source/drivers/staging/media/tegra-vde/iommu.c#L100 Tegra30 SoC supports up to 4 domains, hence it's not possible to afford wasting unused implicit domains. I think this needs to be addressed before this patch could be applied. Would it be possible for IOMMU drivers to gain support for filtering out devices in iommu_domain_alloc(dev, type)? Then perhaps Tegra SMMU driver could simply return NULL in a case of type=IOMMU_DOMAIN_DMA and dev=tegra-vde. Alternatively, the Tegra SMMU could be changed such that the devices will be attached to a domain at the time of a first IOMMU mapping invocation instead of attaching at the time of attach_dev() callback invocation. Or maybe even IOMMU core could be changed to attach devices at the time of the first IOMMU mapping invocation? This could be a universal solution for all drivers.
20.08.2020 22:51, Dmitry Osipenko пишет: > Alternatively, the Tegra SMMU could be changed such that the devices > will be attached to a domain at the time of a first IOMMU mapping > invocation instead of attaching at the time of attach_dev() callback > invocation. > > Or maybe even IOMMU core could be changed to attach devices at the time > of the first IOMMU mapping invocation? This could be a universal > solution for all drivers. Although, please scratch this :) I'll need to revisit how DMA mapping API works.
On 2020-08-20 20:51, Dmitry Osipenko wrote: > 20.08.2020 18:08, Robin Murphy пишет: >> Now that arch/arm is wired up for default domains and iommu-dma, we no >> longer need to work around the arch-private mapping. >> >> Signed-off-by: Robin Murphy <robin.murphy@arm.com> >> --- >> drivers/staging/media/tegra-vde/iommu.c | 12 ------------ >> 1 file changed, 12 deletions(-) >> >> diff --git a/drivers/staging/media/tegra-vde/iommu.c b/drivers/staging/media/tegra-vde/iommu.c >> index 6af863d92123..4f770189ed34 100644 >> --- a/drivers/staging/media/tegra-vde/iommu.c >> +++ b/drivers/staging/media/tegra-vde/iommu.c >> @@ -10,10 +10,6 @@ >> #include <linux/kernel.h> >> #include <linux/platform_device.h> >> >> -#if IS_ENABLED(CONFIG_ARM_DMA_USE_IOMMU) >> -#include <asm/dma-iommu.h> >> -#endif >> - >> #include "vde.h" >> >> int tegra_vde_iommu_map(struct tegra_vde *vde, >> @@ -70,14 +66,6 @@ int tegra_vde_iommu_init(struct tegra_vde *vde) >> if (!vde->group) >> return 0; >> >> -#if IS_ENABLED(CONFIG_ARM_DMA_USE_IOMMU) >> - if (dev->archdata.mapping) { >> - struct dma_iommu_mapping *mapping = to_dma_iommu_mapping(dev); >> - >> - arm_iommu_detach_device(dev); >> - arm_iommu_release_mapping(mapping); >> - } >> -#endif >> vde->domain = iommu_domain_alloc(&platform_bus_type); >> if (!vde->domain) { >> err = -ENOMEM; >> > > Hello, Robin! Thank you for yours work! > > Some drivers, like this Tegra VDE (Video Decoder Engine) driver for > example, do not want to use implicit IOMMU domain. That isn't (intentionally) changing here - the only difference should be that instead of having the ARM-special implicit domain, which you have to kick out of the way with the ARM-specific API before you're able to attach your own domain, the implicit domain is now a proper IOMMU API default domain, which automatically gets bumped by your attach. The default domains should still only be created in the same cases that the ARM dma_iommu_mappings were. > Tegra VDE driver > relies on explicit IOMMU domain in a case of Tegra SMMU because VDE > hardware can't access last page of the AS and because driver wants to > reserve some fixed addresses [1]. > > [1] > https://elixir.bootlin.com/linux/v5.9-rc1/source/drivers/staging/media/tegra-vde/iommu.c#L100 > > Tegra30 SoC supports up to 4 domains, hence it's not possible to afford > wasting unused implicit domains. I think this needs to be addressed > before this patch could be applied. Yeah, there is one subtle change in behaviour from removing the ARM layer on top of the core API, in that the IOMMU driver will no longer see an explicit detach call. Thus it does stand to benefit from being a bit cleverer about noticing devices being moved from one domain to another by an attach call, either by releasing the hardware context for the inactive domain once the device(s) are moved across to the new one, or by simply reprogramming the hardware context in-place for the new domain's address space without allocating a new one at all (most of the drivers that don't have multiple contexts already handle the latter approach quite well). > Would it be possible for IOMMU drivers to gain support for filtering out > devices in iommu_domain_alloc(dev, type)? Then perhaps Tegra SMMU driver > could simply return NULL in a case of type=IOMMU_DOMAIN_DMA and > dev=tegra-vde. If you can implement IOMMU_DOMAIN_IDENTITY by allowing the relevant devices to bypass translation entirely without needing a hardware context (or at worst, can spare one context which all identity-mapped logical domains can share), then you could certainly do that kind of filtering with the .def_domain_type callback if you really wanted to. As above, the intent is that that shouldn't be necessary for this particular case, since only one of a group's default domain and explicitly attached domain can be live at any given time, so the driver should be able to take advantage of that. If you simply have more active devices (groups) than available contexts then yes, you probably would want to do some filtering to decide who deserves a translation domain and who doesn't, but in that case you should already have had a long-standing problem with the ARM implicit domains. > Alternatively, the Tegra SMMU could be changed such that the devices > will be attached to a domain at the time of a first IOMMU mapping > invocation instead of attaching at the time of attach_dev() callback > invocation. > > Or maybe even IOMMU core could be changed to attach devices at the time > of the first IOMMU mapping invocation? This could be a universal > solution for all drivers. I suppose technically you could do that within an IOMMU driver already (similar to how some defer most of setup that logically belongs to ->domain_alloc until the first ->attach_dev). It's a bit grim from the caller's PoV though, in terms of the failure mode being non-obvious and having no real way to recover. Again, you'd be better off simply making decisions up-front at domain_alloc or attach time based on the domain type. Robin.
21.08.2020 03:11, Robin Murphy пишет: ... >> Hello, Robin! Thank you for yours work! >> >> Some drivers, like this Tegra VDE (Video Decoder Engine) driver for >> example, do not want to use implicit IOMMU domain. > > That isn't (intentionally) changing here - the only difference should be > that instead of having the ARM-special implicit domain, which you have > to kick out of the way with the ARM-specific API before you're able to > attach your own domain, the implicit domain is now a proper IOMMU API > default domain, which automatically gets bumped by your attach. The > default domains should still only be created in the same cases that the > ARM dma_iommu_mappings were. > >> Tegra VDE driver >> relies on explicit IOMMU domain in a case of Tegra SMMU because VDE >> hardware can't access last page of the AS and because driver wants to >> reserve some fixed addresses [1]. >> >> [1] >> https://elixir.bootlin.com/linux/v5.9-rc1/source/drivers/staging/media/tegra-vde/iommu.c#L100 >> >> >> Tegra30 SoC supports up to 4 domains, hence it's not possible to afford >> wasting unused implicit domains. I think this needs to be addressed >> before this patch could be applied. > > Yeah, there is one subtle change in behaviour from removing the ARM > layer on top of the core API, in that the IOMMU driver will no longer > see an explicit detach call. Thus it does stand to benefit from being a > bit cleverer about noticing devices being moved from one domain to > another by an attach call, either by releasing the hardware context for > the inactive domain once the device(s) are moved across to the new one, > or by simply reprogramming the hardware context in-place for the new > domain's address space without allocating a new one at all (most of the > drivers that don't have multiple contexts already handle the latter > approach quite well). > >> Would it be possible for IOMMU drivers to gain support for filtering out >> devices in iommu_domain_alloc(dev, type)? Then perhaps Tegra SMMU driver >> could simply return NULL in a case of type=IOMMU_DOMAIN_DMA and >> dev=tegra-vde. > > If you can implement IOMMU_DOMAIN_IDENTITY by allowing the relevant > devices to bypass translation entirely without needing a hardware > context (or at worst, can spare one context which all identity-mapped > logical domains can share), then you could certainly do that kind of > filtering with the .def_domain_type callback if you really wanted to. As > above, the intent is that that shouldn't be necessary for this > particular case, since only one of a group's default domain and > explicitly attached domain can be live at any given time, so the driver > should be able to take advantage of that. > > If you simply have more active devices (groups) than available contexts > then yes, you probably would want to do some filtering to decide who > deserves a translation domain and who doesn't, but in that case you > should already have had a long-standing problem with the ARM implicit > domains. > >> Alternatively, the Tegra SMMU could be changed such that the devices >> will be attached to a domain at the time of a first IOMMU mapping >> invocation instead of attaching at the time of attach_dev() callback >> invocation. >> >> Or maybe even IOMMU core could be changed to attach devices at the time >> of the first IOMMU mapping invocation? This could be a universal >> solution for all drivers. > > I suppose technically you could do that within an IOMMU driver already > (similar to how some defer most of setup that logically belongs to > ->domain_alloc until the first ->attach_dev). It's a bit grim from the > caller's PoV though, in terms of the failure mode being non-obvious and > having no real way to recover. Again, you'd be better off simply making > decisions up-front at domain_alloc or attach time based on the domain type. Robin, thank you very much for the clarifications! In accordance to yours comments, this patch can't be applied until Tegra SMMU will support IOMMU_DOMAIN_IDENTITY and implement def_domain_type() callback that returns IOMMU_DOMAIN_IDENTITY for the VDE device. Otherwise you're breaking the VDE driver because dma_buf_map_attachment() [1] returns the IOMMU SGT of the implicit domain which is then mapped into the VDE's explicit domain [2], and this is a nonsense. [1] https://elixir.bootlin.com/linux/v5.9-rc1/source/drivers/staging/media/tegra-vde/dmabuf-cache.c#L102 [2] https://elixir.bootlin.com/linux/v5.9-rc1/source/drivers/staging/media/tegra-vde/dmabuf-cache.c#L122 Hence, either VDE driver should bypass iommu_dma_ops from the start or it needs a way to kick out the ops, like it does this using ARM's arm_iommu_detach_device(). The same applies to the Tegra GPU devices, otherwise you're breaking them as well because Tegra DRM is sensible to implicit vs explicit domain. BTW, I tried to apply this series and T30 doesn't boot anymore. I don't have more info for now.
On 2020-08-23 22:34, Dmitry Osipenko wrote: > 21.08.2020 03:11, Robin Murphy пишет: > ... >>> Hello, Robin! Thank you for yours work! >>> >>> Some drivers, like this Tegra VDE (Video Decoder Engine) driver for >>> example, do not want to use implicit IOMMU domain. >> >> That isn't (intentionally) changing here - the only difference should be >> that instead of having the ARM-special implicit domain, which you have >> to kick out of the way with the ARM-specific API before you're able to >> attach your own domain, the implicit domain is now a proper IOMMU API >> default domain, which automatically gets bumped by your attach. The >> default domains should still only be created in the same cases that the >> ARM dma_iommu_mappings were. >> >>> Tegra VDE driver >>> relies on explicit IOMMU domain in a case of Tegra SMMU because VDE >>> hardware can't access last page of the AS and because driver wants to >>> reserve some fixed addresses [1]. >>> >>> [1] >>> https://elixir.bootlin.com/linux/v5.9-rc1/source/drivers/staging/media/tegra-vde/iommu.c#L100 >>> >>> >>> Tegra30 SoC supports up to 4 domains, hence it's not possible to afford >>> wasting unused implicit domains. I think this needs to be addressed >>> before this patch could be applied. >> >> Yeah, there is one subtle change in behaviour from removing the ARM >> layer on top of the core API, in that the IOMMU driver will no longer >> see an explicit detach call. Thus it does stand to benefit from being a >> bit cleverer about noticing devices being moved from one domain to >> another by an attach call, either by releasing the hardware context for >> the inactive domain once the device(s) are moved across to the new one, >> or by simply reprogramming the hardware context in-place for the new >> domain's address space without allocating a new one at all (most of the >> drivers that don't have multiple contexts already handle the latter >> approach quite well). >> >>> Would it be possible for IOMMU drivers to gain support for filtering out >>> devices in iommu_domain_alloc(dev, type)? Then perhaps Tegra SMMU driver >>> could simply return NULL in a case of type=IOMMU_DOMAIN_DMA and >>> dev=tegra-vde. >> >> If you can implement IOMMU_DOMAIN_IDENTITY by allowing the relevant >> devices to bypass translation entirely without needing a hardware >> context (or at worst, can spare one context which all identity-mapped >> logical domains can share), then you could certainly do that kind of >> filtering with the .def_domain_type callback if you really wanted to. As >> above, the intent is that that shouldn't be necessary for this >> particular case, since only one of a group's default domain and >> explicitly attached domain can be live at any given time, so the driver >> should be able to take advantage of that. >> >> If you simply have more active devices (groups) than available contexts >> then yes, you probably would want to do some filtering to decide who >> deserves a translation domain and who doesn't, but in that case you >> should already have had a long-standing problem with the ARM implicit >> domains. >> >>> Alternatively, the Tegra SMMU could be changed such that the devices >>> will be attached to a domain at the time of a first IOMMU mapping >>> invocation instead of attaching at the time of attach_dev() callback >>> invocation. >>> >>> Or maybe even IOMMU core could be changed to attach devices at the time >>> of the first IOMMU mapping invocation? This could be a universal >>> solution for all drivers. >> >> I suppose technically you could do that within an IOMMU driver already >> (similar to how some defer most of setup that logically belongs to >> ->domain_alloc until the first ->attach_dev). It's a bit grim from the >> caller's PoV though, in terms of the failure mode being non-obvious and >> having no real way to recover. Again, you'd be better off simply making >> decisions up-front at domain_alloc or attach time based on the domain type. > > Robin, thank you very much for the clarifications! > > In accordance to yours comments, this patch can't be applied until Tegra > SMMU will support IOMMU_DOMAIN_IDENTITY and implement def_domain_type() > callback that returns IOMMU_DOMAIN_IDENTITY for the VDE device. > > Otherwise you're breaking the VDE driver because > dma_buf_map_attachment() [1] returns the IOMMU SGT of the implicit > domain which is then mapped into the VDE's explicit domain [2], and this > is a nonsense. It's true that iommu_dma_ops will do some work in the unattached default domain, but non-coherent cache maintenance will still be performed correctly on the underlying memory, which is really all that you care about for this case. As for tegra_vde_iommu_map(), that seems to do the right thing in only referencing the physical side of the scatterlist (via iommu_map_sg()) and ignoring the DMA side, so things ought to work out OK even if it is a little non-obvious. > [1] > https://elixir.bootlin.com/linux/v5.9-rc1/source/drivers/staging/media/tegra-vde/dmabuf-cache.c#L102 > > [2] > https://elixir.bootlin.com/linux/v5.9-rc1/source/drivers/staging/media/tegra-vde/dmabuf-cache.c#L122 > > Hence, either VDE driver should bypass iommu_dma_ops from the start or > it needs a way to kick out the ops, like it does this using ARM's > arm_iommu_detach_device(). > > > The same applies to the Tegra GPU devices, otherwise you're breaking > them as well because Tegra DRM is sensible to implicit vs explicit domain. Note that Tegra DRM will only be as broken as its current state on arm64, and I was under the impression that that was OK now - at least I don't recall seeing any complaints since 43c5bf11a610. Although that commit and the one before it are resolving the scalability issue that they describe, it was very much in my mind at the time that they also have the happy side-effect described above - the default domain isn't *completely* out of the way, but it's far enough that sensible cases should be able to work as expected. > BTW, I tried to apply this series and T30 doesn't boot anymore. I don't > have more info for now. Yeah, I'm still trying to get to the bottom of whether it's actually working as intended at all, even on my RK3288. So far my debugging instrumentation has been confusingly inconclusive :/ Robin.
24.08.2020 17:01, Robin Murphy пишет: ... >> Robin, thank you very much for the clarifications! >> >> In accordance to yours comments, this patch can't be applied until Tegra >> SMMU will support IOMMU_DOMAIN_IDENTITY and implement def_domain_type() >> callback that returns IOMMU_DOMAIN_IDENTITY for the VDE device. >> >> Otherwise you're breaking the VDE driver because >> dma_buf_map_attachment() [1] returns the IOMMU SGT of the implicit >> domain which is then mapped into the VDE's explicit domain [2], and this >> is a nonsense. > > It's true that iommu_dma_ops will do some work in the unattached default > domain, but non-coherent cache maintenance will still be performed > correctly on the underlying memory, which is really all that you care > about for this case. As for tegra_vde_iommu_map(), that seems to do the > right thing in only referencing the physical side of the scatterlist > (via iommu_map_sg()) and ignoring the DMA side, so things ought to work > out OK even if it is a little non-obvious. I'll need to double-check this, it's indeed not clear to me right now. I see that if Tegra DRM driver uses implicit IOMMU domain, then when VDE driver imports DMA-buf from Terga DRM and the imported buffer will be auto-mapped to the implicit VDE IOVA [1]. [1] https://elixir.bootlin.com/linux/v5.9-rc2/source/drivers/gpu/drm/tegra/gem.c#L574 >> Hence, either VDE driver should bypass iommu_dma_ops from the start or >> it needs a way to kick out the ops, like it does this using ARM's >> arm_iommu_detach_device(). >> >> >> The same applies to the Tegra GPU devices, otherwise you're breaking >> them as well because Tegra DRM is sensible to implicit vs explicit >> domain. > > Note that Tegra DRM will only be as broken as its current state on > arm64, and I was under the impression that that was OK now - at least I > don't recall seeing any complaints since 43c5bf11a610. Although that > commit and the one before it are resolving the scalability issue that > they describe, it was very much in my mind at the time that they also > have the happy side-effect described above - the default domain isn't > *completely* out of the way, but it's far enough that sensible cases > should be able to work as expected. The Tegra DRM has a very special quirk for ARM32 that was added in this commit [2] and driver relies on checking of whether explicit or implicit IOMMU is used in order to activate the quirk. [2] https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?id=273da5a046965ccf0ec79eb63f2d5173467e20fa Once the implicit IOMMU is used for the DRM driver, the quirk no longer works (if I'm not missing something). This problem needs to be resolved before implicit IOMMU could be used by the Tegra DRM on ARM32. >> BTW, I tried to apply this series and T30 doesn't boot anymore. I don't >> have more info for now. > > Yeah, I'm still trying to get to the bottom of whether it's actually > working as intended at all, even on my RK3288. So far my debugging > instrumentation has been confusingly inconclusive :/ Surely it will take some time to resolve all the problems and it's great that you're pushing this work! I'll try to help with fixing the ARM32 Tegra side of the problems. I added this to my "TODO" list and should be able to take a closer look during of this/next weeks!
On Thu, Aug 27, 2020 at 10:05:14AM +0300, Dmitry Osipenko wrote: > 24.08.2020 17:01, Robin Murphy пишет: > ... > >> Robin, thank you very much for the clarifications! > >> > >> In accordance to yours comments, this patch can't be applied until Tegra > >> SMMU will support IOMMU_DOMAIN_IDENTITY and implement def_domain_type() > >> callback that returns IOMMU_DOMAIN_IDENTITY for the VDE device. > >> > >> Otherwise you're breaking the VDE driver because > >> dma_buf_map_attachment() [1] returns the IOMMU SGT of the implicit > >> domain which is then mapped into the VDE's explicit domain [2], and this > >> is a nonsense. > > > > It's true that iommu_dma_ops will do some work in the unattached default > > domain, but non-coherent cache maintenance will still be performed > > correctly on the underlying memory, which is really all that you care > > about for this case. As for tegra_vde_iommu_map(), that seems to do the > > right thing in only referencing the physical side of the scatterlist > > (via iommu_map_sg()) and ignoring the DMA side, so things ought to work > > out OK even if it is a little non-obvious. > > I'll need to double-check this, it's indeed not clear to me right now. > > I see that if Tegra DRM driver uses implicit IOMMU domain, then when VDE > driver imports DMA-buf from Terga DRM and the imported buffer will be > auto-mapped to the implicit VDE IOVA [1]. > > [1] > https://elixir.bootlin.com/linux/v5.9-rc2/source/drivers/gpu/drm/tegra/gem.c#L574 > > >> Hence, either VDE driver should bypass iommu_dma_ops from the start or > >> it needs a way to kick out the ops, like it does this using ARM's > >> arm_iommu_detach_device(). > >> > >> > >> The same applies to the Tegra GPU devices, otherwise you're breaking > >> them as well because Tegra DRM is sensible to implicit vs explicit > >> domain. > > > > Note that Tegra DRM will only be as broken as its current state on > > arm64, and I was under the impression that that was OK now - at least I > > don't recall seeing any complaints since 43c5bf11a610. Although that > > commit and the one before it are resolving the scalability issue that > > they describe, it was very much in my mind at the time that they also > > have the happy side-effect described above - the default domain isn't > > *completely* out of the way, but it's far enough that sensible cases > > should be able to work as expected. > > The Tegra DRM has a very special quirk for ARM32 that was added in this > commit [2] and driver relies on checking of whether explicit or implicit > IOMMU is used in order to activate the quirk. > > [2] > https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?id=273da5a046965ccf0ec79eb63f2d5173467e20fa > > Once the implicit IOMMU is used for the DRM driver, the quirk no longer > works (if I'm not missing something). This problem needs to be resolved > before implicit IOMMU could be used by the Tegra DRM on ARM32. > > >> BTW, I tried to apply this series and T30 doesn't boot anymore. I don't > >> have more info for now. > > > > Yeah, I'm still trying to get to the bottom of whether it's actually > > working as intended at all, even on my RK3288. So far my debugging > > instrumentation has been confusingly inconclusive :/ > > Surely it will take some time to resolve all the problems and it's great > that you're pushing this work! > > I'll try to help with fixing the ARM32 Tegra side of the problems. I > added this to my "TODO" list and should be able to take a closer look > during of this/next weeks! I do have a patch lying around somewhere that implements the mapping cache that was referenced in the above commit. Let me know if I should dig that up and send it out. Thierry
27.08.2020 18:54, Thierry Reding пишет: ... >> The Tegra DRM has a very special quirk for ARM32 that was added in this >> commit [2] and driver relies on checking of whether explicit or implicit >> IOMMU is used in order to activate the quirk. >> >> [2] >> https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?id=273da5a046965ccf0ec79eb63f2d5173467e20fa >> >> Once the implicit IOMMU is used for the DRM driver, the quirk no longer >> works (if I'm not missing something). This problem needs to be resolved >> before implicit IOMMU could be used by the Tegra DRM on ARM32. ... > I do have a patch lying around somewhere that implements the mapping > cache that was referenced in the above commit. Let me know if I should > dig that up and send it out. Hello, Thierry! It certainly will be interesting to take a look at yours patch! I think that the caching shouldn't be strictly necessary for keeping the current workaround working and it should be possible to keep the code as-is by replacing the domain-type checking with the SoC-generation check in the Tegra DRM driver. In general, IMO it should be better to stash the complex changes until we'll get closer to adopting the new UAPI as it will certainly touch the aspect of the per-device mappings. But if yours patch is less than 100 LOC, then maybe we could consider applying it right now!
diff --git a/drivers/staging/media/tegra-vde/iommu.c b/drivers/staging/media/tegra-vde/iommu.c index 6af863d92123..4f770189ed34 100644 --- a/drivers/staging/media/tegra-vde/iommu.c +++ b/drivers/staging/media/tegra-vde/iommu.c @@ -10,10 +10,6 @@ #include <linux/kernel.h> #include <linux/platform_device.h> -#if IS_ENABLED(CONFIG_ARM_DMA_USE_IOMMU) -#include <asm/dma-iommu.h> -#endif - #include "vde.h" int tegra_vde_iommu_map(struct tegra_vde *vde, @@ -70,14 +66,6 @@ int tegra_vde_iommu_init(struct tegra_vde *vde) if (!vde->group) return 0; -#if IS_ENABLED(CONFIG_ARM_DMA_USE_IOMMU) - if (dev->archdata.mapping) { - struct dma_iommu_mapping *mapping = to_dma_iommu_mapping(dev); - - arm_iommu_detach_device(dev); - arm_iommu_release_mapping(mapping); - } -#endif vde->domain = iommu_domain_alloc(&platform_bus_type); if (!vde->domain) { err = -ENOMEM;