Message ID | 151001624873.16354.2551756846133945335.stgit@dwillia2-desk3.amr.corp.intel.com (mailing list archive) |
---|---|
State | Accepted, archived |
Headers |
Received: from vger.kernel.org ([209.132.180.67]) by www.linuxtv.org with esmtp (Exim 4.84_2) (envelope-from <linux-media-owner@vger.kernel.org>) id 1eBsLC-0001Rl-IP; Tue, 07 Nov 2017 01:06:10 +0000 Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S935070AbdKGBFy (ORCPT <rfc822;mkrufky@linuxtv.org> + 1 other); Mon, 6 Nov 2017 20:05:54 -0500 Received: from mga07.intel.com ([134.134.136.100]:41667 "EHLO mga07.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S935062AbdKGBFn (ORCPT <rfc822;linux-media@vger.kernel.org>); Mon, 6 Nov 2017 20:05:43 -0500 Received: from orsmga002.jf.intel.com ([10.7.209.21]) by orsmga105.jf.intel.com with ESMTP; 06 Nov 2017 17:05:43 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.44,355,1505804400"; d="scan'208";a="4246191" Received: from dwillia2-desk3.jf.intel.com (HELO dwillia2-desk3.amr.corp.intel.com) ([10.54.39.16]) by orsmga002.jf.intel.com with ESMTP; 06 Nov 2017 17:05:43 -0800 Subject: [PATCH 3/3] [media] v4l2: disable filesystem-dax mapping support From: Dan Williams <dan.j.williams@intel.com> To: akpm@linux-foundation.org Cc: Jan Kara <jack@suse.cz>, linux-kernel@vger.kernel.org, stable@vger.kernel.org, linux-mm@kvack.org, Mauro Carvalho Chehab <mchehab@kernel.org>, linux-media@vger.kernel.org Date: Mon, 06 Nov 2017 16:57:28 -0800 Message-ID: <151001624873.16354.2551756846133945335.stgit@dwillia2-desk3.amr.corp.intel.com> In-Reply-To: <151001623063.16354.14661493921524115663.stgit@dwillia2-desk3.amr.corp.intel.com> References: <151001623063.16354.14661493921524115663.stgit@dwillia2-desk3.amr.corp.intel.com> User-Agent: StGit/0.17.1-9-g687f MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit Sender: linux-media-owner@vger.kernel.org Precedence: bulk List-ID: <linux-media.vger.kernel.org> X-Mailing-List: linux-media@vger.kernel.org |
Commit Message
Dan Williams
Nov. 7, 2017, 12:57 a.m. UTC
V4L2 memory registrations are incompatible with filesystem-dax that
needs the ability to revoke dma access to a mapping at will, or
otherwise allow the kernel to wait for completion of DMA. The
filesystem-dax implementation breaks the traditional solution of
truncate of active file backed mappings since there is no page-cache
page we can orphan to sustain ongoing DMA.
If v4l2 wants to support long lived DMA mappings it needs to arrange to
hold a file lease or use some other mechanism so that the kernel can
coordinate revoking DMA access when the filesystem needs to truncate
mappings.
Reported-by: Jan Kara <jack@suse.cz>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: linux-media@vger.kernel.org
Cc: <stable@vger.kernel.org>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
drivers/media/v4l2-core/videobuf-dma-sg.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
Comments
Em Mon, 06 Nov 2017 16:57:28 -0800 Dan Williams <dan.j.williams@intel.com> escreveu: > V4L2 memory registrations are incompatible with filesystem-dax that > needs the ability to revoke dma access to a mapping at will, or > otherwise allow the kernel to wait for completion of DMA. The > filesystem-dax implementation breaks the traditional solution of > truncate of active file backed mappings since there is no page-cache > page we can orphan to sustain ongoing DMA. > > If v4l2 wants to support long lived DMA mappings it needs to arrange to > hold a file lease or use some other mechanism so that the kernel can > coordinate revoking DMA access when the filesystem needs to truncate > mappings. Not sure if I understand this your comment here... what happens if FS_DAX is enabled? The new err = get_user_pages_longterm() would cause DMA allocation to fail? If so, that doesn't sound right. Instead, mm should somehow mark this mapping to be out of FS_DAX control range. Also, it is not only videobuf-dma-sg.c that does long lived DMA mappings. VB2 also does that (and videobuf-vmalloc). Regards, Mauro > > Reported-by: Jan Kara <jack@suse.cz> > Cc: Mauro Carvalho Chehab <mchehab@kernel.org> > Cc: linux-media@vger.kernel.org > Cc: <stable@vger.kernel.org> > Signed-off-by: Dan Williams <dan.j.williams@intel.com> > --- > drivers/media/v4l2-core/videobuf-dma-sg.c | 5 +++-- > 1 file changed, 3 insertions(+), 2 deletions(-) > > diff --git a/drivers/media/v4l2-core/videobuf-dma-sg.c b/drivers/media/v4l2-core/videobuf-dma-sg.c > index 0b5c43f7e020..f412429cf5ba 100644 > --- a/drivers/media/v4l2-core/videobuf-dma-sg.c > +++ b/drivers/media/v4l2-core/videobuf-dma-sg.c > @@ -185,12 +185,13 @@ static int videobuf_dma_init_user_locked(struct videobuf_dmabuf *dma, > dprintk(1, "init user [0x%lx+0x%lx => %d pages]\n", > data, size, dma->nr_pages); > > - err = get_user_pages(data & PAGE_MASK, dma->nr_pages, > + err = get_user_pages_longterm(data & PAGE_MASK, dma->nr_pages, > flags, dma->pages, NULL); > > if (err != dma->nr_pages) { > dma->nr_pages = (err >= 0) ? err : 0; > - dprintk(1, "get_user_pages: err=%d [%d]\n", err, dma->nr_pages); > + dprintk(1, "get_user_pages_longterm: err=%d [%d]\n", err, > + dma->nr_pages); > return err < 0 ? err : -EINVAL; > } > return 0; > Thanks, Mauro
On Tue, Nov 7, 2017 at 12:33 AM, Mauro Carvalho Chehab <mchehab@s-opensource.com> wrote: > Em Mon, 06 Nov 2017 16:57:28 -0800 > Dan Williams <dan.j.williams@intel.com> escreveu: > >> V4L2 memory registrations are incompatible with filesystem-dax that >> needs the ability to revoke dma access to a mapping at will, or >> otherwise allow the kernel to wait for completion of DMA. The >> filesystem-dax implementation breaks the traditional solution of >> truncate of active file backed mappings since there is no page-cache >> page we can orphan to sustain ongoing DMA. >> >> If v4l2 wants to support long lived DMA mappings it needs to arrange to >> hold a file lease or use some other mechanism so that the kernel can >> coordinate revoking DMA access when the filesystem needs to truncate >> mappings. > > > Not sure if I understand this your comment here... what happens > if FS_DAX is enabled? The new err = get_user_pages_longterm() > would cause DMA allocation to fail? Correct, any attempt to specify a filesystem-dax mapping range to get_user_pages_longterm will fail with EOPNOTSUPP. In the future we want to add something like a 'struct file_lock *' argument to get_user_pages_longterm so that the kernel has a handle to revoke access to the returned pages. Once we have a safe way for the kernel to undo elevated page counts we can stop failing the longterm vs filesystem-dax case. Here is more background on why _longterm gup is a problem for filesystem-dax: https://lwn.net/Articles/737273/ > If so, that doesn't sound > right. Instead, mm should somehow mark this mapping to be out > of FS_DAX control range. DAX is currently global setting for the entire backing device of the filesystem, so any mapping of any file when the "-o dax" mount option is set is in the "FS_DAX control range". In other words there's currently no way to prevent FS_DAX mappings from being exposed to V4L2 outside of CONFIG_FS_DAX=n. > Also, it is not only videobuf-dma-sg.c that does long lived > DMA mappings. VB2 also does that (and videobuf-vmalloc). Without finding the code videobuf-vmalloc sounds like it should be ok if the kernel is allocating memory separate from a file-backed DAX mapping. Where is the VB2 get_user_pages call?
Em Tue, 7 Nov 2017 09:43:41 -0800 Dan Williams <dan.j.williams@intel.com> escreveu: > On Tue, Nov 7, 2017 at 12:33 AM, Mauro Carvalho Chehab > <mchehab@s-opensource.com> wrote: > > Em Mon, 06 Nov 2017 16:57:28 -0800 > > Dan Williams <dan.j.williams@intel.com> escreveu: > > > >> V4L2 memory registrations are incompatible with filesystem-dax that > >> needs the ability to revoke dma access to a mapping at will, or > >> otherwise allow the kernel to wait for completion of DMA. The > >> filesystem-dax implementation breaks the traditional solution of > >> truncate of active file backed mappings since there is no page-cache > >> page we can orphan to sustain ongoing DMA. > >> > >> If v4l2 wants to support long lived DMA mappings it needs to arrange to > >> hold a file lease or use some other mechanism so that the kernel can > >> coordinate revoking DMA access when the filesystem needs to truncate > >> mappings. > > > > > > Not sure if I understand this your comment here... what happens > > if FS_DAX is enabled? The new err = get_user_pages_longterm() > > would cause DMA allocation to fail? > > Correct, any attempt to specify a filesystem-dax mapping range to > get_user_pages_longterm will fail with EOPNOTSUPP. In the future we > want to add something like a 'struct file_lock *' argument to > get_user_pages_longterm so that the kernel has a handle to revoke > access to the returned pages. Once we have a safe way for the kernel > to undo elevated page counts we can stop failing the longterm vs > filesystem-dax case. Argh! Perhaps we should make it depend on BROKEN while not fixed :-/ > Here is more background on why _longterm gup is a problem for filesystem-dax: > > https://lwn.net/Articles/737273/ > > > If so, that doesn't sound > > right. Instead, mm should somehow mark this mapping to be out > > of FS_DAX control range. > > DAX is currently global setting for the entire backing device of the > filesystem, so any mapping of any file when the "-o dax" mount option > is set is in the "FS_DAX control range". In other words there's > currently no way to prevent FS_DAX mappings from being exposed to V4L2 > outside of CONFIG_FS_DAX=n. Grrr... > > Also, it is not only videobuf-dma-sg.c that does long lived > > DMA mappings. VB2 also does that (and videobuf-vmalloc). > > Without finding the code videobuf-vmalloc sounds like it should be ok > if the kernel is allocating memory separate from a file-backed DAX > mapping. videobuf-vmalloc do DMA mapping for pages allocated via vmalloc(), via vmalloc_user()/remap_vmalloc_range(). There aren't much drivers using VB1 anymore, but a change at VB2 will likely break support for almost all webcams if fs DAX is in usage. > Where is the VB2 get_user_pages call? Before changeset 3336c24f25ec, the logic for get_user_pages() were at drivers/media/v4l2-core/videobuf2-dma-sg.c. Now, the logic it uses is inside mm/frame_vector.c. Thanks, Mauro
On Tue, Nov 7, 2017 at 12:39 PM, Mauro Carvalho Chehab <mchehab@s-opensource.com> wrote: > Em Tue, 7 Nov 2017 09:43:41 -0800 > Dan Williams <dan.j.williams@intel.com> escreveu: > >> On Tue, Nov 7, 2017 at 12:33 AM, Mauro Carvalho Chehab >> <mchehab@s-opensource.com> wrote: >> > Em Mon, 06 Nov 2017 16:57:28 -0800 >> > Dan Williams <dan.j.williams@intel.com> escreveu: >> > >> >> V4L2 memory registrations are incompatible with filesystem-dax that >> >> needs the ability to revoke dma access to a mapping at will, or >> >> otherwise allow the kernel to wait for completion of DMA. The >> >> filesystem-dax implementation breaks the traditional solution of >> >> truncate of active file backed mappings since there is no page-cache >> >> page we can orphan to sustain ongoing DMA. >> >> >> >> If v4l2 wants to support long lived DMA mappings it needs to arrange to >> >> hold a file lease or use some other mechanism so that the kernel can >> >> coordinate revoking DMA access when the filesystem needs to truncate >> >> mappings. >> > >> > >> > Not sure if I understand this your comment here... what happens >> > if FS_DAX is enabled? The new err = get_user_pages_longterm() >> > would cause DMA allocation to fail? >> >> Correct, any attempt to specify a filesystem-dax mapping range to >> get_user_pages_longterm will fail with EOPNOTSUPP. In the future we >> want to add something like a 'struct file_lock *' argument to >> get_user_pages_longterm so that the kernel has a handle to revoke >> access to the returned pages. Once we have a safe way for the kernel >> to undo elevated page counts we can stop failing the longterm vs >> filesystem-dax case. > > Argh! Perhaps we should make it depend on BROKEN while not fixed :-/ Small consolation, but we do warn that filesystem-dax is still considered experimental when mounting a filesystem with "-o dax" >> Here is more background on why _longterm gup is a problem for filesystem-dax: >> >> https://lwn.net/Articles/737273/ >> >> > If so, that doesn't sound >> > right. Instead, mm should somehow mark this mapping to be out >> > of FS_DAX control range. >> >> DAX is currently global setting for the entire backing device of the >> filesystem, so any mapping of any file when the "-o dax" mount option >> is set is in the "FS_DAX control range". In other words there's >> currently no way to prevent FS_DAX mappings from being exposed to V4L2 >> outside of CONFIG_FS_DAX=n. > > Grrr... > >> > Also, it is not only videobuf-dma-sg.c that does long lived >> > DMA mappings. VB2 also does that (and videobuf-vmalloc). >> >> Without finding the code videobuf-vmalloc sounds like it should be ok >> if the kernel is allocating memory separate from a file-backed DAX >> mapping. > > videobuf-vmalloc do DMA mapping for pages allocated via vmalloc(), > via vmalloc_user()/remap_vmalloc_range(). Ok, that's completely safe since filesystem-dax mappings are not involved in a vmalloc backed virtual address range. > There aren't much drivers using VB1 anymore, but a change at VB2 > will likely break support for almost all webcams if fs DAX is > in usage. Yes, unless / until we can switch userspace to using a new memory registration api that includes a way for the kernel to revoke access to a dax mapping. Another mitigation is following through on support for moving dax support from a global mount flag to a per-inode flag to at least prevent dax from leaking to use cases that need explicit coordination. >> Where is the VB2 get_user_pages call? > > Before changeset 3336c24f25ec, the logic for get_user_pages() were > at drivers/media/v4l2-core/videobuf2-dma-sg.c. Now, the logic > it uses is inside mm/frame_vector.c. Ok, I'll take a look.
diff --git a/drivers/media/v4l2-core/videobuf-dma-sg.c b/drivers/media/v4l2-core/videobuf-dma-sg.c index 0b5c43f7e020..f412429cf5ba 100644 --- a/drivers/media/v4l2-core/videobuf-dma-sg.c +++ b/drivers/media/v4l2-core/videobuf-dma-sg.c @@ -185,12 +185,13 @@ static int videobuf_dma_init_user_locked(struct videobuf_dmabuf *dma, dprintk(1, "init user [0x%lx+0x%lx => %d pages]\n", data, size, dma->nr_pages); - err = get_user_pages(data & PAGE_MASK, dma->nr_pages, + err = get_user_pages_longterm(data & PAGE_MASK, dma->nr_pages, flags, dma->pages, NULL); if (err != dma->nr_pages) { dma->nr_pages = (err >= 0) ? err : 0; - dprintk(1, "get_user_pages: err=%d [%d]\n", err, dma->nr_pages); + dprintk(1, "get_user_pages_longterm: err=%d [%d]\n", err, + dma->nr_pages); return err < 0 ? err : -EINVAL; } return 0;