Message ID | 20221107161740.144456-6-david@redhat.com (mailing list archive) |
---|---|
State | Not Applicable |
Headers |
Received: from vger.kernel.org ([23.128.96.18]) by www.linuxtv.org with esmtp (Exim 4.92) (envelope-from <linux-media-owner@vger.kernel.org>) id 1os4rE-00GwZw-27; Mon, 07 Nov 2022 16:20:20 +0000 Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232499AbiKGQUR (ORCPT <rfc822;mkrufky@linuxtv.org> + 1 other); Mon, 7 Nov 2022 11:20:17 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42532 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231933AbiKGQUH (ORCPT <rfc822;linux-media@vger.kernel.org>); Mon, 7 Nov 2022 11:20:07 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A0716615C for <linux-media@vger.kernel.org>; Mon, 7 Nov 2022 08:18:25 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1667837904; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=3+oJWZfKWoSUzQaG7FZQskadqtB6uOuGd99hd1wAZnI=; b=Jde86wVJizzxcxDxnFsBZC0prROVDdcqGUeZBTQR4kTybm76E/Xgwzvq9IliIoLqp+R++8 M+u+1wUjCa9731Al0/O+VEglXJmrTpipL5VpICP8Ubnu/8fZDf5B3bJgOEI8cwaveAx+jZ 1rHWysLc4hhH0cCaHgq/zpFcVYJPmAM= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-424-yxYkz-LWMkeOtOv1hyz8Xg-1; Mon, 07 Nov 2022 11:18:18 -0500 X-MC-Unique: yxYkz-LWMkeOtOv1hyz8Xg-1 Received: from smtp.corp.redhat.com (int-mx09.intmail.prod.int.rdu2.redhat.com [10.11.54.9]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 5E88B806001; Mon, 7 Nov 2022 16:18:17 +0000 (UTC) Received: from t480s.redhat.com (unknown [10.39.195.106]) by smtp.corp.redhat.com (Postfix) with ESMTP id 3DB084B3FC6; Mon, 7 Nov 2022 16:18:12 +0000 (UTC) From: David Hildenbrand <david@redhat.com> To: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org, etnaviv@lists.freedesktop.org, dri-devel@lists.freedesktop.org, linux-arm-kernel@lists.infradead.org, linux-samsung-soc@vger.kernel.org, linux-rdma@vger.kernel.org, linux-media@vger.kernel.org, linux-kselftest@vger.kernel.org, David Hildenbrand <david@redhat.com>, Linus Torvalds <torvalds@linux-foundation.org>, Andrew Morton <akpm@linux-foundation.org>, Jason Gunthorpe <jgg@ziepe.ca>, John Hubbard <jhubbard@nvidia.com>, Peter Xu <peterx@redhat.com>, Greg Kroah-Hartman <gregkh@linuxfoundation.org>, Andrea Arcangeli <aarcange@redhat.com>, Hugh Dickins <hughd@google.com>, Nadav Amit <namit@vmware.com>, Vlastimil Babka <vbabka@suse.cz>, Matthew Wilcox <willy@infradead.org>, Mike Kravetz <mike.kravetz@oracle.com>, Muchun Song <songmuchun@bytedance.com>, Lucas Stach <l.stach@pengutronix.de>, David Airlie <airlied@gmail.com>, Oded Gabbay <ogabbay@kernel.org>, Arnd Bergmann <arnd@arndb.de> Subject: [PATCH RFC 05/19] mm: add early FAULT_FLAG_WRITE consistency checks Date: Mon, 7 Nov 2022 17:17:26 +0100 Message-Id: <20221107161740.144456-6-david@redhat.com> In-Reply-To: <20221107161740.144456-1-david@redhat.com> References: <20221107161740.144456-1-david@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Scanned-By: MIMEDefang 3.1 on 10.11.54.9 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-media.vger.kernel.org> X-Mailing-List: linux-media@vger.kernel.org X-LSpam-Score: -2.5 (--) X-LSpam-Report: No, score=-2.5 required=5.0 tests=BAYES_00=-1.9,DKIMWL_WL_HIGH=0.001,DKIM_SIGNED=0.1,DKIM_VALID=-0.1,DKIM_VALID_AU=-0.1,HEADER_FROM_DIFFERENT_DOMAINS=0.5,MAILING_LIST_MULTI=-1 autolearn=ham autolearn_force=no |
Series |
mm/gup: remove FOLL_FORCE usage from drivers (reliable R/O long-term pinning)
|
|
Commit Message
David Hildenbrand
Nov. 7, 2022, 4:17 p.m. UTC
Let's catch abuse of FAULT_FLAG_WRITE early, such that we don't have to
care in all other handlers and might get "surprises" if we forget to do
so.
Write faults without VM_MAYWRITE don't make any sense, and our
maybe_mkwrite() logic could have hidden such abuse for now.
Write faults without VM_WRITE on something that is not a COW mapping is
similarly broken, and e.g., do_wp_page() could end up placing an
anonymous page into a shared mapping, which would be bad.
This is a preparation for reliable R/O long-term pinning of pages in
private mappings, whereby we want to make sure that we will never break
COW in a read-only private mapping.
Signed-off-by: David Hildenbrand <david@redhat.com>
---
mm/memory.c | 8 ++++++++
1 file changed, 8 insertions(+)
Comments
On Nov 7, 2022, at 8:17 AM, David Hildenbrand <david@redhat.com> wrote: > !! External Email > > Let's catch abuse of FAULT_FLAG_WRITE early, such that we don't have to > care in all other handlers and might get "surprises" if we forget to do > so. > > Write faults without VM_MAYWRITE don't make any sense, and our > maybe_mkwrite() logic could have hidden such abuse for now. > > Write faults without VM_WRITE on something that is not a COW mapping is > similarly broken, and e.g., do_wp_page() could end up placing an > anonymous page into a shared mapping, which would be bad. > > This is a preparation for reliable R/O long-term pinning of pages in > private mappings, whereby we want to make sure that we will never break > COW in a read-only private mapping. > > Signed-off-by: David Hildenbrand <david@redhat.com> > --- > mm/memory.c | 8 ++++++++ > 1 file changed, 8 insertions(+) > > diff --git a/mm/memory.c b/mm/memory.c > index fe131273217a..826353da7b23 100644 > --- a/mm/memory.c > +++ b/mm/memory.c > @@ -5159,6 +5159,14 @@ static vm_fault_t sanitize_fault_flags(struct vm_area_struct *vma, > */ > if (!is_cow_mapping(vma->vm_flags)) > *flags &= ~FAULT_FLAG_UNSHARE; > + } else if (*flags & FAULT_FLAG_WRITE) { > + /* Write faults on read-only mappings are impossible ... */ > + if (WARN_ON_ONCE(!(vma->vm_flags & VM_MAYWRITE))) > + return VM_FAULT_SIGSEGV; > + /* ... and FOLL_FORCE only applies to COW mappings. */ > + if (WARN_ON_ONCE(!(vma->vm_flags & VM_WRITE) && > + !is_cow_mapping(vma->vm_flags))) > + return VM_FAULT_SIGSEGV; Not sure about the WARN_*(). Seems as if it might trigger in benign even if rare scenarios, e.g., mprotect() racing with page-fault.
On 07.11.22 20:03, Nadav Amit wrote: > On Nov 7, 2022, at 8:17 AM, David Hildenbrand <david@redhat.com> wrote: > >> !! External Email >> >> Let's catch abuse of FAULT_FLAG_WRITE early, such that we don't have to >> care in all other handlers and might get "surprises" if we forget to do >> so. >> >> Write faults without VM_MAYWRITE don't make any sense, and our >> maybe_mkwrite() logic could have hidden such abuse for now. >> >> Write faults without VM_WRITE on something that is not a COW mapping is >> similarly broken, and e.g., do_wp_page() could end up placing an >> anonymous page into a shared mapping, which would be bad. >> >> This is a preparation for reliable R/O long-term pinning of pages in >> private mappings, whereby we want to make sure that we will never break >> COW in a read-only private mapping. >> >> Signed-off-by: David Hildenbrand <david@redhat.com> >> --- >> mm/memory.c | 8 ++++++++ >> 1 file changed, 8 insertions(+) >> >> diff --git a/mm/memory.c b/mm/memory.c >> index fe131273217a..826353da7b23 100644 >> --- a/mm/memory.c >> +++ b/mm/memory.c >> @@ -5159,6 +5159,14 @@ static vm_fault_t sanitize_fault_flags(struct vm_area_struct *vma, >> */ >> if (!is_cow_mapping(vma->vm_flags)) >> *flags &= ~FAULT_FLAG_UNSHARE; >> + } else if (*flags & FAULT_FLAG_WRITE) { >> + /* Write faults on read-only mappings are impossible ... */ >> + if (WARN_ON_ONCE(!(vma->vm_flags & VM_MAYWRITE))) >> + return VM_FAULT_SIGSEGV; >> + /* ... and FOLL_FORCE only applies to COW mappings. */ >> + if (WARN_ON_ONCE(!(vma->vm_flags & VM_WRITE) && >> + !is_cow_mapping(vma->vm_flags))) >> + return VM_FAULT_SIGSEGV; > > Not sure about the WARN_*(). Seems as if it might trigger in benign even if > rare scenarios, e.g., mprotect() racing with page-fault. > We most certainly would want to catch any such broken/racy cases. There are no benign cases I could possibly think of. Page faults need the mmap lock in read. mprotect() / VMA changes need the mmap lock in write. Whoever calls handle_mm_fault() is supposed to properly check VMA permissions.
On Nov 7, 2022, at 11:27 AM, David Hildenbrand <david@redhat.com> wrote: > !! External Email > > On 07.11.22 20:03, Nadav Amit wrote: >> On Nov 7, 2022, at 8:17 AM, David Hildenbrand <david@redhat.com> wrote: >> >>> !! External Email >>> >>> Let's catch abuse of FAULT_FLAG_WRITE early, such that we don't have to >>> care in all other handlers and might get "surprises" if we forget to do >>> so. >>> >>> Write faults without VM_MAYWRITE don't make any sense, and our >>> maybe_mkwrite() logic could have hidden such abuse for now. >>> >>> Write faults without VM_WRITE on something that is not a COW mapping is >>> similarly broken, and e.g., do_wp_page() could end up placing an >>> anonymous page into a shared mapping, which would be bad. >>> >>> This is a preparation for reliable R/O long-term pinning of pages in >>> private mappings, whereby we want to make sure that we will never break >>> COW in a read-only private mapping. >>> >>> Signed-off-by: David Hildenbrand <david@redhat.com> >>> --- >>> mm/memory.c | 8 ++++++++ >>> 1 file changed, 8 insertions(+) >>> >>> diff --git a/mm/memory.c b/mm/memory.c >>> index fe131273217a..826353da7b23 100644 >>> --- a/mm/memory.c >>> +++ b/mm/memory.c >>> @@ -5159,6 +5159,14 @@ static vm_fault_t sanitize_fault_flags(struct vm_area_struct *vma, >>> */ >>> if (!is_cow_mapping(vma->vm_flags)) >>> *flags &= ~FAULT_FLAG_UNSHARE; >>> + } else if (*flags & FAULT_FLAG_WRITE) { >>> + /* Write faults on read-only mappings are impossible ... */ >>> + if (WARN_ON_ONCE(!(vma->vm_flags & VM_MAYWRITE))) >>> + return VM_FAULT_SIGSEGV; >>> + /* ... and FOLL_FORCE only applies to COW mappings. */ >>> + if (WARN_ON_ONCE(!(vma->vm_flags & VM_WRITE) && >>> + !is_cow_mapping(vma->vm_flags))) >>> + return VM_FAULT_SIGSEGV; >> >> Not sure about the WARN_*(). Seems as if it might trigger in benign even if >> rare scenarios, e.g., mprotect() racing with page-fault. > > We most certainly would want to catch any such broken/racy cases. There > are no benign cases I could possibly think of. > > Page faults need the mmap lock in read. mprotect() / VMA changes need > the mmap lock in write. Whoever calls handle_mm_fault() is supposed to > properly check VMA permissions. My bad. I now see it. Thanks for explaining.
diff --git a/mm/memory.c b/mm/memory.c index fe131273217a..826353da7b23 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -5159,6 +5159,14 @@ static vm_fault_t sanitize_fault_flags(struct vm_area_struct *vma, */ if (!is_cow_mapping(vma->vm_flags)) *flags &= ~FAULT_FLAG_UNSHARE; + } else if (*flags & FAULT_FLAG_WRITE) { + /* Write faults on read-only mappings are impossible ... */ + if (WARN_ON_ONCE(!(vma->vm_flags & VM_MAYWRITE))) + return VM_FAULT_SIGSEGV; + /* ... and FOLL_FORCE only applies to COW mappings. */ + if (WARN_ON_ONCE(!(vma->vm_flags & VM_WRITE) && + !is_cow_mapping(vma->vm_flags))) + return VM_FAULT_SIGSEGV; } return 0; }