<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/fs/userfaultfd.c, branch v5.1</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v5.1</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v5.1'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2019-04-19T16:46:05Z</updated>
<entry>
<title>coredump: fix race condition between mmget_not_zero()/get_task_mm() and core dumping</title>
<updated>2019-04-19T16:46:05Z</updated>
<author>
<name>Andrea Arcangeli</name>
<email>aarcange@redhat.com</email>
</author>
<published>2019-04-19T00:50:52Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=04f5866e41fb70690e28397487d8bd8eea7d712a'/>
<id>urn:sha1:04f5866e41fb70690e28397487d8bd8eea7d712a</id>
<content type='text'>
The core dumping code has always run without holding the mmap_sem for
writing, despite that is the only way to ensure that the entire vma
layout will not change from under it.  Only using some signal
serialization on the processes belonging to the mm is not nearly enough.
This was pointed out earlier.  For example in Hugh's post from Jul 2017:

  https://lkml.kernel.org/r/alpine.LSU.2.11.1707191716030.2055@eggly.anvils

  "Not strictly relevant here, but a related note: I was very surprised
   to discover, only quite recently, how handle_mm_fault() may be called
   without down_read(mmap_sem) - when core dumping. That seems a
   misguided optimization to me, which would also be nice to correct"

In particular because the growsdown and growsup can move the
vm_start/vm_end the various loops the core dump does around the vma will
not be consistent if page faults can happen concurrently.

Pretty much all users calling mmget_not_zero()/get_task_mm() and then
taking the mmap_sem had the potential to introduce unexpected side
effects in the core dumping code.

Adding mmap_sem for writing around the -&gt;core_dump invocation is a
viable long term fix, but it requires removing all copy user and page
faults and to replace them with get_dump_page() for all binary formats
which is not suitable as a short term fix.

For the time being this solution manually covers the places that can
confuse the core dump either by altering the vma layout or the vma flags
while it runs.  Once -&gt;core_dump runs under mmap_sem for writing the
function mmget_still_valid() can be dropped.

Allowing mmap_sem protected sections to run in parallel with the
coredump provides some minor parallelism advantage to the swapoff code
(which seems to be safe enough by never mangling any vma field and can
keep doing swapins in parallel to the core dumping) and to some other
corner case.

In order to facilitate the backporting I added "Fixes: 86039bd3b4e6"
however the side effect of this same race condition in /proc/pid/mem
should be reproducible since before 2.6.12-rc2 so I couldn't add any
other "Fixes:" because there's no hash beyond the git genesis commit.

Because find_extend_vma() is the only location outside of the process
context that could modify the "mm" structures under mmap_sem for
reading, by adding the mmget_still_valid() check to it, all other cases
that take the mmap_sem for reading don't need the new check after
mmget_not_zero()/get_task_mm().  The expand_stack() in page fault
context also doesn't need the new check, because all tasks under core
dumping are frozen.

Link: http://lkml.kernel.org/r/20190325224949.11068-1-aarcange@redhat.com
Fixes: 86039bd3b4e6 ("userfaultfd: add new syscall to provide memory externalization")
Signed-off-by: Andrea Arcangeli &lt;aarcange@redhat.com&gt;
Reported-by: Jann Horn &lt;jannh@google.com&gt;
Suggested-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Acked-by: Peter Xu &lt;peterx@redhat.com&gt;
Reviewed-by: Mike Rapoport &lt;rppt@linux.ibm.com&gt;
Reviewed-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Reviewed-by: Jann Horn &lt;jannh@google.com&gt;
Acked-by: Jason Gunthorpe &lt;jgg@mellanox.com&gt;
Acked-by: Michal Hocko &lt;mhocko@suse.com&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>userfaultfd: clear flag if remap event not enabled</title>
<updated>2018-12-28T20:11:51Z</updated>
<author>
<name>Peter Xu</name>
<email>peterx@redhat.com</email>
</author>
<published>2018-12-28T08:38:47Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=3cfd22be0ad663248fadfc8f6ffa3e255c394552'/>
<id>urn:sha1:3cfd22be0ad663248fadfc8f6ffa3e255c394552</id>
<content type='text'>
When the process being tracked does mremap() without
UFFD_FEATURE_EVENT_REMAP on the corresponding tracking uffd file handle,
we should not generate the remap event, and at the same time we should
clear all the uffd flags on the new VMA.  Without this patch, we can still
have the VM_UFFD_MISSING|VM_UFFD_WP flags on the new VMA even the fault
handling process does not even know the existance of the VMA.

Link: http://lkml.kernel.org/r/20181211053409.20317-1-peterx@redhat.com
Signed-off-by: Peter Xu &lt;peterx@redhat.com&gt;
Reviewed-by: Andrea Arcangeli &lt;aarcange@redhat.com&gt;
Acked-by: Mike Rapoport &lt;rppt@linux.vnet.ibm.com&gt;
Reviewed-by: William Kucharski &lt;william.kucharski@oracle.com&gt;
Cc: Andrea Arcangeli &lt;aarcange@redhat.com&gt;
Cc: Mike Rapoport &lt;rppt@linux.vnet.ibm.com&gt;
Cc: Kirill A. Shutemov &lt;kirill@shutemov.name&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Cc: Pavel Emelyanov &lt;xemul@virtuozzo.com&gt;
Cc: Pravin Shedge &lt;pravin.shedge4linux@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>userfaultfd: convert userfaultfd_ctx::refcount to refcount_t</title>
<updated>2018-12-28T20:11:47Z</updated>
<author>
<name>Eric Biggers</name>
<email>ebiggers@google.com</email>
</author>
<published>2018-12-28T08:34:43Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=ca880420665dbc8beec3693bee9f5eccb89de4a6'/>
<id>urn:sha1:ca880420665dbc8beec3693bee9f5eccb89de4a6</id>
<content type='text'>
Reference counters should use refcount_t rather than atomic_t, since the
refcount_t implementation can prevent overflows, reducing the
exploitability of reference leak bugs.  userfaultfd_ctx::refcount is a
reference counter with the usual semantics, so convert it to refcount_t.

Note: I replaced the BUG() on incrementing a 0 refcount with just
refcount_inc(), since part of the semantics of refcount_t is that that
incrementing a 0 refcount is not allowed; with CONFIG_REFCOUNT_FULL,
refcount_inc() already checks for it and warns.

Link: http://lkml.kernel.org/r/20181115003916.63381-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers &lt;ebiggers@google.com&gt;
Reviewed-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: Andrea Arcangeli &lt;aarcange@redhat.com&gt;
Reviewed-by: Mike Rapoport &lt;rppt@linux.ibm.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip</title>
<updated>2018-12-26T21:07:19Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2018-12-26T21:07:19Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=792bf4d871dea8b69be2aaabdd320d7c6ed15985'/>
<id>urn:sha1:792bf4d871dea8b69be2aaabdd320d7c6ed15985</id>
<content type='text'>
Pull RCU updates from Ingo Molnar:
 "The biggest RCU changes in this cycle were:

   - Convert RCU's BUG_ON() and similar calls to WARN_ON() and similar.

   - Replace calls of RCU-bh and RCU-sched update-side functions to
     their vanilla RCU counterparts. This series is a step towards
     complete removal of the RCU-bh and RCU-sched update-side functions.

     ( Note that some of these conversions are going upstream via their
       respective maintainers. )

   - Documentation updates, including a number of flavor-consolidation
     updates from Joel Fernandes.

   - Miscellaneous fixes.

   - Automate generation of the initrd filesystem used for rcutorture
     testing.

   - Convert spin_is_locked() assertions to instead use lockdep.

     ( Note that some of these conversions are going upstream via their
       respective maintainers. )

   - SRCU updates, especially including a fix from Dennis Krein for a
     bag-on-head-class bug.

   - RCU torture-test updates"

* 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (112 commits)
  rcutorture: Don't do busted forward-progress testing
  rcutorture: Use 100ms buckets for forward-progress callback histograms
  rcutorture: Recover from OOM during forward-progress tests
  rcutorture: Print forward-progress test age upon failure
  rcutorture: Print time since GP end upon forward-progress failure
  rcutorture: Print histogram of CB invocation at OOM time
  rcutorture: Print GP age upon forward-progress failure
  rcu: Print per-CPU callback counts for forward-progress failures
  rcu: Account for nocb-CPU callback counts in RCU CPU stall warnings
  rcutorture: Dump grace-period diagnostics upon forward-progress OOM
  rcutorture: Prepare for asynchronous access to rcu_fwd_startat
  torture: Remove unnecessary "ret" variables
  rcutorture: Affinity forward-progress test to avoid housekeeping CPUs
  rcutorture: Break up too-long rcu_torture_fwd_prog() function
  rcutorture: Remove cbflood facility
  torture: Bring any extra CPUs online during kernel startup
  rcutorture: Add call_rcu() flooding forward-progress tests
  rcutorture/formal: Replace synchronize_sched() with synchronize_rcu()
  tools/kernel.h: Replace synchronize_sched() with synchronize_rcu()
  net/decnet: Replace rcu_barrier_bh() with rcu_barrier()
  ...
</content>
</entry>
<entry>
<title>userfaultfd: check VM_MAYWRITE was set after verifying the uffd is registered</title>
<updated>2018-12-14T23:05:45Z</updated>
<author>
<name>Andrea Arcangeli</name>
<email>aarcange@redhat.com</email>
</author>
<published>2018-12-14T22:17:17Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=01e881f5a1fca4677e82733061868c6d6ea05ca7'/>
<id>urn:sha1:01e881f5a1fca4677e82733061868c6d6ea05ca7</id>
<content type='text'>
Calling UFFDIO_UNREGISTER on virtual ranges not yet registered in uffd
could trigger an harmless false positive WARN_ON.  Check the vma is
already registered before checking VM_MAYWRITE to shut off the false
positive warning.

Link: http://lkml.kernel.org/r/20181206212028.18726-2-aarcange@redhat.com
Cc: &lt;stable@vger.kernel.org&gt;
Fixes: 29ec90660d68 ("userfaultfd: shmem/hugetlbfs: only allow to register VM_MAYWRITE vmas")
Signed-off-by: Andrea Arcangeli &lt;aarcange@redhat.com&gt;
Reported-by: syzbot+06c7092e7d71218a2c16@syzkaller.appspotmail.com
Acked-by: Mike Rapoport &lt;rppt@linux.ibm.com&gt;
Acked-by: Hugh Dickins &lt;hughd@google.com&gt;
Acked-by: Peter Xu &lt;peterx@redhat.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>Merge branch 'for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu into core/rcu</title>
<updated>2018-12-04T06:52:30Z</updated>
<author>
<name>Ingo Molnar</name>
<email>mingo@kernel.org</email>
</author>
<published>2018-12-04T06:52:30Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=4bbfd7467cfc7d42e18d3008fa6a28ffd56e901a'/>
<id>urn:sha1:4bbfd7467cfc7d42e18d3008fa6a28ffd56e901a</id>
<content type='text'>
Pull RCU changes from Paul E. McKenney:

- Convert RCU's BUG_ON() and similar calls to WARN_ON() and similar.

- Replace calls of RCU-bh and RCU-sched update-side functions
  to their vanilla RCU counterparts.  This series is a step
  towards complete removal of the RCU-bh and RCU-sched update-side
  functions.

  ( Note that some of these conversions are going upstream via their
    respective maintainers. )

- Documentation updates, including a number of flavor-consolidation
  updates from Joel Fernandes.

- Miscellaneous fixes.

- Automate generation of the initrd filesystem used for
  rcutorture testing.

- Convert spin_is_locked() assertions to instead use lockdep.

  ( Note that some of these conversions are going upstream via their
    respective maintainers. )

- SRCU updates, especially including a fix from Dennis Krein
  for a bag-on-head-class bug.

- RCU torture-test updates.

Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>userfaultfd: shmem/hugetlbfs: only allow to register VM_MAYWRITE vmas</title>
<updated>2018-11-30T22:56:14Z</updated>
<author>
<name>Andrea Arcangeli</name>
<email>aarcange@redhat.com</email>
</author>
<published>2018-11-30T22:09:32Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=29ec90660d68bbdd69507c1c8b4e33aa299278b1'/>
<id>urn:sha1:29ec90660d68bbdd69507c1c8b4e33aa299278b1</id>
<content type='text'>
After the VMA to register the uffd onto is found, check that it has
VM_MAYWRITE set before allowing registration.  This way we inherit all
common code checks before allowing to fill file holes in shmem and
hugetlbfs with UFFDIO_COPY.

The userfaultfd memory model is not applicable for readonly files unless
it's a MAP_PRIVATE.

Link: http://lkml.kernel.org/r/20181126173452.26955-4-aarcange@redhat.com
Fixes: ff62a3421044 ("hugetlb: implement memfd sealing")
Signed-off-by: Andrea Arcangeli &lt;aarcange@redhat.com&gt;
Reviewed-by: Mike Rapoport &lt;rppt@linux.ibm.com&gt;
Reviewed-by: Hugh Dickins &lt;hughd@google.com&gt;
Reported-by: Jann Horn &lt;jannh@google.com&gt;
Fixes: 4c27fe4c4c84 ("userfaultfd: shmem: add shmem_mcopy_atomic_pte for userfaultfd support")
Cc: &lt;stable@vger.kernel.org&gt;
Cc: "Dr. David Alan Gilbert" &lt;dgilbert@redhat.com&gt;
Cc: Mike Kravetz &lt;mike.kravetz@oracle.com&gt;
Cc: Peter Xu &lt;peterx@redhat.com&gt;
Cc: stable@vger.kernel.org
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>userfaultfd: Replace spin_is_locked() with lockdep</title>
<updated>2018-11-12T17:06:22Z</updated>
<author>
<name>Lance Roy</name>
<email>ldr709@gmail.com</email>
</author>
<published>2018-10-05T06:45:44Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=456a737896b25e4853bbf3d6ca5a1c8ee4df4ee9'/>
<id>urn:sha1:456a737896b25e4853bbf3d6ca5a1c8ee4df4ee9</id>
<content type='text'>
lockdep_assert_held() is better suited to checking locking requirements,
since it only checks if the current thread holds the lock regardless of
whether someone else does. This is also a step towards possibly removing
spin_is_locked().

Signed-off-by: Lance Roy &lt;ldr709@gmail.com&gt;
Cc: Alexander Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: &lt;linux-fsdevel@vger.kernel.org&gt;
Signed-off-by: Paul E. McKenney &lt;paulmck@linux.ibm.com&gt;
</content>
</entry>
<entry>
<title>userfaultfd: disable irqs when taking the waitqueue lock</title>
<updated>2018-10-26T23:25:18Z</updated>
<author>
<name>Christoph Hellwig</name>
<email>hch@lst.de</email>
</author>
<published>2018-10-26T22:02:19Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=ae62c16e105a869524afcf8a07ee85c5ae5d0479'/>
<id>urn:sha1:ae62c16e105a869524afcf8a07ee85c5ae5d0479</id>
<content type='text'>
userfaultfd contains howe-grown locking of the waitqueue lock, and does
not disable interrupts.  This relies on the fact that no one else takes it
from interrupt context and violates an invariat of the normal waitqueue
locking scheme.  With aio poll it is easy to trigger other locks that
disable interrupts (or are called from interrupt context).

Link: http://lkml.kernel.org/r/20181018154101.18750-1-hch@lst.de
Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Reviewed-by: Andrea Arcangeli &lt;aarcange@redhat.com&gt;
Reviewed-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: &lt;stable@vger.kernel.org&gt;	[4.19.x]
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: Change return type int to vm_fault_t for fault handlers</title>
<updated>2018-08-24T01:48:44Z</updated>
<author>
<name>Souptick Joarder</name>
<email>jrdr.linux@gmail.com</email>
</author>
<published>2018-08-24T00:01:36Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=2b7403035459c75e193c6b04a293e518a4212de0'/>
<id>urn:sha1:2b7403035459c75e193c6b04a293e518a4212de0</id>
<content type='text'>
Use new return type vm_fault_t for fault handler.  For now, this is just
documenting that the function returns a VM_FAULT value rather than an
errno.  Once all instances are converted, vm_fault_t will become a
distinct type.

Ref-&gt; commit 1c8f422059ae ("mm: change return type to vm_fault_t")

The aim is to change the return type of finish_fault() and
handle_mm_fault() to vm_fault_t type.  As part of that clean up return
type of all other recursively called functions have been changed to
vm_fault_t type.

The places from where handle_mm_fault() is getting invoked will be
change to vm_fault_t type but in a separate patch.

vmf_error() is the newly introduce inline function in 4.17-rc6.

[akpm@linux-foundation.org: don't shadow outer local `ret' in __do_huge_pmd_anonymous_page()]
Link: http://lkml.kernel.org/r/20180604171727.GA20279@jordon-HP-15-Notebook-PC
Signed-off-by: Souptick Joarder &lt;jrdr.linux@gmail.com&gt;
Reviewed-by: Matthew Wilcox &lt;mawilcox@microsoft.com&gt;
Reviewed-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: Matthew Wilcox &lt;willy@infradead.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
</feed>
