<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/fs/exec.c, branch v5.4</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v5.4</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v5.4'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2019-09-25T15:42:30Z</updated>
<entry>
<title>sched/membarrier: Fix p-&gt;mm-&gt;membarrier_state racy load</title>
<updated>2019-09-25T15:42:30Z</updated>
<author>
<name>Mathieu Desnoyers</name>
<email>mathieu.desnoyers@efficios.com</email>
</author>
<published>2019-09-19T17:37:02Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=227a4aadc75ba22fcb6c4e1c078817b8cbaae4ce'/>
<id>urn:sha1:227a4aadc75ba22fcb6c4e1c078817b8cbaae4ce</id>
<content type='text'>
The membarrier_state field is located within the mm_struct, which
is not guaranteed to exist when used from runqueue-lock-free iteration
on runqueues by the membarrier system call.

Copy the membarrier_state from the mm_struct into the scheduler runqueue
when the scheduler switches between mm.

When registering membarrier for mm, after setting the registration bit
in the mm membarrier state, issue a synchronize_rcu() to ensure the
scheduler observes the change. In order to take care of the case
where a runqueue keeps executing the target mm without swapping to
other mm, iterate over each runqueue and issue an IPI to copy the
membarrier_state from the mm_struct into each runqueue which have the
same mm which state has just been modified.

Move the mm membarrier_state field closer to pgd in mm_struct to use
a cache line already touched by the scheduler switch_mm.

The membarrier_execve() (now membarrier_exec_mmap) hook now needs to
clear the runqueue's membarrier state in addition to clear the mm
membarrier state, so move its implementation into the scheduler
membarrier code so it can access the runqueue structure.

Add memory barrier in membarrier_exec_mmap() prior to clearing
the membarrier state, ensuring memory accesses executed prior to exec
are not reordered with the stores clearing the membarrier state.

As suggested by Linus, move all membarrier.c RCU read-side locks outside
of the for each cpu loops.

Suggested-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Mathieu Desnoyers &lt;mathieu.desnoyers@efficios.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: Chris Metcalf &lt;cmetcalf@ezchip.com&gt;
Cc: Christoph Lameter &lt;cl@linux.com&gt;
Cc: Eric W. Biederman &lt;ebiederm@xmission.com&gt;
Cc: Kirill Tkhai &lt;tkhai@yandex.ru&gt;
Cc: Mike Galbraith &lt;efault@gmx.de&gt;
Cc: Oleg Nesterov &lt;oleg@redhat.com&gt;
Cc: Paul E. McKenney &lt;paulmck@linux.ibm.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Russell King - ARM Linux admin &lt;linux@armlinux.org.uk&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Link: https://lkml.kernel.org/r/20190919173705.2181-5-mathieu.desnoyers@efficios.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>sched/fair: Don't free p-&gt;numa_faults with concurrent readers</title>
<updated>2019-07-25T13:37:04Z</updated>
<author>
<name>Jann Horn</name>
<email>jannh@google.com</email>
</author>
<published>2019-07-16T15:20:45Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=16d51a590a8ce3befb1308e0e7ab77f3b661af33'/>
<id>urn:sha1:16d51a590a8ce3befb1308e0e7ab77f3b661af33</id>
<content type='text'>
When going through execve(), zero out the NUMA fault statistics instead of
freeing them.

During execve, the task is reachable through procfs and the scheduler. A
concurrent /proc/*/sched reader can read data from a freed -&gt;numa_faults
allocation (confirmed by KASAN) and write it back to userspace.
I believe that it would also be possible for a use-after-free read to occur
through a race between a NUMA fault and execve(): task_numa_fault() can
lead to task_numa_compare(), which invokes task_weight() on the currently
running task of a different CPU.

Another way to fix this would be to make -&gt;numa_faults RCU-managed or add
extra locking, but it seems easier to wipe the NUMA fault statistics on
execve.

Signed-off-by: Jann Horn &lt;jannh@google.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Petr Mladek &lt;pmladek@suse.com&gt;
Cc: Sergey Senozhatsky &lt;sergey.senozhatsky@gmail.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Will Deacon &lt;will@kernel.org&gt;
Fixes: 82727018b0d3 ("sched/numa: Call task_numa_free() from do_execve()")
Link: https://lkml.kernel.org/r/20190716152047.14424-1-jannh@google.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>Merge branch 'siginfo-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace</title>
<updated>2019-07-09T04:48:15Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2019-07-09T04:48:15Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=5ad18b2e60b75c7297a998dea702451d33a052ed'/>
<id>urn:sha1:5ad18b2e60b75c7297a998dea702451d33a052ed</id>
<content type='text'>
Pull force_sig() argument change from Eric Biederman:
 "A source of error over the years has been that force_sig has taken a
  task parameter when it is only safe to use force_sig with the current
  task.

  The force_sig function is built for delivering synchronous signals
  such as SIGSEGV where the userspace application caused a synchronous
  fault (such as a page fault) and the kernel responded with a signal.

  Because the name force_sig does not make this clear, and because the
  force_sig takes a task parameter the function force_sig has been
  abused for sending other kinds of signals over the years. Slowly those
  have been fixed when the oopses have been tracked down.

  This set of changes fixes the remaining abusers of force_sig and
  carefully rips out the task parameter from force_sig and friends
  making this kind of error almost impossible in the future"

* 'siginfo-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (27 commits)
  signal/x86: Move tsk inside of CONFIG_MEMORY_FAILURE in do_sigbus
  signal: Remove the signal number and task parameters from force_sig_info
  signal: Factor force_sig_info_to_task out of force_sig_info
  signal: Generate the siginfo in force_sig
  signal: Move the computation of force into send_signal and correct it.
  signal: Properly set TRACE_SIGNAL_LOSE_INFO in __send_signal
  signal: Remove the task parameter from force_sig_fault
  signal: Use force_sig_fault_to_task for the two calls that don't deliver to current
  signal: Explicitly call force_sig_fault on current
  signal/unicore32: Remove tsk parameter from __do_user_fault
  signal/arm: Remove tsk parameter from __do_user_fault
  signal/arm: Remove tsk parameter from ptrace_break
  signal/nds32: Remove tsk parameter from send_sigtrap
  signal/riscv: Remove tsk parameter from do_trap
  signal/sh: Remove tsk parameter from force_sig_info_fault
  signal/um: Remove task parameter from send_sigtrap
  signal/x86: Remove task parameter from send_sigtrap
  signal: Remove task parameter from force_sig_mceerr
  signal: Remove task parameter from force_sig
  signal: Remove task parameter from force_sigsegv
  ...
</content>
</entry>
<entry>
<title>signal: Remove task parameter from force_sigsegv</title>
<updated>2019-05-27T14:36:28Z</updated>
<author>
<name>Eric W. Biederman</name>
<email>ebiederm@xmission.com</email>
</author>
<published>2019-05-21T15:03:48Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=cb44c9a0ab21a9ae4dfcabac1ed8e38aa872d1af'/>
<id>urn:sha1:cb44c9a0ab21a9ae4dfcabac1ed8e38aa872d1af</id>
<content type='text'>
The function force_sigsegv is always called on the current task
so passing in current is redundant and not passing in current
makes this fact obvious.

This also makes it clear force_sigsegv always calls force_sig
on the current task.

Signed-off-by: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
</content>
</entry>
<entry>
<title>treewide: Add SPDX license identifier for missed files</title>
<updated>2019-05-21T08:50:45Z</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2019-05-19T12:08:55Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=457c89965399115e5cd8bf38f9c597293405703d'/>
<id>urn:sha1:457c89965399115e5cd8bf38f9c597293405703d</id>
<content type='text'>
Add SPDX license identifiers to all files which:

 - Have no license information of any form

 - Have EXPORT_.*_SYMBOL_GPL inside which was used in the
   initial scan/conversion to ignore the file

These files fall under the project license, GPL v2 only. The resulting SPDX
license identifier is:

  GPL-2.0-only

Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>fs/exec.c: move -&gt;recursion_depth out of critical sections</title>
<updated>2019-05-15T02:52:50Z</updated>
<author>
<name>Alexey Dobriyan</name>
<email>adobriyan@gmail.com</email>
</author>
<published>2019-05-14T22:44:37Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=d53ddd0181d1c284c28c0d70a3d7039db41c6f7e'/>
<id>urn:sha1:d53ddd0181d1c284c28c0d70a3d7039db41c6f7e</id>
<content type='text'>
-&gt;recursion_depth is changed only by current, therefore decrementing can
be done without taking any locks.

Link: http://lkml.kernel.org/r/20190417213150.GA26474@avx2
Signed-off-by: Alexey Dobriyan &lt;adobriyan@gmail.com&gt;
Reviewed-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>exec: increase BINPRM_BUF_SIZE to 256</title>
<updated>2019-03-08T02:32:01Z</updated>
<author>
<name>Oleg Nesterov</name>
<email>oleg@redhat.com</email>
</author>
<published>2019-03-08T00:29:26Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=6eb3c3d0a52dca337e327ae8868ca1f44a712e02'/>
<id>urn:sha1:6eb3c3d0a52dca337e327ae8868ca1f44a712e02</id>
<content type='text'>
Large enterprise clients often run applications out of networked file
systems where the IT mandated layout of project volumes can end up
leading to paths that are longer than 128 characters.  Bumping this up
to the next order of two solves this problem in all but the most
egregious case while still fitting into a 512b slab.

[oleg@redhat.com: update comment, per Kees]
Link: http://lkml.kernel.org/r/20181112160956.GA28472@redhat.com
Signed-off-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Reported-by: Ben Woodard &lt;woodard@redhat.com&gt;
Reviewed-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Acked-by: Michal Hocko &lt;mhocko@suse.com&gt;
Acked-by: Kees Cook &lt;keescook@chromium.org&gt;
Cc: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>fs/exec.c: replace opencoded set_mask_bits()</title>
<updated>2019-03-08T02:32:01Z</updated>
<author>
<name>Vineet Gupta</name>
<email>vineet.gupta1@synopsys.com</email>
</author>
<published>2019-03-08T00:29:23Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=26e152252e92999b975fe935c666a090c46905f7'/>
<id>urn:sha1:26e152252e92999b975fe935c666a090c46905f7</id>
<content type='text'>
Link: http://lkml.kernel.org/r/1548275584-18096-2-git-send-email-vgupta@synopsys.com
Link: http://lkml.kernel.org/g/20150807115710.GA16897@redhat.com
Signed-off-by: Vineet Gupta &lt;vgupta@synopsys.com&gt;
Reviewed-by: Anthony Yznaga &lt;anthony.yznaga@oracle.com&gt;
Acked-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Cc: Alexander Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: Chris Wilson &lt;chris@chris-wilson.co.uk&gt;
Cc: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Jani Nikula &lt;jani.nikula@intel.com&gt;
Cc: Miklos Szeredi &lt;mszeredi@redhat.com&gt;
Cc: Theodore Ts'o &lt;tytso@mit.edu&gt;
Cc: Will Deacon &lt;will.deacon@arm.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip</title>
<updated>2019-03-06T16:14:05Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2019-03-06T16:14:05Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=45802da05e666a81b421422d3e302930c0e24e77'/>
<id>urn:sha1:45802da05e666a81b421422d3e302930c0e24e77</id>
<content type='text'>
Pull scheduler updates from Ingo Molnar:
 "The main changes in this cycle were:

   - refcount conversions

   - Solve the rq-&gt;leaf_cfs_rq_list can of worms for real.

   - improve power-aware scheduling

   - add sysctl knob for Energy Aware Scheduling

   - documentation updates

   - misc other changes"

* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (34 commits)
  kthread: Do not use TIMER_IRQSAFE
  kthread: Convert worker lock to raw spinlock
  sched/fair: Use non-atomic cpumask_{set,clear}_cpu()
  sched/fair: Remove unused 'sd' parameter from select_idle_smt()
  sched/wait: Use freezable_schedule() when possible
  sched/fair: Prune, fix and simplify the nohz_balancer_kick() comment block
  sched/fair: Explain LLC nohz kick condition
  sched/fair: Simplify nohz_balancer_kick()
  sched/topology: Fix percpu data types in struct sd_data &amp; struct s_data
  sched/fair: Simplify post_init_entity_util_avg() by calling it with a task_struct pointer argument
  sched/fair: Fix O(nr_cgroups) in the load balancing path
  sched/fair: Optimize update_blocked_averages()
  sched/fair: Fix insertion in rq-&gt;leaf_cfs_rq_list
  sched/fair: Add tmp_alone_branch assertion
  sched/core: Use READ_ONCE()/WRITE_ONCE() in move_queued_task()/task_rq_lock()
  sched/debug: Initialize sd_sysctl_cpus if !CONFIG_CPUMASK_OFFSTACK
  sched/pelt: Skip updating util_est when utilization is higher than CPU's capacity
  sched/fair: Update scale invariance of PELT
  sched/fair: Move the rq_of() helper function
  sched/core: Convert task_struct.stack_refcount to refcount_t
  ...
</content>
</entry>
<entry>
<title>exec: Fix mem leak in kernel_read_file</title>
<updated>2019-02-19T02:26:24Z</updated>
<author>
<name>YueHaibing</name>
<email>yuehaibing@huawei.com</email>
</author>
<published>2019-02-19T02:10:38Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=f612acfae86af7ecad754ae6a46019be9da05b8e'/>
<id>urn:sha1:f612acfae86af7ecad754ae6a46019be9da05b8e</id>
<content type='text'>
syzkaller report this:
BUG: memory leak
unreferenced object 0xffffc9000488d000 (size 9195520):
  comm "syz-executor.0", pid 2752, jiffies 4294787496 (age 18.757s)
  hex dump (first 32 bytes):
    ff ff ff ff ff ff ff ff a8 00 00 00 01 00 00 00  ................
    02 00 00 00 00 00 00 00 80 a1 7a c1 ff ff ff ff  ..........z.....
  backtrace:
    [&lt;000000000863775c&gt;] __vmalloc_node mm/vmalloc.c:1795 [inline]
    [&lt;000000000863775c&gt;] __vmalloc_node_flags mm/vmalloc.c:1809 [inline]
    [&lt;000000000863775c&gt;] vmalloc+0x8c/0xb0 mm/vmalloc.c:1831
    [&lt;000000003f668111&gt;] kernel_read_file+0x58f/0x7d0 fs/exec.c:924
    [&lt;000000002385813f&gt;] kernel_read_file_from_fd+0x49/0x80 fs/exec.c:993
    [&lt;0000000011953ff1&gt;] __do_sys_finit_module+0x13b/0x2a0 kernel/module.c:3895
    [&lt;000000006f58491f&gt;] do_syscall_64+0x147/0x600 arch/x86/entry/common.c:290
    [&lt;00000000ee78baf4&gt;] entry_SYSCALL_64_after_hwframe+0x49/0xbe
    [&lt;00000000241f889b&gt;] 0xffffffffffffffff

It should goto 'out_free' lable to free allocated buf while kernel_read
fails.

Fixes: 39d637af5aa7 ("vfs: forbid write access when reading a file into memory")
Signed-off-by: YueHaibing &lt;yuehaibing@huawei.com&gt;
Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</content>
</entry>
</feed>
