<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/fs/exec.c, branch v6.0</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v6.0</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v6.0'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2022-09-13T17:38:43Z</updated>
<entry>
<title>Revert "fs/exec: allow to unshare a time namespace on vfork+exec"</title>
<updated>2022-09-13T17:38:43Z</updated>
<author>
<name>Andrei Vagin</name>
<email>avagin@gmail.com</email>
</author>
<published>2022-09-13T10:25:51Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=33a2d6bc3480f9f8ac8c8def29854f98cc8bfee2'/>
<id>urn:sha1:33a2d6bc3480f9f8ac8c8def29854f98cc8bfee2</id>
<content type='text'>
This reverts commit 133e2d3e81de5d9706cab2dd1d52d231c27382e5.

Alexey pointed out a few undesirable side effects of the reverted change.
First, it doesn't take into account that CLONE_VFORK can be used with
CLONE_THREAD. Second, a child process doesn't enter a target time name-space,
if its parent dies before the child calls exec. It happens because the parent
clears vfork_done.

Eric W. Biederman suggests installing a time namespace as a task gets a new mm.
It includes all new processes cloned without CLONE_VM and all tasks that call
exec(). This is an user API change, but we think there aren't users that depend
on the old behavior.

It is too late to make such changes in this release, so let's roll back
this patch and introduce the right one in the next release.

Cc: Alexey Izbyshev &lt;izbyshev@ispras.ru&gt;
Cc: Christian Brauner &lt;brauner@kernel.org&gt;
Cc: Dmitry Safonov &lt;0x7f454c46@gmail.com&gt;
Cc: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Cc: Florian Weimer &lt;fweimer@redhat.com&gt;
Cc: Kees Cook &lt;keescook@chromium.org&gt;
Signed-off-by: Andrei Vagin &lt;avagin@gmail.com&gt;
Signed-off-by: Kees Cook &lt;keescook@chromium.org&gt;
Link: https://lore.kernel.org/r/20220913102551.1121611-3-avagin@google.com
</content>
</entry>
<entry>
<title>Merge tag 'execve-v6.0-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux</title>
<updated>2022-08-19T21:02:24Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2022-08-19T21:02:24Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=50cd95ac46548429e5bba7ca75cc97d11a697947'/>
<id>urn:sha1:50cd95ac46548429e5bba7ca75cc97d11a697947</id>
<content type='text'>
Pull execve fix from Kees Cook:

 - Replace remaining kmap() uses with kmap_local_page() (Fabio M. De
   Francesco)

* tag 'execve-v6.0-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  exec: Replace kmap{,_atomic}() with kmap_local_page()
</content>
</entry>
<entry>
<title>exec: Replace kmap{,_atomic}() with kmap_local_page()</title>
<updated>2022-08-16T19:11:27Z</updated>
<author>
<name>Fabio M. De Francesco</name>
<email>fmdefrancesco@gmail.com</email>
</author>
<published>2022-08-03T18:28:56Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=3a608cfee97e99b3fff9ffe62246a098042e725d'/>
<id>urn:sha1:3a608cfee97e99b3fff9ffe62246a098042e725d</id>
<content type='text'>
The use of kmap() and kmap_atomic() are being deprecated in favor of
kmap_local_page().

There are two main problems with kmap(): (1) It comes with an overhead as
mapping space is restricted and protected by a global lock for
synchronization and (2) it also requires global TLB invalidation when the
kmap’s pool wraps and it might block when the mapping space is fully
utilized until a slot becomes available.

With kmap_local_page() the mappings are per thread, CPU local, can take
page faults, and can be called from any context (including interrupts).
It is faster than kmap() in kernels with HIGHMEM enabled. Furthermore,
the tasks can be preempted and, when they are scheduled to run again, the
kernel virtual addresses are restored and are still valid.

Since the use of kmap_local_page() in exec.c is safe, it should be
preferred everywhere in exec.c.

As said, since kmap_local_page() can be also called from atomic context,
and since remove_arg_zero() doesn't (and shouldn't ever) rely on an
implicit preempt_disable(), this function can also safely replace
kmap_atomic().

Therefore, replace kmap() and kmap_atomic() with kmap_local_page() in
fs/exec.c.

Tested with xfstests on a QEMU/KVM x86_32 VM, 6GB RAM, booting a kernel
with HIGHMEM64GB enabled.

Cc: Eric W. Biederman &lt;ebiederm@xmission.com&gt;
Suggested-by: Ira Weiny &lt;ira.weiny@intel.com&gt;
Reviewed-by: Ira Weiny &lt;ira.weiny@intel.com&gt;
Signed-off-by: Fabio M. De Francesco &lt;fmdefrancesco@gmail.com&gt;
Signed-off-by: Kees Cook &lt;keescook@chromium.org&gt;
Link: https://lore.kernel.org/r/20220803182856.28246-1-fmdefrancesco@gmail.com
</content>
</entry>
<entry>
<title>posix-cpu-timers: Cleanup CPU timers before freeing them during exec</title>
<updated>2022-08-09T18:02:13Z</updated>
<author>
<name>Thadeu Lima de Souza Cascardo</name>
<email>cascardo@canonical.com</email>
</author>
<published>2022-08-09T17:07:51Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=e362359ace6f87c201531872486ff295df306d13'/>
<id>urn:sha1:e362359ace6f87c201531872486ff295df306d13</id>
<content type='text'>
Commit 55e8c8eb2c7b ("posix-cpu-timers: Store a reference to a pid not a
task") started looking up tasks by PID when deleting a CPU timer.

When a non-leader thread calls execve, it will switch PIDs with the leader
process. Then, as it calls exit_itimers, posix_cpu_timer_del cannot find
the task because the timer still points out to the old PID.

That means that armed timers won't be disarmed, that is, they won't be
removed from the timerqueue_list. exit_itimers will still release their
memory, and when that list is later processed, it leads to a
use-after-free.

Clean up the timers from the de-threaded task before freeing them. This
prevents a reported use-after-free.

Fixes: 55e8c8eb2c7b ("posix-cpu-timers: Store a reference to a pid not a task")
Signed-off-by: Thadeu Lima de Souza Cascardo &lt;cascardo@canonical.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Reviewed-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Link: https://lore.kernel.org/r/20220809170751.164716-1-cascardo@canonical.com

</content>
</entry>
<entry>
<title>Merge tag 'execve-v5.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux</title>
<updated>2022-08-02T21:36:19Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2022-08-02T21:36:19Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=d7b767b5088d57ff9b5f9a0060c9ad0f9410b1c0'/>
<id>urn:sha1:d7b767b5088d57ff9b5f9a0060c9ad0f9410b1c0</id>
<content type='text'>
Pull execve updates from Kees Cook:

 - Allow unsharing time namespace on vfork+exec (Andrei Vagin)

 - Replace usage of deprecated kmap APIs (Fabio M. De Francesco)

 - Fix spelling mistake (Zhang Jiaming)

* tag 'execve-v5.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  exec: Call kmap_local_page() in copy_string_kernel()
  exec: Fix a spelling mistake
  selftests/timens: add a test for vfork+exit
  fs/exec: allow to unshare a time namespace on vfork+exec
</content>
</entry>
<entry>
<title>exec: Call kmap_local_page() in copy_string_kernel()</title>
<updated>2022-07-27T21:15:09Z</updated>
<author>
<name>Fabio M. De Francesco</name>
<email>fmdefrancesco@gmail.com</email>
</author>
<published>2022-07-24T21:25:23Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=c6e8e36c6ae4b11bed5643317afb66b6c3cadba8'/>
<id>urn:sha1:c6e8e36c6ae4b11bed5643317afb66b6c3cadba8</id>
<content type='text'>
The use of kmap_atomic() is being deprecated in favor of kmap_local_page().

With kmap_local_page(), the mappings are per thread, CPU local and not
globally visible. Furthermore, the mappings can be acquired from any
context (including interrupts).

Therefore, replace kmap_atomic() with kmap_local_page() in
copy_string_kernel(). Instead of open-coding local mapping + memcpy(),
use memcpy_to_page(). Delete a redundant call to flush_dcache_page().

Tested with xfstests on a QEMU/ KVM x86_32 VM, 6GB RAM, booting a kernel
with HIGHMEM64GB enabled.

Suggested-by: Ira Weiny &lt;ira.weiny@intel.com&gt;
Signed-off-by: Fabio M. De Francesco &lt;fmdefrancesco@gmail.com&gt;
Signed-off-by: Kees Cook &lt;keescook@chromium.org&gt;
Link: https://lore.kernel.org/r/20220724212523.13317-1-fmdefrancesco@gmail.com
</content>
</entry>
<entry>
<title>fix race between exit_itimers() and /proc/pid/timers</title>
<updated>2022-07-11T16:52:59Z</updated>
<author>
<name>Oleg Nesterov</name>
<email>oleg@redhat.com</email>
</author>
<published>2022-07-11T16:16:25Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=d5b36a4dbd06c5e8e36ca8ccc552f679069e2946'/>
<id>urn:sha1:d5b36a4dbd06c5e8e36ca8ccc552f679069e2946</id>
<content type='text'>
As Chris explains, the comment above exit_itimers() is not correct,
we can race with proc_timers_seq_ops. Change exit_itimers() to clear
signal-&gt;posix_timers with -&gt;siglock held.

Cc: &lt;stable@vger.kernel.org&gt;
Reported-by: chris@accessvector.net
Signed-off-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>exec: Fix a spelling mistake</title>
<updated>2022-07-01T22:06:14Z</updated>
<author>
<name>Zhang Jiaming</name>
<email>jiaming@nfschina.com</email>
</author>
<published>2022-06-29T07:29:32Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=5036793d7dbd0b14aec51526441a50b01c7bf66d'/>
<id>urn:sha1:5036793d7dbd0b14aec51526441a50b01c7bf66d</id>
<content type='text'>
Change 'wont't' to 'won't'.

Signed-off-by: Zhang Jiaming &lt;jiaming@nfschina.com&gt;
Reviewed-by: Souptick Joarder (HPE) &lt;jrdr.linux@gmail.com&gt;
Signed-off-by: Kees Cook &lt;keescook@chromium.org&gt;
Link: https://lore.kernel.org/r/20220629072932.27506-1-jiaming@nfschina.com
</content>
</entry>
<entry>
<title>fs/exec: allow to unshare a time namespace on vfork+exec</title>
<updated>2022-06-15T14:58:04Z</updated>
<author>
<name>Andrei Vagin</name>
<email>avagin@gmail.com</email>
</author>
<published>2022-06-13T06:07:22Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=133e2d3e81de5d9706cab2dd1d52d231c27382e5'/>
<id>urn:sha1:133e2d3e81de5d9706cab2dd1d52d231c27382e5</id>
<content type='text'>
Right now, a new process can't be forked in another time namespace
if it shares mm with its parent. It is prohibited, because each time
namespace has its own vvar page that is mapped into a process address
space.

When a process calls exec, it gets a new mm and so it could be "legal"
to switch time namespace in that case. This was not implemented and
now if we want to do this, we need to add another clone flag to not
break backward compatibility.

We don't have any user requests to switch times on exec except the
vfork+exec combination, so there is no reason to add a new clone flag.
As for vfork+exec, this should be safe to allow switching timens with
the current clone flag. Right now, vfork (CLONE_VFORK | CLONE_VM) fails
if a child is forked into another time namespace. With this change,
vfork creates a new process in parent's timens, and the following exec
does the actual switch to the target time namespace.

Suggested-by: Florian Weimer &lt;fweimer@redhat.com&gt;
Signed-off-by: Andrei Vagin &lt;avagin@gmail.com&gt;
Acked-by: Christian Brauner (Microsoft) &lt;brauner@kernel.org&gt;
Signed-off-by: Kees Cook &lt;keescook@chromium.org&gt;
Link: https://lore.kernel.org/r/20220613060723.197407-1-avagin@gmail.com
</content>
</entry>
<entry>
<title>Merge tag 'kthread-cleanups-for-v5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace</title>
<updated>2022-06-03T23:03:05Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2022-06-03T23:03:05Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=1ec6574a3c0a22c130c08e8c36c825cb87d68f8e'/>
<id>urn:sha1:1ec6574a3c0a22c130c08e8c36c825cb87d68f8e</id>
<content type='text'>
Pull kthread updates from Eric Biederman:
 "This updates init and user mode helper tasks to be ordinary user mode
  tasks.

  Commit 40966e316f86 ("kthread: Ensure struct kthread is present for
  all kthreads") caused init and the user mode helper threads that call
  kernel_execve to have struct kthread allocated for them. This struct
  kthread going away during execve in turned made a use after free of
  struct kthread possible.

  Here, commit 343f4c49f243 ("kthread: Don't allocate kthread_struct for
  init and umh") is enough to fix the use after free and is simple
  enough to be backportable.

  The rest of the changes pass struct kernel_clone_args to clean things
  up and cause the code to make sense.

  In making init and the user mode helpers tasks purely user mode tasks
  I ran into two complications. The function task_tick_numa was
  detecting tasks without an mm by testing for the presence of
  PF_KTHREAD. The initramfs code in populate_initrd_image was using
  flush_delayed_fput to ensuere the closing of all it's file descriptors
  was complete, and flush_delayed_fput does not work in a userspace
  thread.

  I have looked and looked and more complications and in my code review
  I have not found any, and neither has anyone else with the code
  sitting in linux-next"

* tag 'kthread-cleanups-for-v5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
  sched: Update task_tick_numa to ignore tasks without an mm
  fork: Stop allowing kthreads to call execve
  fork: Explicitly set PF_KTHREAD
  init: Deal with the init process being a user mode process
  fork: Generalize PF_IO_WORKER handling
  fork: Explicity test for idle tasks in copy_thread
  fork: Pass struct kernel_clone_args into copy_thread
  kthread: Don't allocate kthread_struct for init and umh
</content>
</entry>
</feed>
