<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/kernel/ptrace.c, branch v5.12</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v5.12</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v5.12'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2021-03-27T20:09:10Z</updated>
<entry>
<title>Revert "kernel: treat PF_IO_WORKER like PF_KTHREAD for ptrace/signals"</title>
<updated>2021-03-27T20:09:10Z</updated>
<author>
<name>Jens Axboe</name>
<email>axboe@kernel.dk</email>
</author>
<published>2021-03-26T00:18:59Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=e8b33b8cfafcfcef287ae4c0f23a173bfcf617f3'/>
<id>urn:sha1:e8b33b8cfafcfcef287ae4c0f23a173bfcf617f3</id>
<content type='text'>
This reverts commit 6fb8f43cede0e4bd3ead847de78d531424a96be9.

The IO threads do allow signals now, including SIGSTOP, and we can allow
ptrace attach. Attaching won't reveal anything interesting for the IO
threads, but it will allow eg gdb to attach to a task with io_urings
and IO threads without complaining. And once attached, it will allow
the usual introspection into regular threads.

Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>kernel: treat PF_IO_WORKER like PF_KTHREAD for ptrace/signals</title>
<updated>2021-02-22T00:25:22Z</updated>
<author>
<name>Jens Axboe</name>
<email>axboe@kernel.dk</email>
</author>
<published>2021-02-18T14:01:11Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=6fb8f43cede0e4bd3ead847de78d531424a96be9'/>
<id>urn:sha1:6fb8f43cede0e4bd3ead847de78d531424a96be9</id>
<content type='text'>
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>Merge branch 'akpm' (patches from Andrew)</title>
<updated>2020-12-15T20:53:37Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2020-12-15T20:53:37Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=ac73e3dc8acd0a3be292755db30388c3580f5674'/>
<id>urn:sha1:ac73e3dc8acd0a3be292755db30388c3580f5674</id>
<content type='text'>
Merge misc updates from Andrew Morton:

 - a few random little subsystems

 - almost all of the MM patches which are staged ahead of linux-next
   material. I'll trickle to post-linux-next work in as the dependents
   get merged up.

Subsystems affected by this patch series: kthread, kbuild, ide, ntfs,
ocfs2, arch, and mm (slab-generic, slab, slub, dax, debug, pagecache,
gup, swap, shmem, memcg, pagemap, mremap, hmm, vmalloc, documentation,
kasan, pagealloc, memory-failure, hugetlb, vmscan, z3fold, compaction,
oom-kill, migration, cma, page-poison, userfaultfd, zswap, zsmalloc,
uaccess, zram, and cleanups).

* emailed patches from Andrew Morton &lt;akpm@linux-foundation.org&gt;: (200 commits)
  mm: cleanup kstrto*() usage
  mm: fix fall-through warnings for Clang
  mm: slub: convert sysfs sprintf family to sysfs_emit/sysfs_emit_at
  mm: shmem: convert shmem_enabled_show to use sysfs_emit_at
  mm:backing-dev: use sysfs_emit in macro defining functions
  mm: huge_memory: convert remaining use of sprintf to sysfs_emit and neatening
  mm: use sysfs_emit for struct kobject * uses
  mm: fix kernel-doc markups
  zram: break the strict dependency from lzo
  zram: add stat to gather incompressible pages since zram set up
  zram: support page writeback
  mm/process_vm_access: remove redundant initialization of iov_r
  mm/zsmalloc.c: rework the list_add code in insert_zspage()
  mm/zswap: move to use crypto_acomp API for hardware acceleration
  mm/zswap: fix passing zero to 'PTR_ERR' warning
  mm/zswap: make struct kernel_param_ops definitions const
  userfaultfd/selftests: hint the test runner on required privilege
  userfaultfd/selftests: fix retval check for userfaultfd_open()
  userfaultfd/selftests: always dump something in modes
  userfaultfd: selftests: make __{s,u}64 format specifiers portable
  ...
</content>
</entry>
<entry>
<title>mm: cleanup: remove unused tsk arg from __access_remote_vm</title>
<updated>2020-12-15T20:13:40Z</updated>
<author>
<name>John Hubbard</name>
<email>jhubbard@nvidia.com</email>
</author>
<published>2020-12-15T03:07:45Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=d3f5ffcacd1528736471bc78f03f06da6c4551cc'/>
<id>urn:sha1:d3f5ffcacd1528736471bc78f03f06da6c4551cc</id>
<content type='text'>
Despite a comment that said that page fault accounting would be charged to
whatever task_struct* was passed into __access_remote_vm(), the tsk
argument was actually unused.

Making page fault accounting actually use this task struct is quite a
project, so there is no point in keeping the tsk argument.

Delete both the comment, and the argument.

[rppt@linux.ibm.com: changelog addition]

Link: https://lkml.kernel.org/r/20201026074137.4147787-1-jhubbard@nvidia.com
Signed-off-by: John Hubbard &lt;jhubbard@nvidia.com&gt;
Reviewed-by: Mike Rapoport &lt;rppt@linux.ibm.com&gt;
Cc: Oleg Nesterov &lt;oleg@redhat.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>Merge tag 'core-entry-2020-12-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip</title>
<updated>2020-12-15T01:13:53Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2020-12-15T01:13:53Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=1ac0884d5474fea8dc6ceabbd0e870d1bf4b7b42'/>
<id>urn:sha1:1ac0884d5474fea8dc6ceabbd0e870d1bf4b7b42</id>
<content type='text'>
Pull core entry/exit updates from Thomas Gleixner:
 "A set of updates for entry/exit handling:

   - More generalization of entry/exit functionality

   - The consolidation work to reclaim TIF flags on x86 and also for
     non-x86 specific TIF flags which are solely relevant for syscall
     related work and have been moved into their own storage space. The
     x86 specific part had to be merged in to avoid a major conflict.

   - The TIF_NOTIFY_SIGNAL work which replaces the inefficient signal
     delivery mode of task work and results in an impressive performance
     improvement for io_uring. The non-x86 consolidation of this is
     going to come seperate via Jens.

   - The selective syscall redirection facility which provides a clean
     and efficient way to support the non-Linux syscalls of WINE by
     catching them at syscall entry and redirecting them to the user
     space emulation. This can be utilized for other purposes as well
     and has been designed carefully to avoid overhead for the regular
     fastpath. This includes the core changes and the x86 support code.

   - Simplification of the context tracking entry/exit handling for the
     users of the generic entry code which guarantee the proper ordering
     and protection.

   - Preparatory changes to make the generic entry code accomodate S390
     specific requirements which are mostly related to their syscall
     restart mechanism"

* tag 'core-entry-2020-12-14' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (36 commits)
  entry: Add syscall_exit_to_user_mode_work()
  entry: Add exit_to_user_mode() wrapper
  entry_Add_enter_from_user_mode_wrapper
  entry: Rename exit_to_user_mode()
  entry: Rename enter_from_user_mode()
  docs: Document Syscall User Dispatch
  selftests: Add benchmark for syscall user dispatch
  selftests: Add kselftest for syscall user dispatch
  entry: Support Syscall User Dispatch on common syscall entry
  kernel: Implement selective syscall userspace redirection
  signal: Expose SYS_USER_DISPATCH si_code type
  x86: vdso: Expose sigreturn address on vdso to the kernel
  MAINTAINERS: Add entry for common entry code
  entry: Fix boot for !CONFIG_GENERIC_ENTRY
  x86: Support HAVE_CONTEXT_TRACKING_OFFSTACK
  context_tracking: Only define schedule_user() on !HAVE_CONTEXT_TRACKING_OFFSTACK archs
  sched: Detect call to schedule from critical entry code
  context_tracking: Don't implement exception_enter/exit() on CONFIG_HAVE_CONTEXT_TRACKING_OFFSTACK
  context_tracking: Introduce HAVE_CONTEXT_TRACKING_OFFSTACK
  x86: Reclaim unused x86 TI flags
  ...
</content>
</entry>
<entry>
<title>ptrace: Set PF_SUPERPRIV when checking capability</title>
<updated>2020-11-17T20:53:22Z</updated>
<author>
<name>Mickaël Salaün</name>
<email>mic@linux.microsoft.com</email>
</author>
<published>2020-10-30T12:38:48Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=cf23705244c947151179f929774fabf71e239eee'/>
<id>urn:sha1:cf23705244c947151179f929774fabf71e239eee</id>
<content type='text'>
Commit 69f594a38967 ("ptrace: do not audit capability check when outputing
/proc/pid/stat") replaced the use of ns_capable() with
has_ns_capability{,_noaudit}() which doesn't set PF_SUPERPRIV.

Commit 6b3ad6649a4c ("ptrace: reintroduce usage of subjective credentials in
ptrace_has_cap()") replaced has_ns_capability{,_noaudit}() with
security_capable(), which doesn't set PF_SUPERPRIV neither.

Since commit 98f368e9e263 ("kernel: Add noaudit variant of ns_capable()"), a
new ns_capable_noaudit() helper is available.  Let's use it!

As a result, the signature of ptrace_has_cap() is restored to its original one.

Cc: Christian Brauner &lt;christian.brauner@ubuntu.com&gt;
Cc: Eric Paris &lt;eparis@redhat.com&gt;
Cc: Jann Horn &lt;jannh@google.com&gt;
Cc: Kees Cook &lt;keescook@chromium.org&gt;
Cc: Oleg Nesterov &lt;oleg@redhat.com&gt;
Cc: Serge E. Hallyn &lt;serge@hallyn.com&gt;
Cc: Tyler Hicks &lt;tyhicks@linux.microsoft.com&gt;
Cc: stable@vger.kernel.org
Fixes: 6b3ad6649a4c ("ptrace: reintroduce usage of subjective credentials in ptrace_has_cap()")
Fixes: 69f594a38967 ("ptrace: do not audit capability check when outputing /proc/pid/stat")
Signed-off-by: Mickaël Salaün &lt;mic@linux.microsoft.com&gt;
Reviewed-by: Jann Horn &lt;jannh@google.com&gt;
Signed-off-by: Kees Cook &lt;keescook@chromium.org&gt;
Link: https://lore.kernel.org/r/20201030123849.770769-2-mic@digikod.net
</content>
</entry>
<entry>
<title>ptrace: Migrate TIF_SYSCALL_EMU to use SYSCALL_WORK flag</title>
<updated>2020-11-16T20:53:16Z</updated>
<author>
<name>Gabriel Krisman Bertazi</name>
<email>krisman@collabora.com</email>
</author>
<published>2020-11-16T17:42:03Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=64eb35f701f04b30706e21d1b02636b5d31a37d2'/>
<id>urn:sha1:64eb35f701f04b30706e21d1b02636b5d31a37d2</id>
<content type='text'>
On architectures using the generic syscall entry code the architecture
independent syscall work is moved to flags in thread_info::syscall_work.
This removes architecture dependencies and frees up TIF bits.

Define SYSCALL_WORK_SYSCALL_EMU, use it in the generic entry code and
convert the code which uses the TIF specific helper functions to use the
new *_syscall_work() helpers which either resolve to the new mode for users
of the generic entry code or to the TIF based functions for the other
architectures.

Signed-off-by: Gabriel Krisman Bertazi &lt;krisman@collabora.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Reviewed-by: Andy Lutomirski &lt;luto@kernel.org&gt;
Link: https://lore.kernel.org/r/20201116174206.2639648-8-krisman@collabora.com
</content>
</entry>
<entry>
<title>ptrace: Migrate to use SYSCALL_TRACE flag</title>
<updated>2020-11-16T20:53:16Z</updated>
<author>
<name>Gabriel Krisman Bertazi</name>
<email>krisman@collabora.com</email>
</author>
<published>2020-11-16T17:42:02Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=64c19ba29b66e98af9306b4a7525fb22c895d252'/>
<id>urn:sha1:64c19ba29b66e98af9306b4a7525fb22c895d252</id>
<content type='text'>
On architectures using the generic syscall entry code the architecture
independent syscall work is moved to flags in thread_info::syscall_work.
This removes architecture dependencies and frees up TIF bits.

Define SYSCALL_WORK_SYSCALL_TRACE, use it in the generic entry code and
convert the code which uses the TIF specific helper functions to use the
new *_syscall_work() helpers which either resolve to the new mode for users
of the generic entry code or to the TIF based functions for the other
architectures.

Signed-off-by: Gabriel Krisman Bertazi &lt;krisman@collabora.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Reviewed-by: Andy Lutomirski &lt;luto@kernel.org&gt;
Link: https://lore.kernel.org/r/20201116174206.2639648-7-krisman@collabora.com


</content>
</entry>
<entry>
<title>ptrace: reintroduce usage of subjective credentials in ptrace_has_cap()</title>
<updated>2020-01-18T12:51:39Z</updated>
<author>
<name>Christian Brauner</name>
<email>christian.brauner@ubuntu.com</email>
</author>
<published>2020-01-15T13:42:34Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=6b3ad6649a4c75504edeba242d3fd36b3096a57f'/>
<id>urn:sha1:6b3ad6649a4c75504edeba242d3fd36b3096a57f</id>
<content type='text'>
Commit 69f594a38967 ("ptrace: do not audit capability check when outputing /proc/pid/stat")
introduced the ability to opt out of audit messages for accesses to various
proc files since they are not violations of policy.  While doing so it
somehow switched the check from ns_capable() to
has_ns_capability{_noaudit}(). That means it switched from checking the
subjective credentials of the task to using the objective credentials. This
is wrong since. ptrace_has_cap() is currently only used in
ptrace_may_access() And is used to check whether the calling task (subject)
has the CAP_SYS_PTRACE capability in the provided user namespace to operate
on the target task (object). According to the cred.h comments this would
mean the subjective credentials of the calling task need to be used.
This switches ptrace_has_cap() to use security_capable(). Because we only
call ptrace_has_cap() in ptrace_may_access() and in there we already have a
stable reference to the calling task's creds under rcu_read_lock() there's
no need to go through another series of dereferences and rcu locking done
in ns_capable{_noaudit}().

As one example where this might be particularly problematic, Jann pointed
out that in combination with the upcoming IORING_OP_OPENAT feature, this
bug might allow unprivileged users to bypass the capability checks while
asynchronously opening files like /proc/*/mem, because the capability
checks for this would be performed against kernel credentials.

To illustrate on the former point about this being exploitable: When
io_uring creates a new context it records the subjective credentials of the
caller. Later on, when it starts to do work it creates a kernel thread and
registers a callback. The callback runs with kernel creds for
ktask-&gt;real_cred and ktask-&gt;cred. To prevent this from becoming a
full-blown 0-day io_uring will call override_cred() and override
ktask-&gt;cred with the subjective credentials of the creator of the io_uring
instance. With ptrace_has_cap() currently looking at ktask-&gt;real_cred this
override will be ineffective and the caller will be able to open arbitray
proc files as mentioned above.
Luckily, this is currently not exploitable but will turn into a 0-day once
IORING_OP_OPENAT{2} land in v5.6. Fix it now!

Cc: Oleg Nesterov &lt;oleg@redhat.com&gt;
Cc: Eric Paris &lt;eparis@redhat.com&gt;
Cc: stable@vger.kernel.org
Reviewed-by: Kees Cook &lt;keescook@chromium.org&gt;
Reviewed-by: Serge Hallyn &lt;serge@hallyn.com&gt;
Reviewed-by: Jann Horn &lt;jannh@google.com&gt;
Fixes: 69f594a38967 ("ptrace: do not audit capability check when outputing /proc/pid/stat")
Signed-off-by: Christian Brauner &lt;christian.brauner@ubuntu.com&gt;
</content>
</entry>
<entry>
<title>ptrace: add PTRACE_GET_SYSCALL_INFO request</title>
<updated>2019-07-17T02:23:24Z</updated>
<author>
<name>Elvira Khabirova</name>
<email>lineprinter@altlinux.org</email>
</author>
<published>2019-07-16T23:29:42Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=201766a20e30f982ccfe36bebfad9602c3ff574a'/>
<id>urn:sha1:201766a20e30f982ccfe36bebfad9602c3ff574a</id>
<content type='text'>
PTRACE_GET_SYSCALL_INFO is a generic ptrace API that lets ptracer obtain
details of the syscall the tracee is blocked in.

There are two reasons for a special syscall-related ptrace request.

Firstly, with the current ptrace API there are cases when ptracer cannot
retrieve necessary information about syscalls.  Some examples include:

 * The notorious int-0x80-from-64-bit-task issue. See [1] for details.
   In short, if a 64-bit task performs a syscall through int 0x80, its
   tracer has no reliable means to find out that the syscall was, in
   fact, a compat syscall, and misidentifies it.

 * Syscall-enter-stop and syscall-exit-stop look the same for the
   tracer. Common practice is to keep track of the sequence of
   ptrace-stops in order not to mix the two syscall-stops up. But it is
   not as simple as it looks; for example, strace had a (just recently
   fixed) long-standing bug where attaching strace to a tracee that is
   performing the execve system call led to the tracer identifying the
   following syscall-exit-stop as syscall-enter-stop, which messed up
   all the state tracking.

 * Since the introduction of commit 84d77d3f06e7 ("ptrace: Don't allow
   accessing an undumpable mm"), both PTRACE_PEEKDATA and
   process_vm_readv become unavailable when the process dumpable flag is
   cleared. On such architectures as ia64 this results in all syscall
   arguments being unavailable for the tracer.

Secondly, ptracers also have to support a lot of arch-specific code for
obtaining information about the tracee.  For some architectures, this
requires a ptrace(PTRACE_PEEKUSER, ...) invocation for every syscall
argument and return value.

ptrace(2) man page:

long ptrace(enum __ptrace_request request, pid_t pid,
            void *addr, void *data);
...
PTRACE_GET_SYSCALL_INFO
       Retrieve information about the syscall that caused the stop.
       The information is placed into the buffer pointed by "data"
       argument, which should be a pointer to a buffer of type
       "struct ptrace_syscall_info".
       The "addr" argument contains the size of the buffer pointed to
       by "data" argument (i.e., sizeof(struct ptrace_syscall_info)).
       The return value contains the number of bytes available
       to be written by the kernel.
       If the size of data to be written by the kernel exceeds the size
       specified by "addr" argument, the output is truncated.

[ldv@altlinux.org: selftests/seccomp/seccomp_bpf: update for PTRACE_GET_SYSCALL_INFO]
  Link: http://lkml.kernel.org/r/20190708182904.GA12332@altlinux.org
Link: http://lkml.kernel.org/r/20190510152842.GF28558@altlinux.org
Signed-off-by: Elvira Khabirova &lt;lineprinter@altlinux.org&gt;
Co-developed-by: Dmitry V. Levin &lt;ldv@altlinux.org&gt;
Signed-off-by: Dmitry V. Levin &lt;ldv@altlinux.org&gt;
Reviewed-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Reviewed-by: Kees Cook &lt;keescook@chromium.org&gt;
Reviewed-by: Andy Lutomirski &lt;luto@kernel.org&gt;
Cc: Eugene Syromyatnikov &lt;esyr@redhat.com&gt;
Cc: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
Cc: Greentime Hu &lt;greentime@andestech.com&gt;
Cc: Helge Deller &lt;deller@gmx.de&gt;	[parisc]
Cc: James E.J. Bottomley &lt;jejb@parisc-linux.org&gt;
Cc: James Hogan &lt;jhogan@kernel.org&gt;
Cc: kbuild test robot &lt;lkp@intel.com&gt;
Cc: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Cc: Paul Burton &lt;paul.burton@mips.com&gt;
Cc: Paul Mackerras &lt;paulus@samba.org&gt;
Cc: Ralf Baechle &lt;ralf@linux-mips.org&gt;
Cc: Richard Kuo &lt;rkuo@codeaurora.org&gt;
Cc: Shuah Khan &lt;shuah@kernel.org&gt;
Cc: Vincent Chen &lt;deanbo422@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
</feed>
