<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/kernel/seccomp.c, branch v5.0</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v5.0</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v5.0'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2019-01-15T17:43:12Z</updated>
<entry>
<title>seccomp: fix UAF in user-trap code</title>
<updated>2019-01-15T17:43:12Z</updated>
<author>
<name>Tycho Andersen</name>
<email>tycho@tycho.ws</email>
</author>
<published>2019-01-12T18:24:20Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=a811dc61559e0c8003f1086c2a4dc8e4d5ae4cb8'/>
<id>urn:sha1:a811dc61559e0c8003f1086c2a4dc8e4d5ae4cb8</id>
<content type='text'>
On the failure path, we do an fput() of the listener fd if the filter fails
to install (e.g. because of a TSYNC race that's lost, or if the thread is
killed, etc.). fput() doesn't actually release the fd, it just ads it to a
work queue. Then the thread proceeds to free the filter, even though the
listener struct file has a reference to it.

To fix this, on the failure path let's set the private data to null, so we
know in -&gt;release() to ignore the filter.

Reported-by: syzbot+981c26489b2d1c6316ba@syzkaller.appspotmail.com
Fixes: 6a21cc50f0c7 ("seccomp: add a return code to trap to userspace")
Signed-off-by: Tycho Andersen &lt;tycho@tycho.ws&gt;
Acked-by: Kees Cook &lt;keescook@chromium.org&gt;
Signed-off-by: James Morris &lt;james.morris@microsoft.com&gt;
</content>
</entry>
<entry>
<title>seccomp: fix poor type promotion</title>
<updated>2018-12-14T00:49:01Z</updated>
<author>
<name>Tycho Andersen</name>
<email>tycho@tycho.ws</email>
</author>
<published>2018-12-13T02:46:54Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=319deec7db6c0aab276d2447f778e7cffed24c7c'/>
<id>urn:sha1:319deec7db6c0aab276d2447f778e7cffed24c7c</id>
<content type='text'>
sparse complains,

kernel/seccomp.c:1172:13: warning: incorrect type in assignment (different base types)
kernel/seccomp.c:1172:13:    expected restricted __poll_t [usertype] ret
kernel/seccomp.c:1172:13:    got int
kernel/seccomp.c:1173:13: warning: restricted __poll_t degrades to integer

Instead of assigning this to ret, since we don't use this anywhere, let's
just test it against 0 directly.

Signed-off-by: Tycho Andersen &lt;tycho@tycho.ws&gt;
Reported-by: 0day robot &lt;lkp@intel.com&gt;
Fixes: 6a21cc50f0c7 ("seccomp: add a return code to trap to userspace")
Signed-off-by: Kees Cook &lt;keescook@chromium.org&gt;
</content>
</entry>
<entry>
<title>seccomp: add a return code to trap to userspace</title>
<updated>2018-12-12T00:28:41Z</updated>
<author>
<name>Tycho Andersen</name>
<email>tycho@tycho.ws</email>
</author>
<published>2018-12-09T18:24:13Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=6a21cc50f0c7f87dae5259f6cfefe024412313f6'/>
<id>urn:sha1:6a21cc50f0c7f87dae5259f6cfefe024412313f6</id>
<content type='text'>
This patch introduces a means for syscalls matched in seccomp to notify
some other task that a particular filter has been triggered.

The motivation for this is primarily for use with containers. For example,
if a container does an init_module(), we obviously don't want to load this
untrusted code, which may be compiled for the wrong version of the kernel
anyway. Instead, we could parse the module image, figure out which module
the container is trying to load and load it on the host.

As another example, containers cannot mount() in general since various
filesystems assume a trusted image. However, if an orchestrator knows that
e.g. a particular block device has not been exposed to a container for
writing, it want to allow the container to mount that block device (that
is, handle the mount for it).

This patch adds functionality that is already possible via at least two
other means that I know about, both of which involve ptrace(): first, one
could ptrace attach, and then iterate through syscalls via PTRACE_SYSCALL.
Unfortunately this is slow, so a faster version would be to install a
filter that does SECCOMP_RET_TRACE, which triggers a PTRACE_EVENT_SECCOMP.
Since ptrace allows only one tracer, if the container runtime is that
tracer, users inside the container (or outside) trying to debug it will not
be able to use ptrace, which is annoying. It also means that older
distributions based on Upstart cannot boot inside containers using ptrace,
since upstart itself uses ptrace to monitor services while starting.

The actual implementation of this is fairly small, although getting the
synchronization right was/is slightly complex.

Finally, it's worth noting that the classic seccomp TOCTOU of reading
memory data from the task still applies here, but can be avoided with
careful design of the userspace handler: if the userspace handler reads all
of the task memory that is necessary before applying its security policy,
the tracee's subsequent memory edits will not be read by the tracer.

Signed-off-by: Tycho Andersen &lt;tycho@tycho.ws&gt;
CC: Kees Cook &lt;keescook@chromium.org&gt;
CC: Andy Lutomirski &lt;luto@amacapital.net&gt;
CC: Oleg Nesterov &lt;oleg@redhat.com&gt;
CC: Eric W. Biederman &lt;ebiederm@xmission.com&gt;
CC: "Serge E. Hallyn" &lt;serge@hallyn.com&gt;
Acked-by: Serge Hallyn &lt;serge@hallyn.com&gt;
CC: Christian Brauner &lt;christian@brauner.io&gt;
CC: Tyler Hicks &lt;tyhicks@canonical.com&gt;
CC: Akihiro Suda &lt;suda.akihiro@lab.ntt.co.jp&gt;
Signed-off-by: Kees Cook &lt;keescook@chromium.org&gt;
</content>
</entry>
<entry>
<title>seccomp: switch system call argument type to void *</title>
<updated>2018-12-12T00:28:41Z</updated>
<author>
<name>Tycho Andersen</name>
<email>tycho@tycho.ws</email>
</author>
<published>2018-12-09T18:24:12Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=a5662e4d81c4d4b08140c625d0f3c50b15786252'/>
<id>urn:sha1:a5662e4d81c4d4b08140c625d0f3c50b15786252</id>
<content type='text'>
The const qualifier causes problems for any code that wants to write to the
third argument of the seccomp syscall, as we will do in a future patch in
this series.

The third argument to the seccomp syscall is documented as void *, so
rather than just dropping the const, let's switch everything to use void *
as well.

I believe this is safe because of 1. the documentation above, 2. there's no
real type information exported about syscalls anywhere besides the man
pages.

Signed-off-by: Tycho Andersen &lt;tycho@tycho.ws&gt;
CC: Kees Cook &lt;keescook@chromium.org&gt;
CC: Andy Lutomirski &lt;luto@amacapital.net&gt;
CC: Oleg Nesterov &lt;oleg@redhat.com&gt;
CC: Eric W. Biederman &lt;ebiederm@xmission.com&gt;
CC: "Serge E. Hallyn" &lt;serge@hallyn.com&gt;
Acked-by: Serge Hallyn &lt;serge@hallyn.com&gt;
CC: Christian Brauner &lt;christian@brauner.io&gt;
CC: Tyler Hicks &lt;tyhicks@canonical.com&gt;
CC: Akihiro Suda &lt;suda.akihiro@lab.ntt.co.jp&gt;
Signed-off-by: Kees Cook &lt;keescook@chromium.org&gt;
</content>
</entry>
<entry>
<title>seccomp: hoist struct seccomp_data recalculation higher</title>
<updated>2018-12-12T00:28:40Z</updated>
<author>
<name>Tycho Andersen</name>
<email>tycho@tycho.ws</email>
</author>
<published>2018-12-09T18:24:11Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=db5113911abaa7eb20cf115d4339959c1aecea95'/>
<id>urn:sha1:db5113911abaa7eb20cf115d4339959c1aecea95</id>
<content type='text'>
In the next patch, we're going to use the sd pointer passed to
__seccomp_filter() as the data to pass to userspace. Except that in some
cases (__seccomp_filter(SECCOMP_RET_TRACE), emulate_vsyscall(), every time
seccomp is inovked on power, etc.) the sd pointer will be NULL in order to
force seccomp to recompute the register data. Previously this recomputation
happened one level lower, in seccomp_run_filters(); this patch just moves
it up a level higher to __seccomp_filter().

Thanks Oleg for spotting this.

Signed-off-by: Tycho Andersen &lt;tycho@tycho.ws&gt;
CC: Kees Cook &lt;keescook@chromium.org&gt;
CC: Andy Lutomirski &lt;luto@amacapital.net&gt;
CC: Oleg Nesterov &lt;oleg@redhat.com&gt;
CC: Eric W. Biederman &lt;ebiederm@xmission.com&gt;
CC: "Serge E. Hallyn" &lt;serge@hallyn.com&gt;
Acked-by: Serge Hallyn &lt;serge@hallyn.com&gt;
CC: Christian Brauner &lt;christian@brauner.io&gt;
CC: Tyler Hicks &lt;tyhicks@canonical.com&gt;
CC: Akihiro Suda &lt;suda.akihiro@lab.ntt.co.jp&gt;
Signed-off-by: Kees Cook &lt;keescook@chromium.org&gt;
</content>
</entry>
<entry>
<title>Merge branch 'next-general' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security</title>
<updated>2018-10-24T10:49:35Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2018-10-24T10:49:35Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=638820d8da8ededd6dc609beaef02d5396599c03'/>
<id>urn:sha1:638820d8da8ededd6dc609beaef02d5396599c03</id>
<content type='text'>
Pull security subsystem updates from James Morris:
 "In this patchset, there are a couple of minor updates, as well as some
  reworking of the LSM initialization code from Kees Cook (these prepare
  the way for ordered stackable LSMs, but are a valuable cleanup on
  their own)"

* 'next-general' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
  LSM: Don't ignore initialization failures
  LSM: Provide init debugging infrastructure
  LSM: Record LSM name in struct lsm_info
  LSM: Convert security_initcall() into DEFINE_LSM()
  vmlinux.lds.h: Move LSM_TABLE into INIT_DATA
  LSM: Convert from initcall to struct lsm_info
  LSM: Remove initcall tracing
  LSM: Rename .security_initcall section to .lsm_info
  vmlinux.lds.h: Avoid copy/paste of security_init section
  LSM: Correctly announce start of LSM initialization
  security: fix LSM description location
  keys: Fix the use of the C++ keyword "private" in uapi/linux/keyctl.h
  seccomp: remove unnecessary unlikely()
  security: tomoyo: Fix obsolete function
  security/capabilities: remove check for -EINVAL
</content>
</entry>
<entry>
<title>signal: Distinguish between kernel_siginfo and siginfo</title>
<updated>2018-10-03T14:47:43Z</updated>
<author>
<name>Eric W. Biederman</name>
<email>ebiederm@xmission.com</email>
</author>
<published>2018-09-25T09:27:20Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=ae7795bc6187a15ec51cf258abae656a625f9980'/>
<id>urn:sha1:ae7795bc6187a15ec51cf258abae656a625f9980</id>
<content type='text'>
Linus recently observed that if we did not worry about the padding
member in struct siginfo it is only about 48 bytes, and 48 bytes is
much nicer than 128 bytes for allocating on the stack and copying
around in the kernel.

The obvious thing of only adding the padding when userspace is
including siginfo.h won't work as there are sigframe definitions in
the kernel that embed struct siginfo.

So split siginfo in two; kernel_siginfo and siginfo.  Keeping the
traditional name for the userspace definition.  While the version that
is used internally to the kernel and ultimately will not be padded to
128 bytes is called kernel_siginfo.

The definition of struct kernel_siginfo I have put in include/signal_types.h

A set of buildtime checks has been added to verify the two structures have
the same field offsets.

To make it easy to verify the change kernel_siginfo retains the same
size as siginfo.  The reduction in size comes in a following change.

Signed-off-by: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
</content>
</entry>
<entry>
<title>seccomp: remove unnecessary unlikely()</title>
<updated>2018-09-06T20:29:59Z</updated>
<author>
<name>Igor Stoppa</name>
<email>igor.stoppa@gmail.com</email>
</author>
<published>2018-09-05T20:34:43Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=0d42d73a37ff91028785e42a6bf12fc020a277c1'/>
<id>urn:sha1:0d42d73a37ff91028785e42a6bf12fc020a277c1</id>
<content type='text'>
WARN_ON() already contains an unlikely(), so it's not necessary to wrap it
into another.

Signed-off-by: Igor Stoppa &lt;igor.stoppa@huawei.com&gt;
Acked-by: Kees Cook &lt;keescook@chromium.org&gt;
Cc: linux-security-module@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: James Morris &lt;james.morris@microsoft.com&gt;
</content>
</entry>
<entry>
<title>Merge tag 'audit-pr-20180605' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit</title>
<updated>2018-06-06T23:34:00Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2018-06-06T23:34:00Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=8b5c6a3a49d9ebc7dc288870b9c56c4f946035d8'/>
<id>urn:sha1:8b5c6a3a49d9ebc7dc288870b9c56c4f946035d8</id>
<content type='text'>
Pull audit updates from Paul Moore:
 "Another reasonable chunk of audit changes for v4.18, thirteen patches
  in total.

  The thirteen patches can mostly be broken down into one of four
  categories: general bug fixes, accessor functions for audit state
  stored in the task_struct, negative filter matches on executable
  names, and extending the (relatively) new seccomp logging knobs to the
  audit subsystem.

  The main driver for the accessor functions from Richard are the
  changes we're working on to associate audit events with containers,
  but I think they have some standalone value too so I figured it would
  be good to get them in now.

  The seccomp/audit patches from Tyler apply the seccomp logging
  improvements from a few releases ago to audit's seccomp logging;
  starting with this patchset the changes in
  /proc/sys/kernel/seccomp/actions_logged should apply to both the
  standard kernel logging and audit.

  As usual, everything passes the audit-testsuite and it happens to
  merge cleanly with your tree"

[ Heh, except it had trivial merge conflicts with the SELinux tree that
  also came in from Paul   - Linus ]

* tag 'audit-pr-20180605' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit:
  audit: Fix wrong task in comparison of session ID
  audit: use existing session info function
  audit: normalize loginuid read access
  audit: use new audit_context access funciton for seccomp_actions_logged
  audit: use inline function to set audit context
  audit: use inline function to get audit context
  audit: convert sessionid unset to a macro
  seccomp: Don't special case audited processes when logging
  seccomp: Audit attempts to modify the actions_logged sysctl
  seccomp: Configurable separator for the actions_logged string
  seccomp: Separate read and write code for actions_logged sysctl
  audit: allow not equal op for audit by executable
  audit: add syscall information to FEATURE_CHANGE records
</content>
</entry>
<entry>
<title>seccomp: Don't special case audited processes when logging</title>
<updated>2018-05-08T06:04:23Z</updated>
<author>
<name>Tyler Hicks</name>
<email>tyhicks@canonical.com</email>
</author>
<published>2018-05-04T01:08:15Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=326bee0286d7f6b0d780f5b75a35ea9fe489a802'/>
<id>urn:sha1:326bee0286d7f6b0d780f5b75a35ea9fe489a802</id>
<content type='text'>
Seccomp logging for "handled" actions such as RET_TRAP, RET_TRACE, or
RET_ERRNO can be very noisy for processes that are being audited. This
patch modifies the seccomp logging behavior to treat processes that are
being inspected via the audit subsystem the same as processes that
aren't under inspection. Handled actions will no longer be logged just
because the process is being inspected. Since v4.14, applications have
the ability to request logging of handled actions by using the
SECCOMP_FILTER_FLAG_LOG flag when loading seccomp filters.

With this patch, the logic for deciding if an action will be logged is:

  if action == RET_ALLOW:
    do not log
  else if action not in actions_logged:
    do not log
  else if action == RET_KILL:
    log
  else if action == RET_LOG:
    log
  else if filter-requests-logging:
    log
  else:
    do not log

Reported-by: Steve Grubb &lt;sgrubb@redhat.com&gt;
Signed-off-by: Tyler Hicks &lt;tyhicks@canonical.com&gt;
Acked-by: Kees Cook &lt;keescook@chromium.org&gt;
Signed-off-by: Paul Moore &lt;paul@paul-moore.com&gt;
</content>
</entry>
</feed>
