| Age | Commit message (Collapse) | Author | Lines |
|
The BUILD_BUG_ON check in domain_is_scoped() and
unmask_scoped_access() should check that the loop that counts down
client_layer finishes. We therefore check that the numbers
LANDLOCK_MAX_NUM_LAYERS-1 and -1 are both representable by that
integer. If they are representable, the numbers in between are
representable too, and the loop finishes.
Signed-off-by: Günther Noack <gnoack3000@gmail.com>
Link: https://lore.kernel.org/r/20260327164838.38231-6-gnoack3000@gmail.com
Signed-off-by: Mickaël Salaün <mic@digikod.net>
|
|
* Add a new access right LANDLOCK_ACCESS_FS_RESOLVE_UNIX, which
controls the lookup operations for named UNIX domain sockets. The
resolution happens during connect() and sendmsg() (depending on
socket type).
* Change access_mask_t from u16 to u32 (see below)
* Hook into the path lookup in unix_find_bsd() in af_unix.c, using a
LSM hook. Make policy decisions based on the new access rights
* Increment the Landlock ABI version.
* Minor test adaptations to keep the tests working.
* Document the design rationale for scoped access rights,
and cross-reference it from the header documentation.
With this access right, access is granted if either of the following
conditions is met:
* The target socket's filesystem path was allow-listed using a
LANDLOCK_RULE_PATH_BENEATH rule, *or*:
* The target socket was created in the same Landlock domain in which
LANDLOCK_ACCESS_FS_RESOLVE_UNIX was restricted.
In case of a denial, connect() and sendmsg() return EACCES, which is
the same error as it is returned if the user does not have the write
bit in the traditional UNIX file system permissions of that file.
The access_mask_t type grows from u16 to u32 to make space for the new
access right. This also doubles the size of struct layer_access_masks
from 32 byte to 64 byte. To avoid memory layout inconsistencies between
architectures (especially m68k), pack and align struct access_masks [2].
Document the (possible future) interaction between scoped flags and
other access rights in struct landlock_ruleset_attr, and summarize the
rationale, as discussed in code review leading up to [3].
This feature was created with substantial discussion and input from
Justin Suess, Tingmao Wang and Mickaël Salaün.
Cc: Tingmao Wang <m@maowtm.org>
Cc: Justin Suess <utilityemal77@gmail.com>
Cc: Kuniyuki Iwashima <kuniyu@google.com>
Suggested-by: Jann Horn <jannh@google.com>
Link[1]: https://github.com/landlock-lsm/linux/issues/36
Link[2]: https://lore.kernel.org/all/20260401.Re1Eesu1Yaij@digikod.net/
Link[3]: https://lore.kernel.org/all/20260205.8531e4005118@gnoack.org/
Signed-off-by: Günther Noack <gnoack3000@gmail.com>
Acked-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Link: https://lore.kernel.org/r/20260327164838.38231-5-gnoack3000@gmail.com
[mic: Fix kernel-doc formatting, pack and align access_masks]
Signed-off-by: Mickaël Salaün <mic@digikod.net>
|
|
This is equivalent, but expresses the intent a bit clearer.
Signed-off-by: Günther Noack <gnoack3000@gmail.com>
Link: https://lore.kernel.org/r/20260327164838.38231-3-gnoack3000@gmail.com
Signed-off-by: Mickaël Salaün <mic@digikod.net>
|
|
The insert_rule() and create_rule() functions take a
pointer-to-flexible-array parameter declared as:
const struct landlock_layer (*const layers)[]
The kernel-doc parser cannot handle a qualifier between * and the
parameter name in this syntax, producing spurious "Invalid param" and
"not described" warnings.
Remove the const qualifier of the "layers" argument to avoid this
parsing issue.
Cc: Günther Noack <gnoack@google.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Reviewed-by: Günther Noack <gnoack3000@gmail.com>
Link: https://lore.kernel.org/r/20260310172004.1839864-1-mic@digikod.net
Signed-off-by: Mickaël Salaün <mic@digikod.net>
|
|
Fix comment formatting in tsync.c to fit in 80 columns.
Cc: Günther Noack <gnoack@google.com>
Reviewed-by: Günther Noack <gnoack3000@gmail.com>
Link: https://lore.kernel.org/r/20260304193134.250495-4-mic@digikod.net
Signed-off-by: Mickaël Salaün <mic@digikod.net>
|
|
The canonical kernel-doc form is "Return:" (singular, without trailing
"s"). Normalize all existing "Returns:" occurrences across the Landlock
source tree to the canonical form.
Also fix capitalization for consistency. Balance descriptions to
describe all possible returned values.
Consolidate bullet-point return descriptions into inline text for
functions with simple two-value or three-value returns for consistency.
Cc: Günther Noack <gnoack@google.com>
Reviewed-by: Günther Noack <gnoack3000@gmail.com>
Link: https://lore.kernel.org/r/20260304193134.250495-3-mic@digikod.net
Signed-off-by: Mickaël Salaün <mic@digikod.net>
|
|
The kernel-doc -Wreturn check warns about functions with documentation
comments that lack a "Return:" section. Add "Return:" documentation to
all functions missing it so that kernel-doc -Wreturn passes cleanly.
Convert existing function descriptions into a formal "Return:" section.
Also fix the inaccurate return documentation for
landlock_merge_ruleset() which claimed to return @parent directly, and
document the previously missing ERR_PTR() error return path. Document
the ABI version and errata return paths for landlock_create_ruleset()
which were previously only implied by the prose.
Cc: Günther Noack <gnoack@google.com>
Reviewed-by: Günther Noack <gnoack3000@gmail.com>
Link: https://lore.kernel.org/r/20260304193134.250495-2-mic@digikod.net
Signed-off-by: Mickaël Salaün <mic@digikod.net>
|
|
LANDLOCK_RESTRICT_SELF_TSYNC does not allow
LANDLOCK_RESTRICT_SELF_LOG_SUBDOMAINS_OFF with ruleset_fd=-1, preventing
a multithreaded process from atomically propagating subdomain log muting
to all threads without creating a domain layer. Relax the fd=-1
condition to accept TSYNC alongside LOG_SUBDOMAINS_OFF, and update the
documentation accordingly.
Add flag validation tests for all TSYNC combinations with ruleset_fd=-1,
and audit tests verifying both transition directions: muting via TSYNC
(logged to not logged) and override via TSYNC (not logged to logged).
Cc: Günther Noack <gnoack@google.com>
Cc: stable@vger.kernel.org
Fixes: 42fc7e6543f6 ("landlock: Multithreading support for landlock_restrict_self()")
Reviewed-by: Günther Noack <gnoack3000@gmail.com>
Link: https://lore.kernel.org/r/20260407164107.2012589-2-mic@digikod.net
Signed-off-by: Mickaël Salaün <mic@digikod.net>
|
|
hook_cred_transfer() only copies the Landlock security blob when the
source credential has a domain. This is inconsistent with
landlock_restrict_self() which can set LOG_SUBDOMAINS_OFF on a
credential without creating a domain (via the ruleset_fd=-1 path): the
field is committed but not preserved across fork() because the child's
prepare_creds() calls hook_cred_transfer() which skips the copy when
domain is NULL.
This breaks the documented use case where a process mutes subdomain logs
before forking sandboxed children: the children lose the muting and
their domains produce unexpected audit records.
Fix this by unconditionally copying the Landlock credential blob.
Cc: Günther Noack <gnoack@google.com>
Cc: Jann Horn <jannh@google.com>
Cc: stable@vger.kernel.org
Fixes: ead9079f7569 ("landlock: Add LANDLOCK_RESTRICT_SELF_LOG_SUBDOMAINS_OFF")
Reviewed-by: Günther Noack <gnoack3000@gmail.com>
Link: https://lore.kernel.org/r/20260407164107.2012589-1-mic@digikod.net
Signed-off-by: Mickaël Salaün <mic@digikod.net>
|
|
In landlock_restrict_sibling_threads(), when the calling thread is
interrupted while waiting for sibling threads to prepare, it executes
a recovery path.
Previously, this path included a wait_for_completion() call on
all_prepared to prevent a Use-After-Free of the local shared_ctx.
However, this wait is redundant. Exiting the main do-while loop
already leads to a bottom cleanup section that unconditionally waits
for all_finished. Therefore, replacing the wait with a simple break
is safe, prevents UAF, and correctly unblocks the remaining task_works.
Clean up the error path by breaking the loop and updating the
surrounding comments to accurately reflect the state machine.
Suggested-by: Günther Noack <gnoack3000@gmail.com>
Signed-off-by: Yihan Ding <dingyihan@uniontech.com>
Tested-by: Günther Noack <gnoack3000@gmail.com>
Reviewed-by: Günther Noack <gnoack3000@gmail.com>
Link: https://lore.kernel.org/r/20260306021651.744723-3-dingyihan@uniontech.com
Signed-off-by: Mickaël Salaün <mic@digikod.net>
|
|
syzbot found a deadlock in landlock_restrict_sibling_threads().
When multiple threads concurrently call landlock_restrict_self() with
sibling thread restriction enabled, they can deadlock by mutually
queueing task_works on each other and then blocking in kernel space
(waiting for the other to finish).
Fix this by serializing the TSYNC operations within the same process
using the exec_update_lock. This prevents concurrent invocations
from deadlocking.
We use down_write_trylock() and restart the syscall if the lock
cannot be acquired immediately. This ensures that if a thread fails
to get the lock, it will return to userspace, allowing it to process
any pending TSYNC task_works from the lock holder, and then
transparently restart the syscall.
Fixes: 42fc7e6543f6 ("landlock: Multithreading support for landlock_restrict_self()")
Reported-by: syzbot+7ea2f5e9dfd468201817@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=7ea2f5e9dfd468201817
Suggested-by: Günther Noack <gnoack3000@gmail.com>
Suggested-by: Tingmao Wang <m@maowtm.org>
Tested-by: Justin Suess <utilityemal77@gmail.com>
Signed-off-by: Yihan Ding <dingyihan@uniontech.com>
Tested-by: Günther Noack <gnoack3000@gmail.com>
Reviewed-by: Günther Noack <gnoack3000@gmail.com>
Link: https://lore.kernel.org/r/20260306021651.744723-2-dingyihan@uniontech.com
Signed-off-by: Mickaël Salaün <mic@digikod.net>
|
|
Constify pointers when it makes sense.
Consistently use size_t for loops, especially to match works->size type.
Add new lines to improve readability.
Cc: Jann Horn <jannh@google.com>
Reviewed-by: Günther Noack <gnoack@google.com>
Link: https://lore.kernel.org/r/20260217122341.2359582-2-mic@digikod.net
Signed-off-by: Mickaël Salaün <mic@digikod.net>
|
|
If task_work_add() failed, ctx->task is put but the tsync_works struct
is not reset to its previous state. The first consequence is that the
kernel allocates memory for dying threads, which could lead to
user-accounted memory exhaustion (not very useful nor specific to this
case). The second consequence is that task_work_cancel(), called by
cancel_tsync_works(), can dereference a NULL task pointer.
Fix this issues by keeping a consistent works->size wrt the added task
work. This is done in a new tsync_works_trim() helper which also cleans
up the shared_ctx and work fields.
As a safeguard, add a pointer check to cancel_tsync_works() and update
tsync_works_release() accordingly.
Cc: Jann Horn <jannh@google.com>
Reviewed-by: Günther Noack <gnoack@google.com>
Link: https://lore.kernel.org/r/20260217122341.2359582-1-mic@digikod.net
[mic: Replace memset() with compound literal]
Signed-off-by: Mickaël Salaün <mic@digikod.net>
|
|
Auto-format with clang-format -i security/landlock/*.[ch]
Cc: Günther Noack <gnoack@google.com>
Cc: Kees Cook <kees@kernel.org>
Fixes: 69050f8d6d07 ("treewide: Replace kmalloc with kmalloc_obj for non-scalar types")
Reviewed-by: Günther Noack <gnoack3000@gmail.com>
Link: https://lore.kernel.org/r/20260303173632.88040-1-mic@digikod.net
Signed-off-by: Mickaël Salaün <mic@digikod.net>
|
|
This is the exact same thing as the 'alloc_obj()' version, only much
smaller because there are a lot fewer users of the *alloc_flex()
interface.
As with alloc_obj() version, this was done entirely with mindless brute
force, using the same script, except using 'flex' in the pattern rather
than 'objs*'.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
This is the result of running the Coccinelle script from
scripts/coccinelle/api/kmalloc_objs.cocci. The script is designed to
avoid scalar types (which need careful case-by-case checking), and
instead replace kmalloc-family calls that allocate struct or union
object instances:
Single allocations: kmalloc(sizeof(TYPE), ...)
are replaced with: kmalloc_obj(TYPE, ...)
Array allocations: kmalloc_array(COUNT, sizeof(TYPE), ...)
are replaced with: kmalloc_objs(TYPE, COUNT, ...)
Flex array allocations: kmalloc(struct_size(PTR, FAM, COUNT), ...)
are replaced with: kmalloc_flex(*PTR, FAM, COUNT, ...)
(where TYPE may also be *VAR)
The resulting allocations no longer return "void *", instead returning
"TYPE *".
Signed-off-by: Kees Cook <kees@kernel.org>
|
|
The layer masks data structure tracks the requested but unfulfilled
access rights during an operation's security check. It stores one bit
for each combination of access right and layer index. If the bit is
set, that access right is not granted (yet) in the given layer and we
have to traverse the path further upwards to grant it.
Previously, the layer masks were stored as arrays mapping from access
right indices to layer_mask_t. The layer_mask_t value then indicates
all layers in which the given access right is still (tentatively)
denied.
This patch introduces struct layer_access_masks instead: This struct
contains an array with the access_mask_t of each (tentatively) denied
access right in that layer.
The hypothesis of this patch is that this simplifies the code enough
so that the resulting code will run faster:
* We can use bitwise operations in multiple places where we previously
looped over bits individually with macros. (Should require less
branch speculation and lends itself to better loop unrolling.)
* Code is ~75 lines smaller.
Other noteworthy changes:
* In no_more_access(), call a new helper function may_refer(), which
only solves the asymmetric case. Previously, the code interleaved
the checks for the two symmetric cases in RENAME_EXCHANGE. It feels
that the code is clearer when renames without RENAME_EXCHANGE are
more obviously the normal case.
Tradeoffs:
This change improves performance, at a slight size increase to the
layer masks data structure.
This fixes the size of the data structure at 32 bytes for all types of
access rights. (64, once we introduce a 17th filesystem access right).
For filesystem access rights, at the moment, the data structure has
the same size as before, but once we introduce the 17th filesystem
access right, it will double in size (from 32 to 64 bytes), as
access_mask_t grows from 16 to 32 bit [1].
Link: https://lore.kernel.org/all/20260120.haeCh4li9Vae@digikod.net/ [1]
Signed-off-by: Günther Noack <gnoack3000@gmail.com>
Link: https://lore.kernel.org/r/20260206151154.97915-5-gnoack3000@gmail.com
[mic: Cosmetic fixes, moved struct layer_access_masks definition]
Signed-off-by: Mickaël Salaün <mic@digikod.net>
|
|
This helper function checks whether an access_mask_t has a subset of the
bits enabled than another one. This expresses the intent a bit smoother
in the code and does not cost us anything when it gets inlined.
Signed-off-by: Günther Noack <gnoack3000@gmail.com>
Link: https://lore.kernel.org/r/20260206151154.97915-4-gnoack3000@gmail.com
[mic: Improve subject]
Signed-off-by: Mickaël Salaün <mic@digikod.net>
|
|
Add errata section with code examples for querying errata and a warning
that most applications should not check errata. Use kernel-doc directives
to include errata descriptions from the header files instead of manual
links.
Also enhance existing DOC sections in security/landlock/errata/abi-*.h
files with Impact sections, and update the code comment in syscalls.c
to remind developers to update errata documentation when applicable.
This addresses the gap where the kernel implements errata tracking
but provides no user-facing documentation on how to use it, while
improving the existing technical documentation in-place rather than
duplicating it.
Signed-off-by: Samasth Norway Ananda <samasth.norway.ananda@oracle.com>
Reviewed-by: Günther Noack <gnoack3000@gmail.com>
Link: https://lore.kernel.org/r/20260128031814.2945394-3-samasth.norway.ananda@oracle.com
[mic: Cosmetic fix]
Signed-off-by: Mickaël Salaün <mic@digikod.net>
|
|
Move the socket type check earlier, so that we will later be able to add
elseifs for other types. Ordering of checks (socket is of a type we
enforce restrictions on) / (current creds have Landlock restrictions)
should not change anything.
Signed-off-by: Matthieu Buffet <matthieu@buffet.re>
Link: https://lore.kernel.org/r/20251212163704.142301-3-matthieu@buffet.re
Signed-off-by: Mickaël Salaün <mic@digikod.net>
|
|
Introduce the LANDLOCK_RESTRICT_SELF_TSYNC flag. With this flag, a
given Landlock ruleset is applied to all threads of the calling
process, instead of only the current one.
Without this flag, multithreaded userspace programs currently resort
to using the nptl(7)/libpsx hack for multithreaded policy enforcement,
which is also used by libcap and for setuid(2). Using this
userspace-based scheme, the threads of a process enforce the same
Landlock policy, but the resulting Landlock domains are still
separate. The domains being separate causes multiple problems:
* When using Landlock's "scoped" access rights, the domain identity is
used to determine whether an operation is permitted. As a result,
when using LANLDOCK_SCOPE_SIGNAL, signaling between sibling threads
stops working. This is a problem for programming languages and
frameworks which are inherently multithreaded (e.g. Go).
* In audit logging, the domains of separate threads in a process will
get logged with different domain IDs, even when they are based on
the same ruleset FD, which might confuse users.
Cc: Andrew G. Morgan <morgan@kernel.org>
Cc: John Johansen <john.johansen@canonical.com>
Cc: Paul Moore <paul@paul-moore.com>
Suggested-by: Jann Horn <jannh@google.com>
Signed-off-by: Günther Noack <gnoack@google.com>
Link: https://lore.kernel.org/r/20251127115136.3064948-2-gnoack@google.com
[mic: Fix restrict_self_flags test, clean up Makefile, allign comments,
reduce local variable scope, add missing includes]
Closes: https://github.com/landlock-lsm/linux/issues/2
Signed-off-by: Mickaël Salaün <mic@digikod.net>
|
|
Currently it is not obvious what "scoped" mean, and the fact that the
function returns true when access should be denied is slightly surprising
and in need of documentation.
Cc: Tahera Fahimi <fahimitahera@gmail.com>
Signed-off-by: Tingmao Wang <m@maowtm.org>
Link: https://lore.kernel.org/r/06393bc18aee5bc278df5ef31c64a05b742ebc10.1766885035.git.m@maowtm.org
[mic: Fix formatting and improve consistency]
Signed-off-by: Mickaël Salaün <mic@digikod.net>
|
|
Until now, each landlock_request struct were allocated on the stack, even
if not really used, because is_access_to_paths_allowed() unconditionally
modified the passed references. Even if the changed landlock_request
variables are not used, the compiler is not smart enough to detect this
case.
To avoid this issue, explicitly disable the related code when
CONFIG_AUDIT is not set, which enables elision of log_request_parent*
and associated caller's stack variables thanks to dead code elimination.
This makes it possible to reduce the stack frame by 32 bytes for the
path_link and path_rename hooks, and by 20 bytes for most other
filesystem hooks.
Here is a summary of scripts/stackdelta before and after this change
when CONFIG_AUDIT is disabled:
current_check_refer_path 560 320 -240
current_check_access_path 328 184 -144
hook_file_open 328 184 -144
is_access_to_paths_allowed 376 360 -16
Also, add extra pointer checks to be more future-proof.
Cc: Günther Noack <gnoack@google.com>
Reported-by: Tingmao Wang <m@maowtm.org>
Closes: https://lore.kernel.org/r/eb86863b-53b0-460b-b223-84dd31d765b9@maowtm.org
Fixes: 2fc80c69df82 ("landlock: Log file-related denials")
Link: https://lore.kernel.org/r/20251219142302.744917-2-mic@digikod.net
Reviewed-by: Günther Noack <gnoack3000@gmail.com>
[mic: Improve stack usage measurement accuracy with scripts/stackdelta]
Signed-off-by: Mickaël Salaün <mic@digikod.net>
|
|
Cc: Günther Noack <gnoack3000@gmail.com>
Link: https://lore.kernel.org/r/20251219193855.825889-4-mic@digikod.net
Reviewed-by: Günther Noack <gnoack3000@gmail.com>
Signed-off-by: Mickaël Salaün <mic@digikod.net>
|
|
Make variable's scope minimal in hook_ptrace_access_check().
Cc: Günther Noack <gnoack3000@gmail.com>
Link: https://lore.kernel.org/r/20251219193855.825889-3-mic@digikod.net
Reviewed-by: Günther Noack <gnoack3000@gmail.com>
Signed-off-by: Mickaël Salaün <mic@digikod.net>
|
|
Improve description about scoped signal handling.
Reported-by: Günther Noack <gnoack3000@gmail.com>
Link: https://lore.kernel.org/r/20251219193855.825889-2-mic@digikod.net
Reviewed-by: Günther Noack <gnoack3000@gmail.com>
Signed-off-by: Mickaël Salaün <mic@digikod.net>
|
|
Remove useless audit.h include.
Cc: Günther Noack <gnoack@google.com>
Fixes: 33e65b0d3add ("landlock: Add AUDIT_LANDLOCK_ACCESS and log ptrace denials")
Link: https://lore.kernel.org/r/20251219193855.825889-1-mic@digikod.net
Reviewed-by: Günther Noack <gnoack3000@gmail.com>
Signed-off-by: Mickaël Salaün <mic@digikod.net>
|
|
I think, based on my best understanding, that this type is likely a typo
(even though in the end both are u16)
Signed-off-by: Tingmao Wang <m@maowtm.org>
Fixes: 2fc80c69df82 ("landlock: Log file-related denials")
Reviewed-by: Günther Noack <gnoack@google.com>
Link: https://lore.kernel.org/r/7339ad7b47f998affd84ca629a334a71f913616d.1765040503.git.m@maowtm.org
Signed-off-by: Mickaël Salaün <mic@digikod.net>
|
|
current_check_access_socket() treats AF_UNSPEC addresses as
AF_INET ones, and only later adds special case handling to
allow connect(AF_UNSPEC), and on IPv4 sockets
bind(AF_UNSPEC+INADDR_ANY).
This would be fine except AF_UNSPEC addresses can be as
short as a bare AF_UNSPEC sa_family_t field, and nothing
more. The AF_INET code path incorrectly enforces a length of
sizeof(struct sockaddr_in) instead.
Move AF_UNSPEC edge case handling up inside the switch-case,
before the address is (potentially incorrectly) treated as
AF_INET.
Fixes: fff69fb03dde ("landlock: Support network rules with TCP bind and connect")
Signed-off-by: Matthieu Buffet <matthieu@buffet.re>
Link: https://lore.kernel.org/r/20251027190726.626244-4-matthieu@buffet.re
Signed-off-by: Mickaël Salaün <mic@digikod.net>
|
|
Format with clang-format -i security/landlock/*.[ch]
Cc: Christian Brauner <brauner@kernel.org>
Cc: Günther Noack <gnoack3000@gmail.com>
Cc: Mateusz Guzik <mjguzik@gmail.com>
Fixes: b4dbfd8653b3 ("Coccinelle-based conversion to use ->i_state accessors")
Link: https://lore.kernel.org/r/20251219193855.825889-5-mic@digikod.net
Reviewed-by: Günther Noack <gnoack3000@gmail.com>
Signed-off-by: Mickaël Salaün <mic@digikod.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux
Pull landlock updates from Mickaël Salaün:
"This mainly fixes handling of disconnected directories and adds new
tests"
* tag 'landlock-6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux:
selftests/landlock: Add disconnected leafs and branch test suites
selftests/landlock: Add tests for access through disconnected paths
landlock: Improve variable scope
landlock: Fix handling of disconnected directories
selftests/landlock: Fix makefile header list
landlock: Make docs in cred.h and domain.h visible
landlock: Minor comments improvements
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm
Pull LSM updates from Paul Moore:
- Rework the LSM initialization code
What started as a "quick" patch to enable a notification event once
all of the individual LSMs were initialized, snowballed a bit into a
30+ patch patchset when everything was done. Most of the patches, and
diffstat, is due to splitting out the initialization code into
security/lsm_init.c and cleaning up some of the mess that was there.
While not strictly necessary, it does cleanup the code signficantly,
and hopefully makes the upkeep a bit easier in the future.
Aside from the new LSM_STARTED_ALL notification, these changes also
ensure that individual LSM initcalls are only called when the LSM is
enabled at boot time. There should be a minor reduction in boot times
for those who build multiple LSMs into their kernels, but only enable
a subset at boot.
It is worth mentioning that nothing at present makes use of the
LSM_STARTED_ALL notification, but there is work in progress which is
dependent upon LSM_STARTED_ALL.
- Make better use of the seq_put*() helpers in device_cgroup
* tag 'lsm-pr-20251201' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm: (36 commits)
lsm: use unrcu_pointer() for current->cred in security_init()
device_cgroup: Refactor devcgroup_seq_show to use seq_put* helpers
lsm: add a LSM_STARTED_ALL notification event
lsm: consolidate all of the LSM framework initcalls
selinux: move initcalls to the LSM framework
ima,evm: move initcalls to the LSM framework
lockdown: move initcalls to the LSM framework
apparmor: move initcalls to the LSM framework
safesetid: move initcalls to the LSM framework
tomoyo: move initcalls to the LSM framework
smack: move initcalls to the LSM framework
ipe: move initcalls to the LSM framework
loadpin: move initcalls to the LSM framework
lsm: introduce an initcall mechanism into the LSM framework
lsm: group lsm_order_parse() with the other lsm_order_*() functions
lsm: output available LSMs when debugging
lsm: cleanup the debug and console output in lsm_init.c
lsm: add/tweak function header comment blocks in lsm_init.c
lsm: fold lsm_init_ordered() into security_init()
lsm: cleanup initialize_lsm() and rename to lsm_init_single()
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull vfs inode updates from Christian Brauner:
"Features:
- Hide inode->i_state behind accessors. Open-coded accesses prevent
asserting they are done correctly. One obvious aspect is locking,
but significantly more can be checked. For example it can be
detected when the code is clearing flags which are already missing,
or is setting flags when it is illegal (e.g., I_FREEING when
->i_count > 0)
- Provide accessors for ->i_state, converts all filesystems using
coccinelle and manual conversions (btrfs, ceph, smb, f2fs, gfs2,
overlayfs, nilfs2, xfs), and makes plain ->i_state access fail to
compile
- Rework I_NEW handling to operate without fences, simplifying the
code after the accessor infrastructure is in place
Cleanups:
- Move wait_on_inode() from writeback.h to fs.h
- Spell out fenced ->i_state accesses with explicit smp_wmb/smp_rmb
for clarity
- Cosmetic fixes to LRU handling
- Push list presence check into inode_io_list_del()
- Touch up predicts in __d_lookup_rcu()
- ocfs2: retire ocfs2_drop_inode() and I_WILL_FREE usage
- Assert on ->i_count in iput_final()
- Assert ->i_lock held in __iget()
Fixes:
- Add missing fences to I_NEW handling"
* tag 'vfs-6.19-rc1.inode' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (22 commits)
dcache: touch up predicts in __d_lookup_rcu()
fs: push list presence check into inode_io_list_del()
fs: cosmetic fixes to lru handling
fs: rework I_NEW handling to operate without fences
fs: make plain ->i_state access fail to compile
xfs: use the new ->i_state accessors
nilfs2: use the new ->i_state accessors
overlayfs: use the new ->i_state accessors
gfs2: use the new ->i_state accessors
f2fs: use the new ->i_state accessors
smb: use the new ->i_state accessors
ceph: use the new ->i_state accessors
btrfs: use the new ->i_state accessors
Manual conversion to use ->i_state accessors of all places not covered by coccinelle
Coccinelle-based conversion to use ->i_state accessors
fs: provide accessors for ->i_state
fs: spell out fenced ->i_state accesses with explicit smp_wmb/smp_rmb
fs: move wait_on_inode() from writeback.h to fs.h
fs: add missing fences to I_NEW handling
ocfs2: retire ocfs2_drop_inode() and I_WILL_FREE usage
...
|
|
This is now possible thanks to the disconnected directory fix.
Cc: Günther Noack <gnoack@google.com>
Cc: Song Liu <song@kernel.org>
Cc: Tingmao Wang <m@maowtm.org>
Link: https://lore.kernel.org/r/20251128172200.760753-3-mic@digikod.net
Signed-off-by: Mickaël Salaün <mic@digikod.net>
|
|
Disconnected files or directories can appear when they are visible and
opened from a bind mount, but have been renamed or moved from the source
of the bind mount in a way that makes them inaccessible from the mount
point (i.e. out of scope).
Previously, access rights tied to files or directories opened through a
disconnected directory were collected by walking the related hierarchy
down to the root of the filesystem, without taking into account the
mount point because it couldn't be found. This could lead to
inconsistent access results, potential access right widening, and
hard-to-debug renames, especially since such paths cannot be printed.
For a sandboxed task to create a disconnected directory, it needs to
have write access (i.e. FS_MAKE_REG, FS_REMOVE_FILE, and FS_REFER) to
the underlying source of the bind mount, and read access to the related
mount point. Because a sandboxed task cannot acquire more access
rights than those defined by its Landlock domain, this could lead to
inconsistent access rights due to missing permissions that should be
inherited from the mount point hierarchy, while inheriting permissions
from the filesystem hierarchy hidden by this mount point instead.
Landlock now handles files and directories opened from disconnected
directories by taking into account the filesystem hierarchy when the
mount point is not found in the hierarchy walk, and also always taking
into account the mount point from which these disconnected directories
were opened. This ensures that a rename is not allowed if it would
widen access rights [1].
The rationale is that, even if disconnected hierarchies might not be
visible or accessible to a sandboxed task, relying on the collected
access rights from them improves the guarantee that access rights will
not be widened during a rename because of the access right comparison
between the source and the destination (see LANDLOCK_ACCESS_FS_REFER).
It may look like this would grant more access on disconnected files and
directories, but the security policies are always enforced for all the
evaluated hierarchies. This new behavior should be less surprising to
users and safer from an access control perspective.
Remove a wrong WARN_ON_ONCE() canary in collect_domain_accesses() and
fix the related comment.
Because opened files have their access rights stored in the related file
security properties, there is no impact for disconnected or unlinked
files.
Cc: Christian Brauner <brauner@kernel.org>
Cc: Günther Noack <gnoack@google.com>
Cc: Song Liu <song@kernel.org>
Reported-by: Tingmao Wang <m@maowtm.org>
Closes: https://lore.kernel.org/r/027d5190-b37a-40a8-84e9-4ccbc352bcdf@maowtm.org
Closes: https://lore.kernel.org/r/09b24128f86973a6022e6aa8338945fcfb9a33e4.1749925391.git.m@maowtm.org
Fixes: b91c3e4ea756 ("landlock: Add support for file reparenting with LANDLOCK_ACCESS_FS_REFER")
Fixes: cb2c7d1a1776 ("landlock: Support filesystem access-control")
Link: https://lore.kernel.org/r/b0f46246-f2c5-42ca-93ce-0d629702a987@maowtm.org [1]
Reviewed-by: Tingmao Wang <m@maowtm.org>
Link: https://lore.kernel.org/r/20251128172200.760753-2-mic@digikod.net
Signed-off-by: Mickaël Salaün <mic@digikod.net>
|
|
This patch contains some small comment changes. The first three
comments for ruleset.c, I sort of made along the way while working on /
trying to understand Landlock, and the one from ruleset.h was from the
hashtable patch but extracted here. In fs.c, one comment which I found
would have been helpful to me when reading this.
Signed-off-by: Tingmao Wang <m@maowtm.org>
Link: https://lore.kernel.org/r/20250602134150.67189-1-m@maowtm.org
Link: https://lore.kernel.org/r/20297185fd71ffbb5ce4fec14b38e5444c719c96.1748379182.git.m@maowtm.org
[mic: Squash patches with updated description, cosmetic fixes]
Signed-off-by: Mickaël Salaün <mic@digikod.net>
|
|
At this point it is guaranteed this is not the last reference.
However, a recent addition of might_sleep() at top of iput() started
generating false-positives as it was executing for all values.
Remedy the problem by using the newly introduced iput_not_last().
Reported-by: syzbot+12479ae15958fc3f54ec@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/68d32659.a70a0220.4f78.0012.GAE@google.com/
Fixes: 2ef435a872ab ("fs: add might_sleep() annotation to iput() and more")
Signed-off-by: Mateusz Guzik <mjguzik@gmail.com>
Link: https://patch.msgid.link/20251105212025.807549-2-mjguzik@gmail.com
Reviewed-by: Mickaël Salaün <mic@digikod.net>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Reduce the duplication between the lsm_id struct and the DEFINE_LSM()
definition by linking the lsm_id struct directly into the individual
LSM's DEFINE_LSM() instance.
Linking the lsm_id into the LSM definition also allows us to simplify
the security_add_hooks() function by removing the code which populates
the lsm_idlist[] array and moving it into the normal LSM startup code
where the LSM list is parsed and the individual LSMs are enabled,
making for a cleaner implementation with less overhead at boot.
Reviewed-by: Kees Cook <kees@kernel.org>
Reviewed-by: John Johansen <john.johansen@canonical.com>
Reviewed-by: Casey Schaufler <casey@schaufler-ca.com>
Reviewed-by: Mimi Zohar <zohar@linux.ibm.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>
|
|
All places were patched by coccinelle with the default expecting that
->i_lock is held, afterwards entries got fixed up by hand to use
unlocked variants as needed.
The script:
@@
expression inode, flags;
@@
- inode->i_state & flags
+ inode_state_read(inode) & flags
@@
expression inode, flags;
@@
- inode->i_state &= ~flags
+ inode_state_clear(inode, flags)
@@
expression inode, flag1, flag2;
@@
- inode->i_state &= ~flag1 & ~flag2
+ inode_state_clear(inode, flag1 | flag2)
@@
expression inode, flags;
@@
- inode->i_state |= flags
+ inode_state_set(inode, flags)
@@
expression inode, flags;
@@
- inode->i_state = flags
+ inode_state_assign(inode, flags)
@@
expression inode, flags;
@@
- flags = inode->i_state
+ flags = inode_state_read(inode)
@@
expression inode, flags;
@@
- READ_ONCE(inode->i_state) & flags
+ inode_state_read(inode) & flags
Signed-off-by: Mateusz Guzik <mjguzik@gmail.com>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Instead of doing direct access to ->i_count, add a helper to handle
this. This will make it easier to convert i_count to a refcount later.
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Link: https://lore.kernel.org/9bc62a84c6b9d6337781203f60837bd98fbc4a96.1756222464.git.josef@toxicpanda.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux
Pull landlock update from Mickaël Salaün:
"Fix test issues, improve build compatibility, and add new tests"
* tag 'landlock-6.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux:
landlock: Fix cosmetic change
samples/landlock: Fix building on musl libc
landlock: Fix warning from KUnit tests
selftests/landlock: Add test to check rule tied to covered mount point
selftests/landlock: Fix build of audit_test
selftests/landlock: Fix readlink check
|
|
This line removal should not be there and it makes it more difficult to
backport the following patch.
Cc: Günther Noack <gnoack@google.com>
Cc: Konstantin Meskhidze <konstantin.meskhidze@huawei.com>
Fixes: 7a11275c3787 ("landlock: Refactor layer helpers")
Link: https://lore.kernel.org/r/20250719104204.545188-2-mic@digikod.net
Signed-off-by: Mickaël Salaün <mic@digikod.net>
|
|
get_id_range() expects a positive value as first argument but
get_random_u8() can return 0. Fix this by clamping it.
Validated by running the test in a for loop for 1000 times.
Note that MAX() is wrong as it is only supposed to be used for
constants, but max() is good here.
[..] ok 9 test_range2_rand1
[..] ok 10 test_range2_rand2
[..] ok 11 test_range2_rand15
[..] ------------[ cut here ]------------
[..] WARNING: CPU: 6 PID: 104 at security/landlock/id.c:99 test_range2_rand16 (security/landlock/id.c:99 (discriminator 1) security/landlock/id.c:234 (discriminator 1))
[..] Modules linked in:
[..] CPU: 6 UID: 0 PID: 104 Comm: kunit_try_catch Tainted: G N 6.16.0-rc1-dev-00001-g314a2f98b65f #1 PREEMPT(undef)
[..] Tainted: [N]=TEST
[..] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
[..] RIP: 0010:test_range2_rand16 (security/landlock/id.c:99 (discriminator 1) security/landlock/id.c:234 (discriminator 1))
[..] Code: 49 c7 c0 10 70 30 82 4c 89 ff 48 c7 c6 a0 63 1e 83 49 c7 45 a0 e0 63 1e 83 e8 3f 95 17 00 e9 1f ff ff ff 0f 0b e9 df fd ff ff <0f> 0b ba 01 00 00 00 e9 68 fe ff ff 49 89 45 a8 49 8d 4d a0 45 31
[..] RSP: 0000:ffff888104eb7c78 EFLAGS: 00010246
[..] RAX: 0000000000000000 RBX: 000000000870822c RCX: 0000000000000000
^^^^^^^^^^^^^^^^
[..]
[..] Call Trace:
[..]
[..] ---[ end trace 0000000000000000 ]---
[..] ok 12 test_range2_rand16
[..] # landlock_id: pass:12 fail:0 skip:0 total:12
[..] # Totals: pass:12 fail:0 skip:0 total:12
[..] ok 1 landlock_id
Fixes: d9d2a68ed44b ("landlock: Add unique ID generator")
Signed-off-by: Tingmao Wang <m@maowtm.org>
Link: https://lore.kernel.org/r/73e28efc5b8cc394608b99d5bc2596ca917d7c4a.1750003733.git.m@maowtm.org
[mic: Minor cosmetic improvements]
Signed-off-by: Mickaël Salaün <mic@digikod.net>
|
|
Reviewed-by: Christian Brauner <brauner@kernel.org>
Acked-by: Mickaël Salaün <mic@digikod.net>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Use the BIT() and BIT_ULL() macros in the new audit code instead of
explicit shifts to improve readability. Use bitmask instead of modulo
operation to simplify code.
Add test_range1_rand15() and test_range2_rand15() KUnit tests to improve
get_id_range() coverage.
Signed-off-by: Günther Noack <gnoack3000@gmail.com>
Link: https://lore.kernel.org/r/20250512093732.1408485-1-mic@digikod.net
Signed-off-by: Mickaël Salaün <mic@digikod.net>
|
|
A KUnit test checking boundaries triggers a canary warning, which may be
disturbing. Let's remove this test for now. Hopefully, KUnit will soon
get support for suppressing warning backtraces [1].
Cc: Alessandro Carminati <acarmina@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Günther Noack <gnoack@google.com>
Reported-by: Tingmao Wang <m@maowtm.org>
Closes: https://lore.kernel.org/r/20250327213807.12964-1-m@maowtm.org
Link: https://lore.kernel.org/r/20250425193249.78b45d2589575c15f483c3d8@linux-foundation.org [1]
Link: https://lore.kernel.org/r/20250503065359.3625407-1-mic@digikod.net
Signed-off-by: Mickaël Salaün <mic@digikod.net>
|
|
Fix, deduplicate, and improve rendering of landlock_restrict_self(2)'s
flags documentation.
The flags are now rendered like the syscall's parameters and
description.
Cc: Günther Noack <gnoack@google.com>
Cc: Paul Moore <paul@paul-moore.com>
Link: https://lore.kernel.org/r/20250416154716.1799902-2-mic@digikod.net
Signed-off-by: Mickaël Salaün <mic@digikod.net>
|
|
Move and fix the flags documentation, and improve formatting.
It makes more sense and it eases maintenance to document syscall flags
in landlock.h, where they are defined. This is already the case for
landlock_restrict_self(2)'s flags.
The flags are now rendered like the syscall's parameters and
description.
Cc: Günther Noack <gnoack@google.com>
Cc: Paul Moore <paul@paul-moore.com>
Link: https://lore.kernel.org/r/20250416154716.1799902-1-mic@digikod.net
Signed-off-by: Mickaël Salaün <mic@digikod.net>
|
|
As for other Audit's "pid" fields, Landlock should use the task's TGID
instead of its TID. Fix this issue by keeping a reference to the TGID
of the domain creator.
Existing tests already check for the PID but only with the thread group
leader, so always the TGID. A following patch adds dedicated tests for
non-leader thread.
Remove the current_real_cred() check which does not make sense because
we only reference a struct pid, whereas a previous version did reference
a struct cred instead.
Cc: Christian Brauner <brauner@kernel.org>
Cc: Paul Moore <paul@paul-moore.com>
Reviewed-by: Günther Noack <gnoack3000@gmail.com>
Link: https://lore.kernel.org/r/20250410171725.1265860-1-mic@digikod.net
Signed-off-by: Mickaël Salaün <mic@digikod.net>
|
|
landlock_put_hierarchy() can be called when an error occurs in
landlock_merge_ruleset() due to insufficient memory. In this case, the
domain's audit details might not have been allocated yet, which would
cause landlock_free_hierarchy_details() to print a warning (but still
safely handle this case).
We could keep the WARN_ON_ONCE(!hierarchy) but it's not worth it for
this kind of function, so let's remove it entirely.
Cc: Paul Moore <paul@paul-moore.com>
Reported-by: syzbot+8bca99e91de7e060e4ea@syzkaller.appspotmail.com
Link: https://lore.kernel.org/r/20250331104709.897062-1-mic@digikod.net
Reviewed-by: Günther Noack <gnoack@google.com>
Signed-off-by: Mickaël Salaün <mic@digikod.net>
|