<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/kernel/rseq.c, branch v6.14</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v6.14</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v6.14'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2025-02-21T13:21:02Z</updated>
<entry>
<title>rseq: Fix rseq registration with CONFIG_DEBUG_RSEQ</title>
<updated>2025-02-21T13:21:02Z</updated>
<author>
<name>Michael Jeanson</name>
<email>mjeanson@efficios.com</email>
</author>
<published>2025-02-19T20:53:26Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=dc0a241ceaf3b7df6f1a7658b020c92682b75bfc'/>
<id>urn:sha1:dc0a241ceaf3b7df6f1a7658b020c92682b75bfc</id>
<content type='text'>
With CONFIG_DEBUG_RSEQ=y, at rseq registration the read-only fields are
copied from user-space, if this copy fails the syscall returns -EFAULT
and the registration should not be activated - but it erroneously is.

Move the activation of the registration after the copy of the fields to
fix this bug.

Fixes: 7d5265ffcd8b ("rseq: Validate read-only fields under DEBUG_RSEQ config")
Signed-off-by: Michael Jeanson &lt;mjeanson@efficios.com&gt;
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Reviewed-by: Mathieu Desnoyers &lt;mathieu.desnoyers@efficios.com&gt;
Link: https://lore.kernel.org/r/20250219205330.324770-1-mjeanson@efficios.com
</content>
</entry>
<entry>
<title>rseq: Fix rseq unregistration regression</title>
<updated>2025-01-21T07:10:51Z</updated>
<author>
<name>Mathieu Desnoyers</name>
<email>mathieu.desnoyers@efficios.com</email>
</author>
<published>2025-01-16T20:59:56Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=40724ecafccb1fb62b66264854e8c3ad394c8f3d'/>
<id>urn:sha1:40724ecafccb1fb62b66264854e8c3ad394c8f3d</id>
<content type='text'>
A logic inversion in rseq_reset_rseq_cpu_node_id() causes the rseq
unregistration to fail when rseq_validate_ro_fields() succeeds rather
than the opposite.

This affects both CONFIG_DEBUG_RSEQ=y and CONFIG_DEBUG_RSEQ=n.

Fixes: 7d5265ffcd8b ("rseq: Validate read-only fields under DEBUG_RSEQ config")
Signed-off-by: Mathieu Desnoyers &lt;mathieu.desnoyers@efficios.com&gt;
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Link: https://lore.kernel.org/r/20250116205956.836074-1-mathieu.desnoyers@efficios.com
</content>
</entry>
<entry>
<title>rseq: Validate read-only fields under DEBUG_RSEQ config</title>
<updated>2024-12-10T14:07:06Z</updated>
<author>
<name>Mathieu Desnoyers</name>
<email>mathieu.desnoyers@efficios.com</email>
</author>
<published>2024-11-12T15:28:26Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=7d5265ffcd8b41da5e09066360540d6e0716e9cd'/>
<id>urn:sha1:7d5265ffcd8b41da5e09066360540d6e0716e9cd</id>
<content type='text'>
The rseq uapi requires cooperation between users of the rseq fields
to ensure that all libraries and applications using rseq within a
process do not interfere with each other.

This is especially important for fields which are meant to be read-only
from user-space, as documented in uapi/linux/rseq.h:

  - cpu_id_start,
  - cpu_id,
  - node_id,
  - mm_cid.

Storing to those fields from a user-space library prevents any sharing
of the rseq ABI with other libraries and applications, as other users
are not aware that the content of those fields has been altered by a
third-party library.

This is unfortunately the current behavior of tcmalloc: it purposefully
overlaps part of a cached value with the cpu_id_start upper bits to get
notified about preemption, because the kernel clears those upper bits
before returning to user-space. This behavior does not conform to the
rseq uapi header ABI.

This prevents tcmalloc from using rseq when rseq is registered by the
GNU C library 2.35+. It requires tcmalloc users to disable glibc rseq
registration with a glibc tunable, which is a sad state of affairs.

Considering that tcmalloc and the GNU C library are the two first
upstream projects using rseq, and that they are already incompatible due
to use of this hack, adding kernel-level validation of all read-only
fields content is necessary to ensure future users of rseq abide by the
rseq ABI requirements.

Validate that user-space does not corrupt the read-only fields and
conform to the rseq uapi header ABI when the kernel is built with
CONFIG_DEBUG_RSEQ=y. This is done by storing a copy of the read-only
fields in the task_struct, and validating the prior values present in
user-space before updating them. If the values do not match, print
a warning on the console (printk_ratelimited()).

This is a first step to identify misuses of the rseq ABI by printing
a warning on the console. After a giving some time to userspace to
correct its use of rseq, the plan is to eventually terminate offending
processes with SIGSEGV.

This change is expected to produce warnings for the upstream tcmalloc
implementation, but tcmalloc developers mentioned they were open to
adapt their implementation to kernel-level change.

Signed-off-by: Mathieu Desnoyers &lt;mathieu.desnoyers@efficios.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Link: https://github.com/google/tcmalloc/issues/144
</content>
</entry>
<entry>
<title>rseq: Extend struct rseq with per-memory-map concurrency ID</title>
<updated>2022-12-27T11:52:12Z</updated>
<author>
<name>Mathieu Desnoyers</name>
<email>mathieu.desnoyers@efficios.com</email>
</author>
<published>2022-11-22T20:39:10Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=f7b01bb0b57f994a44ea6368536b59062b796381'/>
<id>urn:sha1:f7b01bb0b57f994a44ea6368536b59062b796381</id>
<content type='text'>
If a memory map has fewer threads than there are cores on the system, or
is limited to run on few cores concurrently through sched affinity or
cgroup cpusets, the concurrency IDs will be values close to 0, thus
allowing efficient use of user-space memory for per-cpu data structures.

Signed-off-by: Mathieu Desnoyers &lt;mathieu.desnoyers@efficios.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Link: https://lore.kernel.org/r/20221122203932.231377-9-mathieu.desnoyers@efficios.com
</content>
</entry>
<entry>
<title>rseq: Extend struct rseq with numa node id</title>
<updated>2022-12-27T11:52:10Z</updated>
<author>
<name>Mathieu Desnoyers</name>
<email>mathieu.desnoyers@efficios.com</email>
</author>
<published>2022-11-22T20:39:06Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=cbae6bac29a8c5cf2f1cb5c6bce35af00cec164b'/>
<id>urn:sha1:cbae6bac29a8c5cf2f1cb5c6bce35af00cec164b</id>
<content type='text'>
Adding the NUMA node id to struct rseq is a straightforward thing to do,
and a good way to figure out if anything in the user-space ecosystem
prevents extending struct rseq.

This NUMA node id field allows memory allocators such as tcmalloc to
take advantage of fast access to the current NUMA node id to perform
NUMA-aware memory allocation.

It can also be useful for implementing fast-paths for NUMA-aware
user-space mutexes.

It also allows implementing getcpu(2) purely in user-space.

Signed-off-by: Mathieu Desnoyers &lt;mathieu.desnoyers@efficios.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Link: https://lore.kernel.org/r/20221122203932.231377-5-mathieu.desnoyers@efficios.com
</content>
</entry>
<entry>
<title>rseq: Introduce extensible rseq ABI</title>
<updated>2022-12-27T11:52:10Z</updated>
<author>
<name>Mathieu Desnoyers</name>
<email>mathieu.desnoyers@efficios.com</email>
</author>
<published>2022-11-22T20:39:05Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=ee3e3ac05c2631ce1f12d88c9cc9a092f8fe947a'/>
<id>urn:sha1:ee3e3ac05c2631ce1f12d88c9cc9a092f8fe947a</id>
<content type='text'>
Introduce the extensible rseq ABI, where the feature size supported by
the kernel and the required alignment are communicated to user-space
through ELF auxiliary vectors.

This allows user-space to call rseq registration with a rseq_len of
either 32 bytes for the original struct rseq size (which includes
padding), or larger.

If rseq_len is larger than 32 bytes, then it must be large enough to
contain the feature size communicated to user-space through ELF
auxiliary vectors.

Signed-off-by: Mathieu Desnoyers &lt;mathieu.desnoyers@efficios.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Link: https://lore.kernel.org/r/20221122203932.231377-4-mathieu.desnoyers@efficios.com
</content>
</entry>
<entry>
<title>rseq: Use pr_warn_once() when deprecated/unknown ABI flags are encountered</title>
<updated>2022-11-14T08:58:32Z</updated>
<author>
<name>Mathieu Desnoyers</name>
<email>mathieu.desnoyers@efficios.com</email>
</author>
<published>2022-11-02T13:06:35Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=448dca8c88755b768552e19bd1618be34ef6d1ff'/>
<id>urn:sha1:448dca8c88755b768552e19bd1618be34ef6d1ff</id>
<content type='text'>
These commits use WARN_ON_ONCE() and kill the offending processes when
deprecated and unknown flags are encountered:

commit c17a6ff93213 ("rseq: Kill process when unknown flags are encountered in ABI structures")
commit 0190e4198e47 ("rseq: Deprecate RSEQ_CS_FLAG_NO_RESTART_ON_* flags")

The WARN_ON_ONCE() triggered by userspace input prevents use of
Syzkaller to fuzz the rseq system call.

Replace this WARN_ON_ONCE() by pr_warn_once() messages which contain
actually useful information.

Reported-by: Mark Rutland &lt;mark.rutland@arm.com&gt;
Signed-off-by: Mathieu Desnoyers &lt;mathieu.desnoyers@efficios.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Acked-by: Mark Rutland &lt;mark.rutland@arm.com&gt;
Acked-by: Paul E. McKenney &lt;paulmck@kernel.org&gt;
Link: https://lkml.kernel.org/r/20221102130635.7379-1-mathieu.desnoyers@efficios.com
</content>
</entry>
<entry>
<title>rseq: Kill process when unknown flags are encountered in ABI structures</title>
<updated>2022-08-01T13:21:42Z</updated>
<author>
<name>Mathieu Desnoyers</name>
<email>mathieu.desnoyers@efficios.com</email>
</author>
<published>2022-06-22T19:46:17Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=c17a6ff9321355487d7d5ccaa7d406a0ea06b6c4'/>
<id>urn:sha1:c17a6ff9321355487d7d5ccaa7d406a0ea06b6c4</id>
<content type='text'>
rseq_abi()-&gt;flags and rseq_abi()-&gt;rseq_cs-&gt;flags 29 upper bits are
currently unused.

The current behavior when those bits are set is to ignore them. This is
not an ideal behavior, because when future features will start using
those flags, if user-space fails to correctly validate that the kernel
indeed supports those flags (e.g. with a new sys_rseq flags bit) before
using them, it may incorrectly assume that the kernel will handle those
flags way when in fact those will be silently ignored on older kernels.

Validating that unused flags bits are cleared will allow a smoother
transition when those flags will start to be used by allowing
applications to fail early, and obviously, when they attempt to use the
new flags on an older kernel that does not support them.

Signed-off-by: Mathieu Desnoyers &lt;mathieu.desnoyers@efficios.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Link: https://lkml.kernel.org/r/20220622194617.1155957-2-mathieu.desnoyers@efficios.com
</content>
</entry>
<entry>
<title>rseq: Deprecate RSEQ_CS_FLAG_NO_RESTART_ON_* flags</title>
<updated>2022-08-01T13:21:29Z</updated>
<author>
<name>Mathieu Desnoyers</name>
<email>mathieu.desnoyers@efficios.com</email>
</author>
<published>2022-06-22T19:46:16Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=0190e4198e47fe99d002d72588f34fd62c9ab570'/>
<id>urn:sha1:0190e4198e47fe99d002d72588f34fd62c9ab570</id>
<content type='text'>
The pretty much unused RSEQ_CS_FLAG_NO_RESTART_ON_* flags introduce
complexity in rseq, and are subtly buggy [1]. Solving those issues
requires introducing additional complexity in the rseq implementation
for each supported architecture.

Considering that it complexifies the rseq ABI, I am proposing that we
deprecate those flags. [2]

So far there appears to be consensus from maintainers of user-space
projects impacted by this feature that its removal would be a welcome
simplification. [3]

The deprecation approach proposed here is to issue WARN_ON_ONCE() when
encountering those flags and kill the offending process with sigsegv.
This should allow us to quickly identify whether anyone yells at us for
removing this.

Link: https://lore.kernel.org/lkml/20220618182515.95831-1-minhquangbui99@gmail.com/ [1]
Link: https://lore.kernel.org/lkml/258546133.12151.1655739550814.JavaMail.zimbra@efficios.com/ [2]
Link: https://lore.kernel.org/lkml/87pmj1enjh.fsf@email.froward.int.ebiederm.org/ [3]
Signed-off-by: Mathieu Desnoyers &lt;mathieu.desnoyers@efficios.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Link: https://lore.kernel.org/lkml/20220622194617.1155957-1-mathieu.desnoyers@efficios.com
</content>
</entry>
<entry>
<title>rseq: Remove broken uapi field layout on 32-bit little endian</title>
<updated>2022-02-02T12:11:34Z</updated>
<author>
<name>Mathieu Desnoyers</name>
<email>mathieu.desnoyers@efficios.com</email>
</author>
<published>2022-01-27T15:27:20Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=bfdf4e6208051ed7165b2e92035b4bf11f43eb63'/>
<id>urn:sha1:bfdf4e6208051ed7165b2e92035b4bf11f43eb63</id>
<content type='text'>
The rseq rseq_cs.ptr.{ptr32,padding} uapi endianness handling is
entirely wrong on 32-bit little endian: a preprocessor logic mistake
wrongly uses the big endian field layout on 32-bit little endian
architectures.

Fortunately, those ptr32 accessors were never used within the kernel,
and only meant as a convenience for user-space.

Remove those and replace the whole rseq_cs union by a __u64 type, as
this is the only thing really needed to express the ABI. Document how
32-bit architectures are meant to interact with this field.

Fixes: ec9c82e03a74 ("rseq: uapi: Declare rseq_cs field as union, update includes")
Signed-off-by: Mathieu Desnoyers &lt;mathieu.desnoyers@efficios.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Link: https://lkml.kernel.org/r/20220127152720.25898-1-mathieu.desnoyers@efficios.com
</content>
</entry>
</feed>
