<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/kernel/sched/syscalls.c, branch v6.14</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v6.14</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v6.14'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2025-01-23T20:09:25Z</updated>
<entry>
<title>cpufreq/schedutil: Only bind threads if needed</title>
<updated>2025-01-23T20:09:25Z</updated>
<author>
<name>Christian Loehle</name>
<email>christian.loehle@arm.com</email>
</author>
<published>2025-01-20T10:09:46Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=93940fbdc46843cea58708bbe8dd225bd0f32e67'/>
<id>urn:sha1:93940fbdc46843cea58708bbe8dd225bd0f32e67</id>
<content type='text'>
Remove the unconditional binding of sugov kthreads to the affected CPUs
if the cpufreq driver indicates that updates can happen from any CPU.
This allows userspace to set affinities to either save power (waking up
bigger CPUs on HMP can be expensive) or increasing performance (by
letting the utilized CPUs run without preemption of the sugov kthread).

Signed-off-by: Christian Loehle &lt;christian.loehle@arm.com&gt;
Acked-by: Viresh Kumar &lt;viresh.kumar@linaro.org&gt;
Acked-by: Vincent Guittot &lt;vincent.guittot@linaro.org&gt;
Acked-by: Rafael J. Wysocki &lt;rafael@kernel.org&gt;
Acked-by: Juri Lelli &lt;juri.lelli@redhat.com&gt;
Link: https://patch.msgid.link/5a8deed4-7764-4729-a9d4-9520c25fa7e8@arm.com
Signed-off-by: Rafael J. Wysocki &lt;rafael.j.wysocki@intel.com&gt;
</content>
</entry>
<entry>
<title>sched/fair: Encapsulate set custom slice in a __setparam_fair() function</title>
<updated>2025-01-13T13:10:22Z</updated>
<author>
<name>Vincent Guittot</name>
<email>vincent.guittot@linaro.org</email>
</author>
<published>2025-01-11T09:14:09Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=2cf9ac40073ddb6a70dd5ef94f1cf45501b696ed'/>
<id>urn:sha1:2cf9ac40073ddb6a70dd5ef94f1cf45501b696ed</id>
<content type='text'>
Similarly to dl, create a __setparam_fair() function to set parameters
related to fair class and move it in the fair.c file.

No functional changes expected

Signed-off-by: Vincent Guittot &lt;vincent.guittot@linaro.org&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Reviewed-by: Phil Auld &lt;pauld@redhat.com&gt;
Link: https://lore.kernel.org/r/20250110144656.484601-1-vincent.guittot@linaro.org
</content>
</entry>
<entry>
<title>sched: Fix race between yield_to() and try_to_wake_up()</title>
<updated>2025-01-13T13:10:22Z</updated>
<author>
<name>Tianchen Ding</name>
<email>dtcccc@linux.alibaba.com</email>
</author>
<published>2024-12-31T05:50:20Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=5d808c78d97251af1d3a3e4f253e7d6c39fd871e'/>
<id>urn:sha1:5d808c78d97251af1d3a3e4f253e7d6c39fd871e</id>
<content type='text'>
We met a SCHED_WARN in set_next_buddy():
  __warn_printk
  set_next_buddy
  yield_to_task_fair
  yield_to
  kvm_vcpu_yield_to [kvm]
  ...

After a short dig, we found the rq_lock held by yield_to() may not
be exactly the rq that the target task belongs to. There is a race
window against try_to_wake_up().

         CPU0                             target_task

                                        blocking on CPU1
   lock rq0 &amp; rq1
   double check task_rq == p_rq, ok
                                        woken to CPU2 (lock task_pi &amp; rq2)
                                        task_rq = rq2
   yield_to_task_fair (w/o lock rq2)

In this race window, yield_to() is operating the task w/o the correct
lock. Fix this by taking task pi_lock first.

Fixes: d95f41220065 ("sched: Add yield_to(task, preempt) functionality")
Signed-off-by: Tianchen Ding &lt;dtcccc@linux.alibaba.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Link: https://lkml.kernel.org/r/20241231055020.6521-1-dtcccc@linux.alibaba.com
</content>
</entry>
<entry>
<title>sched: fix warning in sched_setaffinity</title>
<updated>2024-12-02T11:01:27Z</updated>
<author>
<name>Josh Don</name>
<email>joshdon@google.com</email>
</author>
<published>2024-11-11T18:27:38Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=70ee7947a29029736a1a06c73a48ff37674a851b'/>
<id>urn:sha1:70ee7947a29029736a1a06c73a48ff37674a851b</id>
<content type='text'>
Commit 8f9ea86fdf99b added some logic to sched_setaffinity that included
a WARN when a per-task affinity assignment races with a cpuset update.

Specifically, we can have a race where a cpuset update results in the
task affinity no longer being a subset of the cpuset. That's fine; we
have a fallback to instead use the cpuset mask. However, we have a WARN
set up that will trigger if the cpuset mask has no overlap at all with
the requested task affinity. This shouldn't be a warning condition; its
trivial to create this condition.

Reproduced the warning by the following setup:

- $PID inside a cpuset cgroup
- another thread repeatedly switching the cpuset cpus from 1-2 to just 1
- another thread repeatedly setting the $PID affinity (via taskset) to 2

Fixes: 8f9ea86fdf99b ("sched: Always preserve the user requested cpumask")
Signed-off-by: Josh Don &lt;joshdon@google.com&gt;
Acked-and-tested-by: Vincent Guittot &lt;vincent.guittot@linaro.org&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Acked-by: Waiman Long &lt;longman@redhat.com&gt;
Tested-by: Madadi Vineeth Reddy &lt;vineethr@linux.ibm.com&gt;
Link: https://lkml.kernel.org/r/20241111182738.1832953-1-joshdon@google.com
</content>
</entry>
<entry>
<title>Merge tag 'sched-core-2024-11-18' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip</title>
<updated>2024-11-19T22:16:06Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2024-11-19T22:16:06Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=3f020399e4f1c690ce87b4c472f75b1fc89e07d5'/>
<id>urn:sha1:3f020399e4f1c690ce87b4c472f75b1fc89e07d5</id>
<content type='text'>
Pull scheduler updates from Ingo Molnar:
 "Core facilities:

   - Add the "Lazy preemption" model (CONFIG_PREEMPT_LAZY=y), which
     optimizes fair-class preemption by delaying preemption requests to
     the tick boundary, while working as full preemption for
     RR/FIFO/DEADLINE classes. (Peter Zijlstra)
        - x86: Enable Lazy preemption (Peter Zijlstra)
        - riscv: Enable Lazy preemption (Jisheng Zhang)

   - Initialize idle tasks only once (Thomas Gleixner)

   - sched/ext: Remove sched_fork() hack (Thomas Gleixner)

  Fair scheduler:

   - Optimize the PLACE_LAG when se-&gt;vlag is zero (Huang Shijie)

  Idle loop:

   - Optimize the generic idle loop by removing unnecessary memory
     barrier (Zhongqiu Han)

  RSEQ:

   - Improve cache locality of RSEQ concurrency IDs for intermittent
     workloads (Mathieu Desnoyers)

  Waitqueues:

   - Make wake_up_{bit,var} less fragile (Neil Brown)

  PSI:

   - Pass enqueue/dequeue flags to psi callbacks directly (Johannes
     Weiner)

  Preparatory patches for proxy execution:

   - Add move_queued_task_locked helper (Connor O'Brien)

   - Consolidate pick_*_task to task_is_pushable helper (Connor O'Brien)

   - Split out __schedule() deactivate task logic into a helper (John
     Stultz)

   - Split scheduler and execution contexts (Peter Zijlstra)

   - Make mutex::wait_lock irq safe (Juri Lelli)

   - Expose __mutex_owner() (Juri Lelli)

   - Remove wakeups from under mutex::wait_lock (Peter Zijlstra)

  Misc fixes and cleanups:

   - Remove unused __HAVE_THREAD_FUNCTIONS hook support (David
     Disseldorp)

   - Update the comment for TIF_NEED_RESCHED_LAZY (Sebastian Andrzej
     Siewior)

   - Remove unused bit_wait_io_timeout (Dr. David Alan Gilbert)

   - remove the DOUBLE_TICK feature (Huang Shijie)

   - fix the comment for PREEMPT_SHORT (Huang Shijie)

   - Fix unnused variable warning (Christian Loehle)

   - No PREEMPT_RT=y for all{yes,mod}config"

* tag 'sched-core-2024-11-18' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (33 commits)
  sched, x86: Update the comment for TIF_NEED_RESCHED_LAZY.
  sched: No PREEMPT_RT=y for all{yes,mod}config
  riscv: add PREEMPT_LAZY support
  sched, x86: Enable Lazy preemption
  sched: Enable PREEMPT_DYNAMIC for PREEMPT_RT
  sched: Add Lazy preemption model
  sched: Add TIF_NEED_RESCHED_LAZY infrastructure
  sched/ext: Remove sched_fork() hack
  sched: Initialize idle tasks only once
  sched: psi: pass enqueue/dequeue flags to psi callbacks directly
  sched/uclamp: Fix unnused variable warning
  sched: Split scheduler and execution contexts
  sched: Split out __schedule() deactivate task logic into a helper
  sched: Consolidate pick_*_task to task_is_pushable helper
  sched: Add move_queued_task_locked helper
  locking/mutex: Expose __mutex_owner()
  locking/mutex: Make mutex::wait_lock irq safe
  locking/mutex: Remove wakeups from under mutex::wait_lock
  sched: Improve cache locality of RSEQ concurrency IDs for intermittent workloads
  sched: idle: Optimize the generic idle loop by removing needless memory barrier
  ...
</content>
</entry>
<entry>
<title>Merge tag 'vfs-6.13.usercopy' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs</title>
<updated>2024-11-18T18:50:09Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2024-11-18T18:50:09Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=a5ca57479656f2562f164d650c6646debbe2f99b'/>
<id>urn:sha1:a5ca57479656f2562f164d650c6646debbe2f99b</id>
<content type='text'>
Pull copy_struct_to_user helper from Christian Brauner:
 "This adds a copy_struct_to_user() helper which is a companion helper
  to the already widely used copy_struct_from_user().

  It copies a struct from kernel space to userspace, in a way that
  guarantees backwards-compatibility for struct syscall arguments as
  long as future struct extensions are made such that all new fields are
  appended to the old struct, and zeroed-out new fields have the same
  meaning as the old struct.

  The first user is sched_getattr() system call but the new extensible
  pidfs ioctl will be ported to it as well"

* tag 'vfs-6.13.usercopy' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
  sched_getattr: port to copy_struct_to_user
  uaccess: add copy_struct_to_user helper
</content>
</entry>
<entry>
<title>sched: Pass correct scheduling policy to __setscheduler_class</title>
<updated>2024-10-29T12:57:51Z</updated>
<author>
<name>Aboorva Devarajan</name>
<email>aboorvad@linux.ibm.com</email>
</author>
<published>2024-10-25T18:50:20Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=5db91545ef8150c45a526675ef99e8998b648a41'/>
<id>urn:sha1:5db91545ef8150c45a526675ef99e8998b648a41</id>
<content type='text'>
Commit 98442f0ccd82 ("sched: Fix delayed_dequeue vs
switched_from_fair()") overlooked that __setscheduler_prio(), now
__setscheduler_class() relies on p-&gt;policy for task_should_scx(), and
moved the call before __setscheduler_params() updates it, causing it
to be using the old p-&gt;policy value.

Resolve this by changing task_should_scx() to take the policy itself
instead of a task pointer, such that __sched_setscheduler() can pass
in the updated policy.

Fixes: 98442f0ccd82 ("sched: Fix delayed_dequeue vs switched_from_fair()")
Signed-off-by: Aboorva Devarajan &lt;aboorvad@linux.ibm.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Acked-by: Tejun Heo &lt;tj@kernel.org&gt;
</content>
</entry>
<entry>
<title>sched_getattr: port to copy_struct_to_user</title>
<updated>2024-10-21T14:51:31Z</updated>
<author>
<name>Aleksa Sarai</name>
<email>cyphar@cyphar.com</email>
</author>
<published>2024-10-09T20:40:35Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=112cca098a7010c02a4d535a253af72e4e5bbd06'/>
<id>urn:sha1:112cca098a7010c02a4d535a253af72e4e5bbd06</id>
<content type='text'>
sched_getattr(2) doesn't care about trailing non-zero bytes in the
(ksize &gt; usize) case, so just use copy_struct_to_user() without checking
ignored_trailing.

Signed-off-by: Aleksa Sarai &lt;cyphar@cyphar.com&gt;
Link: https://lore.kernel.org/r/20241010-extensible-structs-check_fields-v3-2-d2833dfe6edd@cyphar.com
Signed-off-by: Christian Brauner &lt;brauner@kernel.org&gt;
</content>
</entry>
<entry>
<title>sched: Split scheduler and execution contexts</title>
<updated>2024-10-14T10:52:42Z</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2024-10-09T23:53:40Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=af0c8b2bf67b25756f27644936e74fd9a6273bd2'/>
<id>urn:sha1:af0c8b2bf67b25756f27644936e74fd9a6273bd2</id>
<content type='text'>
Let's define the "scheduling context" as all the scheduler state
in task_struct for the task chosen to run, which we'll call the
donor task, and the "execution context" as all state required to
actually run the task.

Currently both are intertwined in task_struct. We want to
logically split these such that we can use the scheduling
context of the donor task selected to be scheduled, but use
the execution context of a different task to actually be run.

To this purpose, introduce rq-&gt;donor field to point to the
task_struct chosen from the runqueue by the scheduler, and will
be used for scheduler state, and preserve rq-&gt;curr to indicate
the execution context of the task that will actually be run.

This patch introduces the donor field as a union with curr, so it
doesn't cause the contexts to be split yet, but adds the logic to
handle everything separately.

[add additional comments and update more sched_class code to use
 rq::proxy]
[jstultz: Rebased and resolved minor collisions, reworked to use
 accessors, tweaked update_curr_common to use rq_proxy fixing rt
 scheduling issues]

Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Signed-off-by: Juri Lelli &lt;juri.lelli@redhat.com&gt;
Signed-off-by: Connor O'Brien &lt;connoro@google.com&gt;
Signed-off-by: John Stultz &lt;jstultz@google.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Reviewed-by: Metin Kaya &lt;metin.kaya@arm.com&gt;
Tested-by: K Prateek Nayak &lt;kprateek.nayak@amd.com&gt;
Tested-by: Metin Kaya &lt;metin.kaya@arm.com&gt;
Link: https://lore.kernel.org/r/20241009235352.1614323-8-jstultz@google.com
</content>
</entry>
<entry>
<title>sched: Fix delayed_dequeue vs switched_from_fair()</title>
<updated>2024-10-11T08:49:32Z</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2024-10-10T09:54:38Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=98442f0ccd828ac42e89281a815e9e7a97533822'/>
<id>urn:sha1:98442f0ccd828ac42e89281a815e9e7a97533822</id>
<content type='text'>
Commit 2e0199df252a ("sched/fair: Prepare exit/cleanup paths for delayed_dequeue")
and its follow up fixes try to deal with a rather unfortunate
situation where is task is enqueued in a new class, even though it
shouldn't have been. Mostly because the existing -&gt;switched_to/from()
hooks are in the wrong place for this case.

This all led to Paul being able to trigger failures at something like
once per 10k CPU hours of RCU torture.

For now, do the ugly thing and move the code to the right place by
ignoring the switch hooks.

Note: Clean up the whole sched_class::switch*_{to,from}() thing.

Fixes: 2e0199df252a ("sched/fair: Prepare exit/cleanup paths for delayed_dequeue")
Reported-by: Paul E. McKenney &lt;paulmck@kernel.org&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Link: https://lkml.kernel.org/r/20241003185037.GA5594@noisy.programming.kicks-ass.net
</content>
</entry>
</feed>
