<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/kernel/sched/ext.c, branch v6.19</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v6.19</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v6.19'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2026-02-04T23:11:24Z</updated>
<entry>
<title>Merge tag 'sched_ext-for-6.19-rc8-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext</title>
<updated>2026-02-04T23:11:24Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2026-02-04T23:11:24Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=3c7b4d1994f63d6fa3984d7d5ad06dbaad96f167'/>
<id>urn:sha1:3c7b4d1994f63d6fa3984d7d5ad06dbaad96f167</id>
<content type='text'>
Pull sched_ext fix from Tejun Heo:

 - Fix race where sched_class operations (sched_setscheduler() and
   friends) could be invoked on dead tasks after sched_ext_dead()
   already ran, causing invalid SCX task state transitions and NULL
   pointer dereferences.

   This was a regression from the cgroup exit ordering fix which
   moved sched_ext_free() to finish_task_switch().

* tag 'sched_ext-for-6.19-rc8-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext:
  sched_ext: Short-circuit sched_class operations on dead tasks
</content>
</entry>
<entry>
<title>sched_ext: Short-circuit sched_class operations on dead tasks</title>
<updated>2026-02-04T22:22:11Z</updated>
<author>
<name>Tejun Heo</name>
<email>tj@kernel.org</email>
</author>
<published>2026-02-04T20:07:55Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=0eca95cba2b7bf7b7b4f2fa90734a85fcaa72782'/>
<id>urn:sha1:0eca95cba2b7bf7b7b4f2fa90734a85fcaa72782</id>
<content type='text'>
7900aa699c34 ("sched_ext: Fix cgroup exit ordering by moving sched_ext_free()
to finish_task_switch()") moved sched_ext_free() to finish_task_switch() and
renamed it to sched_ext_dead() to fix cgroup exit ordering issues. However,
this created a race window where certain sched_class ops may be invoked on
dead tasks leading to failures - e.g. sched_setscheduler() may try to switch a
task which finished sched_ext_dead() back into SCX triggering invalid SCX task
state transitions.

Add task_dead_and_done() which tests whether a task is TASK_DEAD and has
completed its final context switch, and use it to short-circuit sched_class
operations which may be called on dead tasks.

Fixes: 7900aa699c34 ("sched_ext: Fix cgroup exit ordering by moving sched_ext_free() to finish_task_switch()")
Reported-by: Andrea Righi &lt;arighi@nvidia.com&gt;
Link: http://lkml.kernel.org/r/20260202151341.796959-1-arighi@nvidia.com
Reviewed-by: Andrea Righi &lt;arighi@nvidia.com&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
</content>
</entry>
<entry>
<title>sched: Audit MOVE vs balance_callbacks</title>
<updated>2026-01-15T20:57:53Z</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2026-01-15T08:17:49Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=53439363c0a111f11625982b69c88ee2ce8608ec'/>
<id>urn:sha1:53439363c0a111f11625982b69c88ee2ce8608ec</id>
<content type='text'>
The {DE,EN}QUEUE_MOVE flag indicates a task is allowed to change
priority, which means there could be balance callbacks queued.

Therefore audit all MOVE users and make sure they do run balance
callbacks before dropping rq-lock.

Fixes: 6455ad5346c9 ("sched: Move sched_class::prio_changed() into the change pattern")
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Tested-by: Pierre Gondois &lt;pierre.gondois@arm.com&gt;
Tested-by: Juri Lelli &lt;juri.lelli@redhat.com&gt;
Link: https://patch.msgid.link/20260114130528.GB831285@noisy.programming.kicks-ass.net
</content>
</entry>
<entry>
<title>sched_ext: Use the resched_cpu() to replace resched_curr() in the bypass_lb_node()</title>
<updated>2025-12-23T03:51:51Z</updated>
<author>
<name>Zqiang</name>
<email>qiang.zhang@linux.dev</email>
</author>
<published>2025-12-22T11:53:17Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=ccaeeb585c7c2a0ac67ee1af9acb4d1411dc409e'/>
<id>urn:sha1:ccaeeb585c7c2a0ac67ee1af9acb4d1411dc409e</id>
<content type='text'>
For the PREEMPT_RT kernels, the scx_bypass_lb_timerfn() running in the
preemptible per-CPU ktimer kthread context, this means that the following
scenarios will occur(for x86 platform):

       cpu1                          cpu2
				 ktimer kthread:
                                 -&gt;scx_bypass_lb_timerfn
                                   -&gt;bypass_lb_node
                                     -&gt;for_each_cpu(cpu, resched_mask)

    migration/1:                       by preempt by migration/2:
    multi_cpu_stop()                     multi_cpu_stop()
    -&gt;take_cpu_down()
      -&gt;__cpu_disable()
	-&gt;set cpu1 offline

                                       -&gt;rq1 = cpu_rq(cpu1)
                                       -&gt;resched_curr(rq1)
                                         -&gt;smp_send_reschedule(cpu1)
					   -&gt;native_smp_send_reschedule(cpu1)
					     -&gt;if(unlikely(cpu_is_offline(cpu))) {
                					WARN(1, "sched: Unexpected
							reschedule of offline CPU#%d!\n", cpu);
                					return;
        					}

This commit therefore use the resched_cpu() to replace resched_curr()
in the bypass_lb_node() to avoid send-ipi to offline CPUs.

Signed-off-by: Zqiang &lt;qiang.zhang@linux.dev&gt;
Reviewed-by: Andrea Righi &lt;arighi@nvidia.com&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
</content>
</entry>
<entry>
<title>sched_ext: Fix some comments in ext.c</title>
<updated>2025-12-19T23:11:22Z</updated>
<author>
<name>Zqiang</name>
<email>qiang.zhang@linux.dev</email>
</author>
<published>2025-12-19T09:34:04Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=12494e5e2aea17dac54c0356e53e40a31c2a31e4'/>
<id>urn:sha1:12494e5e2aea17dac54c0356e53e40a31c2a31e4</id>
<content type='text'>
This commit update balance_scx() in the comments to balance_one().

Signed-off-by: Zqiang &lt;qiang.zhang@linux.dev&gt;
Reviewed-by: Andrea Righi &lt;arighi@nvidia.com&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
</content>
</entry>
<entry>
<title>sched_ext: fix uninitialized ret on alloc_percpu() failure</title>
<updated>2025-12-16T19:15:03Z</updated>
<author>
<name>Liang Jie</name>
<email>liangjie@lixiang.com</email>
</author>
<published>2025-12-16T09:39:55Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=b0101ccb5b4641885f30fecc352ef891ed06e083'/>
<id>urn:sha1:b0101ccb5b4641885f30fecc352ef891ed06e083</id>
<content type='text'>
Smatch reported:

  kernel/sched/ext.c:5332 scx_alloc_and_add_sched() warn: passing zero to 'ERR_PTR'

In scx_alloc_and_add_sched(), the alloc_percpu() failure path jumps to
err_free_gdsqs without initializing @ret. That can lead to returning
ERR_PTR(0), which violates the ERR_PTR() convention and confuses
callers.

Set @ret to -ENOMEM before jumping to the error path when
alloc_percpu() fails.

Reported-by: kernel test robot &lt;lkp@intel.com&gt;
Closes: https://lore.kernel.org/r/202512141601.yAXDAeA9-lkp@intel.com/
Reported-by: Dan Carpenter &lt;error27@gmail.com&gt;
Fixes: c201ea1578d3 ("sched_ext: Move event_stats_cpu into scx_sched")
Signed-off-by: Liang Jie &lt;liangjie@lixiang.com&gt;
Reviewed-by: Emil Tsalapatis &lt;emil@etsalapatis.com&gt;
Reviewed-by: Andrea Righi &lt;arighi@nvidia.com&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
</content>
</entry>
<entry>
<title>sched_ext: Remove unused code in the do_pick_task_scx()</title>
<updated>2025-12-15T15:53:49Z</updated>
<author>
<name>Zqiang</name>
<email>qiang.zhang@linux.dev</email>
</author>
<published>2025-12-15T11:29:40Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=bb27226f0d00588ac53be8825e021ae80aa43371'/>
<id>urn:sha1:bb27226f0d00588ac53be8825e021ae80aa43371</id>
<content type='text'>
The kick_idle variable is no longer used, this commit therefore remove
it and also remove associated code in the do_pick_task_scx().

Signed-off-by: Zqiang &lt;qiang.zhang@linux.dev&gt;
Reviewed-by: Andrea Righi &lt;arighi@nvidia.com&gt;
Reviewed-by: Emil Tsalapatis &lt;emil@etsalapatis.com&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
</content>
</entry>
<entry>
<title>sched_ext: Fix missing post-enqueue handling in move_local_task_to_local_dsq()</title>
<updated>2025-12-12T16:26:42Z</updated>
<author>
<name>Tejun Heo</name>
<email>tj@kernel.org</email>
</author>
<published>2025-12-12T01:45:04Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=f5e1e5ec204da11fa87fdf006d451d80ce06e118'/>
<id>urn:sha1:f5e1e5ec204da11fa87fdf006d451d80ce06e118</id>
<content type='text'>
move_local_task_to_local_dsq() is used when moving a task from a non-local
DSQ to a local DSQ on the same CPU. It directly manipulates the local DSQ
without going through dispatch_enqueue() and was missing the post-enqueue
handling that triggers preemption when SCX_ENQ_PREEMPT is set or the idle
task is running.

The function is used by move_task_between_dsqs() which backs
scx_bpf_dsq_move() and may be called while the CPU is busy.

Add local_dsq_post_enq() call to move_local_task_to_local_dsq(). As the
dispatch path doesn't need post-enqueue handling, add SCX_RQ_IN_BALANCE
early exit to keep consume_dispatch_q() behavior unchanged and avoid
triggering unnecessary resched when scx_bpf_dsq_move() is used from the
dispatch path.

Fixes: 4c30f5ce4f7a ("sched_ext: Implement scx_bpf_dispatch[_vtime]_from_dsq()")
Cc: stable@vger.kernel.org # v6.12+
Reviewed-by: Andrea Righi &lt;arighi@nvidia.com&gt;
Reviewed-by: Emil Tsalapatis &lt;emil@etsalapatis.com&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
</content>
</entry>
<entry>
<title>sched_ext: Factor out local_dsq_post_enq() from dispatch_enqueue()</title>
<updated>2025-12-12T16:26:07Z</updated>
<author>
<name>Tejun Heo</name>
<email>tj@kernel.org</email>
</author>
<published>2025-12-12T01:45:03Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=530b6637c79e728d58f1d9b66bd4acf4b735b86d'/>
<id>urn:sha1:530b6637c79e728d58f1d9b66bd4acf4b735b86d</id>
<content type='text'>
Factor out local_dsq_post_enq() which performs post-enqueue handling for
local DSQs - triggering resched_curr() if SCX_ENQ_PREEMPT is specified or if
the current CPU is idle. No functional change.

This will be used by the next patch to fix move_local_task_to_local_dsq().

Cc: stable@vger.kernel.org # v6.12+
Reviewed-by: Andrea Righi &lt;arighi@nvidia.com&gt;
Reviewed-by: Emil Tsalapatis &lt;emil@etsalapatis.com&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
</content>
</entry>
<entry>
<title>sched_ext: Fix bypass depth leak on scx_enable() failure</title>
<updated>2025-12-11T16:27:35Z</updated>
<author>
<name>Tejun Heo</name>
<email>tj@kernel.org</email>
</author>
<published>2025-12-09T21:04:33Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=9f769637a93fac81689b80df6855f545839cf999'/>
<id>urn:sha1:9f769637a93fac81689b80df6855f545839cf999</id>
<content type='text'>
scx_enable() calls scx_bypass(true) to initialize in bypass mode and then
scx_bypass(false) on success to exit. If scx_enable() fails during task
initialization - e.g. scx_cgroup_init() or scx_init_task() returns an error -
it jumps to err_disable while bypass is still active. scx_disable_workfn()
then calls scx_bypass(true/false) for its own bypass, leaving the bypass depth
at 1 instead of 0. This causes the system to remain permanently in bypass mode
after a failed scx_enable().

Failures after task initialization is complete - e.g. scx_tryset_enable_state()
at the end - already call scx_bypass(false) before reaching the error path and
are not affected. This only affects a subset of failure modes.

Fix it by tracking whether scx_enable() called scx_bypass(true) in a bool and
having scx_disable_workfn() call an extra scx_bypass(false) to clear it. This
is a temporary measure as the bypass depth will be moved into the sched
instance, which will make this tracking unnecessary.

Fixes: 8c2090c504e9 ("sched_ext: Initialize in bypass mode")
Cc: stable@vger.kernel.org # v6.12+
Reported-by: Chris Mason &lt;clm@meta.com&gt;
Reviewed-by: Emil Tsalapatis &lt;emil@etsalapatis.com&gt;
Link: https://lore.kernel.org/stable/286e6f7787a81239e1ce2989b52391ce%40kernel.org
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
</content>
</entry>
</feed>
