linux - Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/

Age	Commit message (Collapse)	Author	Lines
2026-04-09	selftests/sched_ext: Fix wrong DSQ ID in peek_dsq error message	fangqiurong	-1/+1
	The error path after scx_bpf_create_dsq(real_dsq_id, ...) was reporting test_dsq_id instead of real_dsq_id in the error message, which would mislead debugging. Signed-off-by: fangqiurong <fangqiurong@kylinos.cn> Signed-off-by: Tejun Heo <tj@kernel.org>
2026-04-08	selftests/sched_ext: Improve runner error reporting for invalid arguments	Cheng-Yang Chou	-0/+20
	Report an error for './runner foo' (positional arg instead of -t) and for './runner -t foo' when the filter matches no tests. Previously both cases produced no error output. Pre-scan the test list before the main loop so the error is reported immediately, avoiding spurious SKIP output from '-s' when no tests match. Signed-off-by: Cheng-Yang Chou <yphbchou0911@gmail.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2026-03-30	Merge branch 'for-7.0-fixes' into for-7.1	Tejun Heo	-0/+263
	Conflict in kernel/sched/ext.c init_sched_ext_class() between: 415cb193bb97 ("sched_ext: Fix SCX_KICK_WAIT deadlock by deferring wait to balance callback") which adds cpus_to_sync cpumask allocation, and: 84b1a0ea0b7c ("sched_ext: Implement scx_bpf_dsq_reenq() for user DSQs") 8c1b9453fde6 ("sched_ext: Convert deferred_reenq_locals from llist to regular list") which add deferred_reenq init code at the same location. Both are independent additions. Include both. Signed-off-by: Tejun Heo <tj@kernel.org>
2026-03-30	selftests/sched_ext: Add cyclic SCX_KICK_WAIT stress test	Tejun Heo	-0/+263
	Add a test that creates a 3-CPU kick_wait cycle (A->B->C->A). A BPF scheduler kicks the next CPU in the ring with SCX_KICK_WAIT on every enqueue while userspace workers generate continuous scheduling churn via sched_yield(). Without the preceding fix, this hangs the machine within seconds. Signed-off-by: Tejun Heo <tj@kernel.org> Reviewed-by: Christian Loehle <christian.loehle@arm.com> Tested-by: Christian Loehle <christian.loehle@arm.com>
2026-03-26	Revert "selftests/sched_ext: Add tests for SCX_ENQ_IMMED and ↵	Tejun Heo	-601/+0
	scx_bpf_dsq_reenq()" This reverts commit c50dcf533149. The tests are superficial, likely AI-generated slop, and flaky. They don't add actual value and just churn the selftests. Signed-off-by: Tejun Heo <tj@kernel.org>
2026-03-24	selftests/sched_ext: Skip rt_stall on older kernels and list skipped tests	Cheng-Yang Chou	-1/+12
	rt_stall tests the ext DL server which was introduced in commit cd959a356205 ("sched_ext: Add a DL server for sched_ext tasks"). On older kernels that lack this feature, the test calls ksft_exit_fail() internally which terminates the entire runner process, preventing subsequent tests from running. Add a guard in setup() that checks for the ext_server field in struct rq via __COMPAT_struct_has_field() and returns SCX_TEST_SKIP if not present. Also print the names of skipped tests in the results summary, mirroring the existing behavior for failed tests. Signed-off-by: Cheng-Yang Chou <yphbchou0911@gmail.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2026-03-22	selftests/sched_ext: Add tests for SCX_ENQ_IMMED and scx_bpf_dsq_reenq()	zhidao su	-0/+601
	Add three selftests covering features introduced in v7.1: - dsq_reenq: Verify scx_bpf_dsq_reenq() on user DSQs triggers ops.enqueue() with SCX_ENQ_REENQ and SCX_TASK_REENQ_KFUNC in p->scx.flags. - enq_immed: Verify SCX_OPS_ALWAYS_ENQ_IMMED slow path where tasks dispatched to a busy CPU's local DSQ are re-enqueued through ops.enqueue() with SCX_TASK_REENQ_IMMED. - consume_immed: Verify SCX_ENQ_IMMED via the consume path using scx_bpf_dsq_move_to_local___v2() with explicit SCX_ENQ_IMMED. All three tests skip gracefully on kernels that predate the required features by checking availability via __COMPAT_has_ksym() / __COMPAT_read_enum() before loading. Signed-off-by: zhidao su <suzhidao@xiaomi.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2026-03-21	selftests/sched_ext: Return non-zero exit code on test failure	zhidao su	-1/+1
	runner.c always returned 0 regardless of test results. The kselftest framework (tools/testing/selftests/kselftest/runner.sh) invokes the runner binary and treats a non-zero exit code as a test failure; with the old code, failed sched_ext tests were silently hidden from the parent harness even though individual "not ok" TAP lines were emitted. Return 1 when at least one test failed, 0 when all tests passed or were skipped. Signed-off-by: zhidao su <suzhidao@xiaomi.com> Acked-by: Andrea Righi <arighi@nvidia.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2026-03-17	selftests/sched_ext: Show failed test names in summary	Cheng-Yang Chou	-1/+7
	When tests fail, the runner only printed the failure count, making it hard to tell which tests failed without scrolling through output. Track failed test names in an array and print them after the summary so failures are immediately visible at the end of the run. Signed-off-by: Cheng-Yang Chou <yphbchou0911@gmail.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2026-03-14	sched_ext: Update selftests to drop ops.cpu_acquire/release()	Cheng-Yang Chou	-9/+12
	ops.cpu_acquire/release() are deprecated by commit a3f5d4822253 ("sched_ext: Allow scx_bpf_reenqueue_local() to be called from anywhere") in favor of handling CPU preemption via the sched_switch tracepoint. In the maximal selftest, replace the cpu_acquire/release stubs with a minimal sched_switch TP program. Attach all non-struct_ops programs (including the new TP) via maximal__attach() after disabling auto-attach for the maximal_ops struct_ops map, which is managed manually in run(). Apply the same fix to reload_loop, which also uses the maximal skeleton. Signed-off-by: Cheng-Yang Chou <yphbchou0911@gmail.com> Reviewed-by: Andrea Righi <arighi@nvidia.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2026-03-14	sched_ext: Update demo schedulers and selftests to use ↵	Cheng-Yang Chou	-2/+4
	scx_bpf_task_set_dsq_vtime() Direct writes to p->scx.dsq_vtime are deprecated in favor of scx_bpf_task_set_dsq_vtime(). Update scx_simple, scx_flatcg, and select_cpu_vtime selftest to use the new kfunc with scale_by_task_weight_inverse(). Signed-off-by: Cheng-Yang Chou <yphbchou0911@gmail.com> Reviewed-by: Andrea Righi <arighi@nvidia.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2026-03-13	sched_ext/selftests: Fix incorrect include guard comments	Cheng-Yang Chou	-2/+2
	Fix two mismatched closing comments in header include guards: - util.h: closing comment says __SCX_TEST_H__ but the guard is __SCX_TEST_UTIL_H__ - exit_test.h: closing comment has a spurious '#' character before the guard name Signed-off-by: Cheng-Yang Chou <yphbchou0911@gmail.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2026-03-13	selftests/sched_ext: Update scx_bpf_dsq_move_to_local() in kselftests	Andrea Righi	-9/+9
	After commit 860683763ebf ("sched_ext: Add enq_flags to scx_bpf_dsq_move_to_local()") some of the kselftests are failing to build: exit.bpf.c:44:34: error: too few arguments provided to function-like macro invocation 44 \| scx_bpf_dsq_move_to_local(DSQ_ID); Update the kselftests adding the new argument to scx_bpf_dsq_move_to_local(). Fixes: 860683763ebf ("sched_ext: Add enq_flags to scx_bpf_dsq_move_to_local()") Signed-off-by: Andrea Righi <arighi@nvidia.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2026-03-13	selftests/sched_ext: Add missing error check for exit__load()	David Carlier	-1/+1
	exit__load(skel) was called without checking its return value. Every other test in the suite wraps the load call with SCX_FAIL_IF(). Add the missing check to be consistent with the rest of the test suite. Fixes: a5db7817af78 ("sched_ext: Add selftests") Signed-off-by: David Carlier <devnexen@gmail.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2026-03-11	sched_ext: Fix incomplete help text usage strings	Cheng-Yang Chou	-1/+1
	Several demo schedulers and the selftest runner had usage strings that omitted options which are actually supported: - scx_central: add missing [-v] - scx_pair: add missing [-v] - scx_qmap: add missing [-S] and [-H] - scx_userland: add missing [-v] - scx_sdt: remove [-f] which no longer exists - runner.c: add missing [-s], [-l], [-q]; drop [-h] which none of the other sched_ext tools list in their usage lines Suggested-by: Tejun Heo <tj@kernel.org> Signed-off-by: Cheng-Yang Chou <yphbchou0911@gmail.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2026-03-06	Merge branch 'for-7.0-fixes' into for-7.1	Tejun Heo	-6/+11
	To prepare for hierarchical scheduling patchset which will cause multiple conflicts otherwise. Signed-off-by: Tejun Heo <tj@kernel.org>
2026-03-04	sched_ext/selftests: Fix format specifier and buffer length in file_write_long()	Cheng-Yang Chou	-2/+2
	Use %ld (not %lu) for signed long, and pass the actual string length returned by sprintf() to write_text() instead of sizeof(buf). Signed-off-by: Cheng-Yang Chou <yphbchou0911@gmail.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2026-03-02	selftests/sched_ext: Fix peek_dsq.bpf.c compile error for clang 17	Zhao Mengmeng	-2/+2
	When compiling sched_ext selftests using clang 17.0.6, it raised compiler crash and build error: Error at line 68: Unsupport signed division for DAG: 0x55b2f9a60240: i64 = sdiv 0x55b2f9a609b0, Constant:i64<100>, peek_dsq.bpf.c:68:25 @[ peek_dsq.bpf.c:95:4 @[ peek_dsq.bpf.c:169:8 @[ peek _dsq.bpf.c:140:6 ] ] ]Please convert to unsigned div/mod After digging, it's not a compiler error, clang supported Signed division only when using -mcpu=v4, while we use -mcpu=v3 currently, the better way is to use unsigned div, see [1] for details. [1] https://github.com/llvm/llvm-project/issues/70433 Signed-off-by: Zhao Mengmeng <zhaomengmeng@kylinos.cn> Reviewed-by: Andrea Righi <arighi@nvidia.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2026-03-02	selftests/sched_ext: Add -fms-extensions to bpf build flags	Zhao Mengmeng	-0/+2
	Similar to commit 835a50753579 ("selftests/bpf: Add -fms-extensions to bpf build flags") and commit 639f58a0f480 ("bpftool: Fix build warnings due to MS extensions") Fix "declaration does not declare anything" warning by using -fms-extensions and -Wno-microsoft-anon-tag flags to build bpf programs that #include "vmlinux.h" Signed-off-by: Zhao Mengmeng <zhaomengmeng@kylinos.cn> Reviewed-by: Andrea Righi <arighi@nvidia.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2026-02-23	selftests/sched_ext: Add test to validate ops.dequeue() semantics	Andrea Righi	-0/+664
	Add a new kselftest to validate that the new ops.dequeue() semantics work correctly for all task lifecycle scenarios, including the distinction between terminal DSQs (where BPF scheduler is done with the task), user DSQs (where BPF scheduler manages the task lifecycle) and BPF data structures, regardless of which event performs the dispatch. The test validates the following scenarios: - From ops.select_cpu(): - scenario 0 (local DSQ): tasks dispatched to the local DSQ bypass the BPF scheduler entirely; they never enter BPF custody, so ops.dequeue() is not called, - scenario 1 (global DSQ): tasks dispatched to SCX_DSQ_GLOBAL also bypass the BPF scheduler, like the local DSQ; ops.dequeue() is not called, - scenario 2 (user DSQ): tasks dispatched to user DSQs from ops.select_cpu(): tasks enter BPF scheduler's custody with full enqueue/dequeue lifecycle tracking and state machine validation, expects 1:1 enqueue/dequeue pairing, - From ops.enqueue(): - scenario 3 (local DSQ): same behavior as scenario 0, - scenario 4 (global DSQ): same behavior as scenario 1, - scenario 5 (user DSQ): same behavior as scenario 2, - scenario 6 (BPF internal queue): tasks are stored in a BPF queue from ops.enqueue() and consumed from ops.dispatch(); similarly to scenario 5, tasks enter BPF scheduler's custody with full lifecycle tracking and 1:1 enqueue/dequeue validation. This verifies that: - terminal DSQ dispatch (local, global) don't trigger ops.dequeue(), - tasks dispatched to user DSQs, either from ops.select_cpu() or ops.enqueue(), enter BPF scheduler's custody and have exact 1:1 enqueue/dequeue pairing, - tasks stored to internal BPF data structures from ops.enqueue() enter BPF scheduler's custody and have exact 1:1 enqueue/dequeue pairing, - dispatch dequeues have no flags (normal workflow), - property change dequeues have the %SCX_DEQ_SCHED_CHANGE flag set, - no duplicate enqueues or invalid state transitions are happening. Cc: Tejun Heo <tj@kernel.org> Cc: Emil Tsalapatis <emil@etsalapatis.com> Cc: Kuba Piecuch <jpiecuch@google.com> Reviewed-by: Daniel Jordan <daniel.m.jordan@oracle.com> Signed-off-by: Andrea Righi <arighi@nvidia.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2026-02-23	selftests/sched_ext: Remove duplicated unistd.h include in rt_stall.c	Cheng-Yang Chou	-1/+0
	The header <unistd.h> is included twice in rt_stall.c. Remove the redundant inclusion to clean up the code. Signed-off-by: Cheng-Yang Chou <yphbchou0911@gmail.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2026-02-23	selftests/sched_ext: Fix unused-result warning for read()	Cheng-Yang Chou	-1/+2
	The read() call in run_test() triggers a warn_unused_result compiler warning, which breaks the build under -Werror. Check the return value of read() and exit the child process on failure to satisfy the compiler and handle pipe read errors. Reviewed-by: Andrea Righi <arighi@nvidia.com> Signed-off-by: Cheng-Yang Chou <yphbchou0911@gmail.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2026-02-23	selftests/sched_ext: Abort test loop on signal	Cheng-Yang Chou	-0/+3
	The runner sets exit_req on SIGINT/SIGTERM but ignores it during the main loop. This prevents users from cleanly interrupting a test run. Check exit_req each iteration to safely break out on exit signals. Signed-off-by: Cheng-Yang Chou <yphbchou0911@gmail.com> Acked-by: Andrea Righi <arighi@nvidia.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2026-02-13	selftests/sched_ext: Fix rt_stall flaky failure	Ihor Solodrai	-0/+49
	The rt_stall test measures the runtime ratio between an EXT and an RT task pinned to the same CPU, verifying that the deadline server prevents RT tasks from starving SCHED_EXT tasks. It expects the EXT task to get at least 4% of CPU time. The test is flaky because sched_stress_test() calls sleep(RUN_TIME) immediately after fork(), without waiting for the RT child to complete its setup (set_affinity + set_sched). If the RT child experiences scheduling latency before completing setup, that delay eats into the measurement window: the RT child runs for less than RUN_TIME seconds, and the EXT task's measured ratio drops below the 4% threshold. For example, in the failing CI run [1]: EXT=0.140s RT=4.750s total=4.890s (expected ~5.0s) ratio=2.86% < 4% → FAIL The 110ms gap (5.0 - 4.89) corresponds to the RT child's setup time being counted inside the measurement window, during which fewer deadline server ticks fire for the EXT task. Fix by using pipes to synchronize: each child signals the parent after completing its setup, and the parent waits for both signals before starting sleep(RUN_TIME). This ensures the measurement window only counts time when both tasks are fully configured and competing. [1] https://github.com/kernel-patches/bpf/actions/runs/21961895809/job/63442490449 Fixes: be621a76341c ("selftests/sched_ext: Add test for sched_ext dl_server") Assisted-by: claude-opus-4-6-v1 Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev> Reviewed-by: Andrea Righi <arighi@nvidia.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2026-02-11	Merge tag 'sched_ext-for-6.20' of ↵	Linus Torvalds	-11/+23
	git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext Pull sched_ext updates from Tejun Heo: - Move C example schedulers back from the external scx repo to tools/sched_ext as the authoritative source. scx_userland and scx_pair are returning while scx_sdt (BPF arena-based task data management) is new. These schedulers will be dropped from the external repo. - Improve error reporting by adding scx_bpf_error() calls when DSQ creation fails across all in-tree schedulers - Avoid redundant irq_work_queue() calls in destroy_dsq() by only queueing when llist_add() indicates an empty list - Fix flaky init_enable_count selftest by properly synchronizing pre-forked children using a pipe instead of sleep() * tag 'sched_ext-for-6.20' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext: selftests/sched_ext: Fix init_enable_count flakiness tools/sched_ext: Fix data header access during free in scx_sdt tools/sched_ext: Add error logging for dsq creation failures in remaining schedulers tools/sched_ext: add arena based scheduler tools/sched_ext: add scx_pair scheduler tools/sched_ext: add scx_userland scheduler sched_ext: Add error logging for dsq creation failures sched_ext: Avoid multiple irq_work_queue() calls in destroy_dsq()
2026-02-03	selftests/sched_ext: Add test for DL server total_bw consistency	Joel Fernandes	-0/+282
	Add a new kselftest to verify that the total_bw value in /sys/kernel/debug/sched/debug remains consistent across all CPUs under different sched_ext BPF program states: 1. Before a BPF scheduler is loaded 2. While a BPF scheduler is loaded and active 3. After a BPF scheduler is unloaded The test runs CPU stress threads to ensure DL server bandwidth values stabilize before checking consistency. This helps catch potential issues with DL server bandwidth accounting during sched_ext transitions. Co-developed-by: Andrea Righi <arighi@nvidia.com> Signed-off-by: Andrea Righi <arighi@nvidia.com> Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Tested-by: Christian Loehle <christian.loehle@arm.com> Link: https://patch.msgid.link/20260126100050.3854740-8-arighi@nvidia.com
2026-02-03	selftests/sched_ext: Add test for sched_ext dl_server	Andrea Righi	-0/+264
	Add a selftest to validate the correct behavior of the deadline server for the ext_sched_class. Co-developed-by: Joel Fernandes <joelagnelf@nvidia.com> Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com> Signed-off-by: Andrea Righi <arighi@nvidia.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com> Tested-by: Christian Loehle <christian.loehle@arm.com> Link: https://patch.msgid.link/20260126100050.3854740-7-arighi@nvidia.com
2026-02-02	selftests/sched_ext: Fix init_enable_count flakiness	Tejun Heo	-11/+23
	The init_enable_count test is flaky. The test forks 1024 children before attaching the scheduler to verify that existing tasks get ops.init_task() called. The children were using sleep(1) before exiting. 7900aa699c34 ("sched_ext: Fix cgroup exit ordering by moving sched_ext_free() to finish_task_switch()") changed when tasks are removed from scx_tasks - previously when the task_struct was freed, now immediately in finish_task_switch() when the task dies. Before the commit, pre-forked children would linger on scx_tasks until freed regardless of when they exited, so the scheduler would always see them during iteration. The sleep(1) was unnecessary. After the commit, children are removed as soon as they die. The sleep(1) masks the problem in most cases but the test becomes flaky depending on timing. Fix by synchronizing properly using a pipe. All children block on read() and the parent signals them to exit by closing the write end after attaching the scheduler. The children are auto-reaped so there's no need to wait on them. Reported-by: Ihor Solodrai <ihor.solodrai@linux.dev> Cc: David Vernet <void@manifault.com> Cc: Andrea Righi <arighi@nvidia.com> Cc: Changwoo Min <changwoo@igalia.com> Cc: Emil Tsalapatis <emil@etsalapatis.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2025-12-12	selftests/sched_ext: flush stdout before test to avoid log spam	Emil Tsalapatis	-0/+8
	The sched_ext selftests runner runs each test in the same process, with each test possibly forking multiple times. When the main runner has not flushed its stdout, the children inherit the buffered output for previous tests and emit it during exit. This causes log spam. Make sure stdout/stderr is fully flushed before each test. Cc: Ihor Solodrai <ihor.solodrai@linux.dev> Signed-off-by: Emil Tsalapatis <emil@etsalapatis.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2025-10-15	sched_ext: Add a selftest for scx_bpf_dsq_peek	Ryan Newton	-0/+476
	This commit adds two tests. The first is the most basic unit test: make sure an empty queue peeks as empty, and when we put one element in the queue, make sure peek returns that element. However, even this simple test is a little complicated by the different behavior of scx_bpf_dsq_insert in different calling contexts: - insert is for direct dispatch in enqueue - insert is delayed when called from select_cpu In this case we split the insert and the peek that verifies the result between enqueue/dispatch. Note: An alternative would be to call `scx_bpf_dsq_move_to_local` on an empty queue, which in turn calls `flush_dispatch_buf`, in order to flush the buffered insert. Unfortunately, this is not viable within the enqueue path, as it attempts a voluntary context switch within an RCU read-side critical section. The second test is a stress test that performs many peeks on all DSQs and records the observed tasks. Signed-off-by: Ryan Newton <newton@meta.com> Reviewed-by: Christian Loehle <christian.loehle@arm.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2025-08-11	selftests/sched_ext: Remove duplicate sched.h header	Jiapeng Chong	-1/+0
	./tools/testing/selftests/sched_ext/hotplug.c: sched.h is included more than once. Reported-by: Abaci Robot <abaci@linux.alibaba.com> Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=22941 Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Acked-by: Andrea Righi <arighi@nvidia.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2025-07-31	Merge tag 'sched_ext-for-6.17' of ↵	Linus Torvalds	-0/+5
	git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext Pull sched_ext updates from Tejun Heo: - Add support for cgroup "cpu.max" interface - Code organization cleanup so that ext_idle.c doesn't depend on the source-file-inclusion build method of sched/ - Drop UP paths in accordance with sched core changes - Documentation and other misc changes * tag 'sched_ext-for-6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext: sched_ext: Fix scx_bpf_reenqueue_local() reference sched_ext: Drop kfuncs marked for removal in 6.15 sched_ext, rcu: Eject BPF scheduler on RCU CPU stall panic kernel/sched/ext.c: fix typo "occured" -> "occurred" in comments sched_ext: Add support for cgroup bandwidth control interface sched_ext, sched/core: Factor out struct scx_task_group sched_ext: Return NULL in llc_span sched_ext: Always use SMP versions in kernel/sched/ext_idle.h sched_ext: Always use SMP versions in kernel/sched/ext_idle.c sched_ext: Always use SMP versions in kernel/sched/ext.h sched_ext: Always use SMP versions in kernel/sched/ext.c sched_ext: Documentation: Clarify time slice handling in task lifecycle sched_ext: Make scx_locked_rq() inline sched_ext: Make scx_rq_bypassing() inline sched_ext: idle: Make local functions static in ext_idle.c sched_ext: idle: Remove unnecessary ifdef in scx_bpf_cpu_node()
2025-07-03	selftests/sched_ext: Fix exit selftest hang on UP	Andrea Righi	-0/+8
	On single-CPU systems, ops.select_cpu() is never called, causing the EXIT_SELECT_CPU test case to wait indefinitely. Avoid the stall by skipping this specific sub-test when only one CPU is available. Reported-by: Phil Auld <pauld@redhat.com> Fixes: a5db7817af780 ("sched_ext: Add selftests") Signed-off-by: Andrea Righi <arighi@nvidia.com> Reviewed-by: Phil Auld <pauld@redhat.com> Tested-by: Phil Auld <pauld@redhat.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2025-06-20	sched_ext: Add support for cgroup bandwidth control interface	Tejun Heo	-0/+5
	From 077814f57f8acce13f91dc34bbd2b7e4911fbf25 Mon Sep 17 00:00:00 2001 From: Tejun Heo <tj@kernel.org> Date: Fri, 13 Jun 2025 15:06:47 -1000 - Add CONFIG_GROUP_SCHED_BANDWIDTH which is selected by both CONFIG_CFS_BANDWIDTH and EXT_GROUP_SCHED. - Put bandwidth control interface files for both cgroup v1 and v2 under CONFIG_GROUP_SCHED_BANDWIDTH. - Update tg_bandwidth() to fetch configuration parameters from fair if CONFIG_CFS_BANDWIDTH, SCX otherwise. - Update tg_set_bandwidth() to update the parameters for both fair and SCX. - Add bandwidth control parameters to struct scx_cgroup_init_args. - Add sched_ext_ops.cgroup_set_bandwidth() which is invoked on bandwidth control parameter updates. - Update scx_qmap and maximal selftest to test the new feature. Signed-off-by: Tejun Heo <tj@kernel.org>
2025-05-21	selftests/sched_ext: Update test enq_select_cpu_fails	Andrea Righi	-105/+163
	With commit 08699d20467b6 ("sched_ext: idle: Consolidate default idle CPU selection kfuncs") allowing scx_bpf_select_cpu_dfl() to be invoked from multiple contexts, update the test to validate that the kfunc behaves correctly when used from ops.enqueue() and via BPF test_run. Additionally, rename the test to enq_select_cpu, dropping "fails" from the name, as the logic has now been inverted. Signed-off-by: Andrea Righi <arighi@nvidia.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2025-05-20	selftests/sched_ext: Add test for scx_bpf_select_cpu_and() via test_run	Andrea Righi	-0/+50
	Update the allowed_cpus selftest to include a check to validate the behavior of scx_bpf_select_cpu_and() when invoked via a BPF test_run call. Signed-off-by: Andrea Righi <arighi@nvidia.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2025-04-07	selftests/sched_ext: Add test for scx_bpf_select_cpu_and()	Andrea Righi	-0/+179
	Add a selftest to validate the behavior of the built-in idle CPU selection policy applied to a subset of allowed CPUs, using scx_bpf_select_cpu_and(). Signed-off-by: Andrea Righi <arighi@nvidia.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2025-03-24	Merge tag 'sched-core-2025-03-22' of ↵	Linus Torvalds	-1/+0
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler updates from Ingo Molnar: "Core & fair scheduler changes: - Cancel the slice protection of the idle entity (Zihan Zhou) - Reduce the default slice to avoid tasks getting an extra tick (Zihan Zhou) - Force propagating min_slice of cfs_rq when {en,de}queue tasks (Tianchen Ding) - Refactor can_migrate_task() to elimate looping (I Hsin Cheng) - Add unlikey branch hints to several system calls (Colin Ian King) - Optimize current_clr_polling() on certain architectures (Yujun Dong) Deadline scheduler: (Juri Lelli) - Remove redundant dl_clear_root_domain call - Move dl_rebuild_rd_accounting to cpuset.h Uclamp: - Use the uclamp_is_used() helper instead of open-coding it (Xuewen Yan) - Optimize sched_uclamp_used static key enabling (Xuewen Yan) Scheduler topology support: (Juri Lelli) - Ignore special tasks when rebuilding domains - Add wrappers for sched_domains_mutex - Generalize unique visiting of root domains - Rebuild root domain accounting after every update - Remove partition_and_rebuild_sched_domains - Stop exposing partition_sched_domains_locked RSEQ: (Michael Jeanson) - Update kernel fields in lockstep with CONFIG_DEBUG_RSEQ=y - Fix segfault on registration when rseq_cs is non-zero - selftests: Add rseq syscall errors test - selftests: Ensure the rseq ABI TLS is actually 1024 bytes Membarriers: - Fix redundant load of membarrier_state (Nysal Jan K.A.) Scheduler debugging: - Introduce and use preempt_model_str() (Sebastian Andrzej Siewior) - Make CONFIG_SCHED_DEBUG unconditional (Ingo Molnar) Fixes and cleanups: - Always save/restore x86 TSC sched_clock() on suspend/resume (Guilherme G. Piccoli) - Misc fixes and cleanups (Thorsten Blum, Juri Lelli, Sebastian Andrzej Siewior)" * tag 'sched-core-2025-03-22' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (40 commits) cpuidle, sched: Use smp_mb__after_atomic() in current_clr_polling() sched/debug: Remove CONFIG_SCHED_DEBUG sched/debug: Remove CONFIG_SCHED_DEBUG from self-test config files sched/debug, Documentation: Remove (most) CONFIG_SCHED_DEBUG references from documentation sched/debug: Make CONFIG_SCHED_DEBUG functionality unconditional sched/debug: Make 'const_debug' tunables unconditional __read_mostly sched/debug: Change SCHED_WARN_ON() to WARN_ON_ONCE() rseq/selftests: Fix namespace collision with rseq UAPI header include/{topology,cpuset}: Move dl_rebuild_rd_accounting to cpuset.h sched/topology: Stop exposing partition_sched_domains_locked cgroup/cpuset: Remove partition_and_rebuild_sched_domains sched/topology: Remove redundant dl_clear_root_domain call sched/deadline: Rebuild root domain accounting after every update sched/deadline: Generalize unique visiting of root domains sched/topology: Wrappers for sched_domains_mutex sched/deadline: Ignore special tasks when rebuilding domains tracing: Use preempt_model_str() xtensa: Rely on generic printing of preemption model x86: Rely on generic printing of preemption model s390: Rely on generic printing of preemption model ...
2025-03-19	sched/debug: Remove CONFIG_SCHED_DEBUG from self-test config files	Ingo Molnar	-1/+0
	We leave most of the defconfigs alone (there's over 70 of them), but let's remove CONFIG_SCHED_DEBUG from the scheduler self-test Kconfig files. Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Vincent Guittot <vincent.guittot@linaro.org> Cc: Dietmar Eggemann <dietmar.eggemann@arm.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Ben Segall <bsegall@google.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Valentin Schneider <vschneid@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/Z9szt3MpQmQ56TRd@gmail.com
2025-03-03	sched_ext: Merge branch 'for-6.14-fixes' into for-6.15	Tejun Heo	-2/+2
	Pull for-6.14-fixes to receive: 9360dfe4cbd6 ("sched_ext: Validate prev_cpu in scx_bpf_select_cpu_dfl()") which conflicts with: 337d1b354a29 ("sched_ext: Move built-in idle CPU selection policy to a separate file") Signed-off-by: Tejun Heo <tj@kernel.org>
2025-02-26	selftests/sched_ext: Add NUMA-aware scheduler test	Andrea Righi	-0/+160
	Add a selftest to validate the behavior of the NUMA-aware scheduler functionalities, including idle CPU selection within nodes, per-node DSQs and CPU to node mapping. Signed-off-by: Andrea Righi <arighi@nvidia.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2025-02-13	sched_ext: selftests: Fix grammar in tests description	Devaansh Kumar	-2/+2
	Fixed grammar for a few tests of sched_ext. Signed-off-by: Devaansh Kumar <devaanshk840@gmail.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2025-01-27	sched_ext: selftests/dsp_local_on: Fix selftest on UP systems	Andrea Righi	-1/+1
	In UP systems p->migration_disabled is not available. Fix this by using the portable helper is_migration_disabled(p). Fixes: e9fe182772dc ("sched_ext: selftests/dsp_local_on: Fix sporadic failures") Signed-off-by: Andrea Righi <arighi@nvidia.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2025-01-24	sched_ext: selftests/dsp_local_on: Fix sporadic failures	Tejun Heo	-1/+1
	dsp_local_on has several incorrect assumptions, one of which is that p->nr_cpus_allowed always tracks p->cpus_ptr. This is not true when a task is scheduled out while migration is disabled - p->cpus_ptr is temporarily overridden to the previous CPU while p->nr_cpus_allowed remains unchanged. This led to sporadic test faliures when dsp_local_on_dispatch() tries to put a migration disabled task to a different CPU. Fix it by keeping the previous CPU when migration is disabled. There are SCX schedulers that make use of p->nr_cpus_allowed. They should also implement explicit handling for p->migration_disabled. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Ihor Solodrai <ihor.solodrai@pm.me> Cc: Andrea Righi <arighi@nvidia.com> Cc: Changwoo Min <changwoo@igalia.com>
2025-01-24	selftests/sched_ext: Fix enum resolution	Andrea Righi	-67/+88
	All scx enums are now automatically generated from vmlinux.h and they must be initialized using the SCX_ENUM_INIT() macro. Fix the scx selftests to use this macro to properly initialize these values. Fixes: 8da7bf2cee27 ("tools/sched_ext: Receive updates from SCX repo") Reported-by: Ihor Solodrai <ihor.solodrai@pm.me> Closes: https://lore.kernel.org/all/Z2tNK2oFDX1OPp8C@slm.duckdns.org/ Signed-off-by: Andrea Righi <arighi@nvidia.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2025-01-23	Merge tag 'sched_ext-for-6.14' of ↵	Linus Torvalds	-2/+13
	git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext Pull sched_ext updates from Tejun Heo: - scx_bpf_now() added so that BPF scheduler can access the cached timestamp in struct rq to avoid reading TSC multiple times within a locked scheduling operation. - Minor updates to the built-in idle CPU selection logic. - tool/sched_ext updates and other misc changes. * tag 'sched_ext-for-6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext: sched_ext: fix kernel-doc warnings sched_ext: Use time helpers in BPF schedulers sched_ext: Replace bpf_ktime_get_ns() to scx_bpf_now() sched_ext: Add time helpers for BPF schedulers sched_ext: Add scx_bpf_now() for BPF scheduler sched_ext: Implement scx_bpf_now() sched_ext: Relocate scx_enabled() related code sched_ext: Add option -l in selftest runner to list all available tests sched_ext: Include remaining task time slice in error state dump sched_ext: update scx_bpf_dsq_insert() doc for SCX_DSQ_LOCAL_ON sched_ext: idle: small CPU iteration refactoring sched_ext: idle: introduce check_builtin_idle_enabled() helper sched_ext: idle: clarify comments sched_ext: idle: use assign_cpu() to update the idle cpumask sched_ext: Use str_enabled_disabled() helper in update_selcpu_topology() sched_ext: Use sizeof_field for key_len in dsq_hash_params tools/sched_ext: Receive updates from SCX repo sched_ext: Use the NUMA scheduling domain for NUMA optimizations
2025-01-07	sched_ext: Add option -l in selftest runner to list all available tests	Shizhao Chen	-2/+13
	The selftest runner currently allows selecting tests via the -t option. This patch adds a new -l option that lists all available tests, providing users with an overview of the tests they can choose from. This enhancement is especially useful for scripting and automation purposes, making it easier to discover and run tests. Signed-off-by: Shizhao Chen <shichen@redhat.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2024-12-24	sched_ext: Fix dsq_local_on selftest	Tejun Heo	-3/+7
	The dsp_local_on selftest expects the scheduler to fail by trying to schedule an e.g. CPU-affine task to the wrong CPU. However, this isn't guaranteed to happen in the 1 second window that the test is running. Besides, it's odd to have this particular exception path tested when there are no other tests that verify that the interface is working at all - e.g. the test would pass if dsp_local_on interface is completely broken and fails on any attempt. Flip the test so that it verifies that the feature works. While at it, fix a typo in the info message. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Ihor Solodrai <ihor.solodrai@pm.me> Link: http://lkml.kernel.org/r/Z1n9v7Z6iNJ-wKmq@slm.duckdns.org Signed-off-by: Tejun Heo <tj@kernel.org>
2024-12-10	scx: Fix maximal BPF selftest prog	David Vernet	-3/+5
	maximal.bpf.c is still dispatching to and consuming from SCX_DSQ_GLOBAL. Let's have it use its own DSQ to avoid any runtime errors. Signed-off-by: David Vernet <void@manifault.com> Tested-by: Andrea Righi <arighi@nvidia.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2024-12-04	selftests/sched_ext: fix build after renames in sched_ext API	Ihor Solodrai	-19/+19
	The selftests are falining to build on current tip of bpf-next and sched_ext [1]. This has broken BPF CI [2] after merge from upstream. Use appropriate function names in the selftests according to the recent changes in the sched_ext API [3]. [1] https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git/commit/?id=fc39fb56917bb3cb53e99560ca3612a84456ada2 [2] https://github.com/kernel-patches/bpf/actions/runs/11959327258/job/33340923745 [3] https://lore.kernel.org/all/20241109194853.580310-1-tj@kernel.org/ Signed-off-by: Ihor Solodrai <ihor.solodrai@pm.me> Acked-by: Andrea Righi <arighi@nvidia.com> Acked-by: David Vernet <void@manifault.com> Signed-off-by: Tejun Heo <tj@kernel.org>