linux - Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/

Age	Commit message (Collapse)	Author	Lines
2026-03-05	refscale: Update due to x86 not supporting none/voluntary preemption	Paul E. McKenney	-2/+4
	As of v7.0-rc1, architectures that support preemption, including x86 and arm64, no longer support CONFIG_PREEMPT_NONE or CONFIG_PREEMPT_VOLUNTARY. Attempting to build kernels with these two Kconfig options results in .config errors. This commit therefore switches such refscale scenarios to CONFIG_PREEMPT_LAZY. Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Boqun Feng <boqun@kernel.org> Link: https://patch.msgid.link/20260303235903.1967409-3-paulmck@kernel.org
2026-03-05	rcuscale: Update due to x86 not supporting none/voluntary preemption	Paul E. McKenney	-2/+4
	As of v7.0-rc1, architectures that support preemption, including x86 and arm64, no longer support CONFIG_PREEMPT_NONE or CONFIG_PREEMPT_VOLUNTARY. Attempting to build kernels with these two Kconfig options results in .config errors. This commit therefore switches such rcuscale scenarios to CONFIG_PREEMPT_LAZY. Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Boqun Feng <boqun@kernel.org> Link: https://patch.msgid.link/20260303235903.1967409-2-paulmck@kernel.org
2026-03-05	rcutorture: Update due to x86 not supporting none/voluntary preemption	Paul E. McKenney	-13/+29
	As of v7.0-rc1, architectures that support preemption, including x86 and arm64, no longer support CONFIG_PREEMPT_NONE or CONFIG_PREEMPT_VOLUNTARY. Attempting to build kernels with these two Kconfig options results in .config errors. This commit therefore switches such rcutorture scenarios to CONFIG_PREEMPT_LAZY. Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Reviewed-by: Joel Fernandes <joelagnelf@nvidia.com> Signed-off-by: Boqun Feng <boqun@kernel.org> Link: https://patch.msgid.link/bfe89f6c-3b63-40c6-aa6d-5f523e3e9a31@paulmck-laptop
2026-03-05	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net	Jakub Kicinski	-316/+1621
	Cross-merge networking fixes after downstream PR (net-7.0-rc3). No conflicts. Adjacent changes: net/netfilter/nft_set_rbtree.c fb7fb4016300 ("netfilter: nf_tables: clone set on flush only") 3aea466a4399 ("netfilter: nft_set_rbtree: don't disable bh when acquiring tree lock") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-05	Merge tag 'net-7.0-rc3' of ↵	Linus Torvalds	-55/+624
	git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from CAN, netfilter and wireless. Current release - new code bugs: - sched: cake: fixup cake_mq rate adjustment for diffserv config - wifi: fix missing ieee80211_eml_params member initialization Previous releases - regressions: - tcp: give up on stronger sk_rcvbuf checks (for now) Previous releases - always broken: - net: fix rcu_tasks stall in threaded busypoll - sched: - fq: clear q->band_pkt_count[] in fq_reset() - only allow act_ct to bind to clsact/ingress qdiscs and shared blocks - bridge: check relevant per-VLAN options in VLAN range grouping - xsk: fix fragment node deletion to prevent buffer leak Misc: - spring cleanup of inactive maintainers" * tag 'net-7.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (138 commits) xdp: produce a warning when calculated tailroom is negative net: enetc: use truesize as XDP RxQ info frag_size libeth, idpf: use truesize as XDP RxQ info frag_size i40e: use xdp.frame_sz as XDP RxQ info frag_size i40e: fix registering XDP RxQ info ice: change XDP RxQ frag_size from DMA write length to xdp.frame_sz ice: fix rxq info registering in mbuf packets xsk: introduce helper to determine rxq->frag_size xdp: use modulo operation to calculate XDP frag tailroom selftests/tc-testing: Add tests exercising act_ife metalist replace behaviour net/sched: act_ife: Fix metalist update behavior selftests: net: add test for IPv4 route with loopback IPv6 nexthop net: ipv6: fix panic when IPv4 route references loopback IPv6 nexthop net: vxlan: fix nd_tbl NULL dereference when IPv6 is disabled net: bridge: fix nd_tbl NULL dereference when IPv6 is disabled MAINTAINERS: remove Thomas Falcon from IBM ibmvnic MAINTAINERS: remove Claudiu Manoil and Alexandre Belloni from Ocelot switch MAINTAINERS: replace Taras Chornyi with Elad Nachman for Marvell Prestera MAINTAINERS: remove Jonathan Lemon from OpenCompute PTP MAINTAINERS: replace Clark Wang with Frank Li for Freescale FEC ...
2026-03-05	selftests/tc-testing: Add tests exercising act_ife metalist replace behaviour	Victor Nogueira	-0/+99
	Add 2 test cases to exercise fix in act_ife's internal metalist behaviour. - Update decode ife action into encode with tcindex metadata - Update decode ife action into encode with multiple metadata Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Victor Nogueira <victor@mojatatu.com> Link: https://patch.msgid.link/20260304140603.76500-2-jhs@mojatatu.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-05	selftests: net: add test for IPv4 route with loopback IPv6 nexthop	Jiayuan Chen	-0/+11
	Add a regression test for a kernel panic that occurs when an IPv4 route references an IPv6 nexthop object created on the loopback device. The test creates an IPv6 nexthop on lo, binds an IPv4 route to it, then triggers a route lookup via ping to verify the kernel does not crash. ./fib_nexthops.sh Tests passed: 249 Tests failed: 0 Reviewed-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Jiayuan Chen <jiayuan.chen@shopee.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://patch.msgid.link/20260304113817.294966-3-jiayuan.chen@linux.dev Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-05	selftests: net: tun: don't abort XFAIL cases	Sun Jian	-6/+6
	The tun UDP tunnel GSO fixture contains XFAIL-marked variants intended to exercise failure paths (e.g. EMSGSIZE / "Message too long"). Using ASSERT_EQ() in these tests aborts the subtest, which prevents the harness from classifying them as XFAIL and can make the overall net: tun test fail. Switch the relevant ASSERT_EQ() checks to EXPECT_EQ() so the subtests continue running and the failures are correctly reported and accounted as XFAIL where applicable. Signed-off-by: Sun Jian <sun.jian.kdev@gmail.com> Link: https://patch.msgid.link/20260225111451.347923-2-sun.jian.kdev@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-05	selftests/harness: order TEST_F and XFAIL_ADD constructors	Sun Jian	-2/+5
	TEST_F() allocates and registers its struct __test_metadata via mmap() inside its constructor, and only then assigns the _##fixture_##test##_object pointer. XFAIL_ADD() runs in a constructor too and reads _##fixture_##test##_object to initialize xfail->test. If XFAIL_ADD runs first, xfail->test can be NULL and the expected failure will be reported as FAIL. Use constructor priorities to ensure TEST_F registration runs before XFAIL_ADD, without adding extra state or runtime lookups. Fixes: 2709473c9386 ("selftests: kselftest_harness: support using xfail") Signed-off-by: Sun Jian <sun.jian.kdev@gmail.com> Link: https://patch.msgid.link/20260225111451.347923-1-sun.jian.kdev@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-04	selftests: drv-net: update the README	Jakub Kicinski	-2/+88
	I have added some instructions for driver authors on the NIPA wiki: https://github.com/linux-netdev/nipa/wiki/Guidance-for-test-authors last year. Given the increasingly common use of LLMs let's add those in tree as well. Hopefully this will decrease the number of review comments we have to give to AI-assisted noobs. While at it sync the overall instructions with what's on the GitHub as well. Link: https://patch.msgid.link/20260303213626.2320308-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-04	selftests: drv-net: rss: Fix error calculation in test_hitless_key_update	Dimitri Daskalakis	-1/+1
	This test verifies there are no errors when a devices RSS key is updated while traffic is flowing. The current check is a no-op since the last sample was subtracted from itself. Signed-off-by: Dimitri Daskalakis <dimitri.daskalakis1@gmail.com> Link: https://patch.msgid.link/20260303202258.1595661-1-dimitri.daskalakis1@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-04	selftests: mptcp: join: check removing signal+subflow endp	Matthieu Baerts (NGI0)	-0/+13
	This validates the previous commit: endpoints with both the signal and subflow flags should always be marked as used even if it was not possible to create new subflows due to the MPTCP PM limits. For this test, an extra endpoint is created with both the signal and the subflow flags, and limits are set not to create extra subflows. In this case, an ADD_ADDR is sent, but no subflows are created. Still, the local endpoint is marked as used, and no warning is fired when removing the endpoint, after having sent a RM_ADDR. The 'Fixes' tag here below is the same as the one from the previous commit: this patch here is not fixing anything wrong in the selftests, but it validates the previous fix for an issue introduced by this commit ID. Fixes: 85df533a787b ("mptcp: pm: do not ignore 'subflow' if 'signal' flag is also set") Cc: stable@vger.kernel.org Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20260303-net-mptcp-misc-fixes-7-0-rc2-v1-5-4b5462b6f016@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-04	selftests: mptcp: join: check RM_ADDR not sent over same subflow	Matthieu Baerts (NGI0)	-0/+36
	This validates the previous commit: RM_ADDR were sent over the first found active subflow which could be the same as the one being removed. It is more likely to loose this notification. For this check, RM_ADDR are explicitly dropped when trying to send them over the initial subflow, when removing the endpoint attached to it. If it is dropped, the test will complain because some RM_ADDR have not been received. Note that only the RM_ADDR are dropped, to allow the linked subflow to be quickly and cleanly closed. To only drop those RM_ADDR, a cBPF byte code is used. If the IPTables commands fail, that's OK, the tests will continue to pass, but not validate this part. This can be ignored: another subtest fully depends on such command, and will be marked as skipped. The 'Fixes' tag here below is the same as the one from the previous commit: this patch here is not fixing anything wrong in the selftests, but it validates the previous fix for an issue introduced by this commit ID. Fixes: 8dd5efb1f91b ("mptcp: send ack for rm_addr") Cc: stable@vger.kernel.org Reviewed-by: Mat Martineau <martineau@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20260303-net-mptcp-misc-fixes-7-0-rc2-v1-3-4b5462b6f016@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-04	selftests: mptcp: more stable simult_flows tests	Paolo Abeni	-4/+7
	By default, the netem qdisc can keep up to 1000 packets under its belly to deal with the configured rate and delay. The simult flows test-case simulates very low speed links, to avoid problems due to slow CPUs and the TCP stack tend to transmit at a slightly higher rate than the (virtual) link constraints. All the above causes a relatively large amount of packets being enqueued in the netem qdiscs - the longer the transfer, the longer the queue - producing increasingly high TCP RTT samples and consequently increasingly larger receive buffer size due to DRS. When the receive buffer size becomes considerably larger than the needed size, the tests results can flake, i.e. because minimal inaccuracy in the pacing rate can lead to a single subflow usage towards the end of the connection for a considerable amount of data. Address the issue explicitly setting netem limits suitable for the configured link speeds and unflake all the affected tests. Fixes: 1a418cb8e888 ("mptcp: simult flow self-tests") Cc: stable@vger.kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20260303-net-mptcp-misc-fixes-7-0-rc2-v1-1-4b5462b6f016@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-04	selftests/bpf: Add selftests for the invocation of bpf_lwt_xmit_push_encap	Feng Yang	-0/+31
	Calling bpf_lwt_xmit_push_encap will not cause a crash when dst is missing. Signed-off-by: Feng Yang <yangfeng@kylinos.cn> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20260304094429.168521-3-yangfeng59949@163.com
2026-03-04	KVM: selftests: Add a test for L2 clearing EFER.SVME without intercept	Yosry Ahmed	-0/+56
	Add a test that verifies KVM's newly introduced behavior of synthesizing a triple fault in L1 if L2 clears EFER.SVME without an L1 interception (which is architecturally undefined). Signed-off-by: Yosry Ahmed <yosry.ahmed@linux.dev> Link: https://patch.msgid.link/20260209195142.2554532-3-yosry.ahmed@linux.dev Signed-off-by: Sean Christopherson <seanjc@google.com>
2026-03-04	KVM: selftest: Add a selftest for VMRUN/#VMEXIT with unmappable vmcb12	Yosry Ahmed	-0/+99
	Add a test that verifies that KVM correctly injects a #GP for nested VMRUN and a shutdown for nested #VMEXIT, if the GPA of vmcb12 cannot be mapped. Signed-off-by: Yosry Ahmed <yosry@kernel.org> Link: https://patch.msgid.link/20260303003421.2185681-27-yosry@kernel.org Signed-off-by: Sean Christopherson <seanjc@google.com>
2026-03-04	KVM: SVM: Rename vmcb->virt_ext to vmcb->misc_ctl2	Yosry Ahmed	-14/+14
	'virt' is confusing in the VMCB because it is relative and ambiguous. The 'virt_ext' field includes bits for LBR virtualization and VMSAVE/VMLOAD virtualization, so it's just another miscellaneous control field. Name it as such. While at it, move the definitions of the bits below those for 'misc_ctl' and rename them for consistency. Signed-off-by: Yosry Ahmed <yosry@kernel.org> Link: https://patch.msgid.link/20260303003421.2185681-20-yosry@kernel.org Signed-off-by: Sean Christopherson <seanjc@google.com>
2026-03-04	KVM: SVM: Rename vmcb->nested_ctl to vmcb->misc_ctl	Sean Christopherson	-4/+4
	The 'nested_ctl' field is misnamed. Although the first bit is for nested paging, the other defined bits are for SEV/SEV-ES. Other bits in the same field according to the APM (but not defined by KVM) include "Guest Mode Execution Trap", "Enable INVLPGB/TLBSYNC", and other control bits unrelated to 'nested'. There is nothing common among these bits, so just name the field misc_ctl. Also rename the flags accordingly. Signed-off-by: Yosry Ahmed <yosry@kernel.org> Link: https://patch.msgid.link/20260303003421.2185681-19-yosry@kernel.org Signed-off-by: Sean Christopherson <seanjc@google.com>
2026-03-04	KVM: selftests: Add a test for LBR save/restore (ft. nested)	Yosry Ahmed	-0/+151
	Add a selftest exercising save/restore with usage of LBRs in both L1 and L2, and making sure all LBRs remain intact. Signed-off-by: Yosry Ahmed <yosry@kernel.org> Link: https://patch.msgid.link/20260303003421.2185681-5-yosry@kernel.org Signed-off-by: Sean Christopherson <seanjc@google.com>
2026-03-04	Merge tag 'vfs-7.0-rc3.fixes' of ↵	Linus Torvalds	-10/+15
	git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs fixes from Christian Brauner: - kthread: consolidate kthread exit paths to prevent use-after-free - iomap: - don't mark folio uptodate if read IO has bytes pending - don't report direct-io retries to fserror - reject delalloc mappings during writeback - ns: tighten visibility checks - netfs: Fix unbuffered/DIO writes to dispatch subrequests in strict sequence * tag 'vfs-7.0-rc3.fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: iomap: reject delalloc mappings during writeback iomap: don't mark folio uptodate if read IO has bytes pending selftests: fix mntns iteration selftests nstree: tighten permission checks for listing nsfs: tighten permission checks for handle opening nsfs: tighten permission checks for ns iteration ioctls netfs: Fix unbuffered/DIO writes to dispatch subrequests in strict sequence kthread: consolidate kthread exit paths to prevent use-after-free iomap: don't report direct-io retries to fserror
2026-03-04	KVM: selftests: Test MADV_COLLAPSE on guest_memfd	Ackerley Tng	-3/+67
	guest_memfd only supports PAGE_SIZE pages, and khugepaged or MADV_COLLAPSE collapsing pages may result in private memory regions being mapped into host page tables. Add test to verify that MADV_COLLAPSE fails on guest_memfd folios, and any subsequent usage of guest_memfd memory faults in PAGE_SIZE folios. Running this test should not result in any memory failure logs or kernel WARNings. This selftest was added as a result of a syzbot-reported issue where khugepaged operating on guest_memfd memory with MADV_HUGEPAGE caused the collapse of folios, which then subsequently resulted in a WARNing. Link: https://syzkaller.appspot.com/bug?extid=33a04338019ac7e43a44 Suggested-by: David Hildenbrand <david@kernel.org> Signed-off-by: Ackerley Tng <ackerleytng@google.com> Link: https://patch.msgid.link/8048d04f150326d1e2231318aa9f1b3fce3e2e2c.1771630983.git.ackerleytng@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2026-03-04	KVM: selftests: Wrap madvise() to assert success	Ackerley Tng	-0/+1
	Extend kvm_syscalls.h to wrap madvise() to assert success. This will be used in the next patch. Signed-off-by: Ackerley Tng <ackerleytng@google.com> Reviewed-by: David Hildenbrand (Arm) <david@kernel.org> Link: https://patch.msgid.link/455483ca29a3a3042efee0cf3bbd0e2548cbeb1c.1771630983.git.ackerleytng@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2026-03-04	sched_ext/selftests: Fix format specifier and buffer length in file_write_long()	Cheng-Yang Chou	-2/+2
	Use %ld (not %lu) for signed long, and pass the actual string length returned by sprintf() to write_text() instead of sizeof(buf). Signed-off-by: Cheng-Yang Chou <yphbchou0911@gmail.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2026-03-04	selftests: tc-testing: fix list_categories() crash on list type	Naveen Anandhan	-3/+7
	list_categories() builds a set directly from the 'category' field of each test case. Since 'category' is a list, set(map(...)) attempts to insert lists into a set, which raises: TypeError: unhashable type: 'list' Flatten category lists and collect unique category names using set.update() instead. Signed-off-by: Naveen Anandhan <mr.navi8680@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2026-03-03	selftests: net: add macvlan multicast test for shared source MAC	Kibaek Yoo	-0/+94
	Add a selftest that verifies multicast delivery to a macvlan bridge port when the source MAC of the incoming frame matches the macvlan's own MAC address. This scenario occurs with protocols like VRRP where multiple hosts share the same virtual MAC address. Without the corresponding kernel change, macvlan bridge mode does not handle this case and the multicast frame is not delivered. Signed-off-by: Kibaek Yoo <psykibaek@gmail.com> Link: https://patch.msgid.link/20260228071613.4360-2-psykibaek@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-03	selftests: netconsole: print diagnostic on busywait timeout in netcons_basic	Breno Leitao	-1/+5
	The script uses set -euo pipefail, so when busywait times out waiting for the netconsole message to arrive, it returns 1 and the script exits immediately without printing any error message. As reported by Jakub, this makes failures hard to diagnose since the test reports exit=1 with no explanation. Handle the busywait failure explicitly so that a FAIL message is printed before exiting. This is how it looks like now: Running with target mode: basic (ipv6) [ 167.452561] netconsole selftest: netcons_QdMay FAIL: Timed out waiting (20000 ms) for netconsole message in /tmp/netcons_QdMay The remaining silent failures under set -e can only happen during the setup phase (netdevsim creation, interface configuration, configfs writes). So, it is not expected to have any silent failure once the test starts. Note that this issue might be less frequent now, since commit a68a9bd086c28 ("selftests: netconsole: Increase port listening timeout") increased the timeout that _might_ have been the root cause of these random failures in NIPA. Signed-off-by: Breno Leitao <leitao@debian.org> Link: https://patch.msgid.link/20260302-netconsole_test_verbose-v1-1-b1be5d30cd7d@debian.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-03	Merge tag 'cgroup-for-7.0-rc2-fixes' of ↵	Linus Torvalds	-110/+114
	git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup Pull cgroup fixes from Tejun Heo: - Fix circular locking dependency in cpuset partition code by deferring housekeeping_update() calls to a workqueue instead of calling them directly under cpus_read_lock - Fix null-ptr-deref in rebuild_sched_domains_cpuslocked() when generate_sched_domains() returns NULL due to kmalloc failure - Fix incorrect cpuset behavior for effective_xcpus in partition_xcpus_del() and cpuset_update_tasks_cpumask() in update_cpumasks_hier() - Fix race between task migration and cgroup iteration * tag 'cgroup-for-7.0-rc2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: cgroup/cpuset: fix null-ptr-deref in rebuild_sched_domains_cpuslocked cgroup/cpuset: Call housekeeping_update() without holding cpus_read_lock cgroup/cpuset: Defer housekeeping_update() calls from CPU hotplug to workqueue cgroup/cpuset: Move housekeeping_update()/rebuild_sched_domains() together kselftest/cgroup: Simplify test_cpuset_prs.sh by removing "S+" command cgroup/cpuset: Set isolated_cpus_updating only if isolated_cpus is changed cgroup/cpuset: Clarify exclusion rules for cpuset internal variables cgroup/cpuset: Fix incorrect use of cpuset_update_tasks_cpumask() in update_cpumasks_hier() cgroup/cpuset: Fix incorrect change to effective_xcpus in partition_xcpus_del() cgroup: fix race between task migration and iteration
2026-03-03	Merge tag 'sched_ext-for-7.0-rc2-fixes' of ↵	Linus Torvalds	-4/+9
	git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext Pull sched_ext fixes from Tejun Heo: - Fix starvation of scx_enable() under fair-class saturation by offloading the enable path to an RT kthread - Fix out-of-bounds access in idle mask initialization on systems with non-contiguous NUMA node IDs - Fix a preemption window during scheduler exit and a refcount underflow in cgroup init error path - Fix SCX_EFLAG_INITIALIZED being a no-op flag - Add READ_ONCE() annotations for KCSAN-clean lockless accesses and replace naked scx_root dereferences with container_of() in kobject callbacks - Tooling and selftest fixes: compilation issues with clang 17, strtoul() misuse, unused options cleanup, and Kconfig sync * tag 'sched_ext-for-7.0-rc2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext: sched_ext: Fix starvation of scx_enable() under fair-class saturation sched_ext: Remove redundant css_put() in scx_cgroup_init() selftests/sched_ext: Fix peek_dsq.bpf.c compile error for clang 17 selftests/sched_ext: Add -fms-extensions to bpf build flags tools/sched_ext: Add -fms-extensions to bpf build flags sched_ext: Use READ_ONCE() for plain reads of scx_watchdog_timeout sched_ext: Replace naked scx_root dereferences in kobject callbacks sched_ext: Use READ_ONCE() for the read side of dsq->nr update tools/sched_ext: fix strtoul() misuse in scx_hotplug_seq() sched_ext: Fix SCX_EFLAG_INITIALIZED being a no-op flag sched_ext: Fix out-of-bounds access in scx_idle_init_masks() sched_ext: Disable preemption between scx_claim_exit() and kicking helper work tools/sched_ext: Add Kconfig to sync with upstream tools/sched_ext: Sync README.md Kconfig with upstream scx selftests/sched_ext: Remove duplicated unistd.h include in rt_stall.c tools/sched_ext: scx_sdt: Remove unused '-f' option tools/sched_ext: scx_central: Remove unused '-p' option selftests/sched_ext: Fix unused-result warning for read() selftests/sched_ext: Abort test loop on signal
2026-03-03	selftests/bpf: Split module_attach into subtests	Viktor Malik	-86/+148
	The test verifies attachment to various hooks in a kernel module, however, everything is flattened into a single test. This makes it impossible to run or skip test cases selectively. Isolate each BPF program into a separate subtest. This is done by disabling auto-loading of programs and loading and testing each program separately. At the same time, modernize the test to use ASSERT* instead of CHECK and replace `return` by `goto cleanup` where necessary. Signed-off-by: Viktor Malik <vmalik@redhat.com> Link: https://lore.kernel.org/r/20260225120904.1529112-1-vmalik@redhat.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-03-03	selftests: bpf: Add tests for void global subprogs	Emil Tsalapatis	-3/+111
	Add additional testing for void global functions. The tests ensure that calls to void global functions properly keep R0 invalid. Also make sure that exception callbacks still require a return value. Acked-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Emil Tsalapatis <emil@etsalapatis.com> Link: https://lore.kernel.org/r/20260228184759.108145-6-emil@etsalapatis.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-03-03	bpf: Allow void global functions in the verifier	Emil Tsalapatis	-2/+2
	Global subprogs are currently not allowed to return void. Adjust verifier logic to allow global functions with a void return type. Acked-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Emil Tsalapatis <emil@etsalapatis.com> Link: https://lore.kernel.org/r/20260228184759.108145-5-emil@etsalapatis.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-03-03	selftests/bpf: drop test_bpftool.sh	Alexis Lothoré (eBPF Foundation)	-188/+1
	The test_bpftool.sh script runs a python unittest script checking bpftool json output on different commands. As part of the ongoing effort to get rid of any standalone test, this script should either be converted to test_progs or removed. As validating bpftool json output does not bring much value to the test base (and because it would need test_progs to bring in a json parser), remove the standalone test script. Signed-off-by: Alexis Lothoré (eBPF Foundation) <alexis.lothore@bootlin.com> Acked-by: Quentin Monnet <qmo@kernel.org> Link: https://lore.kernel.org/r/20260227-bpftool_feature-v1-1-a25860fd52fb@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-03-03	selftests/bpf: Add tests for ctx fixed offset support	Kumar Kartikeya Dwivedi	-0/+76
	Add tests to ensure PTR_TO_CTX supports fixed offsets for program types that don't rewrite accesses to it. Ensure that variable offsets and negative offsets are still rejected. An extra test also checks writing into ctx with modified offset for syscall progs. Other program types do not support writes (notably, writable tracepoints offer a pointer for a writable buffer through ctx, but don't allow writing to the ctx itself). Before the fix made in the previous commit, these tests do not succeed, except the ones testing for failures regardless of the change. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com> Link: https://lore.kernel.org/r/20260227005725.1247305-3-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-03-03	bpf, arm64: Use ORR-based MOV for general-purpose registers	Puranjay Mohan	-4/+4
	The A64_MOV macro unconditionally uses ADD Rd, Rn, #0 to implement register moves. While functionally correct, this is not the canonical encoding when both operands are general-purpose registers. On AArch64, MOV has two aliases depending on the operand registers: - MOV <Xd\|SP>, <Xn\|SP> → ADD <Xd\|SP>, <Xn\|SP>, #0 - MOV <Xd>, <Xn> → ORR <Xd>, XZR, <Xn> The ADD form is required when the stack pointer is involved (as ORR does not accept SP), while the ORR form is the preferred encoding for general-purpose registers. The ORR encoding is also measurably faster on modern microarchitectures. A microbenchmark [1] comparing dependent chains of MOV (ORR) vs ADD #0 on an ARM Neoverse-V2 (72-core, 3.4 GHz) shows: === mov (ORR Xd, XZR, Xn) === run1 cycles/op=0.749859456 run2 cycles/op=0.749991250 run3 cycles/op=0.749601847 avg cycles/op=0.749817518 === add0 (ADD Xd, Xn, #0) === run1 cycles/op=1.004777689 run2 cycles/op=1.004558266 run3 cycles/op=1.004806559 avg cycles/op=1.004714171 The ORR form completes in ~0.75 cycles/op vs ~1.00 cycles/op for ADD #0, a ~25% improvement. This is likely because the CPU's register renaming hardware can eliminate ORR-based moves, while ADD #0 must go through the ALU pipeline. Update A64_MOV to select the appropriate encoding at JIT time: use ADD when either register is A64_SP, and ORR (via aarch64_insn_gen_move_reg()) otherwise. Update verifier_private_stack selftests to expect "mov x7, x0" instead of "add x7, x0, #0x0" in the JITed instruction checks, matching the new ORR-based encoding. [1] https://github.com/puranjaymohan/scripts/blob/main/arm64/bench/run_mov_vs_add0.sh Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Acked-by: Xu Kuohai <xukuohai@huawei.com> Link: https://lore.kernel.org/r/20260225134339.2723288-1-puranjay@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-03-03	selftests/bpf: Add usdt trigger bench	Jiri Olsa	-2/+76
	Adding usdt trigger bench for usdt: trig-usdt-nop - usdt on top of nop1 instruction trig-usdt-nop5 - usdt on top of nop1/nop5 combo Adding it to benchs/run_bench_uprobes.sh script. Example run on x86_64 kernel with uprobe syscall: # ./benchs/run_bench_uprobes.sh usermode-count : 152.507 ± 0.098M/s syscall-count : 14.309 ± 0.093M/s uprobe-nop : 3.190 ± 0.012M/s uprobe-push : 3.057 ± 0.004M/s uprobe-ret : 1.095 ± 0.009M/s uprobe-nop5 : 7.305 ± 0.034M/s uretprobe-nop : 2.175 ± 0.005M/s uretprobe-push : 2.109 ± 0.003M/s uretprobe-ret : 0.945 ± 0.002M/s uretprobe-nop5 : 3.530 ± 0.006M/s usdt-nop : 3.235 ± 0.008M/s <-- added usdt-nop5 : 7.511 ± 0.045M/s <-- added Signed-off-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20260224103915.1369690-6-jolsa@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-03-03	selftests/bpf: Add test for checking correct nop of optimized usdt	Jiri Olsa	-1/+142
	Adding test that attaches bpf program on usdt probe in 2 scenarios; - attach program on top of usdt_1, which is single nop instruction, so the probe stays on nop instruction and is not optimized. - attach program on top of usdt_2 which is probe defined on top of nop,nop5 combo, so the probe is placed on top of nop5 and is optimized. Signed-off-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20260224103915.1369690-5-jolsa@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-03-03	selftests/bpf: Emit nop,nop5 instructions combo for x86_64 arch	Jiri Olsa	-0/+2
	Syncing latest usdt.h change [1]. Now that we have nop5 optimization support in kernel, let's emit nop,nop5 for usdt probe. We leave it up to the library to use desirable nop instruction. [1] https://github.com/libbpf/usdt/commit/c9865d158984fb2b73e3cbbdcdfb4f583ad36a73 Signed-off-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20260224103915.1369690-4-jolsa@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-03-03	selftests/bpf: factor out get_func_* tests for fsession	Menglong Dong	-64/+108
	The fsession is already supported by x86_64, arm64, riscv and s390, so we don't need to disable it in the compile time according to the architecture. Factor out the testings for it. Therefore, the testing can be disabled for the architecture that doesn't support it manually. Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn> Acked-by: Ilya Leoshkevich <iii@linux.ibm.com> Link: https://lore.kernel.org/r/20260224092208.1395085-4-dongml2@chinatelecom.cn Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-03-03	bpf/s390: Implement get_preempt_count()	Ilya Leoshkevich	-0/+7
	exe_ctx test fails on s390, because get_preempt_count() is not implemented and its fallback path always returns 0. Implement it using the new bpf_get_lowcore() kfunc. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Link: https://lore.kernel.org/r/20260217160813.100855-3-iii@linux.ibm.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-03-03	s390: Introduce bpf_get_lowcore() kfunc	Ilya Leoshkevich	-0/+4
	Implementing BPF version of preempt_count() requires accessing lowcore from BPF. Since lowcore can be relocated, open-coding (struct lowcore *)0 does not work, so add a kfunc. Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com> Link: https://lore.kernel.org/r/20260217160813.100855-2-iii@linux.ibm.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-03-03	selftests/bpf: add test for xdp_bonding xmit_hash_policy compat	Jiayuan Chen	-0/+58
	Add a selftest to verify that changing xmit_hash_policy to vlan+srcmac is rejected when a native XDP program is loaded on a bond in 802.3ad mode. Without the fix in bond_option_xmit_hash_policy_set(), the change succeeds silently, creating an inconsistent state that triggers a kernel WARNING in dev_xdp_uninstall() when the bond is torn down. The test attaches native XDP to a bond0 (802.3ad, layer2+3), then attempts to switch xmit_hash_policy to vlan+srcmac and asserts the operation fails. It also verifies the change succeeds after XDP is detached, confirming the rejection is specific to the XDP-loaded state. Signed-off-by: Jiayuan Chen <jiayuan.chen@shopee.com> Link: https://patch.msgid.link/20260226080306.98766-3-jiayuan.chen@linux.dev Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2026-03-02	selftests/sched_ext: Fix peek_dsq.bpf.c compile error for clang 17	Zhao Mengmeng	-2/+2
	When compiling sched_ext selftests using clang 17.0.6, it raised compiler crash and build error: Error at line 68: Unsupport signed division for DAG: 0x55b2f9a60240: i64 = sdiv 0x55b2f9a609b0, Constant:i64<100>, peek_dsq.bpf.c:68:25 @[ peek_dsq.bpf.c:95:4 @[ peek_dsq.bpf.c:169:8 @[ peek _dsq.bpf.c:140:6 ] ] ]Please convert to unsigned div/mod After digging, it's not a compiler error, clang supported Signed division only when using -mcpu=v4, while we use -mcpu=v3 currently, the better way is to use unsigned div, see [1] for details. [1] https://github.com/llvm/llvm-project/issues/70433 Signed-off-by: Zhao Mengmeng <zhaomengmeng@kylinos.cn> Reviewed-by: Andrea Righi <arighi@nvidia.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2026-03-02	selftests/sched_ext: Add -fms-extensions to bpf build flags	Zhao Mengmeng	-0/+2
	Similar to commit 835a50753579 ("selftests/bpf: Add -fms-extensions to bpf build flags") and commit 639f58a0f480 ("bpftool: Fix build warnings due to MS extensions") Fix "declaration does not declare anything" warning by using -fms-extensions and -Wno-microsoft-anon-tag flags to build bpf programs that #include "vmlinux.h" Signed-off-by: Zhao Mengmeng <zhaomengmeng@kylinos.cn> Reviewed-by: Andrea Righi <arighi@nvidia.com> Signed-off-by: Tejun Heo <tj@kernel.org>
2026-03-02	selftest: net: Add basic functionality tests for ipmr.	Kuniyuki Iwashima	-0/+460
	The new test exercise paths, where RTNL is needed, to catch lockdep splat: setsockopt MRT_INIT / MRT_DONE MRT_ADD_VIF / MRT_DEL_VIF MRT_ADD_MFC / MRT_DEL_MFC / MRT_ADD_MFC_PROXY / MRT_DEL_MFC_PROXY MRT_TABLE MRT_FLUSH rtnetlink RTM_NEWROUTE RTM_DELROUTE NETDEV_UNREGISTER I will extend this to cover IPv6 setsockopt() later. Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20260228221800.1082070-2-kuniyu@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-02	selftests/net: packetdrill: restore tcp_rcv_big_endseq.pkt	Simon Baatz	-0/+44
	Commit 1cc93c48b5d7 ("selftests/net: packetdrill: remove tests for tcp_rcv_*big") removed the test for the reverted commit 1d2fbaad7cd8 ("tcp: stronger sk_rcvbuf checks") but also the one for commit 9ca48d616ed7 ("tcp: do not accept packets beyond window"). Restore the test with the necessary adaptation: expect a delayed ACK instead of an immediate one, since tcp_can_ingest() does not fail anymore for the last data packet. Signed-off-by: Simon Baatz <gmbnomis@gmail.com> Link: https://patch.msgid.link/20260301-tcp_rcv_big_endseq-v1-1-86ab7415ab58@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-02	selftests: drv-net: iou-zcrx: allocate hugepages for large chunks test	Jakub Kicinski	-1/+9
	The large chunks test needs 2MB hugepages for its mmap allocation, but the test system may not have any pre-allocated. Ensure at least 64 hugepages are available before running the test, and restore the original value on cleanup. While at it strip the stdout, it has a trailing new line. Before: ok 5 iou-zcrx.test_zcrx_large_chunks # SKIP Can't allocate huge pages Link: https://patch.msgid.link/20260227171305.2848240-4-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-02	selftests: drv-net: iou-zcrx: rework large chunks test to use common setup	Jakub Kicinski	-25/+6
	Commit a32bb32d0193 ("selftests: iou-zcrx: test large chunk sizes") and commit de7c600e2d5b ("selftests/net: parametrise iou-zcrx.py with ksft_variants") landed at similar time. The large chunks test was actually not included in the list of tests, so it never run. We haven't noticed that it uses the old-style helpers (_get_combined_channels, _get_current_settings, _set_flow_rule) that were removed by the other commit. Rework test_zcrx_large_chunks to reuse the single() setup function and add it to the ksft_run cases list so it actually gets executed. Link: https://patch.msgid.link/20260227171305.2848240-3-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-02	selftests: drv-net: iou-zcrx: wait for memory provider cleanup	Jakub Kicinski	-1/+17
	io_uring defers zcrx context teardown to the iou_exit workqueue. # ps aux \| grep iou ... 07:58 0:00 [kworker/u19:0-iou_exit] ... 07:58 0:00 [kworker/u18:2-iou_exit] When the test's receiver process exits, bkg() returns but the memory provider may still be attached to the rx queue. The subsequent defer() that restores tcp-data-split then fails: # Exception while handling defer / cleanup (callback 3 of 3)! # Defer Exception\| net.ynl.pyynl.lib.ynl.NlError: Netlink error: can't disable tcp-data-split while device has memory provider enabled: Invalid argument not ok 1 iou-zcrx.test_zcrx.single Add a helper that polls netdev queue-get until no rx queue reports the io-uring memory provider attribute. Register it as a defer() just before tcp-data-split is restored as a "barrier". Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com> Link: https://patch.msgid.link/20260227171305.2848240-2-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-02	KVM: selftests: Extend state_test to check next_rip	Yosry Ahmed	-0/+11
	Similar to vGIF, extend state_test to make sure that next_rip is saved correctly in nested state. GUEST_SYNC() in L2 causes IO emulation by KVM, which advances the RIP to the value of next_rip. Hence, if next_rip is saved correctly, its value should match the saved RIP value. Signed-off-by: Yosry Ahmed <yosry@kernel.org> Link: https://patch.msgid.link/20260225005950.3739782-5-yosry@kernel.org Signed-off-by: Sean Christopherson <seanjc@google.com>