summaryrefslogtreecommitdiffstats
path: root/include
AgeCommit message (Collapse)AuthorLines
2026-02-10Merge tag 'timers-core-2026-02-09' of ↵Linus Torvalds-37/+29
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer core updates from Thomas Gleixner: - Inline timecounter_cyc2time() as that is now used in the networking hotpath. Inlining it significantly improves performance. - Optimize the tick dependency check in case that the tracepoint is disabled, which improves the hotpath performance in the tick management code, which is a hotpath on transitions in and out of idle. - The usual cleanups and improvements * tag 'timers-core-2026-02-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: time/kunit: Document handling of negative years of is_leap() tick/nohz: Optimize check_tick_dependency() with early return time/sched_clock: Use ACCESS_PRIVATE() to evaluate hrtimer::function hrtimer: Drop _tv64() helpers hrtimer: Remove public definition of HIGH_RES_NSEC hrtimer: Remove unused resolution constants time/timecounter: Inline timecounter_cyc2time()
2026-02-10Merge tag 'irq-msi-2026-02-09' of ↵Linus Torvalds-8/+40
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull MSI updates from Thomas Gleixner: "Updates for the [PCI] MSI subsystem: - Add interrupt redirection infrastructure Some PCI controllers use a single demultiplexing interrupt for the MSI interrupts of subordinate devices. This prevents setting the interrupt affinity of device interrupts, which causes device interrupts to be delivered to a single CPU. That obviously is counterproductive for multi-queue devices and interrupt balancing. To work around this limitation the new infrastructure installs a dummy irq_set_affinity() callback which captures the affinity mask and picks a redirection target CPU out of the mask. When the PCI controller demultiplexes the interrupts it invokes a new handling function in the core, which either runs the interrupt handler in the context of the target CPU or delegates it to irq_work on the target CPU. - Utilize the interrupt redirection mechanism in the PCI DWC host controller driver. This allows affinity control for the subordinate device MSI interrupts instead of being randomly executed on the CPU which runs the demultiplex handler. - Replace the binary 64-bit MSI flag with a DMA mask Some PCI devices have PCI_MSI_FLAGS_64BIT in the MSI capability, but implement less than 64 address bits. This breaks on platforms where such a device is assigned an MSI address higher than what's supported. With the binary 64-bit flag there is no other choice than disabling 64-bit MSI support which leaves the device disfunctional. By using a DMA mask the address limit of a device can be described correctly which provides support for the above scenario. - Make use of the DMA mask based address limit in the hda/intel and radeon drivers to enable them on affected platforms - The usual small cleanups and improvements" * tag 'irq-msi-2026-02-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: ALSA: hda/intel: Make MSI address limit based on the device DMA limit drm/radeon: Make MSI address limit based on the device DMA limit PCI/MSI: Check the device specific address mask in msi_verify_entries() PCI/MSI: Convert the boolean no_64bit_msi flag to a DMA address mask genirq/redirect: Prevent writing MSI message on affinity change PCI/MSI: Unmap MSI-X region on error genirq: Update effective affinity for redirected interrupts PCI: dwc: Enable MSI affinity support PCI: dwc: Code cleanup genirq: Add interrupt redirection infrastructure genirq/msi: Correct kernel-doc in <linux/msi.h>
2026-02-10Merge tag 'irq-drivers-2026-02-09' of ↵Linus Torvalds-0/+23
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq chip driver updates from Thomas Gleixner: - Add support for the Renesas RZ/V2N SoC - Add a new driver for the Renesas RZ/[TN]2H SoCs - Preserve the register state of the RISCV APLIC interrupt controller accross suspend/resume - Reinitialize the RISCV IMSIC registers after suspend/resume - Make the various Loongson interrupt chip drivers 32/64-bit aware - Handle the number of hardware interrupts in the SIFIVE PLIC driver correctly The hardware interrupt 0 is reserved which resulted in inconsistent accounting. That went unnoticed as the off by one is only noticable when the number of device interrupts is a multiple of 32 - The usual device tree updates, cleanups and improvements all over the place * tag 'irq-drivers-2026-02-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (24 commits) irqchip/gic-v5: Fix spelling mistake "ouside" -> "outside" dt-bindings: interrupt-controller: sifive,plic: Clarify the riscv,ndev meaning in PLIC irqchip/sifive-plic: Handle number of hardware interrupts correctly irqchip/aspeed-scu-ic: Remove unused variable mask irqchip/ti-sci-intr: Allow parsing interrupt-types per-line dt-bindings: interrupt-controller: ti,sci-intr: Per-line interrupt-types irqchip/renesas-rzv2h: Add suspend/resume support irqchip/aslint-sswi: Fix error check of of_io_request_and_map() result irqchip: Allow LoongArch irqchip drivers on both 32BIT/64BIT irqchip/loongson-pch-pic: Adjust irqchip driver for 32BIT/64BIT irqchip/loongson-pch-msi: Adjust irqchip driver for 32BIT/64BIT irqchip/loongson-htvec: Adjust irqchip driver for 32BIT/64BIT irqchip/loongson-eiointc: Adjust irqchip driver for 32BIT/64BIT irqchip/loongson-liointc: Adjust irqchip driver for 32BIT/64BIT irqchip/loongarch-avec: Adjust irqchip driver for 32BIT/64BIT irqchip/riscv-aplic: Preserve APLIC states across suspend/resume irqchip/riscv-imsic: Add a CPU pm notifier to restore the IMSIC on exit arm64: dts: renesas: r9a09g087: Add ICU support arm64: dts: renesas: r9a09g077: Add ICU support irqchip: Add RZ/{T2H,N2H} Interrupt Controller (ICU) driver ...
2026-02-10Merge tag 'irq-core-2026-02-09' of ↵Linus Torvalds-23/+4
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq core updates from Thomas Gleixner: "Updates for the interrupt core subsystem: - Remove the interrupt timing infrastructure This was added seven years ago to be used for power management purposes, but that integration never happened. - Clean up the remaining setup_percpu_irq() users The memory allocator is available when interrupts can be requested so there is not need for static irq_action. Move the remaining users to request_percpu_irq() and delete the historical cruft. - Warn when interrupt flag inconsistencies are detected in request*_irq(). Inconsistent flags can lead to hard to diagnose malfunction. The fallout of this new warning has been addressed in next and the fixes are coming in via the maintainer trees and the tip irq/cleanup pull requests. - Invoke affinity notifier when CPU hotplug breaks affinity Otherwise the code using the notifier misses the affinity change and operates on stale information. - The usual cleanups and improvements" * tag 'irq-core-2026-02-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: genirq/proc: Replace snprintf with strscpy in register_handler_proc genirq/cpuhotplug: Notify about affinity changes breaking the affinity mask genirq: Move clear of kstat_irqs to free_desc() genirq: Warn about using IRQF_ONESHOT without a threaded handler irqdomain: Fix up const problem in irq_domain_set_name() genirq: Remove setup_percpu_irq() clocksource/drivers/mips-gic-timer: Move GIC timer to request_percpu_irq() MIPS: Move IP27 timer to request_percpu_irq() MIPS: Move IP30 timer to request_percpu_irq() genirq: Remove __request_percpu_irq() helper genirq: Remove IRQ timing tracking infrastructure
2026-02-10Merge tag 'irq-cleanups-2026-02-09' of ↵Linus Torvalds-2/+2
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq cleanups from Thomas Gleixner: "A series of treewide cleanups to ensure interrupt request consistency. - Add the missing IRQF_COND_ONESHOT flag to devm_request_irq() This is inconsistent vs request_irq() and causes the same issues which where addressed with the introduction of this flag - Cleanup IRQF_ONESHOT and IRQF_NO_THREAD usage Quite some drivers have inconsistent interrupt request flags related to interrupt threading namely IRQF_ONESHOT and IRQF_NO_THREAD. This leads to warnings and/or malfunction when forced interrupt threading is enabled. - Remove stub primary (hard interrupt) handlers A bunch of drivers implement a stub primary (hard interrupt) handler which just returns IRQ_WAKE_THREAD. The same functionality is provided by the core code when the primary handler argument of request_thread_irq() is set to NULL" * tag 'irq-cleanups-2026-02-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: media: pci: mg4b: Use IRQF_NO_THREAD mfd: wm8350-core: Use IRQF_ONESHOT thermal/qcom/lmh: Replace IRQF_ONESHOT with IRQF_NO_THREAD rtc: amlogic-a4: Remove IRQF_ONESHOT usb: typec: fusb302: Remove IRQF_ONESHOT EDAC/altera: Remove IRQF_ONESHOT char: tpm: cr50: Remove IRQF_ONESHOT ARM: versatile: Remove IRQF_ONESHOT scsi: efct: Use IRQF_ONESHOT and default primary handler Bluetooth: btintel_pcie: Use IRQF_ONESHOT and default primary handler bus: fsl-mc: Use default primary handler mailbox: bcm-ferxrm-mailbox: Use default primary handler iommu/amd: Use core's primary handler and set IRQF_ONESHOT platform/x86: int0002: Remove IRQF_ONESHOT from request_irq() genirq: Set IRQF_COND_ONESHOT in devm_request_irq().
2026-02-10Merge tag 'sched-core-2026-02-09' of ↵Linus Torvalds-50/+458
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull scheduler updates from Ingo Molnar: "Scheduler Kconfig space updates: - Further consolidate configurable preemption modes (Peter Zijlstra) Reduce the number of architectures that are allowed to offer PREEMPT_NONE and PREEMPT_VOLUNTARY, reducing the number of preemption models from four to just two: 'full' and 'lazy' on up-to-date architectures (arm64, loongarch, powerpc, riscv, s390, x86). None and voluntary are only available as legacy features on platforms that don't implement lazy preemption yet, or which don't even support preemption. The goal is to eventually remove cond_resched() and voluntary preemption altogether. RSEQ based 'scheduler time slice extension' support (Thomas Gleixner and Peter Zijlstra): This allows a thread to request a time slice extension when it enters a critical section to avoid contention on a resource when the thread is scheduled out inside of the critical section. - Add fields and constants for time slice extension - Provide static branch for time slice extensions - Add statistics for time slice extensions - Add prctl() to enable time slice extensions - Implement sys_rseq_slice_yield() - Implement syscall entry work for time slice extensions - Implement time slice extension enforcement timer - Reset slice extension when scheduled - Implement rseq_grant_slice_extension() - entry: Hook up rseq time slice extension - selftests: Implement time slice extension test - Allow registering RSEQ with slice extension - Move slice_ext_nsec to debugfs - Lower default slice extension - selftests/rseq: Add rseq slice histogram script Scheduler performance/scalability improvements: - Update rq->avg_idle when a task is moved to an idle CPU, which improves the scalability of various workloads (Shubhang Kaushik) - Reorder fields in 'struct rq' for better caching (Blake Jones) - Fair scheduler SMP NOHZ balancing code speedups (Shrikanth Hegde): - Move checking for nohz cpus after time check - Change likelyhood of nohz.nr_cpus - Remove nohz.nr_cpus and use weight of cpumask instead - Avoid false sharing for sched_clock_irqtime (Wangyang Guo) - Cleanups (Yury Norov): - Drop useless cpumask_empty() in find_energy_efficient_cpu() - Simplify task_numa_find_cpu() - Use cpumask_weight_and() in sched_balance_find_dst_group() DL scheduler updates: - Add a deadline server for sched_ext tasks (by Andrea Righi and Joel Fernandes, with fixes by Peter Zijlstra) RT scheduler updates: - Skip currently executing CPU in rto_next_cpu() (Chen Jinghuang) Entry code updates and performance improvements (Jinjie Ruan) This is part of the scheduler tree in this cycle due to inter- dependencies with the RSEQ based time slice extension work: - Remove unused syscall argument from syscall_trace_enter() - Rework syscall_exit_to_user_mode_work() for architecture reuse - Add arch_ptrace_report_syscall_entry/exit() - Inline syscall_exit_work() and syscall_trace_enter() Scheduler core updates (Peter Zijlstra): - Rework sched_class::wakeup_preempt() and rq_modified_*() - Avoid rq->lock bouncing in sched_balance_newidle() - Rename rcu_dereference_check_sched_domain() => rcu_dereference_sched_domain() - <linux/compiler_types.h>: Add the __signed_scalar_typeof() helper Fair scheduler updates/refactoring (Peter Zijlstra and Ingo Molnar): - Fold the sched_avg update - Change rcu_dereference_check_sched_domain() to rcu-sched - Switch to rcu_dereference_all() - Remove superfluous rcu_read_lock() - Limit hrtick work - Join two #ifdef CONFIG_FAIR_GROUP_SCHED blocks - Clean up comments in 'struct cfs_rq' - Separate se->vlag from se->vprot - Rename cfs_rq::avg_load to cfs_rq::sum_weight - Rename cfs_rq::avg_vruntime to ::sum_w_vruntime & helper functions - Introduce and use the vruntime_cmp() and vruntime_op() wrappers for wrapped-signed aritmetics - Sort out 'blocked_load*' namespace noise Scheduler debugging code updates: - Export hidden tracepoints to modules (Gabriele Monaco) - Convert copy_from_user() + kstrtouint() to kstrtouint_from_user() (Fushuai Wang) - Add assertions to QUEUE_CLASS (Peter Zijlstra) - hrtimer: Fix tracing oddity (Thomas Gleixner) Misc fixes and cleanups: - Re-evaluate scheduling when migrating queued tasks out of throttled cgroups (Zicheng Qu) - Remove task_struct->faults_disabled_mapping (Christoph Hellwig) - Fix math notation errors in avg_vruntime comment (Zhan Xusheng) - sched/cpufreq: Use %pe format for PTR_ERR() printing (zenghongling)" * tag 'sched-core-2026-02-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (64 commits) sched: Re-evaluate scheduling when migrating queued tasks out of throttled cgroups sched/cpufreq: Use %pe format for PTR_ERR() printing sched/rt: Skip currently executing CPU in rto_next_cpu() sched/clock: Avoid false sharing for sched_clock_irqtime selftests/sched_ext: Add test for DL server total_bw consistency selftests/sched_ext: Add test for sched_ext dl_server sched/debug: Fix dl_server (re)start conditions sched/debug: Add support to change sched_ext server params sched_ext: Add a DL server for sched_ext tasks sched/debug: Stop and start server based on if it was active sched/debug: Fix updating of ppos on server write ops sched/deadline: Clear the defer params entry: Inline syscall_exit_work() and syscall_trace_enter() entry: Add arch_ptrace_report_syscall_entry/exit() entry: Rework syscall_exit_to_user_mode_work() for architecture reuse entry: Remove unused syscall argument from syscall_trace_enter() sched: remove task_struct->faults_disabled_mapping sched: Update rq->avg_idle when a task is moved to an idle CPU selftests/rseq: Add rseq slice histogram script hrtimer: Fix trace oddity ...
2026-02-10Merge tag 'locking-core-2026-02-08' of ↵Linus Torvalds-405/+1264
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking updates from Ingo Molnar: "Lock debugging: - Implement compiler-driven static analysis locking context checking, using the upcoming Clang 22 compiler's context analysis features (Marco Elver) We removed Sparse context analysis support, because prior to removal even a defconfig kernel produced 1,700+ context tracking Sparse warnings, the overwhelming majority of which are false positives. On an allmodconfig kernel the number of false positive context tracking Sparse warnings grows to over 5,200... On the plus side of the balance actual locking bugs found by Sparse context analysis is also rather ... sparse: I found only 3 such commits in the last 3 years. So the rate of false positives and the maintenance overhead is rather high and there appears to be no active policy in place to achieve a zero-warnings baseline to move the annotations & fixers to developers who introduce new code. Clang context analysis is more complete and more aggressive in trying to find bugs, at least in principle. Plus it has a different model to enabling it: it's enabled subsystem by subsystem, which results in zero warnings on all relevant kernel builds (as far as our testing managed to cover it). Which allowed us to enable it by default, similar to other compiler warnings, with the expectation that there are no warnings going forward. This enforces a zero-warnings baseline on clang-22+ builds (Which are still limited in distribution, admittedly) Hopefully the Clang approach can lead to a more maintainable zero-warnings status quo and policy, with more and more subsystems and drivers enabling the feature. Context tracking can be enabled for all kernel code via WARN_CONTEXT_ANALYSIS_ALL=y (default disabled), but this will generate a lot of false positives. ( Having said that, Sparse support could still be added back, if anyone is interested - the removal patch is still relatively straightforward to revert at this stage. ) Rust integration updates: (Alice Ryhl, Fujita Tomonori, Boqun Feng) - Add support for Atomic<i8/i16/bool> and replace most Rust native AtomicBool usages with Atomic<bool> - Clean up LockClassKey and improve its documentation - Add missing Send and Sync trait implementation for SetOnce - Make ARef Unpin as it is supposed to be - Add __rust_helper to a few Rust helpers as a preparation for helper LTO - Inline various lock related functions to avoid additional function calls WW mutexes: - Extend ww_mutex tests and other test-ww_mutex updates (John Stultz) Misc fixes and cleanups: - rcu: Mark lockdep_assert_rcu_helper() __always_inline (Arnd Bergmann) - locking/local_lock: Include more missing headers (Peter Zijlstra) - seqlock: fix scoped_seqlock_read kernel-doc (Randy Dunlap) - rust: sync: Replace `kernel::c_str!` with C-Strings (Tamir Duberstein)" * tag 'locking-core-2026-02-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (90 commits) locking/rwlock: Fix write_trylock_irqsave() with CONFIG_INLINE_WRITE_TRYLOCK rcu: Mark lockdep_assert_rcu_helper() __always_inline compiler-context-analysis: Remove __assume_ctx_lock from initializers tomoyo: Use scoped init guard crypto: Use scoped init guard kcov: Use scoped init guard compiler-context-analysis: Introduce scoped init guards cleanup: Make __DEFINE_LOCK_GUARD handle commas in initializers seqlock: fix scoped_seqlock_read kernel-doc tools: Update context analysis macros in compiler_types.h rust: sync: Replace `kernel::c_str!` with C-Strings rust: sync: Inline various lock related methods rust: helpers: Move #define __rust_helper out of atomic.c rust: wait: Add __rust_helper to helpers rust: time: Add __rust_helper to helpers rust: task: Add __rust_helper to helpers rust: sync: Add __rust_helper to helpers rust: refcount: Add __rust_helper to helpers rust: rcu: Add __rust_helper to helpers rust: processor: Add __rust_helper to helpers ...
2026-02-10Merge tag 'perf-core-2026-02-09' of ↵Linus Torvalds-11/+71
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull performance event updates from Ingo Molnar: "x86 PMU driver updates: - Add support for the core PMU for Intel Diamond Rapids (DMR) CPUs (Dapeng Mi) Compared to previous iterations of the Intel PMU code, there's been a lot of changes, which center around three main areas: - Introduce the OFF-MODULE RESPONSE (OMR) facility to replace the Off-Core Response (OCR) facility - New PEBS data source encoding layout - Support the new "RDPMC user disable" feature - Likewise, a large series adds uncore PMU support for Intel Diamond Rapids (DMR) CPUs (Zide Chen) This centers around these four main areas: - DMR may have two Integrated I/O and Memory Hub (IMH) dies, separate from the compute tile (CBB) dies. Each CBB and each IMH die has its own discovery domain. - Unlike prior CPUs that retrieve the global discovery table portal exclusively via PCI or MSR, DMR uses PCI for IMH PMON discovery and MSR for CBB PMON discovery. - DMR introduces several new PMON types: SCA, HAMVF, D2D_ULA, UBR, PCIE4, CRS, CPC, ITC, OTC, CMS, and PCIE6. - IIO free-running counters in DMR are MMIO-based, unlike SPR. - Also add support for Add missing PMON units for Intel Panther Lake, and support Nova Lake (NVL), which largely maps to Panther Lake. (Zide Chen) - KVM integration: Add support for mediated vPMUs (by Kan Liang and Sean Christopherson, with fixes and cleanups by Peter Zijlstra, Sandipan Das and Mingwei Zhang) - Add Intel cstate driver to support for Wildcat Lake (WCL) CPUs, which are a low-power variant of Panther Lake (Zide Chen) - Add core, cstate and MSR PMU support for the Airmont NP Intel CPU (aka MaxLinear Lightning Mountain), which maps to the existing Airmont code (Martin Schiller) Performance enhancements: - Speed up kexec shutdown by avoiding unnecessary cross CPU calls (Jan H. Schönherr) - Fix slow perf_event_task_exit() with LBR callstacks (Namhyung Kim) User-space stack unwinding support: - Various cleanups and refactorings in preparation to generalize the unwinding code for other architectures (Jens Remus) Uprobes updates: - Transition from kmap_atomic to kmap_local_page (Keke Ming) - Fix incorrect lockdep condition in filter_chain() (Breno Leitao) - Fix XOL allocation failure for 32-bit tasks (Oleg Nesterov) Misc fixes and cleanups: - s390: Remove kvm_types.h from Kbuild (Randy Dunlap) - x86/intel/uncore: Convert comma to semicolon (Chen Ni) - x86/uncore: Clean up const mismatch (Greg Kroah-Hartman) - x86/ibs: Fix typo in dc_l2tlb_miss comment (Xiang-Bin Shi)" * tag 'perf-core-2026-02-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (58 commits) s390: remove kvm_types.h from Kbuild uprobes: Fix incorrect lockdep condition in filter_chain() x86/ibs: Fix typo in dc_l2tlb_miss comment x86/uprobes: Fix XOL allocation failure for 32-bit tasks perf/x86/intel/uncore: Convert comma to semicolon perf/x86/intel: Add support for rdpmc user disable feature perf/x86: Use macros to replace magic numbers in attr_rdpmc perf/x86/intel: Add core PMU support for Novalake perf/x86/intel: Add support for PEBS memory auxiliary info field in NVL perf/x86/intel: Add core PMU support for DMR perf/x86/intel: Add support for PEBS memory auxiliary info field in DMR perf/x86/intel: Support the 4 new OMR MSRs introduced in DMR and NVL perf/core: Fix slow perf_event_task_exit() with LBR callstacks perf/core: Speed up kexec shutdown by avoiding unnecessary cross CPU calls uprobes: use kmap_local_page() for temporary page mappings arm/uprobes: use kmap_local_page() in arch_uprobe_copy_ixol() mips/uprobes: use kmap_local_page() in arch_uprobe_copy_ixol() arm64/uprobes: use kmap_local_page() in arch_uprobe_copy_ixol() riscv/uprobes: use kmap_local_page() in arch_uprobe_copy_ixol() perf/x86/intel/uncore: Add Nova Lake support ...
2026-02-10Merge tag 'bitmap-for-6.20' of https://github.com/norov/linuxLinus Torvalds-2/+3
Pull bitmap updates from Yury Norov: - more rust helpers (Alice) - more bitops tests (Ryota) - FIND_NTH_BIT() uninitialized variable fix (Lee Yongjun) - random cleanups (Andy, H. Peter) * tag 'bitmap-for-6.20' of https://github.com/norov/linux: lib/tests: extend KUnit test for bitops with more cases bitops: Add more files to the MAINTAINERS lib/find_bit: fix uninitialized variable use in FIND_NTH_BIT lib/tests: add KUnit test for bitops rust: cpumask: add __rust_helper to helpers rust: bitops: add __rust_helper to helpers rust: bitmap: add __rust_helper to helpers linux/bitfield.h: replace __auto_type with auto
2026-02-10Merge tag 'bpf-next-7.0' of ↵Linus Torvalds-32/+393
git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Pull bpf updates from Alexei Starovoitov: - Support associating BPF program with struct_ops (Amery Hung) - Switch BPF local storage to rqspinlock and remove recursion detection counters which were causing false positives (Amery Hung) - Fix live registers marking for indirect jumps (Anton Protopopov) - Introduce execution context detection BPF helpers (Changwoo Min) - Improve verifier precision for 32bit sign extension pattern (Cupertino Miranda) - Optimize BTF type lookup by sorting vmlinux BTF and doing binary search (Donglin Peng) - Allow states pruning for misc/invalid slots in iterator loops (Eduard Zingerman) - In preparation for ASAN support in BPF arenas teach libbpf to move global BPF variables to the end of the region and enable arena kfuncs while holding locks (Emil Tsalapatis) - Introduce support for implicit arguments in kfuncs and migrate a number of them to new API. This is a prerequisite for cgroup sub-schedulers in sched-ext (Ihor Solodrai) - Fix incorrect copied_seq calculation in sockmap (Jiayuan Chen) - Fix ORC stack unwind from kprobe_multi (Jiri Olsa) - Speed up fentry attach by using single ftrace direct ops in BPF trampolines (Jiri Olsa) - Require frozen map for calculating map hash (KP Singh) - Fix lock entry creation in TAS fallback in rqspinlock (Kumar Kartikeya Dwivedi) - Allow user space to select cpu in lookup/update operations on per-cpu array and hash maps (Leon Hwang) - Make kfuncs return trusted pointers by default (Matt Bobrowski) - Introduce "fsession" support where single BPF program is executed upon entry and exit from traced kernel function (Menglong Dong) - Allow bpf_timer and bpf_wq use in all programs types (Mykyta Yatsenko, Andrii Nakryiko, Kumar Kartikeya Dwivedi, Alexei Starovoitov) - Make KF_TRUSTED_ARGS the default for all kfuncs and clean up their definition across the tree (Puranjay Mohan) - Allow BPF arena calls from non-sleepable context (Puranjay Mohan) - Improve register id comparison logic in the verifier and extend linked registers with negative offsets (Puranjay Mohan) - In preparation for BPF-OOM introduce kfuncs to access memcg events (Roman Gushchin) - Use CFI compatible destructor kfunc type (Sami Tolvanen) - Add bitwise tracking for BPF_END in the verifier (Tianci Cao) - Add range tracking for BPF_DIV and BPF_MOD in the verifier (Yazhou Tang) - Make BPF selftests work with 64k page size (Yonghong Song) * tag 'bpf-next-7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (268 commits) selftests/bpf: Fix outdated test on storage->smap selftests/bpf: Choose another percpu variable in bpf for btf_dump test selftests/bpf: Remove test_task_storage_map_stress_lookup selftests/bpf: Update task_local_storage/task_storage_nodeadlock test selftests/bpf: Update task_local_storage/recursion test selftests/bpf: Update sk_storage_omem_uncharge test bpf: Switch to bpf_selem_unlink_nofail in bpf_local_storage_{map_free, destroy} bpf: Support lockless unlink when freeing map or local storage bpf: Prepare for bpf_selem_unlink_nofail() bpf: Remove unused percpu counter from bpf_local_storage_map_free bpf: Remove cgroup local storage percpu counter bpf: Remove task local storage percpu counter bpf: Change local_storage->lock and b->lock to rqspinlock bpf: Convert bpf_selem_unlink to failable bpf: Convert bpf_selem_link_map to failable bpf: Convert bpf_selem_unlink_map to failable bpf: Select bpf_local_storage_map_bucket based on bpf_local_storage selftests/xsk: fix number of Tx frags in invalid packet selftests/xsk: properly handle batch ending in the middle of a packet bpf: Prevent reentrance into call_rcu_tasks_trace() ...
2026-02-10Merge tag 'modules-7.0-rc1' of ↵Linus Torvalds-14/+12
git://git.kernel.org/pub/scm/linux/kernel/git/modules/linux Pull module updates from Sami Tolvanen: "Module signing: - Remove SHA-1 support for signing modules. SHA-1 is no longer considered secure for signatures due to vulnerabilities that can lead to hash collisions. None of the major distributions use SHA-1 anymore, and the kernel has defaulted to SHA-512 since v6.11. Note that loading SHA-1 signed modules is still supported. - Update scripts/sign-file to use only the OpenSSL CMS API for signing. As SHA-1 support is gone, we can drop the legacy PKCS#7 API which was limited to SHA-1. This also cleans up support for legacy OpenSSL versions. Cleanups and fixes: - Use system_dfl_wq instead of the per-cpu system_wq following the ongoing workqueue API refactoring. - Avoid open-coded kvrealloc() in module decompression logic by using the standard helper. - Improve section annotations by replacing the custom __modinit with __init_or_module and removing several unused __INIT*_OR_MODULE macros. - Fix kernel-doc warnings in include/linux/moduleparam.h. - Ensure set_module_sig_enforced is only declared when module signing is enabled. - Fix gendwarfksyms build failures on 32-bit hosts. MAINTAINERS: - Update the module subsystem entry to reflect the maintainer rotation and update the git repository link" * tag 'modules-7.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/modules/linux: modules: moduleparam.h: fix kernel-doc comments module: Only declare set_module_sig_enforced when CONFIG_MODULE_SIG=y module/decompress: Avoid open-coded kvrealloc() gendwarfksyms: Fix build on 32-bit hosts sign-file: Use only the OpenSSL CMS API for signing module: Remove SHA-1 support for module signing module: replace use of system_wq with system_dfl_wq params: Replace __modinit with __init_or_module module: Remove unused __INIT*_OR_MODULE macros MAINTAINERS: Update module subsystem maintainers and repository
2026-02-10Merge tag 'keys-next-20260206' of ↵Linus Torvalds-2/+9
git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs Pull keys update from David Howells: "This adds support for ML-DSA signatures in X.509 certificates and PKCS#7/CMS messages, thereby allowing this algorithm to be used for signing modules, kexec'able binaries, wifi regulatory data, etc.. This requires OpenSSL-3.5 at a minimum and preferably OpenSSL-4 (so that it can avoid the use of CMS signedAttrs - but that version is not cut yet). certs/Kconfig does a check to hide the signing options if OpenSSL does not list the algorithm as being available" * tag 'keys-next-20260206' of git://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs: pkcs7: Change a pr_warn() to pr_warn_once() pkcs7: Allow authenticatedAttributes for ML-DSA modsign: Enable ML-DSA module signing pkcs7, x509: Add ML-DSA support pkcs7: Allow the signing algo to do whatever digestion it wants itself pkcs7, x509: Rename ->digest to ->m x509: Separately calculate sha256 for blacklist crypto: Add ML-DSA crypto_sig support
2026-02-10Merge tag 'kmalloc_obj-v7.0-rc1' of ↵Linus Torvalds-0/+179
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull kmalloc_obj updates from Kees Cook: "Introduce the kmalloc_obj* family of APIs for switching to type-based kmalloc allocations, away from purely size-based allocations. Discussed on lkml, with you, and at Linux Plumbers. It's been in -next for the entire dev cycle. Before the merge window closes, I'd like to send the treewide change (generated from the Coccinelle script included here), which mechanically converts almost 20k callsites from kmalloc* to kmalloc_obj*: 8007 files changed, 19980 insertions(+), 20838 deletions(-) This change needed fixes for mismatched types (since now the return type from allocations is a pointer to the requested type, not "void *"), and I've been fixing these over the last 4 releases. These fixes have mostly been trivial mismatches with const qualifiers or accidentally identical sizes (e.g. same object size: "struct kvec" vs "struct iovec", or differing pointers to pointers), but I did catch one case of too-small allocation. Summary: - Introduce kmalloc_obj*() family of type-based allocator APIs - checkpatch: Suggest kmalloc_obj family for sizeof allocations - coccinelle: Add kmalloc_objs conversion script" * tag 'kmalloc_obj-v7.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: coccinelle: Add kmalloc_objs conversion script slab: Introduce kmalloc_flex() and family compiler_types: Introduce __flex_counter() and family checkpatch: Suggest kmalloc_obj family for sizeof allocations slab: Introduce kmalloc_obj() and family
2026-02-10Merge tag 'hardening-v7.0-rc1' of ↵Linus Torvalds-8/+26
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull hardening updates from Kees Cook: "Mostly small cleanups and various scattered annotations and flex array warning fixes that we reviewed by unlanded in other trees. Introduces new annotation for expanding counted_by to pointer members, now that compiler behavior between GCC and Clang has been normalized. - Various missed __counted_by annotations (Thorsten Blum) - Various missed -Wflex-array-member-not-at-end fixes (Gustavo A. R. Silva) - Avoid leftover tempfiles for interrupted compile-time FORTIFY tests (Nicolas Schier) - Remove non-existant CONFIG_UBSAN_REPORT_FULL from docs (Stefan Wiehler) - fortify: Use C arithmetic not FIELD_xxx() in FORTIFY_REASON defines (David Laight) - Add __counted_by_ptr attribute, tests, and first user (Bill Wendling, Kees Cook) - Update MAINTAINERS file to make hardening section not include pstore" * tag 'hardening-v7.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: MAINTAINERS: pstore: Remove L: entry nfp: tls: Avoid -Wflex-array-member-not-at-end warnings carl9170: Avoid -Wflex-array-member-not-at-end warning coredump: Use __counted_by_ptr for struct core_name::corename lkdtm/bugs: Add __counted_by_ptr() test PTR_BOUNDS compiler_types.h: Attributes: Add __counted_by_ptr macro fortify: Cleanup temp file also on non-successful exit fortify: Rename temporary file to match ignore pattern fortify: Use C arithmetic not FIELD_xxx() in FORTIFY_REASON defines ecryptfs: Annotate struct ecryptfs_message with __counted_by fs/xattr: Annotate struct simple_xattr with __counted_by crypto: af_alg - Annotate struct af_alg_iv with __counted_by Kconfig.ubsan: Remove CONFIG_UBSAN_REPORT_FULL from documentation drm/nouveau: fifo: Avoid -Wflex-array-member-not-at-end warning
2026-02-10Merge tag 'v7.0-p1' of ↵Linus Torvalds-16/+141
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto update from Herbert Xu: "API: - Fix race condition in hwrng core by using RCU Algorithms: - Allow authenc(sha224,rfc3686) in fips mode - Add test vectors for authenc(hmac(sha384),cbc(aes)) - Add test vectors for authenc(hmac(sha224),cbc(aes)) - Add test vectors for authenc(hmac(md5),cbc(des3_ede)) - Add lz4 support in hisi_zip - Only allow clear key use during self-test in s390/{phmac,paes} Drivers: - Set rng quality to 900 in airoha - Add gcm(aes) support for AMD/Xilinx Versal device - Allow tfms to share device in hisilicon/trng" * tag 'v7.0-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (100 commits) crypto: img-hash - Use unregister_ahashes in img_{un}register_algs crypto: testmgr - Add test vectors for authenc(hmac(md5),cbc(des3_ede)) crypto: cesa - Simplify return statement in mv_cesa_dequeue_req_locked crypto: testmgr - Add test vectors for authenc(hmac(sha224),cbc(aes)) crypto: testmgr - Add test vectors for authenc(hmac(sha384),cbc(aes)) hwrng: core - use RCU and work_struct to fix race condition crypto: starfive - Fix memory leak in starfive_aes_aead_do_one_req() crypto: xilinx - Fix inconsistant indentation crypto: rng - Use unregister_rngs in register_rngs crypto: atmel - Use unregister_{aeads,ahashes,skciphers} hwrng: optee - simplify OP-TEE context match crypto: ccp - Add sysfs attribute for boot integrity dt-bindings: crypto: atmel,at91sam9g46-sha: add microchip,lan9691-sha dt-bindings: crypto: atmel,at91sam9g46-aes: add microchip,lan9691-aes dt-bindings: crypto: qcom,inline-crypto-engine: document the Milos ICE crypto: caam - fix netdev memory leak in dpaa2_caam_probe crypto: hisilicon/qm - increase wait time for mailbox crypto: hisilicon/qm - obtain the mailbox configuration at one time crypto: hisilicon/qm - remove unnecessary code in qm_mb_write() crypto: hisilicon/qm - move the barrier before writing to the mailbox register ...
2026-02-10Merge tag 'libcrypto-for-linus' of ↵Linus Torvalds-95/+375
git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux Pull crypto library updates from Eric Biggers: - Add support for verifying ML-DSA signatures. ML-DSA (Module-Lattice-Based Digital Signature Algorithm) is a recently-standardized post-quantum (quantum-resistant) signature algorithm. It was known as Dilithium pre-standardization. The first use case in the kernel will be module signing. But there are also other users of RSA and ECDSA signatures in the kernel that might want to upgrade to ML-DSA eventually. - Improve the AES library: - Make the AES key expansion and single block encryption and decryption functions use the architecture-optimized AES code. Enable these optimizations by default. - Support preparing an AES key for encryption-only, using about half as much memory as a bidirectional key. - Replace the existing two generic implementations of AES with a single one. - Simplify how Adiantum message hashing is implemented. Remove the "nhpoly1305" crypto_shash in favor of direct lib/crypto/ support for NH hashing, and enable optimizations by default. * tag 'libcrypto-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux: (53 commits) lib/crypto: mldsa: Clarify the documentation for mldsa_verify() slightly lib/crypto: aes: Drop 'volatile' from aes_sbox and aes_inv_sbox lib/crypto: aes: Remove old AES en/decryption functions lib/crypto: aesgcm: Use new AES library API lib/crypto: aescfb: Use new AES library API crypto: omap - Use new AES library API crypto: inside-secure - Use new AES library API crypto: drbg - Use new AES library API crypto: crypto4xx - Use new AES library API crypto: chelsio - Use new AES library API crypto: ccp - Use new AES library API crypto: x86/aes-gcm - Use new AES library API crypto: arm64/ghash - Use new AES library API crypto: arm/ghash - Use new AES library API staging: rtl8723bs: core: Use new AES library API net: phy: mscc: macsec: Use new AES library API chelsio: Use new AES library API Bluetooth: SMP: Use new AES library API crypto: x86/aes - Remove the superseded AES-NI crypto_cipher lib/crypto: x86/aes: Add AES-NI optimization ...
2026-02-10net: dsa: eliminate local type for tc policersVladimir Oltean-22/+18
David Yang is saying that struct flow_action_entry in include/net/flow_offload.h has gained new fields and DSA's struct dsa_mall_policer_tc_entry, derived from that, isn't keeping up. This structure is passed to drivers and they are completely oblivious to the values of fields they don't see. This has happened before, and almost always the solution was to make the DSA layer thinner and use the upstream data structures. Here, the reason why we didn't do that is because struct flow_action_entry :: police is an anonymous structure. That is easily enough fixable, just name those fields "struct flow_action_police" and reference them from DSA. Make the according transformations to the two users (sja1105 and felix): "rate_bytes_per_sec" -> "rate_bytes_ps". Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Co-developed-by: David Yang <mmyangfl@gmail.com> Signed-off-by: David Yang <mmyangfl@gmail.com> Link: https://patch.msgid.link/20260206075427.44733-1-mmyangfl@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2026-02-10pid: introduce task_ppid_vnr() helperOleg Nesterov-0/+5
Cosmetic change. Unlike all other similar helpers task_ppid_nr_ns() doesn't have a _vnr() version; add one for consistency. Signed-off-by: Oleg Nesterov <oleg@redhat.com> Link: https://patch.msgid.link/20251015123633.GB9456@redhat.com Signed-off-by: Christian Brauner <brauner@kernel.org>
2026-02-10mm/slab: drop the OBJEXTS_NOSPIN_ALLOC flag from enum objext_flagsHarry Yoo-2/+1
OBJEXTS_NOSPIN_ALLOC was used to remember whether a slabobj_ext vector was allocated via kmalloc_nolock(), so that free_slab_obj_exts() could call kfree_nolock() instead of kfree(). Now that kfree() supports freeing kmalloc_nolock() objects, this flag is no longer needed. Instead, pass the allow_spin parameter down to free_slab_obj_exts() to determine whether kfree_nolock() or kfree() should be called in the free path, and free one bit in enum objext_flags. Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Harry Yoo <harry.yoo@oracle.com> Reviewed-by: Hao Li <hao.li@linux.dev> Link: https://patch.msgid.link/20260210044642.139482-3-harry.yoo@oracle.com Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
2026-02-10mm/slab: allow freeing kmalloc_nolock()'d objects using kfree[_rcu]()Harry Yoo-2/+2
Slab objects that are allocated with kmalloc_nolock() must be freed using kfree_nolock() because only a subset of alloc hooks are called, since kmalloc_nolock() can't spin on a lock during allocation. This imposes a limitation: such objects cannot be freed with kfree_rcu(), forcing users to work around this limitation by calling call_rcu() with a callback that frees the object using kfree_nolock(). Remove this limitation by teaching kmemleak to gracefully ignore cases when kmemleak_free() or kmemleak_ignore() is called without a prior kmemleak_alloc(). Unlike kmemleak, kfence already handles this case, because, due to its design, only a subset of allocations are served from kfence. With this change, kfree() and kfree_rcu() can be used to free objects that are allocated using kmalloc_nolock(). Suggested-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Alexei Starovoitov <ast@kernel.org> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Harry Yoo <harry.yoo@oracle.com> Link: https://patch.msgid.link/20260210044642.139482-2-harry.yoo@oracle.com Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
2026-02-10pid: reorder fields in pid_namespace to reduce false sharingMateusz Guzik-7/+7
alloc_pid() loads pid_cachep, level and pid_max prior to taking the lock. It dirties idr and pid_allocated with the lock. Some of these fields share the cacheline as is, split them up. No change in the size of the struct. Signed-off-by: Mateusz Guzik <mjguzik@gmail.com> Link: https://patch.msgid.link/20260120204820.1497002-1-mjguzik@gmail.com Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org>
2026-02-10pidfs: convert rb-tree to rhashtableChristian Brauner-3/+4
Mateusz reported performance penalties [1] during task creation because pidfs uses pidmap_lock to add elements into the rbtree. Switch to an rhashtable to have separate fine-grained locking and to decouple from pidmap_lock moving all heavy manipulations outside of it. Convert the pidfs inode-to-pid mapping from an rb-tree with seqcount protection to an rhashtable. This removes the global pidmap_lock contention from pidfs_ino_get_pid() lookups and allows the hashtable insert to happen outside the pidmap_lock. pidfs_add_pid() is split. pidfs_prepare_pid() allocates inode number and initializes pid fields and is called inside pidmap_lock. pidfs_add_pid() inserts pid into rhashtable and is called outside pidmap_lock. Insertion into the rhashtable can fail and memory allocation may happen so we need to drop the spinlock. To guard against accidently opening an already reaped task pidfs_ino_get_pid() uses additional checks beyond pid_vnr(). If pid->attr is PIDFS_PID_DEAD or NULL the pid either never had a pidfd or it already went through pidfs_exit() aka the process as already reaped. If pid->attr is valid check PIDFS_ATTR_BIT_EXIT to figure out whether the task has exited. This slightly changes visibility semantics: pidfd creation is denied after pidfs_exit() runs, which is just before the pid number is removed from the via free_pid(). That should not be an issue though. Link: https://lore.kernel.org/20251206131955.780557-1-mjguzik@gmail.com [1] Link: https://patch.msgid.link/20260120-work-pidfs-rhashtable-v2-1-d593c4d0f576@kernel.org Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org>
2026-02-10KVM: s390: Use guest address to mark guest page dirtyClaudio Imbrenda-0/+2
Stop using the userspace address to mark the guest page dirty. mark_page_dirty() expects a guest frame number, but was being passed a host virtual frame number. When slot == NULL, mark_page_dirty_in_slot() does nothing and does not complain. This means that in some circumstances the dirtiness of the guest page might have been lost. Fix by adding two fields in struct kvm_s390_adapter_int to keep the guest addressses, and use those for mark_page_dirty(). Fixes: f65470661f36 ("KVM: s390/interrupt: do not pin adapter interrupt pages") Reviewed-by: Steffen Eiden <seiden@linux.ibm.com> Reviewed-by: Janosch Frank <frankja@linux.ibm.com> Reviewed-by: Christoph Schlameuss <schlameuss@linux.ibm.com> Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
2026-02-10Merge branch 'slab/for-7.0/sheaves' into slab/for-nextVlastimil Babka-6/+0
Merge series "slab: replace cpu (partial) slabs with sheaves". The percpu sheaves caching layer was introduced as opt-in but the goal was to eventually move all caches to them. This is the next step, enabling sheaves for all caches (except the two bootstrap ones) and then removing the per cpu (partial) slabs and lots of associated code. Besides the lower locking overhead and much more likely fastpath when freeing, this removes the rather complicated code related to the cpu slab lockless fastpaths (using this_cpu_try_cmpxchg128/64) and all its complications for PREEMPT_RT or kmalloc_nolock(). The lockless slab freelist+counters update operation using try_cmpxchg128/64 remains and is crucial for freeing remote NUMA objects and to allow flushing objects from sheaves to slabs mostly without the node list_lock. Link: https://lore.kernel.org/all/20260123-sheaves-for-all-v4-0-041323d506f7@suse.cz/
2026-02-09Merge tag 'docs-7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/docs/linuxLinus Torvalds-1/+16
Pull documentation updates from Jonathan Corbet: "A slightly calmer cycle for docs this time around, though there is still a fair amount going on, including: - Some signs of life on the long-moribund Japanese translation - Documentation on policies around the use of generative tools for patch submissions, and a separate document intended for consumption by generative tools - The completion of the move of the documentation tools to tools/docs. For now we're leaving a /scripts/kernel-doc symlink behind to avoid breaking scripts - Ongoing build-system work includes the incorporation of documentation in Python code, better support for documenting variables, and lots of improvements and fixes - Automatic linking of man-page references -- cat(1), for example -- to the online pages in the HTML build ...and the usual array of typo fixes and such" * tag 'docs-7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/docs/linux: (107 commits) doc: development-process: add notice on testing tools: sphinx-build-wrapper: improve its help message docs: sphinx-build-wrapper: allow -v override -q docs: kdoc: Fix pdfdocs build for tools docs: ja_JP: process: translate 'Obtain a current source tree' docs: fix 're-use' -> 'reuse' in documentation docs: ioctl-number: fix a typo in ioctl-number.rst docs: filesystems: ensure proc pid substitutable is complete docs: automarkup.py: Skip common English words as C identifiers Documentation: use a source-read extension for the index link boilerplate docs: parse_features: make documentation more consistent docs: add parse_features module documentation docs: jobserver: do some documentation improvements docs: add jobserver module documentation docs: kabi: helpers: add documentation for each "enum" value docs: kabi: helpers: add helper for debug bits 7 and 8 docs: kabi: system_symbols: end docstring phrases with a dot docs: python: abi_regex: do some improvements at documentation docs: python: abi_parser: do some improvements at documentation docs: add kabi modules documentation ...
2026-02-09Merge tag 'efi-next-for-v7.0' of ↵Linus Torvalds-15/+23
git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi Pull EFI updates from Ard Biesheuvel: - Quirk the broken EFI framebuffer geometry on the Valve Steam Deck - Capture the EDID information of the primary display also on non-x86 EFI systems when booting via the EFI stub. * tag 'efi-next-for-v7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi: efi: Support EDID information sysfb: Move edid_info into sysfb_primary_display sysfb: Pass sysfb_primary_display to devices sysfb: Replace screen_info with sysfb_primary_display sysfb: Add struct sysfb_display_info efi: sysfb_efi: Reduce number of references to global screen_info efi: earlycon: Reduce number of references to global screen_info efi: sysfb_efi: Fix efidrmfb and simpledrmfb on Valve Steam Deck efi: sysfb_efi: Convert swap width and height quirk to a callback efi: sysfb_efi: Fix lfb_linelength calculation when applying quirks efi: sysfb_efi: Replace open coded swap with the macro
2026-02-09Merge tag 'for-linus-7.0-rc1-tag' of ↵Linus Torvalds-0/+2
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen updates from Juergen Gross: - fix running as Xen PVH guest in 32-bit mode without PAE - fix PV device handling for suspend/resume when running as a Xen guest - clean up workqueue usage - fix the Xen balloon driver for PVH dom0 - introduce the possibility to use hypercalls for console messages in unprivileged guests - enable Xen dom0 use of virtio devices in nested virtualization setups - simplify the xen-mcelog driver * tag 'for-linus-7.0-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xenbus: Rename helpers to freeze/thaw/restore xenbus: Use .freeze/.thaw to handle xenbus devices xen/mcelog: simplify MCE_GETCLEAR_FLAGS using xchg() xen/balloon: improve accuracy of initial balloon target for dom0 Partial revert "x86/xen: fix balloon target initialization for PVH dom0" xen: introduce xen_console_io option xen/virtio: Don't use grant-dma-ops when running as Dom0 x86/xen/pvh: Enable PAE mode for 32-bit guest only when CONFIG_X86_PAE is set xen: privcmd: WQ_PERCPU added to alloc_workqueue users xen/events: replace use of system_wq with system_percpu_wq
2026-02-09Merge tag 'arm64-upstream' of ↵Linus Torvalds-1/+7
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 updates from Will Deacon: "There's a little less than normal, probably due to LPC & Christmas/New Year meaning that a few series weren't quite ready or reviewed in time. It's still useful across the board, despite the only real feature being support for the LS64 feature enabling 64-byte atomic accesses to endpoints that support it. ACPI: - Add interrupt signalling support to the AGDI handler - Add Catalin and myself to the arm64 ACPI MAINTAINERS entry CPU features: - Drop Kconfig options for PAN and LSE (these are detected at runtime) - Add support for 64-byte single-copy atomic instructions (LS64/LS64V) - Reduce MTE overhead when executing in the kernel on Ampere CPUs - Ensure POR_EL0 value exposed via ptrace is up-to-date - Fix error handling on GCS allocation failure CPU frequency: - Add CPU hotplug support to the FIE setup in the AMU driver Entry code: - Minor optimisations and cleanups to the syscall entry path - Preparatory rework for moving to the generic syscall entry code Hardware errata: - Work around Spectre-BHB on TSV110 processors - Work around broken CMO propagation on some systems with the SI-L1 interconnect Miscellaneous: - Disable branch profiling for arch/arm64/ to avoid issues with noinstr - Minor fixes and cleanups (kexec + ubsan, WARN_ONCE() instead of WARN_ON(), reduction of boolean expression) - Fix custom __READ_ONCE() implementation for LTO builds when operating on non-atomic types Perf and PMUs: - Support for CMN-600AE - Be stricter about supported hardware in the CMN driver - Support for DSU-110 and DSU-120 - Support for the cycles event in the DSU driver (alongside the dedicated cycles counter) - Use IRQF_NO_THREAD instead of IRQF_ONESHOT in the cxlpmu driver - Use !bitmap_empty() as a faster alternative to bitmap_weight() - Fix SPE error handling when failing to resume profiling Selftests: - Add support for the FORCE_TARGETS option to the arm64 kselftests - Avoid nolibc-specific my_syscall() function - Add basic test for the LS64 HWCAP - Extend fp-pidbench to cover additional workload patterns" * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (43 commits) perf/arm-cmn: Reject unsupported hardware configurations perf: arm_spe: Properly set hw.state on failures arm64/gcs: Fix error handling in arch_set_shadow_stack_status() arm64: Fix non-atomic __READ_ONCE() with CONFIG_LTO=y arm64: poe: fix stale POR_EL0 values for ptrace kselftest/arm64: Raise default number of loops in fp-pidbench kselftest/arm64: Add a no-SVE loop after SVE in fp-pidbench perf/cxlpmu: Replace IRQF_ONESHOT with IRQF_NO_THREAD arm64: mte: Set TCMA1 whenever MTE is present in the kernel arm64/ptrace: Return early for ptrace_report_syscall_entry() error arm64/ptrace: Split report_syscall() arm64: Remove unused _TIF_WORK_MASK kselftest/arm64: Add missing file in .gitignore arm64: errata: Workaround for SI L1 downstream coherency issue kselftest/arm64: Add HWCAP test for FEAT_LS64 arm64: Add support for FEAT_{LS64, LS64_V} KVM: arm64: Enable FEAT_{LS64, LS64_V} in the supported guest arm64: Provide basic EL2 setup for FEAT_{LS64, LS64_V} usage at EL0/1 KVM: arm64: Handle DABT caused by LS64* instructions on unsupported memory KVM: arm64: Add documentation for KVM_EXIT_ARM_LDST64B ...
2026-02-09Merge tag 'm68k-for-v7.0-tag1' of ↵Linus Torvalds-2/+1
git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k Pull m68k updates from Geert Uytterhoeven: - Add missing put_device() in the NuBus driver - Replace vsprintf() with vsnprintf() on Sun-3 * tag 'm68k-for-v7.0-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k: m68k: sun3: Replace vsprintf() with bounded vsnprintf() nubus: Call put_device() in bus initialization error path
2026-02-09Merge tag 'kthread-for-7.0' of ↵Linus Torvalds-12/+28
git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks Pull kthread updates from Frederic Weisbecker: "The kthread code provides an infrastructure which manages the preferred affinity of unbound kthreads (node or custom cpumask) against housekeeping (CPU isolation) constraints and CPU hotplug events. One crucial missing piece is the handling of cpuset: when an isolated partition is created, deleted, or its CPUs updated, all the unbound kthreads in the top cpuset become indifferently affine to _all_ the non-isolated CPUs, possibly breaking their preferred affinity along the way. Solve this with performing the kthreads affinity update from cpuset to the kthreads consolidated relevant code instead so that preferred affinities are honoured and applied against the updated cpuset isolated partitions. The dispatch of the new isolated cpumasks to timers, workqueues and kthreads is performed by housekeeping, as per the nice Tejun's suggestion. As a welcome side effect, HK_TYPE_DOMAIN then integrates both the set from boot defined domain isolation (through isolcpus=) and cpuset isolated partitions. Housekeeping cpumasks are now modifiable with a specific RCU based synchronization. A big step toward making nohz_full= also mutable through cpuset in the future" * tag 'kthread-for-7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks: (33 commits) doc: Add housekeeping documentation kthread: Document kthread_affine_preferred() kthread: Comment on the purpose and placement of kthread_affine_node() call kthread: Honour kthreads preferred affinity after cpuset changes sched/arm64: Move fallback task cpumask to HK_TYPE_DOMAIN sched: Switch the fallback task allowed cpumask to HK_TYPE_DOMAIN kthread: Rely on HK_TYPE_DOMAIN for preferred affinity management kthread: Include kthreadd to the managed affinity list kthread: Include unbound kthreads in the managed affinity list kthread: Refine naming of affinity related fields PCI: Remove superfluous HK_TYPE_WQ check sched/isolation: Remove HK_TYPE_TICK test from cpu_is_isolated() cpuset: Remove cpuset_cpu_is_isolated() timers/migration: Remove superfluous cpuset isolation test cpuset: Propagate cpuset isolation update to timers through housekeeping cpuset: Propagate cpuset isolation update to workqueue through housekeeping PCI: Flush PCI probe workqueue on cpuset isolated partition change sched/isolation: Flush vmstat workqueues on cpuset isolated partition change sched/isolation: Flush memcg workqueues on cpuset isolated partition change cpuset: Update HK_TYPE_DOMAIN cpumask from cpuset ...
2026-02-09Merge tag 'thermal-6.20-rc1' of ↵Linus Torvalds-0/+29
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull thermal control updates from Rafael Wysocki: "These add support for "slow" (long-term trend) workload type hints to the Intel int340x thermal driver and selftests (and enable it for Panther Lake), add support for MT8196 along with DT bindings and for MT7987 to the Mediatek LVTS thermal driver, add support for RZ/T2H and RZ/N2H along with DT bindings to the Renesas rzg3e thermal driver, add support for the Panther Lake, Wildcat Lake and Nova Lake processors to the intel_tcc_cooling driver, fix bugs, make some cosmetic changes including code cleanups and library function substitutions, and update documentation. Specifics: - Add Panther Lake, Wildcat Lake and Nova Lake processor IDs to the list of supported processors in the intel_tcc_cooling thermal driver (Srinivas Pandruvada) - Drop unnecessary explicit driver data clearing on removal from the intel_pch_thermal driver (Kaushlendra Kumar) - Add support for "slow" workload type hints to the int340x processor_thermal driver and enable it on the Panther Lake platform (Srinivas Pandruvada) - Use sysfs_emit{_at}() in sysfs show functions in Intel thermal drivers (Thorsten Blum) - Update the x86_pkg_temp_thermal driver to handle THERMAL_TEMP_INVALID that can be passed to it via sysfs as expected (Rafael Wysocki) - Drop a redundant local variable from the intel_tcc_cooling thermal driver and fix a kerneldoc comment typo in the TCC library (Sumeet Pawnikar) - Fix CFLAGS and LDFLAGS in the pkg-config libthermal template (Romain Gantois) - Support multiple temp to raw conversion functions in the Mediatek LVTS thermal driver and add MT8196 and MT6991 support to it (Laura Nao) - Add Mediatek LVTS driver support for MT7987 (Frank Wunderlich) - Use the existing HZ_PER_MHZ macro on STM32 (Andy Shevchenko) - Use the existing clamp() macro on BCM2835 (Thorsten Blum) - Make the reset line optional in order to support new Renesas SoCs where it is not available and add support for RZ/T2H and RZ/N2H to the rzg3e thermal driver (Cosmin Tanislav) - Document RZ/V2N TSU in the r9a09g047-tsu DT bindings (Ovidiu Panait) - Fix all kernel-doc warnings in the internal thermal core header file (Randy Dunlap) - Fix a device node reference leak in thermal_of_cm_lookup() (Felix Gu) - Replace some old-style library function calls with ones that are currently recommended in several places in the thermal core and debugfs code (Sumeet Pawnikar, Thorsten Blum) * tag 'thermal-6.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (34 commits) drivers: thermal: intel: tcc_cooling: Drop redundant local variable thermal/of: Fix reference leak in thermal_of_cm_lookup() thermal: core: thermal_core.h: fix all kernel-doc warnings thermal: intel: x86_pkg_temp_thermal: Handle invalid temperature thermal: renesas: rzg3e: add support for RZ/T2H and RZ/N2H dt-bindings: thermal: r9a09g047-tsu: document RZ/T2H and RZ/N2H thermal: renesas: rzg3e: make calibration value retrieval per-chip thermal: renesas: rzg3e: make min and max temperature per-chip thermal: renesas: rzg3e: make reset optional dt-bindings: thermal: r9a09g047-tsu: Document RZ/V2N TSU thermal/drivers/broadcom: Use clamp to simplify bcm2835_thermal_temp2adc thermal/drivers/stm32: Use predefined HZ_PER_MHZ instead of a custom one thermal/drivers/mediatek/lvts_thermal: Add mt7987 support dt-bindings: thermal: mediatek: Add LVTS thermal controller definition for MT7987 dt-bindings: nvmem: mediatek: efuse: Add support for MT8196 thermal/drivers/mediatek/lvts_thermal: Add MT8196 support thermal/drivers/mediatek/lvts: Support MSR offset for 16-bit calibration data thermal/drivers/mediatek/lvts: Add support for ATP mode thermal/drivers/mediatek/lvts: Add lvts_temp_to_raw variant thermal/drivers/mediatek/lvts: Add platform ops to support alternative conversion logic ...
2026-02-09Merge tag 'pm-6.20-rc1' of ↵Linus Torvalds-2/+9
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management updates from Rafael Wysocki: "By the number of commits, cpufreq is the leading party (again) and the most visible change there is the removal of the omap-cpufreq driver that has not been used for a long time (good riddance). There are also quite a few changes in the cppc_cpufreq driver, mostly related to fixing its frequency invariance engine in the case when the CPPC registers used by it are not in PCC. In addition to that, support for AM62L3 is added to the ti-cpufreq driver and the cpufreq-dt-platdev list is updated for some platforms. The remaining cpufreq changes are assorted fixes and cleanups. Next up is cpuidle and the changes there are dominated by intel_idle driver updates, mostly related to the new command line facility allowing users to adjust the list of C-states used by the driver. There are also a few updates of cpuidle governors, including two menu governor fixes and some refinements of the teo governor, and a MAINTAINERS update adding Christian Loehle as a cpuidle reviewer. [Thanks for stepping up Christian!] The most significant update related to system suspend and hibernation is the one to stop freezing the PM runtime workqueue during system PM transitions which allows some deadlocks to be avoided. There is also a fix for possible concurrent bit field updates in the core device suspend code and a few other minor fixes. Apart from the above, several drivers are updated to discard the return value of pm_runtime_put() which is going to be converted to a void function as soon as everybody stops using its return value, PL4 support for Ice Lake is added to the Intel RAPL power capping driver, and there are assorted cleanups, documentation fixes, and some cpupower utility improvements. Specifics: - Remove the unused omap-cpufreq driver (Andreas Kemnade) - Optimize error handling code in cpufreq_boost_trigger_state() and make cpufreq_boost_trigger_state() return -EOPNOTSUPP if no policy supports boost (Lifeng Zheng) - Update cpufreq-dt-platdev list for tegra, qcom, TI (Aaron Kling, Dhruva Gole, and Konrad Dybcio) - Minor improvements to the cpufreq and cpumask rust implementation (Alexandre Courbot, Alice Ryhl, Tamir Duberstein, and Yilin Chen) - Add support for AM62L3 SoC to the ti-cpufreq driver (Dhruva Gole) - Update arch_freq_scale in the CPPC cpufreq driver's frequency invariance engine (FIE) in scheduler ticks if the related CPPC registers are not in PCC (Jie Zhan) - Assorted minor cleanups and improvements in ARM cpufreq drivers (Juan Martinez, Felix Gu, Luca Weiss, and Sergey Shtylyov) - Add generic helpers for sysfs show/store to cppc_cpufreq (Sumit Gupta) - Make the scaling_setspeed cpufreq sysfs attribute return the actual requested frequency to avoid confusion (Pengjie Zhang) - Simplify the idle CPU time granularity test in the ondemand cpufreq governor (Frederic Weisbecker) - Enable asym capacity in intel_pstate only when CPU SMT is not possible (Yaxiong Tian) - Update the description of rate_limit_us default value in cpufreq documentation (Yaxiong Tian) - Add a command line option to adjust the C-states table in the intel_idle driver, remove the 'preferred_cstates' module parameter from it, add C-states validation to it and clean it up (Artem Bityutskiy) - Make the menu cpuidle governor always check the time till the closest timer event when the scheduler tick has been stopped to prevent it from mistakenly selecting the deepest available idle state (Rafael Wysocki) - Update the teo cpuidle governor to avoid making suboptimal decisions in certain corner cases and generally improve idle state selection accuracy (Rafael Wysocki) - Remove an unlikely() annotation on the early-return condition in menu_select() that leads to branch misprediction 100% of the time on systems with only 1 idle state enabled, like ARM64 servers (Breno Leitao) - Add Christian Loehle to MAINTAINERS as a cpuidle reviewer (Christian Loehle) - Stop flagging the PM runtime workqueue as freezable to avoid system suspend and resume deadlocks in subsystems that assume asynchronous runtime PM to work during system-wide PM transitions (Rafael Wysocki) - Drop redundant NULL pointer checks before acomp_request_free() from the hibernation code handling image saving (Rafael Wysocki) - Update wakeup_sources_walk_start() to handle empty lists of wakeup sources as appropriate (Samuel Wu) - Make dev_pm_clear_wake_irq() check the power.wakeirq value under power.lock to avoid race conditions (Gui-Dong Han) - Avoid bit field races related to power.work_in_progress in the core device suspend code (Xuewen Yan) - Make several drivers discard pm_runtime_put() return value in preparation for converting that function to a void one (Rafael Wysocki) - Add PL4 support for Ice Lake to the Intel RAPL power capping driver (Daniel Tang) - Replace sprintf() with sysfs_emit() in power capping sysfs show functions (Sumeet Pawnikar) - Make dev_pm_opp_get_level() return value match the documentation after a previous update of the latter (Aleks Todorov) - Use scoped for each OF child loop in the OPP code (Krzysztof Kozlowski) - Fix a bug in an example code snippet and correct typos in the energy model management documentation (Patrick Little) - Fix miscellaneous problems in cpupower (Kaushlendra Kumar): * idle_monitor: Fix incorrect value logged after stop * Fix inverted APERF capability check * Use strcspn() to strip trailing newline * Reset errno before strtoull() * Show C0 in idle-info dump - Improve cpupower installation procedure by making the systemd step optional and allowing users to disable the installation of systemd's unit file (João Marcos Costa)" * tag 'pm-6.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (65 commits) PM: sleep: core: Avoid bit field races related to work_in_progress PM: sleep: wakeirq: harden dev_pm_clear_wake_irq() against races cpufreq: Documentation: Update description of rate_limit_us default value cpufreq: intel_pstate: Enable asym capacity only when CPU SMT is not possible PM: wakeup: Handle empty list in wakeup_sources_walk_start() PM: EM: Documentation: Fix bug in example code snippet Documentation: Fix typos in energy model documentation cpuidle: governors: teo: Refine intercepts-based idle state lookup cpuidle: governors: teo: Adjust the classification of wakeup events cpufreq: ondemand: Simplify idle cputime granularity test cpufreq: userspace: make scaling_setspeed return the actual requested frequency PM: hibernate: Drop NULL pointer checks before acomp_request_free() cpufreq: CPPC: Add generic helpers for sysfs show/store cpufreq: scmi: Fix device_node reference leak in scmi_cpu_domain_id() cpufreq: ti-cpufreq: add support for AM62L3 SoC cpufreq: dt-platdev: Add ti,am62l3 to blocklist cpufreq/amd-pstate: Add comment explaining nominal_perf usage for performance policy cpufreq: scmi: correct SCMI explanation cpufreq: dt-platdev: Block the driver from probing on more QC platforms rust: cpumask: rename methods of Cpumask for clarity and consistency ...
2026-02-09Merge tag 'acpi-6.20-rc1' of ↵Linus Torvalds-44/+539
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI updates from Rafael Wysocki: "This one is significantly larger than previous ACPI support pull requests because several significant updates have coincided in it. First, there is a routine ACPICA code update, to upstream version 20251212, but this time it covers new ACPI 6.6 material that has not been covered yet. Among other things, it includes definitions of a few new ACPI tables and updates of some others, like the GICv5 MADT structures and ARM IORT IWB node definitions that are used for adding GICv5 ACPI probing on ARM (that technically is IRQ subsystem material, but it depends on the ACPICA changes, so it is included here). The latter alone adds a few hundred lines of new code. Second, there is an update of ACPI _OSC handling including a fix that prevents failures from occurring in some corner cases due to careless handling of _OSC error bits. On top of that, the "system resource" ACPI device objects with the PNP0C01 and PNP0C02 are now going to be handled by the ACPI core device enumeration code instead of handing them over to the legacy PNP system driver which causes device enumeration issues to occur. Some of those issues have been worked around in device drivers and elsewhere and those workarounds should not be necessary any more, so they are going away. Moreover, the time has come to convert all "core ACPI" device drivers that were still using struct acpi_driver objects for device binding into proper platform drivers that use struct platform_driver for this purpose. These updates are accompanied by some requisite core ACPI device enumeration code changes. Next, there are ACPI APEI updates, including changes to avoid excess overhead in the NMI handler and in SEA on the ARM side, changes to unify ACPI-based HW error tracing and logging, and changes to prevent APEI code from reaching out of its allocated memory. There are also some ACPI power management updates, mostly related to the ACPI cpuidle support in the processor driver, suspend-to-idle handling on systems with ACPI support and to ACPI PM of devices. In addition to the above, bugs are fixed and the code is cleaned up in assorted places all over. Specifics: - Update the ACPICA code in the kernel to upstream version 20251212 which includes the following changes: * Add support for new ACPI table DTPR (Michal Camacho Romero) * Release objects with acpi_ut_delete_object_desc() (Zilin Guan) * Add UUIDs for Microsoft fan extensions and UUIDs associated with TPM 2.0 devices (Armin Wolf) * Fix NULL pointer dereference in acpi_ev_address_space_dispatch() (Alexey Simakov) * Add KEYP ACPI table definition (Dave Jiang) * Add support for the Microsoft display mux _OSI string (Armin Wolf) * Add definitions for the IOVT ACPI table (Xianglai Li) * Abort AML bytecode execution on AML_FATAL_OP (Armin Wolf) * Include all fields in subtable type1 for PPTT (Ben Horgan) * Add GICv5 MADT structures and Arm IORT IWB node definitions (Jose Marinho) * Update Parameter Block structure for RAS2 and add a new flag in Memory Affinity Structure for SRAT (Pawel Chmielewski) * Add _VDM (Voltage Domain) object (Pawel Chmielewski) - Add support for GICv5 ACPI probing on ARM which is based on the GICv5 MADT structures and ARM IORT IWB node definitions recently added to ACPICA (Lorenzo Pieralisi) - Rework ACPI PM notification setup for PCI root buses and modify the ACPI PM setup for devices to register wakeup source objects under physical (that is, PCI, platform, etc.) devices instead of doing that under their ACPI companions (Rafael Wysocki) - Adjust debug messages regarding postponed ACPI PM printed during system resume to be more accurate (Rafael Wysocki) - Remove dead code from lps0_device_attach() (Gergo Koteles) - Start to invoke Microsoft Function 9 (Turn On Display) of the Low- Power S0 Idle (LPS0) _DSM in the suspend-to-idle resume flow on systems with ACPI LPS0 support to address a functional issue on Lenovo Yoga Slim 7i Aura (15ILL9), where system fans and keyboard backlights fail to resume after suspend (Jakob Riemenschneider) - Add sysfs attribute cid for exposing _CID lists under ACPI device objects (Rafael Wysocki) - Replace sprintf() with sysfs_emit() in all of the core ACPI sysfs interface code (Sumeet Pawnikar) - Use acpi_get_local_u64_address() in the code implementing ACPI support for PCI to evaluate _ADR instead of evaluating that object directly (Andy Shevchenko) - Add JWIPC JVC9100 to irq1_level_low_skip_override[] to unbreak serial IRQs on that system (Ai Chao) - Fix handling of _OSC errors in acpi_run_osc() to avoid failures on systems where _OSC error bits are set even though the _OSC return buffer contains acknowledged feature bits (Rafael Wysocki) - Clean up and rearrange \_SB._OSC handling for general platform features and USB4 features to avoid code duplication and unnecessary memory management overhead (Rafael Wysocki) - Make the ACPI core device enumeration code handle PNP0C01 and PNP0C02 ("system resource") device objects directly instead of letting the legacy PNP system driver handle them to avoid device enumeration issues on systems where PNP0C02 is present in the _CID list under ACPI device objects with a _HID matching a proper device driver in Linux (Rafael Wysocki) - Drop workarounds for the known device enumeration issues related to _CID lists containing PNP0C02 (Rafael Wysocki) - Drop outdated comment regarding removed function in the ACPI-based device enumeration code (Julia Lawall) - Make PRP0001 device matching work as expected for ACPI device objects using it as a _HID for board development and similar purposes (Kartik Rajput) - Use async schedule function in acpi_scan_clear_dep_fn() to avoid races with user space initialization on some systems (Yicong Yang) - Add a piece of documentation explaining why binding drivers directly to ACPI device objects is not a good idea in general and why it is desirable to convert drivers doing so into proper platform drivers that use struct platform_driver for device binding (Rafael Wysocki) - Convert multiple "core ACPI" drivers, including the NFIT ACPI device driver, the generic ACPI button drivers, the generic ACPI thermal zone driver, the ACPI hardware event device (HED) driver, the ACPI EC driver, the ACPI SMBUS HC driver, the ACPI Smart Battery Subsystem (SBS) driver, and the ACPI backlight (video) driver to proper platform drivers that use struct platform_driver for device binding (Rafael Wysocki) - Use acpi_get_local_u64_address() in the ACPI backlight (video) driver to evaluate _ADR instead of evaluating that object directly (Andy Shevchenko) - Convert the generic ACPI battery driver to a proper platform driver using struct platform_driver for device binding (Rafael Wysocki) - Fix incorrect charging status when current is zero in the generic ACPI battery driver (Ata İlhan Köktürk) - Use LIST_HEAD() for initializing a stack-allocated list in the generic ACPI watchdog device driver (Can Peng) - Rework the ACPI idle driver initialization to register it directly from the common initialization code instead of doing that from a CPU hotplug "online" callback and clean it up (Huisong Li, Rafael Wysocki) - Fix a possible NULL pointer dereference in acpi_processor_errata_piix4() (Tuo Li) - Make read-only array non_mmio_desc[] static const (Colin Ian King) - Prevent the APEI GHES support code on ARM from accessing memory out of bounds or going past the ARM processor CPER record buffer (Mauro Carvalho Chehab) - Prevent cper_print_fw_err() from dumping the entire memory on systems with defective firmware (Mauro Carvalho Chehab) - Improve ghes_notify_nmi() status check to avoid unnecessary overhead in the NMI handler by carrying out all of the requisite preparations and the NMI registration time (Tony Luck) - Refactor the GHES driver by extracting common functionality into reusable helper functions to reduce code duplication and improve the ghes_notify_sea() status check in analogy with the previous ghes_notify_nmi() status check improvement (Shuai Xue) - Make ELOG and GHES log and trace consistently and support the CPER CXL protocol analogously (Fabio De Francesco) - Disable KASAN instrumentation in the APEI GHES driver when compile testing with clang < 18 (Nathan Chancellor) - Let ghes_edac be the preferred driver to load on __ZX__ and _BYO_ systems by extending the platform detection list in the APEI GHES driver (Tony W Wang-oc) - Clean up cppc_perf_caps and cppc_perf_ctrls structs and rename EPP constants for clarity in the ACPI CPPC library (Sumit Gupta)" * tag 'acpi-6.20-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (117 commits) ACPI: battery: fix incorrect charging status when current is zero ACPI: scan: Use async schedule function in acpi_scan_clear_dep_fn() ACPI: x86: s2idle: Invoke Microsoft _DSM Function 9 (Turn On Display) ACPI: APEI: GHES: Add ghes_edac support for __ZX__ and _BYO_ systems ACPI: APEI: GHES: Disable KASAN instrumentation when compile testing with clang < 18 ACPI: sysfs: Replace sprintf() with sysfs_emit() ACPI: CPPC: Rename EPP constants for clarity ACPI: CPPC: Clean up cppc_perf_caps and cppc_perf_ctrls structs ACPI: processor: idle: Rework the handling of acpi_processor_ffh_lpi_probe() ACPI: processor: idle: Convert acpi_processor_setup_cpuidle_dev() to void ACPI: processor: idle: Convert acpi_processor_setup_cpuidle_states() to void irqchip/gic-v5: Add ACPI IWB probing irqchip/gic-v5: Add ACPI ITS probing irqchip/gic-v5: Add ACPI IRS probing irqchip/gic-v5: Split IRS probing into OF and generic portions PCI/MSI: Make the pci_msi_map_rid_ctlr_node() interface firmware agnostic irqdomain: Add parent field to struct irqchip_fwid ACPI: PCI: simplify code with acpi_get_local_u64_address() ACPI: video: simplify code with acpi_get_local_u64_address() ACPI: PM: Adjust messages regarding postponed ACPI PM ...
2026-02-09Merge tag 'for-7.0/block-stable-pages-20260206' of ↵Linus Torvalds-1/+40
git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux Pull bounce buffer dio for stable pages from Jens Axboe: "This adds support for bounce buffering of dio for stable pages. This was all done by Christoph. In his words: This series tries to address the problem that under I/O pages can be modified during direct I/O, even when the device or file system require stable pages during I/O to calculate checksums, parity or data operations. It does so by adding block layer helpers to bounce buffer an iov_iter into a bio, then wires that up in iomap and ultimately XFS. The reason that the file system even needs to know about it, is because reads need a user context to copy the data back, and the infrastructure to defer ioends to a workqueue currently sits in XFS. I'm going to look into moving that into ioend and enabling it for other file systems. Additionally btrfs already has it's own infrastructure for this, and actually an urgent need to bounce buffer, so this should be useful there and could be wire up easily. In fact the idea comes from patches by Qu that did this in btrfs. This patch fixes all but one xfstests failures on T10 PI capable devices (generic/095 seems to have issues with a mix of mmap and splice still, I'm looking into that separately), and make qemu VMs running Windows, or Linux with swap enabled fine on an XFS file on a device using PI. Performance numbers on my (not exactly state of the art) NVMe PI test setup: Sequential reads using io_uring, QD=16. Bandwidth and CPU usage (usr/sys): | size | zero copy | bounce | +------+--------------------------+--------------------------+ | 4k | 1316MiB/s (12.65/55.40%) | 1081MiB/s (11.76/49.78%) | | 64K | 3370MiB/s ( 5.46/18.20%) | 3365MiB/s ( 4.47/15.68%) | | 1M | 3401MiB/s ( 0.76/23.05%) | 3400MiB/s ( 0.80/09.06%) | +------+--------------------------+--------------------------+ Sequential writes using io_uring, QD=16. Bandwidth and CPU usage (usr/sys): | size | zero copy | bounce | +------+--------------------------+--------------------------+ | 4k | 882MiB/s (11.83/33.88%) | 750MiB/s (10.53/34.08%) | | 64K | 2009MiB/s ( 7.33/15.80%) | 2007MiB/s ( 7.47/24.71%) | | 1M | 1992MiB/s ( 7.26/ 9.13%) | 1992MiB/s ( 9.21/19.11%) | +------+--------------------------+--------------------------+ Note that the 64k read numbers look really odd to me for the baseline zero copy case, but are reproducible over many repeated runs. The bounce read numbers should further improve when moving the PI validation to the file system and removing the double context switch, which I have patches for that will sent out soon" * tag 'for-7.0/block-stable-pages-20260206' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux: xfs: use bounce buffering direct I/O when the device requires stable pages iomap: add a flag to bounce buffer direct I/O iomap: support ioends for direct reads iomap: rename IOMAP_DIO_DIRTY to IOMAP_DIO_USER_BACKED iomap: free the bio before completing the dio iomap: share code between iomap_dio_bio_end_io and iomap_finish_ioend_direct iomap: split out the per-bio logic from iomap_dio_bio_iter iomap: simplify iomap_dio_bio_iter iomap: fix submission side handling of completion side errors block: add helpers to bounce buffer an iov_iter into bios block: remove bio_release_page iov_iter: extract a iov_iter_extract_bvecs helper from bio code block: open code bio_add_page and fix handling of mismatching P2P ranges block: refactor get_contig_folio_len block: add a BIO_MAX_SIZE constant and use it
2026-02-09Merge tag 'for-7.0/block-20260206' of ↵Linus Torvalds-23/+181
git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux Pull block updates from Jens Axboe: - Support for batch request processing for ublk, improving the efficiency of the kernel/ublk server communication. This can yield nice 7-12% performance improvements - Support for integrity data for ublk - Various other ublk improvements and additions, including a ton of selftests additions and updated - Move the handling of blk-crypto software fallback from below the block layer to above it. This reduces the complexity of dealing with bio splitting - Series fixing a number of potential deadlocks in blk-mq related to the queue usage counter and writeback throttling and rq-qos debugfs handling - Add an async_depth queue attribute, to resolve a performance regression that's been around for a qhilw related to the scheduler depth handling - Only use task_work for IOPOLL completions on NVMe, if it is necessary to do so. An earlier fix for an issue resulted in all these completions being punted to task_work, to guarantee that completions were only run for a given io_uring ring when it was local to that ring. With the new changes, we can detect if it's necessary to use task_work or not, and avoid it if possible. - rnbd fixes: - Fix refcount underflow in device unmap path - Handle PREFLUSH and NOUNMAP flags properly in protocol - Fix server-side bi_size for special IOs - Zero response buffer before use - Fix trace format for flags - Add .release to rnbd_dev_ktype - MD pull requests via Yu Kuai - Fix raid5_run() to return error when log_init() fails - Fix IO hang with degraded array with llbitmap - Fix percpu_ref not resurrected on suspend timeout in llbitmap - Fix GPF in write_page caused by resize race - Fix NULL pointer dereference in process_metadata_update - Fix hang when stopping arrays with metadata through dm-raid - Fix any_working flag handling in raid10_sync_request - Refactor sync/recovery code path, improve error handling for badblocks, and remove unused recovery_disabled field - Consolidate mddev boolean fields into mddev_flags - Use mempool to allocate stripe_request_ctx and make sure max_sectors is not less than io_opt in raid5 - Fix return value of mddev_trylock - Fix memory leak in raid1_run() - Add Li Nan as mdraid reviewer - Move phys_vec definitions to the kernel types, mostly in preparation for some VFIO and RDMA changes - Improve the speed for secure erase for some devices - Various little rust updates - Various other minor fixes, improvements, and cleanups * tag 'for-7.0/block-20260206' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux: (162 commits) blk-mq: ABI/sysfs-block: fix docs build warnings selftests: ublk: organize test directories by test ID block: decouple secure erase size limit from discard size limit block: remove redundant kill_bdev() call in set_blocksize() blk-mq: add documentation for new queue attribute async_dpeth block, bfq: convert to use request_queue->async_depth mq-deadline: covert to use request_queue->async_depth kyber: covert to use request_queue->async_depth blk-mq: add a new queue sysfs attribute async_depth blk-mq: factor out a helper blk_mq_limit_depth() blk-mq-sched: unify elevators checking for async requests block: convert nr_requests to unsigned int block: don't use strcpy to copy blockdev name blk-mq-debugfs: warn about possible deadlock blk-mq-debugfs: add missing debugfs_mutex in blk_mq_debugfs_register_hctxs() blk-mq-debugfs: remove blk_mq_debugfs_unregister_rqos() blk-mq-debugfs: make blk_mq_debugfs_register_rqos() static blk-rq-qos: fix possible debugfs_mutex deadlock blk-mq-debugfs: factor out a helper to register debugfs for all rq_qos blk-wbt: fix possible deadlock to nest pcpu_alloc_mutex under q_usage_counter ...
2026-02-09Merge tag 'io_uring-bpf-restrictions.4-20260206' of ↵Linus Torvalds-1/+99
git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux Pull io_uring bpf filters from Jens Axboe: "This adds support for both cBPF filters for io_uring, as well as task inherited restrictions and filters. seccomp and io_uring don't play along nicely, as most of the interesting data to filter on resides somewhat out-of-band, in the submission queue ring. As a result, things like containers and systemd that apply seccomp filters, can't filter io_uring operations. That leaves them with just one choice if filtering is critical - filter the actual io_uring_setup(2) system call to simply disallow io_uring. That's rather unfortunate, and has limited us because of it. io_uring already has some filtering support. It requires the ring to be setup in a disabled state, and then a filter set can be applied. This filter set is completely bi-modal - an opcode is either enabled or it's not. Once a filter set is registered, the ring can be enabled. This is very restrictive, and it's not useful at all to systemd or containers which really want both broader and more specific control. This first adds support for cBPF filters for opcodes, which enables tighter control over what exactly a specific opcode may do. As examples, specific support is added for IORING_OP_OPENAT/OPENAT2, allowing filtering on resolve flags. And another example is added for IORING_OP_SOCKET, allowing filtering on domain/type/protocol. These are both common use cases. cBPF was chosen rather than eBPF, because the latter is often restricted in containers as well. These filters are run post the init phase of the request, which allows filters to even dip into data that is being passed in struct in user memory, as the init side of requests make that data stable by bringing it into the kernel. This allows filtering without needing to copy this data twice, or have filters etc know about the exact layout of the user data. The filters get the already copied and sanitized data passed. On top of that support is added for per-task filters, meaning that any ring created with a task that has a per-task filter will get those filters applied when it's created. These filters are inherited across fork as well. Once a filter has been registered, any further added filters may only further restrict what operations are permitted. Filters cannot change the return value of an operation, they can only permit or deny it based on the contents" * tag 'io_uring-bpf-restrictions.4-20260206' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux: io_uring: allow registration of per-task restrictions io_uring: add task fork hook io_uring/bpf_filter: add ref counts to struct io_bpf_filter io_uring/bpf_filter: cache lookup table in ctx->bpf_filters io_uring/bpf_filter: allow filtering on contents of struct open_how io_uring/net: allow filtering on IORING_OP_SOCKET data io_uring: add support for BPF filtering for opcode restrictions
2026-02-09Merge tag 'for-7.0/io_uring-20260206' of ↵Linus Torvalds-10/+31
git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux Pull io_uring updates from Jens Axboe: - Clean up the IORING_SETUP_R_DISABLED and submitter task checking, mostly just in preparation for relaxing the locking for SINGLE_ISSUER in the future. - Improve IOPOLL by using a doubly linked list to manage completions. Previously it was singly listed, which meant that to complete request N in the chain 0..N-1 had to have completed first. With a doubly linked list we can complete whatever request completes in that order, rather than need to wait for a consecutive range to be available. This reduces latencies. - Improve the restriction setup and checking. Mostly in preparation for adding further features on top of that. Coming in a separate pull request. - Split out task_work and wait handling into separate files. These are mostly nicely abstracted already, but still remained in the io_uring.c file which is on the larger side. - Use GFP_KERNEL_ACCOUNT in a few more spots, where appropriate. - Ensure even the idle io-wq worker exits if a task no longer has any rings open. - Add support for a non-circular submission queue. By default, the SQ ring keeps moving around, even if only a few entries are used for each submission. This can be wasteful in terms of cachelines. If IORING_SETUP_SQ_REWIND is set for the ring when created, each submission will start at offset 0 instead of where we last left off doing submissions. - Various little cleanups * tag 'for-7.0/io_uring-20260206' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux: (30 commits) io_uring/kbuf: fix memory leak if io_buffer_add_list fails io_uring: Add SPDX id lines to remaining source files io_uring: allow io-wq workers to exit when unused io_uring/io-wq: add exit-on-idle state io_uring/net: don't continue send bundle if poll was required for retry io_uring/rsrc: use GFP_KERNEL_ACCOUNT consistently io_uring/futex: use GFP_KERNEL_ACCOUNT for futex data allocation io_uring/io-wq: handle !sysctl_hung_task_timeout_secs io_uring: fix bad indentation for setup flags if statement io_uring/rsrc: take unsigned index in io_rsrc_node_lookup() io_uring: introduce non-circular SQ io_uring: split out CQ waiting code into wait.c io_uring: split out task work code into tw.c io_uring/io-wq: don't trigger hung task for syzbot craziness io_uring: add IO_URING_EXIT_WAIT_MAX definition io_uring/sync: validate passed in offset io_uring/eventfd: remove unused ctx->evfd_last_cq_tail member io_uring/timeout: annotate data race in io_flush_timeouts() io_uring/uring_cmd: explicitly disallow cancelations for IOPOLL io_uring: fix IOPOLL with passthrough I/O ...
2026-02-09Merge tag 'pull-filename' of ↵Linus Torvalds-25/+30
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs 'struct filename' updates from Al Viro: "[Mostly] sanitize struct filename handling" * tag 'pull-filename' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (68 commits) sysfs(2): fs_index() argument is _not_ a pathname alpha: switch osf_mount() to strndup_user() ksmbd: use CLASS(filename_kernel) mqueue: switch to CLASS(filename) user_statfs(): switch to CLASS(filename) statx: switch to CLASS(filename_maybe_null) quotactl_block(): switch to CLASS(filename) chroot(2): switch to CLASS(filename) move_mount(2): switch to CLASS(filename_maybe_null) namei.c: switch user pathname imports to CLASS(filename{,_flags}) namei.c: convert getname_kernel() callers to CLASS(filename_kernel) do_f{chmod,chown,access}at(): use CLASS(filename_uflags) do_readlinkat(): switch to CLASS(filename_flags) do_sys_truncate(): switch to CLASS(filename) do_utimes_path(): switch to CLASS(filename_uflags) chdir(2): unspaghettify a bit... do_fchownat(): unspaghettify a bit... fspick(2): use CLASS(filename_flags) name_to_handle_at(): use CLASS(filename_uflags) vfs_open_tree(): use CLASS(filename_uflags) ...
2026-02-09Merge tag 'xfs-merge-7.0' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linuxLinus Torvalds-0/+1
Pull xfs updates from Carlos Maiolino: "This contains several improvements to zoned device support, performance improvements for the parent pointers, and a new health monitoring feature. There are some improvements in the journaling code too but no behavior change expected. Last but not least, some code refactoring and bug fixes are also included in this series" * tag 'xfs-merge-7.0' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: (67 commits) xfs: add sysfs stats for zoned GC xfs: give the defer_relog stat a xs_ prefix xfs: add zone reset error injection xfs: refactor zone reset handling xfs: don't mark all discard issued by zoned GC as sync xfs: allow setting errortags at mount time xfs: use WRITE_ONCE/READ_ONCE for m_errortag xfs: move the guts of XFS_ERRORTAG_DELAY out of line xfs: don't validate error tags in the I/O path xfs: allocate m_errortag early xfs: fix the errno sign for the xfs_errortag_{add,clearall} stubs xfs: validate log record version against superblock log version xfs: fix spacing style issues in xfs_alloc.c xfs: remove xfs_zone_gc_space_available xfs: use a seprate member to track space availabe in the GC scatch buffer xfs: check for deleted cursors when revalidating two btrees xfs: fix UAF in xchk_btree_check_block_owner xfs: check return value of xchk_scrub_create_subord xfs: only call xf{array,blob}_destroy if we have a valid pointer xfs: get rid of the xchk_xfile_*_descr calls ...
2026-02-09Merge tag 'erofs-for-7.0-rc1' of ↵Linus Torvalds-5/+5
git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs Pull erofs updates from Gao Xiang: "In this cycle, inode page cache sharing among filesystems on the same machine is now supported, which is particularly useful for high-density hosts running tens of thousands of containers. In addition, we fully isolate the EROFS core on-disk format from other optional encoded layouts since the core on-disk part is designed to be simple, effective, and secure. Users can use the core format to build unique golden immutable images and import their filesystem trees directly from raw block devices via DMA, page-mapped DAX devices, and/or file-backed mounts without having to worry about unnecessary intrinsic consistency issues found in other generic filesystems by design. However, the full vision is still working in progress and will spend more time to achieve final goals. There are other improvements and bug fixes as usual, as listed below: - Support inode page cache sharing among filesystems - Formally separate optional encoded (aka compressed) inode layouts (and the implementations) from the EROFS core on-disk aligned plain format for future zero-trust security usage - Improve performance by caching the fact that an inode does not have a POSIX ACL - Improve LZ4 decompression error reporting - Enable LZMA by default and promote DEFLATE and Zstandard algorithms out of EXPERIMENTAL status - Switch to inode_set_cached_link() to cache symlink lengths - random bugfixes and minor cleanups" * tag 'erofs-for-7.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs: (31 commits) erofs: fix UAF issue for file-backed mounts w/ directio option erofs: update compression algorithm status erofs: fix inline data read failure for ztailpacking pclusters erofs: avoid some unnecessary #ifdefs erofs: handle end of filesystem properly for file-backed mounts erofs: separate plain and compressed filesystems formally erofs: use inode_set_cached_link() erofs: mark inodes without acls in erofs_read_inode() erofs: implement .fadvise for page cache share erofs: support compressed inodes for page cache share erofs: support unencoded inodes for page cache share erofs: pass inode to trace_erofs_read_folio erofs: introduce the page cache share feature erofs: using domain_id in the safer way erofs: add erofs_inode_set_aops helper to set the aops erofs: support user-defined fingerprint name erofs: decouple `struct erofs_anon_fs_type` fs: Export alloc_empty_backing_file erofs: tidy up erofs_init_inode_xattrs() erofs: add missing documentation about `directio` mount option ...
2026-02-09Merge tag 'nilfs2-v7.0-tag1' of ↵Linus Torvalds-68/+99
git://git.kernel.org/pub/scm/linux/kernel/git/vdubeyko/nilfs2 Pull nilfs2 updates from Viacheslav Dubeyko: - Fix potential block overflow that cause system hang When executing the FITRIM command, an underflow can occur in the calculation of nblocks. This ultimately leads to the block layer function __blkdev_issue_discard() taking an excessively long time to process the bio chain, and the ns_segctor_sem lock remains held for a long period. This prevents other tasks from acquiring the ns_segctor_sem lock, resulting in a hang reported by syzbot (Edward Adam Davis) - Fix missing struct keywords in nilfs2_api.h kernel-doc (Ryusuke Konishi) - Convert nilfs_super_block to kernel-doc Eliminate 40+ kernel-doc warnings in nilfs2_ondisk.h by converting all of the struct member comments to kernel-doc comments (Randy Dunlap) * tag 'nilfs2-v7.0-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/vdubeyko/nilfs2: nilfs2: fix missing struct keywords in nilfs2_api.h kernel-doc nilfs2: convert nilfs_super_block to kernel-doc nilfs2: Fix potential block overflow that cause system hang
2026-02-09Merge tag 'for-6.20-tag' of ↵Linus Torvalds-2/+33
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs updates from David Sterba: "User visible changes, feature updates: - when using block size > page size, enable direct IO - fallback to buffered IO if the data profile has duplication, workaround to avoid checksum mismatches on block group profiles with redundancy, real direct IO is possible on single or RAID0 - redo export of zoned statistics, moved from sysfs to /proc/pid/mountstats due to size limitations of the former Experimental features: - remove offload checksum tunable, intended to find best way to do it but since we've switched to offload to thread for everything we don't need it anymore - initial support for remap-tree feature, a translation layer of logical block addresses that allow changes without moving/rewriting blocks to do eg. relocation, or other changes that require COW Notable fixes: - automatic removal of accidentally leftover chunks when free-space-tree is enabled since mkfs.btrfs v6.16.1 - zoned mode: - do not try to append to conventional zones when RAID is mixing zoned and conventional drives - fixup write pointers when mixing zoned and conventional on DUP/RAID* profiles - when using squota, relax deletion rules for qgroups with 0 members to allow easier recovery from accounting bugs, also add more checks to detect bad accounting - fix periodic reclaim scanning, properly check boundary conditions not to trigger it unexpectedly or miss the time to run it - trim: - continue after first error - change reporting to the first detected error - add more cancellation points - reduce contention of big device lock that can block other operations when there's lots of trimmed space - when chunk allocation is forced (needs experimental build) fix transaction abort when unexpected space layout is detected Core: - switch to crypto library API for checksumming, removed module dependencies, pointer indirections, etc. - error handling improvements - adjust how and where transaction commit or abort are done and are maybe not necessary - minor compression optimization to skip single block ranges - improve how compression folios are handled - new and updated selftests - cleanups, refactoring: - auto-freeing and other automatic variable cleanup conversion - structure size optimizations - condition annotations" * tag 'for-6.20-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: (137 commits) btrfs: get rid of compressed_bio::compressed_folios[] btrfs: get rid of compressed_folios[] usage for encoded writes btrfs: get rid of compressed_folios[] usage for compressed read btrfs: remove the old btrfs_compress_folios() infrastructure btrfs: switch to btrfs_compress_bio() interface for compressed writes btrfs: introduce btrfs_compress_bio() helper btrfs: zlib: introduce zlib_compress_bio() helper btrfs: zstd: introduce zstd_compress_bio() helper btrfs: lzo: introduce lzo_compress_bio() helper btrfs: zoned: factor out the zone loading part into a testable function btrfs: add cleanup function for btrfs_free_chunk_map btrfs: tests: add cleanup functions for test specific functions btrfs: raid56: fix memory leak of btrfs_raid_bio::stripe_uptodate_bitmap btrfs: tests: add unit tests for pending extent walking functions btrfs: fix EEXIST abort due to non-consecutive gaps in chunk allocation btrfs: fix transaction commit blocking during trim of unallocated space btrfs: handle user interrupt properly in btrfs_trim_fs() btrfs: preserve first error in btrfs_trim_fs() btrfs: continue trimming remaining devices on failure btrfs: do not BUG_ON() in btrfs_remove_block_group() ...
2026-02-09Merge tag 'vfs-7.0-rc1.misc' of ↵Linus Torvalds-20/+47
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull misc vfs updates from Christian Brauner: "This contains a mix of VFS cleanups, performance improvements, API fixes, documentation, and a deprecation notice. Scalability and performance: - Rework pid allocation to only take pidmap_lock once instead of twice during alloc_pid(), improving thread creation/teardown throughput by 10-16% depending on false-sharing luck. Pad the namespace refcount to reduce false-sharing - Track file lock presence via a flag in ->i_opflags instead of reading ->i_flctx, avoiding false-sharing with ->i_readcount on open/close hot paths. Measured 4-16% improvement on 24-core open-in-a-loop benchmarks - Use a consume fence in locks_inode_context() to match the store-release/load-consume idiom, eliminating a hardware fence on some architectures - Annotate cdev_lock with __cacheline_aligned_in_smp to prevent false-sharing - Remove a redundant DCACHE_MANAGED_DENTRY check in __follow_mount_rcu() that never fires since the caller already verifies it, eliminating a 100% mispredicted branch - Fix a 100% mispredicted likely() in devcgroup_inode_permission() that became wrong after a prior code reorder Bug fixes and correctness: - Make insert_inode_locked() wait for inode destruction instead of skipping, fixing a corner case where two matching inodes could exist in the hash - Move f_mode initialization before file_ref_init() in alloc_file() to respect the SLAB_TYPESAFE_BY_RCU ordering contract - Add a WARN_ON_ONCE guard in try_to_free_buffers() for folios with no buffers attached, preventing a null pointer dereference when AS_RELEASE_ALWAYS is set but no release_folio op exists - Fix select restart_block to store end_time as timespec64, avoiding truncation of tv_sec on 32-bit architectures - Make dump_inode() use get_kernel_nofault() to safely access inode and superblock fields, matching the dump_mapping() pattern API modernization: - Make posix_acl_to_xattr() allocate the buffer internally since every single caller was doing it anyway. Reduces boilerplate and unnecessary error checking across ~15 filesystems - Replace deprecated simple_strtoul() with kstrtoul() for the ihash_entries, dhash_entries, mhash_entries, and mphash_entries boot parameters, adding proper error handling - Convert chardev code to use guard(mutex) and __free(kfree) cleanup patterns - Replace min_t() with min() or umin() in VFS code to avoid silently truncating unsigned long to unsigned int - Gate LOOKUP_RCU assertions behind CONFIG_DEBUG_VFS since callers already check the flag Deprecation: - Begin deprecating legacy BSD process accounting (acct(2)). The interface has numerous footguns and better alternatives exist (eBPF) Documentation: - Fix and complete kernel-doc for struct export_operations, removing duplicated documentation between ReST and source - Fix kernel-doc warnings for __start_dirop() and ilookup5_nowait() Testing: - Add a kunit test for initramfs cpio handling of entries with filesize > PATH_MAX Misc: - Add missing <linux/init_task.h> include in fs_struct.c" * tag 'vfs-7.0-rc1.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (28 commits) posix_acl: make posix_acl_to_xattr() alloc the buffer fs: make insert_inode_locked() wait for inode destruction initramfs_test: kunit test for cpio.filesize > PATH_MAX fs: improve dump_inode() to safely access inode fields fs: add <linux/init_task.h> for 'init_fs' docs: exportfs: Use source code struct documentation fs: move initializing f_mode before file_ref_init() exportfs: Complete kernel-doc for struct export_operations exportfs: Mark struct export_operations functions at kernel-doc exportfs: Fix kernel-doc output for get_name() acct(2): begin the deprecation of legacy BSD process accounting device_cgroup: remove branch hint after code refactor VFS: fix __start_dirop() kernel-doc warnings fs: Describe @isnew parameter in ilookup5_nowait() fs/namei: Remove redundant DCACHE_MANAGED_DENTRY check in __follow_mount_rcu fs: only assert on LOOKUP_RCU when built with CONFIG_DEBUG_VFS select: store end_time as timespec64 in restart block chardev: Switch to guard(mutex) and __free(kfree) namespace: Replace simple_strtoul with kstrtoul to parse boot params dcache: Replace simple_strtoul with kstrtoul in set_dhash_entries ...
2026-02-09Merge tag 'vfs-7.0-rc1.iomap' of ↵Linus Torvalds-4/+4
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs iomap updates from Christian Brauner: - Erofs page cache sharing preliminaries: Plumb a void *private parameter through iomap_read_folio() and iomap_readahead() into iomap_iter->private, matching iomap DIO. Erofs uses this to replace a bogus kmap_to_page() call, as preparatory work for page cache sharing. - Fix for invalid folio access: Fix an invalid folio access when a folio without iomap_folio_state is fully submitted to the IO helper — the helper may call folio_end_read() at any time, so ctx->cur_folio must be invalidated after full submission. * tag 'vfs-7.0-rc1.iomap' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: iomap: fix invalid folio access after folio_end_read() erofs: hold read context in iomap_iter if needed iomap: stash iomap read ctx in the private field of iomap_iter
2026-02-09Merge tag 'vfs-7.0-rc1.namespace' of ↵Linus Torvalds-5/+11
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs mount updates from Christian Brauner: - statmount: accept fd as a parameter Extend struct mnt_id_req with a file descriptor field and a new STATMOUNT_BY_FD flag. When set, statmount() returns mount information for the mount the fd resides on — including detached mounts (unmounted via umount2(MNT_DETACH)). For detached mounts the STATMOUNT_MNT_POINT and STATMOUNT_MNT_NS_ID mask bits are cleared since neither is meaningful. The capability check is skipped for STATMOUNT_BY_FD since holding an fd already implies prior access to the mount and equivalent information is available through fstatfs() and /proc/pid/mountinfo without privilege. Includes comprehensive selftests covering both attached and detached mount cases. - fs: Remove internal old mount API code (1 patch) Now that every in-tree filesystem has been converted to the new mount API, remove all the legacy shim code in fs_context.c that handled unconverted filesystems. This deletes ~280 lines including legacy_init_fs_context(), the legacy_fs_context struct, and associated wrappers. The mount(2) syscall path for userspace remains untouched. Documentation references to the legacy callbacks are cleaned up. - mount: add OPEN_TREE_NAMESPACE to open_tree() Container runtimes currently use CLONE_NEWNS to copy the caller's entire mount namespace — only to then pivot_root() and recursively unmount everything they just copied. With large mount tables and thousands of parallel container launches this creates significant contention on the namespace semaphore. OPEN_TREE_NAMESPACE copies only the specified mount tree (like OPEN_TREE_CLONE) but returns a mount namespace fd instead of a detached mount fd. The new namespace contains the copied tree mounted on top of a clone of the real rootfs. This functions as a combined unshare(CLONE_NEWNS) + pivot_root() in a single syscall. Works with user namespaces: an unshare(CLONE_NEWUSER) followed by OPEN_TREE_NAMESPACE creates a mount namespace owned by the new user namespace. Mount namespace file mounts are excluded from the copy to prevent cycles. Includes ~1000 lines of selftests" * tag 'vfs-7.0-rc1.namespace' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: selftests/open_tree: add OPEN_TREE_NAMESPACE tests mount: add OPEN_TREE_NAMESPACE fs: Remove internal old mount API code selftests: statmount: tests for STATMOUNT_BY_FD statmount: accept fd as a parameter statmount: permission check should return EPERM
2026-02-09Merge tag 'vfs-7.0-rc1.atomic_open' of ↵Linus Torvalds-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs atomic_open updates from Christian Brauner: "Allow knfsd to use atomic_open() While knfsd offers combined exclusive create and open results to clients, on some filesystems those results are not atomic. The separate vfs_create() + vfs_open() sequence in dentry_create() can produce races and unexpected errors. For example, open O_CREAT with mode 0 will succeed in creating the file but return -EACCES from vfs_open(). Additionally, network filesystems benefit from reducing remote round-trip operations by using a single atomic_open() call. Teach dentry_create() -- whose sole caller is knfsd -- to use atomic_open() for filesystems that support it" * tag 'vfs-7.0-rc1.atomic_open' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: fs/namei: fix kernel-doc markup for dentry_create VFS/knfsd: Teach dentry_create() to use atomic_open() VFS: Prepare atomic_open() for dentry_create() VFS: move dentry_create() from fs/open.c to fs/namei.c
2026-02-09Merge tag 'vfs-7.0-rc1.nullfs' of ↵Linus Torvalds-0/+2
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs nullfs update from Christian Brauner: "Add a completely catatonic minimal pseudo filesystem called "nullfs" and make pivot_root() work in the initramfs. Currently pivot_root() does not work on the real rootfs because it cannot be unmounted. Userspace has to recursively delete initramfs contents manually before continuing boot, using the fragile switch_root sequence (overmount + chroot). Add nullfs, a minimal immutable filesystem that serves as the true root of the mount hierarchy. The mutable rootfs (tmpfs/ramfs) is mounted on top of it. This allows userspace to simply: chdir(new_root); pivot_root(".", "."); umount2(".", MNT_DETACH); without the traditional switch_root workarounds. systemd already handles this correctly. It tries pivot_root() first and falls back to MS_MOVE only when that fails. This also means rootfs mounts in unprivileged namespaces no longer need MNT_LOCKED, since the immutable nullfs guarantees nothing can be revealed by unmounting the covering mount. nullfs is a single-instance filesystem (get_tree_single()) marked SB_NOUSER | SB_I_NOEXEC | SB_I_NODEV with an immutable empty root directory. This means sooner or later it can be used to overmount other directories to hide their contents without any additional protection needed. We enable it unconditionally. If we see any real regression we'll hide it behind a boot option. nullfs has extensions beyond this in the future. It will serve as a concept to support the creation of completely empty mount namespaces - which is work coming up in the next cycle" * tag 'vfs-7.0-rc1.nullfs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: fs: use nullfs unconditionally as the real rootfs docs: mention nullfs fs: add immutable rootfs fs: add init_pivot_root() fs: ensure that internal tmpfs mount gets mount id zero
2026-02-09Merge tag 'vfs-7.0-rc1.btrfs' of ↵Linus Torvalds-0/+5
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs updates for btrfs from Christian Brauner: "This contains some changes for btrfs that are taken to the vfs tree to stop duplicating VFS code for subvolume/snapshot dentry Btrfs has carried private copies of the VFS may_delete() and may_create() functions in fs/btrfs/ioctl.c for permission checks during subvolume creation and snapshot destruction. These copies have drifted out of sync with the VFS originals — btrfs_may_delete() is missing the uid/gid validity check and btrfs_may_create() is missing the audit_inode_child() call. Export the VFS functions as may_{create,delete}_dentry() and switch btrfs to use them, removing ~70 lines of duplicated code" * tag 'vfs-7.0-rc1.btrfs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: btrfs: use may_create_dentry() in btrfs_mksubvol() btrfs: use may_delete_dentry() in btrfs_ioctl_snap_destroy() fs: export may_create() as may_create_dentry() fs: export may_delete() as may_delete_dentry()
2026-02-09Merge tag 'vfs-7.0-rc1.fserror' of ↵Linus Torvalds-3/+84
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs error reporting updates from Christian Brauner: "This contains the changes to support generic I/O error reporting. Filesystems currently have no standard mechanism for reporting metadata corruption and file I/O errors to userspace via fsnotify. Each filesystem (xfs, ext4, erofs, f2fs, etc.) privately defines EFSCORRUPTED, and error reporting to fanotify is inconsistent or absent entirely. This introduces a generic fserror infrastructure built around struct super_block that gives filesystems a standard way to queue metadata and file I/O error reports for delivery to fsnotify. Errors are queued via mempools and queue_work to avoid holding filesystem locks in the notification path; unmount waits for pending events to drain. A new super_operations::report_error callback lets filesystem drivers respond to file I/O errors themselves (to be used by an upcoming XFS self-healing patchset). On the uapi side, EFSCORRUPTED and EUCLEAN are promoted from private per-filesystem definitions to canonical errno.h values across all architectures" * tag 'vfs-7.0-rc1.fserror' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: ext4: convert to new fserror helpers xfs: translate fsdax media errors into file "data lost" errors when convenient xfs: report fs metadata errors via fsnotify iomap: report file I/O errors to the VFS fs: report filesystem and file I/O errors to fsnotify uapi: promote EFSCORRUPTED and EUCLEAN to errno.h
2026-02-09Merge tag 'vfs-7.0-rc1.leases' of ↵Linus Torvalds-1/+0
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs lease updates from Christian Brauner: "This contains updates for lease support to require filesystems to explicitly opt-in to lease support Currently kernel_setlease() falls through to generic_setlease() when a a filesystem does not define ->setlease(), silently granting lease support to every filesystem regardless of whether it is prepared for it. This is a poor default: most filesystems never intended to support leases, and the silent fallthrough makes it impossible to distinguish "supports leases" from "never thought about it". This inverts the default. It adds explicit .setlease = generic_setlease; assignments to every in-tree filesystem that should retain lease support, then changes kernel_setlease() to return -EINVAL when ->setlease is NULL. With the new default in place, simple_nosetlease() is redundant and is removed along with all references to it" * tag 'vfs-7.0-rc1.leases' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (25 commits) fuse: add setlease file operation fs: remove simple_nosetlease() filelock: default to returning -EINVAL when ->setlease operation is NULL xfs: add setlease file operation ufs: add setlease file operation udf: add setlease file operation tmpfs: add setlease file operation squashfs: add setlease file operation overlayfs: add setlease file operation orangefs: add setlease file operation ocfs2: add setlease file operation ntfs3: add setlease file operation nilfs2: add setlease file operation jfs: add setlease file operation jffs2: add setlease file operation gfs2: add a setlease file operation fat: add setlease file operation f2fs: add setlease file operation exfat: add setlease file operation ext4: add setlease file operation ...