summaryrefslogtreecommitdiffstats
path: root/arch/riscv/kernel
AgeCommit message (Collapse)AuthorLines
2026-05-13riscv: misaligned: Make enabling delegation depend on NONPORTABLEVivian Wang-1/+1
The unaligned access emulation code in Linux has various deficiencies. For example, it doesn't emulate vector instructions [1] [2], and doesn't emulate KVM guest accesses. Therefore, requesting misaligned exception delegation with SBI FWFT actually regresses vector instructions' and KVM guests' behavior. Until Linux can handle it properly, guard these sbi_fwft_set() calls behind RISCV_SBI_FWFT_DELEGATE_MISALIGNED, which in turn depends on NONPORTABLE. Those who are sure that this wouldn't be a problem can enable this option, perhaps getting better performance. The rest of the existing code proceeds as before, except as if SBI_FWFT_MISALIGNED_EXC_DELEG is not available, to handle any remaining address misaligned exceptions on a best-effort basis. The KVM SBI FWFT implementation is also not touched, but it is disabled if the firmware emulates unaligned accesses. Cc: stable@vger.kernel.org Fixes: cf5a8abc6560 ("riscv: misaligned: request misaligned exception from SBI") Reported-by: Songsong Zhang <U2FsdGVkX1@gmail.com> # KVM Link: https://lore.kernel.org/linux-riscv/38ce44c1-08cf-4e3f-8ade-20da224f529c@iscas.ac.cn/ [1] Link: https://lore.kernel.org/linux-riscv/b3cfcdac-0337-4db0-a611-258f2868855f@iscas.ac.cn/ [2] Signed-off-by: Vivian Wang <wangruikang@iscas.ac.cn> Acked-by: Conor Dooley <conor.dooley@microchip.com> Link: https://patch.msgid.link/20260401-riscv-misaligned-dont-delegate-v2-1-5014a288c097@iscas.ac.cn Signed-off-by: Paul Walmsley <pjw@kernel.org>
2026-05-13riscv: cfi: reduce shadow stack size limit from 4GB to 2GBZong Li-3/+4
Follow the ARM64 GCS (Guarded Control Stack) implementation approach by reducing the shadow stack size allocation from min(RLIMIT_STACK, 4GB) to min(RLIMIT_STACK/2, 2GB). See commit 506496bcbb42 ("arm64/gcs: Ensure that new threads have a GCS") Rationale: 1. Shadow stacks only store return addresses (8 bytes per entry), not local variables, function parameters, or saved registers. A 2GB shadow stack is far more than sufficient for any practical application, even with extremely deep recursion. Using half the size maintains adequate margin while being more resource-efficient. 2. On memory-constrained systems (e.g., platforms with only 4GB of physical memory, which is a common configuration), allocating 4GB of virtual address space for shadow stack per process/thread can lead to virtual memory allocation failures when the overcommit mode is set to OVERCOMMIT_GUESS or OVERCOMMIT_NEVER: Error: "__vm_enough_memory: not enough memory for the allocation" This reduces virtual address space consumption by 50% while maintaining more than adequate space for return address storage. Signed-off-by: Zong Li <zong.li@sifive.com> Link: https://patch.msgid.link/20260428024105.645162-1-zong.li@sifive.com [pjw@kernel.org: clean up patch description] Signed-off-by: Paul Walmsley <pjw@kernel.org>
2026-05-07riscv: cpufeature: Use pre-defined ISA ext macros to index isa2hwcapHui Wang-8/+8
We have pre-defined ISA extension macros, here use those macros to replace a magic number for isa2hwcap definition and some array indexing for isa2hwcap access. This doesn't change the original functionality, just improve the code maintainability and readability. Signed-off-by: Hui Wang <hui.wang@canonical.com> Link: https://patch.msgid.link/20260506132152.53239-1-hui.wang@canonical.com Signed-off-by: Paul Walmsley <pjw@kernel.org>
2026-05-01riscv: Fix register corruption from uninitialized cregs on errorMichael Neuling-2/+4
compat_riscv_gpr_set() calls cregs_to_regs() unconditionally, even when user_regset_copyin() fails. Since cregs is an uninitialized stack variable, a copyin failure causes uninitialized stack data to be written into the target task's pt_regs, corrupting its register state and potentially leaking kernel stack contents. compat_restore_sigcontext() has the same issue: it calls cregs_to_regs() even when __copy_from_user() fails, leading to the same corruption of the signal-returning task's register state on error. Only call cregs_to_regs() when the user copy succeeds. Fixes: 4608c159594f ("riscv: compat: ptrace: Add compat_arch_ptrace implement") Fixes: 7383ee05314b ("riscv: compat: signal: Add rt_frame implementation") Signed-off-by: Michael Neuling <mikey@neuling.org> Assisted-by: Cursor:claude-4.6-opus-high-thinking Link: https://patch.msgid.link/20260501062320.2339562-1-mikey@neuling.org Signed-off-by: Paul Walmsley <pjw@kernel.org>
2026-04-30riscv: cpufeature: Drop this_hwcap clear in T-Head vector workaroundHui Wang-3/+1
The variable this_hwcap is initialized to 0 for each loop, it is not necessary to do the bit clearance since this_hwcap is still 0 at this point, clearing the source_isa is enough here. Signed-off-by: Hui Wang <hui.wang@canonical.com> Link: https://patch.msgid.link/20260430045350.22213-1-hui.wang@canonical.com Signed-off-by: Paul Walmsley <pjw@kernel.org>
2026-04-29riscv: Define __riscv_copy_{,vec_}{words,bytes}_unaligned() using ↵Nathan Chancellor-4/+6
SYM_TYPED_FUNC_START After commit 67bdd7b01387 ("riscv: Split out measure_cycles() for reuse") and commit c03ad15f7cf6 ("riscv: Reuse measure_cycles() in check_vector_unaligned_access()"), there are CFI failure when booting kernels with CONFIG_CFI=y: CFI failure at measure_cycles+0x38/0xe0 (target: __riscv_copy_words_unaligned+0x0/0x50; expected type: ...) CFI failure at measure_cycles+0x38/0xe0 (target: __riscv_copy_vec_words_unaligned+0x0/0x24; expected type: ...) The __riscv_copy_*_unaligned() functions are now called indirectly but they are not defined with SYM_TYPED_FUNC_START, which is required for assembly functions called indirectly from C to pass CFI checking. Switch to SYM_TYPED_FUNC_START to clear up the CFI failures. Fixes: 67bdd7b01387 ("riscv: Split out measure_cycles() for reuse") Fixes: c03ad15f7cf6 ("riscv: Reuse measure_cycles() in check_vector_unaligned_access()") Signed-off-by: Nathan Chancellor <nathan@kernel.org> Reviewed-by: Sami Tolvanen <samitolvanen@google.com> Reviewed-by: Nam Cao <namcao@linutronix.de> Link: https://patch.msgid.link/20260406-measure_cycles-cfi-failure-v1-1-03e0234ae02f@kernel.org Signed-off-by: Paul Walmsley <pjw@kernel.org>
2026-04-24Merge tag 'riscv-for-linus-7.1-mw1' of ↵Linus Torvalds-225/+123
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V updates from Paul Walmsley: "There is one significant change outside arch/riscv in this pull request: the addition of a set of KUnit tests for strlen(), strnlen(), and strrchr(). Otherwise, the most notable changes are to add some RISC-V-specific string function implementations, to remove XIP kernel support, to add hardware error exception handling, and to optimize our runtime unaligned access speed testing. A few comments on the motivation for removing XIP support. It's been broken in the RISC-V kernel for months. The code is not easy to maintain. Furthermore, for XIP support to truly be useful for RISC-V, we think that compile-time feature switches would need to be added for many of the RISC-V ISA features and microarchitectural properties that are currently implemented with runtime patching. No one has stepped forward to take responsibility for that work, so many of us think it's best to remove it until clear use cases and champions emerge. Summary: - Add Kunit correctness testing and microbenchmarks for strlen(), strnlen(), and strrchr() - Add RISC-V-specific strnlen(), strchr(), strrchr() implementations - Add hardware error exception handling - Clean up and optimize our unaligned access probe code - Enable HAVE_IOREMAP_PROT to be able to use generic_access_phys() - Remove XIP kernel support - Warn when addresses outside the vmemmap range are passed to vmemmap_populate() - Update the ACPI FADT revision check to warn if it's not at least ACPI v6.6, which is when key RISC-V-specific tables were added to the specification - Increase COMMAND_LINE_SIZE to 2048 to match ARM64, x86, PowerPC, etc. - Make kaslr_offset() a static inline function, since there's no need for it to show up in the symbol table - Add KASLR offset and SATP to the VMCOREINFO ELF notes to improve kdump support - Add Makefile cleanup rule for vdso_cfi copied source files, and add a .gitignore for the build artifacts in that directory - Remove some redundant ifdefs that check Kconfig macros - Add missing SPDX license tag to the CFI selftest - Simplify UTS_MACHINE assignment in the RISC-V Makefile - Clarify some unclear comments and remove some superfluous comments - Fix various English typos across the RISC-V codebase" * tag 'riscv-for-linus-7.1-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (31 commits) riscv: Remove support for XIP kernel riscv: Reuse compare_unaligned_access() in check_vector_unaligned_access() riscv: Split out compare_unaligned_access() riscv: Reuse measure_cycles() in check_vector_unaligned_access() riscv: Split out measure_cycles() for reuse riscv: Clean up & optimize unaligned scalar access probe riscv: lib: add strrchr() implementation riscv: lib: add strchr() implementation riscv: lib: add strnlen() implementation lib/string_kunit: extend benchmarks to strnlen() and chr searches lib/string_kunit: add performance benchmark for strlen() lib/string_kunit: add correctness test for strrchr() lib/string_kunit: add correctness test for strnlen() lib/string_kunit: add correctness test for strlen() riscv: vdso_cfi: Add .gitignore for build artifacts riscv: vdso_cfi: Add clean rule for copied sources riscv: enable HAVE_IOREMAP_PROT riscv: mm: WARN_ON() for bad addresses in vmemmap_populate() riscv: acpi: update FADT revision check to 6.6 riscv: add hardware error trap handler support ...
2026-04-15Merge tag 'mm-stable-2026-04-13-21-45' of ↵Linus Torvalds-11/+1
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull MM updates from Andrew Morton: - "maple_tree: Replace big node with maple copy" (Liam Howlett) Mainly prepararatory work for ongoing development but it does reduce stack usage and is an improvement. - "mm, swap: swap table phase III: remove swap_map" (Kairui Song) Offers memory savings by removing the static swap_map. It also yields some CPU savings and implements several cleanups. - "mm: memfd_luo: preserve file seals" (Pratyush Yadav) File seal preservation to LUO's memfd code - "mm: zswap: add per-memcg stat for incompressible pages" (Jiayuan Chen) Additional userspace stats reportng to zswap - "arch, mm: consolidate empty_zero_page" (Mike Rapoport) Some cleanups for our handling of ZERO_PAGE() and zero_pfn - "mm/kmemleak: Improve scan_should_stop() implementation" (Zhongqiu Han) A robustness improvement and some cleanups in the kmemleak code - "Improve khugepaged scan logic" (Vernon Yang) Improve khugepaged scan logic and reduce CPU consumption by prioritizing scanning tasks that access memory frequently - "Make KHO Stateless" (Jason Miu) Simplify Kexec Handover by transitioning KHO from an xarray-based metadata tracking system with serialization to a radix tree data structure that can be passed directly to the next kernel - "mm: vmscan: add PID and cgroup ID to vmscan tracepoints" (Thomas Ballasi and Steven Rostedt) Enhance vmscan's tracepointing - "mm: arch/shstk: Common shadow stack mapping helper and VM_NOHUGEPAGE" (Catalin Marinas) Cleanup for the shadow stack code: remove per-arch code in favour of a generic implementation - "Fix KASAN support for KHO restored vmalloc regions" (Pasha Tatashin) Fix a WARN() which can be emitted the KHO restores a vmalloc area - "mm: Remove stray references to pagevec" (Tal Zussman) Several cleanups, mainly udpating references to "struct pagevec", which became folio_batch three years ago - "mm: Eliminate fake head pages from vmemmap optimization" (Kiryl Shutsemau) Simplify the HugeTLB vmemmap optimization (HVO) by changing how tail pages encode their relationship to the head page - "mm/damon/core: improve DAMOS quota efficiency for core layer filters" (SeongJae Park) Improve two problematic behaviors of DAMOS that makes it less efficient when core layer filters are used - "mm/damon: strictly respect min_nr_regions" (SeongJae Park) Improve DAMON usability by extending the treatment of the min_nr_regions user-settable parameter - "mm/page_alloc: pcp locking cleanup" (Vlastimil Babka) The proper fix for a previously hotfixed SMP=n issue. Code simplifications and cleanups ensued - "mm: cleanups around unmapping / zapping" (David Hildenbrand) A bunch of cleanups around unmapping and zapping. Mostly simplifications, code movements, documentation and renaming of zapping functions - "support batched checking of the young flag for MGLRU" (Baolin Wang) Batched checking of the young flag for MGLRU. It's part cleanups; one benchmark shows large performance benefits for arm64 - "memcg: obj stock and slab stat caching cleanups" (Johannes Weiner) memcg cleanup and robustness improvements - "Allow order zero pages in page reporting" (Yuvraj Sakshith) Enhance free page reporting - it is presently and undesirably order-0 pages when reporting free memory. - "mm: vma flag tweaks" (Lorenzo Stoakes) Cleanup work following from the recent conversion of the VMA flags to a bitmap - "mm/damon: add optional debugging-purpose sanity checks" (SeongJae Park) Add some more developer-facing debug checks into DAMON core - "mm/damon: test and document power-of-2 min_region_sz requirement" (SeongJae Park) An additional DAMON kunit test and makes some adjustments to the addr_unit parameter handling - "mm/damon/core: make passed_sample_intervals comparisons overflow-safe" (SeongJae Park) Fix a hard-to-hit time overflow issue in DAMON core - "mm/damon: improve/fixup/update ratio calculation, test and documentation" (SeongJae Park) A batch of misc/minor improvements and fixups for DAMON - "mm: move vma_(kernel|mmu)_pagesize() out of hugetlb.c" (David Hildenbrand) Fix a possible issue with dax-device when CONFIG_HUGETLB=n. Some code movement was required. - "zram: recompression cleanups and tweaks" (Sergey Senozhatsky) A somewhat random mix of fixups, recompression cleanups and improvements in the zram code - "mm/damon: support multiple goal-based quota tuning algorithms" (SeongJae Park) Extend DAMOS quotas goal auto-tuning to support multiple tuning algorithms that users can select - "mm: thp: reduce unnecessary start_stop_khugepaged()" (Breno Leitao) Fix the khugpaged sysfs handling so we no longer spam the logs with reams of junk when starting/stopping khugepaged - "mm: improve map count checks" (Lorenzo Stoakes) Provide some cleanups and slight fixes in the mremap, mmap and vma code - "mm/damon: support addr_unit on default monitoring targets for modules" (SeongJae Park) Extend the use of DAMON core's addr_unit tunable - "mm: khugepaged cleanups and mTHP prerequisites" (Nico Pache) Cleanups to khugepaged and is a base for Nico's planned khugepaged mTHP support - "mm: memory hot(un)plug and SPARSEMEM cleanups" (David Hildenbrand) Code movement and cleanups in the memhotplug and sparsemem code - "mm: remove CONFIG_ARCH_ENABLE_MEMORY_HOTREMOVE and cleanup CONFIG_MIGRATION" (David Hildenbrand) Rationalize some memhotplug Kconfig support - "change young flag check functions to return bool" (Baolin Wang) Cleanups to change all young flag check functions to return bool - "mm/damon/sysfs: fix memory leak and NULL dereference issues" (Josh Law and SeongJae Park) Fix a few potential DAMON bugs - "mm/vma: convert vm_flags_t to vma_flags_t in vma code" (Lorenzo Stoakes) Convert a lot of the existing use of the legacy vm_flags_t data type to the new vma_flags_t type which replaces it. Mainly in the vma code. - "mm: expand mmap_prepare functionality and usage" (Lorenzo Stoakes) Expand the mmap_prepare functionality, which is intended to replace the deprecated f_op->mmap hook which has been the source of bugs and security issues for some time. Cleanups, documentation, extension of mmap_prepare into filesystem drivers - "mm/huge_memory: refactor zap_huge_pmd()" (Lorenzo Stoakes) Simplify and clean up zap_huge_pmd(). Additional cleanups around vm_normal_folio_pmd() and the softleaf functionality are performed. * tag 'mm-stable-2026-04-13-21-45' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (369 commits) mm: fix deferred split queue races during migration mm/khugepaged: fix issue with tracking lock mm/huge_memory: add and use has_deposited_pgtable() mm/huge_memory: add and use normal_or_softleaf_folio_pmd() mm: add softleaf_is_valid_pmd_entry(), pmd_to_softleaf_folio() mm/huge_memory: separate out the folio part of zap_huge_pmd() mm/huge_memory: use mm instead of tlb->mm mm/huge_memory: remove unnecessary sanity checks mm/huge_memory: deduplicate zap deposited table call mm/huge_memory: remove unnecessary VM_BUG_ON_PAGE() mm/huge_memory: add a common exit path to zap_huge_pmd() mm/huge_memory: handle buggy PMD entry in zap_huge_pmd() mm/huge_memory: have zap_huge_pmd return a boolean, add kdoc mm/huge: avoid big else branch in zap_huge_pmd() mm/huge_memory: simplify vma_is_specal_huge() mm: on remap assert that input range within the proposed VMA mm: add mmap_action_map_kernel_pages[_full]() uio: replace deprecated mmap hook with mmap_prepare in uio_info drivers: hv: vmbus: replace deprecated mmap hook with mmap_prepare mm: allow handling of stacked mmap_prepare hooks in more drivers ...
2026-04-13Merge tag 'acpi-7.1-rc1' of ↵Linus Torvalds-3/+22
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI support updates from Rafael Wysocki: "These include an update of the CMOS RTC driver and the related ACPI and x86 code that, among other things, switches it over to using the platform device interface for device binding on x86 instead of the PNP device driver interface (which allows the code in question to be simplified quite a bit), a major update of the ACPI Time and Alarm Device (TAD) driver adding an RTC class device interface to it, and updates of core ACPI drivers that remove some unnecessary and not really useful code from them. Apart from that, two drivers are converted to using the platform driver interface for device binding instead of the ACPI driver one, which is slated for removal, support for the Performance Limited register is added to the ACPI CPPC library and there are some janitorial updates of it and the related cpufreq CPPC driver, the ACPI processor driver is fixed and cleaned up, and NVIDIA vendor CPER record handler is added to the APEI GHES code. Also, the interface for obtaining a CPU UID from ACPI is consolidated across architectures and used for fixing a problem with the PCI TPH Steering Tag on ARM64, there are two updates related to ACPICA, a minor ACPI OS Services Layer (OSL) update, and a few assorted updates related to ACPI tables parsing. Specifics: - Update maintainers information regarding ACPICA (Rafael Wysocki) - Replace strncpy() with strscpy_pad() in acpi_ut_safe_strncpy() (Kees Cook) - Trigger an ordered system power off after encountering a fatal error operator in AML (Armin Wolf) - Enable ACPI FPDT parsing on LoongArch (Xi Ruoyao) - Remove the temporary stop-gap acpi_pptt_cache_v1_full structure from the ACPI PPTT parser (Ben Horgan) - Add support for exposing ACPI FPDT subtables FBPT and S3PT (Nate DeSimone) - Address multiple assorted issues and clean up the code in the ACPI processor idle driver (Huisong Li) - Replace strlcat() in the ACPI processor idle drive with a better alternative (Andy Shevchenko) - Rearrange and clean up acpi_processor_errata_piix4() (Rafael Wysocki) - Move reference performance to capabilities and fix an uninitialized variable in the ACPI CPPC library (Pengjie Zhang) - Add support for the Performance Limited Register to the ACPI CPPC library (Sumit Gupta) - Add cppc_get_perf() API to read performance controls, extend cppc_set_epp_perf() for FFH/SystemMemory, and make the ACPI CPPC library warn on missing mandatory DESIRED_PERF register (Sumit Gupta) - Modify the cpufreq CPPC driver to update MIN_PERF/MAX_PERF in target callbacks to allow it to control performance bounds via standard scaling_min_freq and scaling_max_freq sysfs attributes and add sysfs documentation for the Performance Limited Register to it (Sumit Gupta) - Add ACPI support to the platform device interface in the CMOS RTC driver, make the ACPI core device enumeration code create a platform device for the CMOS RTC, and drop CMOS RTC PNP device support (Rafael Wysocki) - Consolidate the x86-specific CMOS RTC handling with the ACPI TAD driver and clean up the CMOS RTC ACPI address space handler (Rafael Wysocki) - Enable ACPI alarm in the CMOS RTC driver if advertised in ACPI FADT and allow that driver to work without a dedicated IRQ if the ACPI alarm is used (Rafael Wysocki) - Clean up the ACPI TAD driver in various ways and add an RTC class device interface, including both the RTC setting/reading and alarm timer support, to it (Rafael Wysocki) - Clean up the ACPI AC and ACPI PAD (processor aggregator device) drivers (Rafael Wysocki) - Rework checking for duplicate video bus devices and consolidate pnp.bus_id workarounds handling in the ACPI video bus driver (Rafael Wysocki) - Update the ACPI core device drivers to stop setting acpi_device_name() unnecessarily (Rafael Wysocki) - Rearrange code using acpi_device_class() in the ACPI core device drivers and update them to stop setting acpi_device_class() unnecessarily (Rafael Wysocki) - Define ACPI_AC_CLASS in one place (Rafael Wysocki) - Convert the ni903x_wdt watchdog driver and the xen ACPI PAD driver to bind to platform devices instead of ACPI devices (Rafael Wysocki) - Add devm_ghes_register_vendor_record_notifier(), use it in the PCI hisi driver, and Add NVIDIA vendor CPER record handler (Kai-Heng Feng) - Consolidate the interface for obtaining a CPU UID from ACPI across architectures and use it to address incorrect PCI TPH Steering Tag on ARM64 resulting from the invalid assumption that the ACPI Processor UID would always be the same as the corresponding logical CPU ID in Linux (Chengwen Feng)" * tag 'acpi-7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (73 commits) ACPICA: Update maintainers information watchdog: ni903x_wdt: Convert to a platform driver ACPI: PAD: xen: Convert to a platform driver ACPI: processor: idle: Reset cpuidle on C-state list changes cpuidle: Extract and export no-lock variants of cpuidle_unregister_device() PCI/TPH: Pass ACPI Processor UID to Cache Locality _DSM ACPI: PPTT: Use acpi_get_cpu_uid() and remove get_acpi_id_for_cpu() perf: arm_cspmu: Switch to acpi_get_cpu_uid() from get_acpi_id_for_cpu() ACPI: Centralize acpi_get_cpu_uid() declaration in include/linux/acpi.h x86/acpi: Add acpi_get_cpu_uid() for unified ACPI CPU UID retrieval RISC-V: ACPI: Add acpi_get_cpu_uid() for unified ACPI CPU UID retrieval LoongArch: Add acpi_get_cpu_uid() for unified ACPI CPU UID retrieval arm64: acpi: Add acpi_get_cpu_uid() for unified ACPI CPU UID retrieval ACPI: APEI: GHES: Add NVIDIA vendor CPER record handler PCI: hisi: Use devm_ghes_register_vendor_record_notifier() ACPI: APEI: GHES: Add devm_ghes_register_vendor_record_notifier() ACPI: tables: Enable FPDT on LoongArch ACPI: processor: idle: Fix NULL pointer dereference in hotplug path ACPI: processor: idle: Reset power_setup_done flag on initialization failure ACPI: TAD: Add alarm support to the RTC class device interface ...
2026-04-13Merge tag 'hardening-v7.1-rc1' of ↵Linus Torvalds-12/+0
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull hardening updates from Kees Cook: - randomize_kstack: Improve implementation across arches (Ryan Roberts) - lkdtm/fortify: Drop unneeded FORTIFY_STR_OBJECT test - refcount: Remove unused __signed_wrap function annotations * tag 'hardening-v7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: lkdtm/fortify: Drop unneeded FORTIFY_STR_OBJECT test refcount: Remove unused __signed_wrap function annotations randomize_kstack: Unify random source across arches randomize_kstack: Maintain kstack_offset per task
2026-04-06RISC-V: ACPI: Add acpi_get_cpu_uid() for unified ACPI CPU UID retrievalChengwen Feng-3/+22
As a step towards unifying the interface for retrieving ACPI CPU UID across architectures, introduce a new function acpi_get_cpu_uid() for riscv. While at it, add input validation to make the code more robust. And also update acpi_numa.c and rhct.c to use the new interface instead of the legacy get_acpi_id_for_cpu(). Signed-off-by: Chengwen Feng <fengchengwen@huawei.com> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Link: https://patch.msgid.link/20260401081640.26875-4-fengchengwen@huawei.com Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2026-04-05riscv: shstk: use the new common vm_mmap_shadow_stack() helperCatalin Marinas-11/+1
Replace part of the allocate_shadow_stack() content with a call to vm_mmap_shadow_stack(). There is no functional change. Link: https://lkml.kernel.org/r/20260225161404.3157851-4-catalin.marinas@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Tested-by: Deepak Gupta <debug@rivosinc.com> Reviewed-by: David Hildenbrand (Arm) <david@kernel.org> Cc: Paul Walmsley <pjw@kernel.org> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexandre Ghiti <alex@ghiti.fr> Cc: "Borislav Petkov (AMD)" <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: "Edgecombe, Rick P" <rick.p.edgecombe@intel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: "Liam R. Howlett" <Liam.Howlett@oracle.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Mark Brown <broonie@kernel.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Thomas Gleixner <tglx@kernel.org> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-04-04riscv: Remove support for XIP kernelNam Cao-67/+1
XIP has a history of being broken for long periods of time. In 2023, it was broken for 18 months before getting fixed [1]. In 2024 it was 4 months [2]. And now it is broken again since commit a44fb5722199 ("riscv: Add runtime constant support"), 10 months ago. These are clear signs that XIP feature is not being used. I occasionally looked after XIP, but mostly because I was bored and had nothing better to do. Remove XIP support. Revert is possible if someone shows up complaining. Link: https://lore.kernel.org/linux-riscv/20231212-customary-hardcover-e19462bf8e75@wendy/ [1] Link: https://lore.kernel.org/linux-riscv/20240526110104.470429-1-namcao@linutronix.de/ [2] Signed-off-by: Nam Cao <namcao@linutronix.de> Cc: Frederik Haxel <haxel@fzi.de> Cc: Vitaly Wool <vitaly.wool@konsulko.com> Reviewed-by: Jisheng Zhang <jszhang@kernel.org> Acked-by: Conor Dooley <conor.dooley@microchip.com> Link: https://patch.msgid.link/20260202115403.2119218-1-namcao@linutronix.de [pjw@kernel.org: updated to apply] Signed-off-by: Paul Walmsley <pjw@kernel.org>
2026-04-04riscv: Reuse compare_unaligned_access() in check_vector_unaligned_access()Nam Cao-39/+16
check_vector_unaligned_access() duplicates the logic in compare_unaligned_access(). Use compare_unaligned_access() and deduplicate. Signed-off-by: Nam Cao <namcao@linutronix.de> Link: https://patch.msgid.link/f18ca7e1efc2e4f231779a4b0bfae04b29f9dc62.1770830596.git.namcao@linutronix.de Signed-off-by: Paul Walmsley <pjw@kernel.org>
2026-04-04riscv: Split out compare_unaligned_access()Nam Cao-19/+40
Scalar misaligned access probe and vector misaligned access probe share very similar code. Split out this similar part from scalar probe into compare_unaligned_access(), which will be reused for vector probe in a follow-up commit. Signed-off-by: Nam Cao <namcao@linutronix.de> Link: https://patch.msgid.link/3695f77279d473eead8ed6210d97c941321cd4f1.1770830596.git.namcao@linutronix.de Signed-off-by: Paul Walmsley <pjw@kernel.org>
2026-04-04riscv: Reuse measure_cycles() in check_vector_unaligned_access()Nam Cao-46/+8
check_vector_unaligned_access() duplicates the logic in measure_cycles(). Reuse measure_cycles() and deduplicate. Signed-off-by: Nam Cao <namcao@linutronix.de> Link: https://patch.msgid.link/be4c66fd4120952195fdcd0e62d245c55f0711e2.1770830596.git.namcao@linutronix.de Signed-off-by: Paul Walmsley <pjw@kernel.org>
2026-04-04riscv: Split out measure_cycles() for reuseNam Cao-36/+33
Byte cycle measurement and word cycle measurement of scalar misaligned access are very similar. Split these parts out into a common measure_cycles() function to avoid duplication. This function will also be reused for vector misaligned access probe in a follow-up commit. Signed-off-by: Nam Cao <namcao@linutronix.de> Link: https://patch.msgid.link/50d0598e45acc56c95176e52fbbe56e1f4becc84.1770830596.git.namcao@linutronix.de Signed-off-by: Paul Walmsley <pjw@kernel.org>
2026-04-04riscv: Clean up & optimize unaligned scalar access probeNam Cao-18/+10
check_unaligned_access_speed_all_cpus() is more complicated than it should be: - It uses on_each_cpu() to probe unaligned memory access on all CPUs but excludes CPU0 with a check in the callback function. So an IPI to CPU0 is wasted. - Probing on CPU0 is done with smp_call_on_cpu(), which is not as fast as on_each_cpu(). The reason for this design is because the probe is timed with jiffies. Therefore on_each_cpu() excludes CPU0 because that CPU needs to tend to jiffies. Instead, replace jiffies usage with ktime_get_mono_fast_ns(). With jiffies out of the way, on_each_cpu() can be used for all CPUs and smp_call_on_cpu() can be dropped. To make ktime_get_mono_fast_ns() usable, move this probe to late_initcall. Anything after clocksource's fs_initcall works, but avoid depending on clocksource staying at fs_initcall. The choice of probe time is now 8000000 ns, which is the same as before (2 jiffies) for riscv defconfig. This is excessive for the CPUs I have, and probably should be reduced; but that's a different discussion. Suggested-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Nam Cao <namcao@linutronix.de> Link: https://patch.msgid.link/9b9a20affe2e4f5c380926ceb885a47e20a59395.1770830596.git.namcao@linutronix.de Signed-off-by: Paul Walmsley <pjw@kernel.org>
2026-04-04riscv: vdso_cfi: Add .gitignore for build artifactsChen Pei-0/+8
The vdso_cfi build process copies source files (*.c, *.S) from the main vdso directory to the build directory. Without a .gitignore file, these copied files appear as untracked files in git status, cluttering the working directory. Add a .gitignore file to exclude: - Copied source files (*.c, *.S) - Temporary build files (vdso.lds, *.tmp, vdso-syms.S) - While preserving vdso-cfi.S which is the original entry point This follows the same pattern used in the main vdso directory and keeps the working directory clean. Signed-off-by: Chen Pei <cp0613@linux.alibaba.com> Link: https://patch.msgid.link/20260320021850.1877-3-cp0613@linux.alibaba.com Signed-off-by: Paul Walmsley <pjw@kernel.org>
2026-04-04riscv: vdso_cfi: Add clean rule for copied sourcesChen Pei-0/+3
When building VDSO with CFI support, source files are copied from the main VDSO directory to the CFI build directory as part of the build process. However, these copied source files were not removed during 'make clean', leaving temporary files in the build directory. Add the clean-files variable to ensure that these copied .c and .S files are properly cleaned up. The notdir() function is used to strip the path prefix, as clean-files expects relative file names without directory components. This ensures the build directory is left in a clean state after make clean. Signed-off-by: Chen Pei <cp0613@linux.alibaba.com> Link: https://patch.msgid.link/20260320021850.1877-2-cp0613@linux.alibaba.com Signed-off-by: Paul Walmsley <pjw@kernel.org>
2026-04-04riscv: acpi: update FADT revision check to 6.6Yufeng Wang-5/+5
ACPI 6.6 is required for RISC-V as it introduces RISC-V specific tables such as RHCT (RISC-V Hart Capabilities Table) and RIMT (RISC-V I/O Mapping Table). Update the FADT revision check from 6.5 to 6.6 and remove the TODO comment since ACPI 6.6 has been officially released. Signed-off-by: Yufeng Wang <wangyufeng@kylinos.cn> Reviewed-by: Sunil V L <sunilvl@oss.qualcomm.com> Acked-by: Heinrich Schuchardt <heinrich.schuchardt@canonical.com> Reviewed-by: Yao Zi <me@ziyao.cc> Link: https://patch.msgid.link/20260305091433.83983-1-r4o5m6e8o@163.com Signed-off-by: Paul Walmsley <pjw@kernel.org>
2026-04-04riscv: add hardware error trap handler supportRui Qi-0/+3
Add support for handling hardware error traps (exception code 19) in the RISC-V architecture. The changes include: - Add do_trap_hardware_error function declaration in asm-prototypes.h - Add hardware error trap vector entry in entry.S exception vector table - Implement do_trap_hardware_error handler in traps.c that generates SIGBUS with BUS_MCEERR_AR for hardware errors This enables proper handling of hardware error exceptions that may occur in RISC-V systems, providing appropriate error reporting and signal generation for user space processes. Signed-off-by: Rui Qi <qirui.001@bytedance.com> Link: https://patch.msgid.link/20260202094200.53735-1-qirui.001@bytedance.com [pjw@kernel.org: clean up commit message slightly] Signed-off-by: Paul Walmsley <pjw@kernel.org>
2026-04-04riscv: remove redundant #ifdef check in cpu-hotplugHui Wang-2/+0
The cpu-hotplug.c only is built when CONFIG_HOTPLUG_CPU is defined, it is not needed to check HOTPLUG_CPU in this file. Signed-off-by: Hui Wang <hui.wang@canonical.com> Link: https://patch.msgid.link/20260304033403.238012-2-hui.wang@canonical.com [pjw@kernel.org: removed extra whitespace at EOF] Signed-off-by: Paul Walmsley <pjw@kernel.org>
2026-04-04riscv: export kaslr offset and satp in VMCOREINFO ELF notesAustin Kim-0/+7
The following options are required by the kdump crash utility for RISC-V based vmcore file: - kaslr: If the vmcore is generated from a KASLR-enabled Linux kernel, the KASLR offset is required for the crash utility to load the vmcore. Without the proper kaslr option, the crash utility fails to load the vmcore file. - satp: The exact root page table address helps determine the correct base PGD address. With this patch, RISC-V VMCOREINFO ELF notes now include both kaslr and satp information. Signed-off-by: Austin Kim <austin.kim@lge.com> Link: https://patch.msgid.link/aYwKUE3ZzN7/ZY/A@adminpc-PowerEdge-R7525 Signed-off-by: Paul Walmsley <pjw@kernel.org>
2026-04-04riscv: smp: Clarify comment "cache" -> "instruction cache"Vivian Wang-2/+2
local_flush_icache_all() only flushes and synchronizes the *instruction* cache, not the data cache. Since RISC-V does have a cbo.flush instruction for data cache flush, clarify the comment to avoid confusion. Fixes: 58661a30f1bc ("riscv: Flush the instruction cache during SMP bringup") Signed-off-by: Vivian Wang <wangruikang@iscas.ac.cn> Link: https://patch.msgid.link/20260204-riscv-smp-comment-update-2026-01-v1-2-8b77aa181530@iscas.ac.cn Signed-off-by: Paul Walmsley <pjw@kernel.org>
2026-04-04riscv: smp: Remove outdated comment about disabling preemptionVivian Wang-4/+0
Commit f1a0a376ca0c ("sched/core: Initialize the idle task with preemption disabled") removed a call to preempt_disable(), but not the associated comment. Remove the outdated comment. Fixes: f1a0a376ca0c ("sched/core: Initialize the idle task with preemption disabled") Signed-off-by: Vivian Wang <wangruikang@iscas.ac.cn> Link: https://patch.msgid.link/20260204-riscv-smp-comment-update-2026-01-v1-1-8b77aa181530@iscas.ac.cn Signed-off-by: Paul Walmsley <pjw@kernel.org>
2026-04-04riscv: fix various typos in comments and codeSean Chang-9/+9
Fix various typos in RISC-V architecture code and comments. The following changes are included: - arch/riscv/errata/thead/errata.c: "futher" → "further" - arch/riscv/include/asm/atomic.h: "therefor" → "therefore", "arithmatic" → "arithmetic" - arch/riscv/include/asm/elf.h: "availiable" → "available", "coorespends" → "corresponds" - arch/riscv/include/asm/processor.h: "requries" → "is required" - arch/riscv/include/asm/thread_info.h: "returing" → "returning" - arch/riscv/kernel/acpi.c: "compliancy" → "compliance" - arch/riscv/kernel/ftrace.c: "therefor" → "therefore" - arch/riscv/kernel/head.S: "intruction" → "instruction" - arch/riscv/kernel/mcount-dyn.S: "localtion → "location" - arch/riscv/kernel/module-sections.c: "maxinum" → "maximum" - arch/riscv/kernel/probes/kprobes.c: "reenabled" → "re-enabled" - arch/riscv/kernel/probes/uprobes.c: "probbed" → "probed" - arch/riscv/kernel/soc.c: "extremly" → "extremely" - arch/riscv/kernel/suspend.c: "incosistent" → "inconsistent" - arch/riscv/kvm/tlb.c: "cahce" → "cache" - arch/riscv/kvm/vcpu_pmu.c: "indicies" → "indices" - arch/riscv/lib/csum.c: "implmentations" → "implementations" - arch/riscv/lib/memmove.S: "ammount" → "amount" - arch/riscv/mm/cacheflush.c: "visable" → "visible" - arch/riscv/mm/physaddr.c: "aginst" → "against" Signed-off-by: Sean Chang <seanwascoding@gmail.com> Acked-by: Conor Dooley <conor.dooley@microchip.com> Link: https://patch.msgid.link/20260212163325.60389-1-seanwascoding@gmail.com Signed-off-by: Paul Walmsley <pjw@kernel.org>
2026-04-04prctl: cfi: change the branch landing pad prctl()s to be more descriptivePaul Walmsley-8/+7
Per Linus' comments requesting the replacement of "INDIR_BR_LP" in the indirect branch tracking prctl()s with something more readable, and suggesting the use of the speculation control prctl()s as an exemplar, reimplement the prctl()s and related constants that control per-task forward-edge control flow integrity. This primarily involves two changes. First, the prctls are restructured to resemble the style of the speculative execution workaround control prctls PR_{GET,SET}_SPECULATION_CTRL, to make them easier to extend in the future. Second, the "indir_br_lp" abbrevation is expanded to "branch_landing_pads" to be less telegraphic. The kselftest and documentation is adjusted accordingly. Link: https://lore.kernel.org/linux-riscv/CAHk-=whhSLGZAx3N5jJpb4GLFDqH_QvS07D+6BnkPWmCEzTAgw@mail.gmail.com/ Cc: Deepak Gupta <debug@rivosinc.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mark Brown <broonie@kernel.org> Signed-off-by: Paul Walmsley <pjw@kernel.org>
2026-04-04riscv: ptrace: cfi: expand "SS" references to "shadow stack" in uapi headersPaul Walmsley-5/+5
Similar to the recent change to expand "LP" to "branch landing pad", let's expand "SS" in the ptrace uapi macros to "shadow stack" as well. This aligns with the existing prctl() arguments, which use the expanded "shadow stack" names, rather than just the abbreviation. Link: https://lore.kernel.org/linux-riscv/CAHk-=whhSLGZAx3N5jJpb4GLFDqH_QvS07D+6BnkPWmCEzTAgw@mail.gmail.com/ Cc: Deepak Gupta <debug@rivosinc.com> Signed-off-by: Paul Walmsley <pjw@kernel.org>
2026-04-04prctl: rename branch landing pad implementation functions to be more explicitPaul Walmsley-8/+8
Per Linus' comments about the unreadability of abbreviations such as "indir_br_lp", rename the three prctl() implementation functions to be more explicit. This involves renaming "indir_br_lp_status" in the function names to "branch_landing_pad_state". While here, add _prctl_ into the function names, following the speculation control prctl implementation functions. Link: https://lore.kernel.org/linux-riscv/CAHk-=whhSLGZAx3N5jJpb4GLFDqH_QvS07D+6BnkPWmCEzTAgw@mail.gmail.com/ Cc: Deepak Gupta <debug@rivosinc.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mark Brown <broonie@kernel.org> Signed-off-by: Paul Walmsley <pjw@kernel.org>
2026-04-04riscv: ptrace: expand "LP" references to "branch landing pads" in uapi headersPaul Walmsley-5/+5
Per Linus' comments about the unreadability of abbreviations such as "LP", rename the RISC-V ptrace landing pad CFI macro names to be more explicit. This primarily involves expanding "LP" in the names to some variant of "branch landing pad." Link: https://lore.kernel.org/linux-riscv/CAHk-=whhSLGZAx3N5jJpb4GLFDqH_QvS07D+6BnkPWmCEzTAgw@mail.gmail.com/ Cc: Deepak Gupta <debug@rivosinc.com> Signed-off-by: Paul Walmsley <pjw@kernel.org>
2026-04-04riscv: cfi: clear CFI lock status in start_thread()Zong Li-6/+8
When libc locks the CFI status through the following prctl: - PR_LOCK_SHADOW_STACK_STATUS - PR_LOCK_INDIR_BR_LP_STATUS A newly execd address space will inherit the lock status if it does not clear the lock bits. Since the lock bits remain set, libc will later fail to enable the landing pad and shadow stack. Signed-off-by: Zong Li <zong.li@sifive.com> Link: https://patch.msgid.link/20260323065640.4045713-1-zong.li@sifive.com [pjw@kernel.org: ensure we unlock before changing state; cleaned up subject line] Signed-off-by: Paul Walmsley <pjw@kernel.org>
2026-04-04riscv: ptrace: cfi: fix "PRACE" typo in uapi headerPaul Walmsley-1/+1
A CFI-related macro defined in arch/riscv/uapi/asm/ptrace.h misspells "PTRACE" as "PRACE"; fix this. Fixes: 2af7c9cf021c ("riscv/ptrace: expose riscv CFI status and state via ptrace and in core files") Cc: Deepak Gupta <debug@rivosinc.com> Signed-off-by: Paul Walmsley <pjw@kernel.org>
2026-04-04riscv: Reset pmm when PR_TAGGED_ADDR_ENABLE is not setZishun Yi-1/+3
In set_tagged_addr_ctrl(), when PR_TAGGED_ADDR_ENABLE is not set, pmlen is correctly set to 0, but it forgets to reset pmm. This results in the CPU pmm state not corresponding to the software pmlen state. Fix this by resetting pmm along with pmlen. Fixes: 2e1743085887 ("riscv: Add support for the tagged address ABI") Signed-off-by: Zishun Yi <vulab@iscas.ac.cn> Reviewed-by: Samuel Holland <samuel.holland@sifive.com> Link: https://patch.msgid.link/20260322160022.21908-1-vulab@iscas.ac.cn Signed-off-by: Paul Walmsley <pjw@kernel.org>
2026-04-04riscv: patch: Avoid early phys_to_page()Vivian Wang-10/+11
Similarly to commit 8d09e2d569f6 ("arm64: patching: avoid early page_to_phys()"), avoid using phys_to_page() for the kernel address case in patch_map(). Since this is called from apply_boot_alternatives() in setup_arch(), and commit 4267739cabb8 ("arch, mm: consolidate initialization of SPARSE memory model") has moved sparse_init() to after setup_arch(), phys_to_page() is not available there yet, and it panics on boot with SPARSEMEM on RV32, which does not use SPARSEMEM_VMEMMAP. Reported-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Closes: https://lore.kernel.org/r/20260223144108-dcace0b9-02e8-4b67-a7ce-f263bed36f26@linutronix.de/ Fixes: 4267739cabb8 ("arch, mm: consolidate initialization of SPARSE memory model") Suggested-by: Mike Rapoport <rppt@kernel.org> Signed-off-by: Vivian Wang <wangruikang@iscas.ac.cn> Acked-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Tested-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Link: https://patch.msgid.link/20260310-riscv-sparsemem-alternatives-fix-v1-1-659d5dd257e2@iscas.ac.cn [pjw@kernel.org: fix the subject line to align with the patch description] Signed-off-by: Paul Walmsley <pjw@kernel.org>
2026-04-04riscv: kgdb: fix several debug register assignment bugsPaul Walmsley-3/+4
Fix several bugs in the RISC-V kgdb implementation: - The element of dbg_reg_def[] that is supposed to pertain to the S1 register embeds instead the struct pt_regs offset of the A1 register. Fix this to use the S1 register offset in struct pt_regs. - The sleeping_thread_to_gdb_regs() function copies the value of the S10 register into the gdb_regs[] array element meant for the S9 register, and copies the value of the S11 register into the array element meant for the S10 register. It also neglects to copy the value of the S11 register. Fix all of these issues. Fixes: fe89bd2be8667 ("riscv: Add KGDB support") Cc: Vincent Chen <vincent.chen@sifive.com> Link: https://patch.msgid.link/fde376f8-bcfd-bfe4-e467-07d8f7608d05@kernel.org Signed-off-by: Paul Walmsley <pjw@kernel.org>
2026-03-24randomize_kstack: Unify random source across archesRyan Roberts-12/+0
Previously different architectures were using random sources of differing strength and cost to decide the random kstack offset. A number of architectures (loongarch, powerpc, s390, x86) were using their timestamp counter, at whatever the frequency happened to be. Other arches (arm64, riscv) were using entropy from the crng via get_random_u16(). There have been concerns that in some cases the timestamp counters may be too weak, because they can be easily guessed or influenced by user space. And get_random_u16() has been shown to be too costly for the level of protection kstack offset randomization provides. So let's use a common, architecture-agnostic source of entropy; a per-cpu prng, seeded at boot-time from the crng. This has a few benefits: - We can remove choose_random_kstack_offset(); That was only there to try to make the timestamp counter value a bit harder to influence from user space [*]. - The architecture code is simplified. All it has to do now is call add_random_kstack_offset() in the syscall path. - The strength of the randomness can be reasoned about independently of the architecture. - Arches previously using get_random_u16() now have much faster syscall paths, see below results. [*] Additionally, this gets rid of some redundant work on s390 and x86. Before this patch, those architectures called choose_random_kstack_offset() under arch_exit_to_user_mode_prepare(), which is also called for exception returns to userspace which were *not* syscalls (e.g. regular interrupts). Getting rid of choose_random_kstack_offset() avoids a small amount of redundant work for the non-syscall cases. In some configurations, add_random_kstack_offset() will now call instrumentable code, so for a couple of arches, I have moved the call a bit later to the first point where instrumentation is allowed. This doesn't impact the efficacy of the mechanism. There have been some claims that a prng may be less strong than the timestamp counter if not regularly reseeded. But the prng has a period of about 2^113. So as long as the prng state remains secret, it should not be possible to guess. If the prng state can be accessed, we have bigger problems. Additionally, we are only consuming 6 bits to randomize the stack, so there are only 64 possible random offsets. I assert that it would be trivial for an attacker to brute force by repeating their attack and waiting for the random stack offset to be the desired one. The prng approach seems entirely proportional to this level of protection. Performance data are provided below. The baseline is v6.18 with rndstack on for each respective arch. (I)/(R) indicate statistically significant improvement/regression. arm64 platform is AWS Graviton3 (m7g.metal). x86_64 platform is AWS Sapphire Rapids (m7i.24xlarge): +-----------------+--------------+---------------+---------------+ | Benchmark | Result Class | per-cpu-prng | per-cpu-prng | | | | arm64 (metal) | x86_64 (VM) | +=================+==============+===============+===============+ | syscall/getpid | mean (ns) | (I) -9.50% | (I) -17.65% | | | p99 (ns) | (I) -59.24% | (I) -24.41% | | | p99.9 (ns) | (I) -59.52% | (I) -28.52% | +-----------------+--------------+---------------+---------------+ | syscall/getppid | mean (ns) | (I) -9.52% | (I) -19.24% | | | p99 (ns) | (I) -59.25% | (I) -25.03% | | | p99.9 (ns) | (I) -59.50% | (I) -28.17% | +-----------------+--------------+---------------+---------------+ | syscall/invalid | mean (ns) | (I) -10.31% | (I) -18.56% | | | p99 (ns) | (I) -60.79% | (I) -20.06% | | | p99.9 (ns) | (I) -61.04% | (I) -25.04% | +-----------------+--------------+---------------+---------------+ I tested an earlier version of this change on x86 bare metal and it showed a smaller but still significant improvement. The bare metal system wasn't available this time around so testing was done in a VM instance. I'm guessing the cost of rdtsc is higher for VMs. Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Ryan Roberts <ryan.roberts@arm.com> Link: https://patch.msgid.link/20260303150840.3789438-3-ryan.roberts@arm.com Signed-off-by: Kees Cook <kees@kernel.org>
2026-02-26kbuild: Split .modinfo out from ELF_DETAILSNathan Chancellor-0/+1
Commit 3e86e4d74c04 ("kbuild: keep .modinfo section in vmlinux.unstripped") added .modinfo to ELF_DETAILS while removing it from COMMON_DISCARDS, as it was needed in vmlinux.unstripped and ELF_DETAILS was present in all architecture specific vmlinux linker scripts. While this shuffle is fine for vmlinux, ELF_DETAILS and COMMON_DISCARDS may be used by other linker scripts, such as the s390 and x86 compressed boot images, which may not expect to have a .modinfo section. In certain circumstances, this could result in a bootloader failing to load the compressed kernel [1]. Commit ddc6cbef3ef1 ("s390/boot/vmlinux.lds.S: Ensure bzImage ends with SecureBoot trailer") recently addressed this for the s390 bzImage but the same bug remains for arm, parisc, and x86. The presence of .modinfo in the x86 bzImage was the root cause of the issue worked around with commit d50f21091358 ("kbuild: align modinfo section for Secureboot Authenticode EDK2 compat"). misc.c in arch/x86/boot/compressed includes lib/decompress_unzstd.c, which in turn includes lib/xxhash.c and its MODULE_LICENSE / MODULE_DESCRIPTION macros due to the STATIC definition. Split .modinfo out from ELF_DETAILS into its own macro and handle it in all vmlinux linker scripts. Discard .modinfo in the places where it was previously being discarded from being in COMMON_DISCARDS, as it has never been necessary in those uses. Cc: stable@vger.kernel.org Fixes: 3e86e4d74c04 ("kbuild: keep .modinfo section in vmlinux.unstripped") Reported-by: Ed W <lists@wildgooses.com> Closes: https://lore.kernel.org/587f25e0-a80e-46a5-9f01-87cb40cfa377@wildgooses.com/ [1] Tested-by: Ed W <lists@wildgooses.com> # x86_64 Link: https://patch.msgid.link/20260225-separate-modinfo-from-elf-details-v1-1-387ced6baf4b@kernel.org Signed-off-by: Nathan Chancellor <nathan@kernel.org>
2026-02-22Convert remaining multi-line kmalloc_obj/flex GFP_KERNEL usesKees Cook-1/+1
Conversion performed via this Coccinelle script: // SPDX-License-Identifier: GPL-2.0-only // Options: --include-headers-for-types --all-includes --include-headers --keep-comments virtual patch @gfp depends on patch && !(file in "tools") && !(file in "samples")@ identifier ALLOC = {kmalloc_obj,kmalloc_objs,kmalloc_flex, kzalloc_obj,kzalloc_objs,kzalloc_flex, kvmalloc_obj,kvmalloc_objs,kvmalloc_flex, kvzalloc_obj,kvzalloc_objs,kvzalloc_flex}; @@ ALLOC(... - , GFP_KERNEL ) $ make coccicheck MODE=patch COCCI=gfp.cocci Build and boot tested x86_64 with Fedora 42's GCC and Clang: Linux version 6.19.0+ (user@host) (gcc (GCC) 15.2.1 20260123 (Red Hat 15.2.1-7), GNU ld version 2.44-12.fc42) #1 SMP PREEMPT_DYNAMIC 1970-01-01 Linux version 6.19.0+ (user@host) (clang version 20.1.8 (Fedora 20.1.8-4.fc42), LLD 20.1.8) #1 SMP PREEMPT_DYNAMIC 1970-01-01 Signed-off-by: Kees Cook <kees@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2026-02-21Convert more 'alloc_obj' cases to default GFP_KERNEL argumentsLinus Torvalds-2/+1
This converts some of the visually simpler cases that have been split over multiple lines. I only did the ones that are easy to verify the resulting diff by having just that final GFP_KERNEL argument on the next line. Somebody should probably do a proper coccinelle script for this, but for me the trivial script actually resulted in an assertion failure in the middle of the script. I probably had made it a bit _too_ trivial. So after fighting that far a while I decided to just do some of the syntactically simpler cases with variations of the previous 'sed' scripts. The more syntactically complex multi-line cases would mostly really want whitespace cleanup anyway. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2026-02-21Convert 'alloc_flex' family to use the new default GFP_KERNEL argumentLinus Torvalds-1/+1
This is the exact same thing as the 'alloc_obj()' version, only much smaller because there are a lot fewer users of the *alloc_flex() interface. As with alloc_obj() version, this was done entirely with mindless brute force, using the same script, except using 'flex' in the pattern rather than 'objs*'. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2026-02-21Convert 'alloc_obj' family to use the new default GFP_KERNEL argumentLinus Torvalds-6/+6
This was done entirely with mindless brute force, using git grep -l '\<k[vmz]*alloc_objs*(.*, GFP_KERNEL)' | xargs sed -i 's/\(alloc_objs*(.*\), GFP_KERNEL)/\1)/' to convert the new alloc_obj() users that had a simple GFP_KERNEL argument to just drop that argument. Note that due to the extreme simplicity of the scripting, any slightly more complex cases spread over multiple lines would not be triggered: they definitely exist, but this covers the vast bulk of the cases, and the resulting diff is also then easier to check automatically. For the same reason the 'flex' versions will be done as a separate conversion. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2026-02-21treewide: Replace kmalloc with kmalloc_obj for non-scalar typesKees Cook-13/+11
This is the result of running the Coccinelle script from scripts/coccinelle/api/kmalloc_objs.cocci. The script is designed to avoid scalar types (which need careful case-by-case checking), and instead replace kmalloc-family calls that allocate struct or union object instances: Single allocations: kmalloc(sizeof(TYPE), ...) are replaced with: kmalloc_obj(TYPE, ...) Array allocations: kmalloc_array(COUNT, sizeof(TYPE), ...) are replaced with: kmalloc_objs(TYPE, COUNT, ...) Flex array allocations: kmalloc(struct_size(PTR, FAM, COUNT), ...) are replaced with: kmalloc_flex(*PTR, FAM, COUNT, ...) (where TYPE may also be *VAR) The resulting allocations no longer return "void *", instead returning "TYPE *". Signed-off-by: Kees Cook <kees@kernel.org>
2026-02-12Merge tag 'riscv-for-linus-7.0-mw1' of ↵Linus Torvalds-90/+1217
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V updates from Paul Walmsley: - Add support for control flow integrity for userspace processes. This is based on the standard RISC-V ISA extensions Zicfiss and Zicfilp - Improve ptrace behavior regarding vector registers, and add some selftests - Optimize our strlen() assembly - Enable the ISO-8859-1 code page as built-in, similar to ARM64, for EFI volume mounting - Clean up some code slightly, including defining copy_user_page() as copy_page() rather than memcpy(), aligning us with other architectures; and using max3() to slightly simplify an expression in riscv_iommu_init_check() * tag 'riscv-for-linus-7.0-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (42 commits) riscv: lib: optimize strlen loop efficiency selftests: riscv: vstate_exec_nolibc: Use the regular prctl() function selftests: riscv: verify ptrace accepts valid vector csr values selftests: riscv: verify ptrace rejects invalid vector csr inputs selftests: riscv: verify syscalls discard vector context selftests: riscv: verify initial vector state with ptrace selftests: riscv: test ptrace vector interface riscv: ptrace: validate input vector csr registers riscv: csr: define vtype register elements riscv: vector: init vector context with proper vlenb riscv: ptrace: return ENODATA for inactive vector extension kselftest/riscv: add kselftest for user mode CFI riscv: add documentation for shadow stack riscv: add documentation for landing pad / indirect branch tracking riscv: create a Kconfig fragment for shadow stack and landing pad support arch/riscv: add dual vdso creation logic and select vdso based on hw arch/riscv: compile vdso with landing pad and shadow stack note riscv: enable kernel access to shadow stack memory via the FWFT SBI call riscv: add kernel command line option to opt out of user CFI riscv/hwprobe: add zicfilp / zicfiss enumeration in hwprobe ...
2026-02-10Merge tag 'x86_paravirt_for_v7.0_rc1' of ↵Linus Torvalds-10/+1
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 paravirt updates from Borislav Petkov: - A nice cleanup to the paravirt code containing a unification of the paravirt clock interface, taming the include hell by splitting the pv_ops structure and removing of a bunch of obsolete code (Juergen Gross) * tag 'x86_paravirt_for_v7.0_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (23 commits) x86/paravirt: Use XOR r32,r32 to clear register in pv_vcpu_is_preempted() x86/paravirt: Remove trailing semicolons from alternative asm templates x86/pvlocks: Move paravirt spinlock functions into own header x86/paravirt: Specify pv_ops array in paravirt macros x86/paravirt: Allow pv-calls outside paravirt.h objtool: Allow multiple pv_ops arrays x86/xen: Drop xen_mmu_ops x86/xen: Drop xen_cpu_ops x86/xen: Drop xen_irq_ops x86/paravirt: Move pv_native_*() prototypes to paravirt.c x86/paravirt: Introduce new paravirt-base.h header x86/paravirt: Move paravirt_sched_clock() related code into tsc.c x86/paravirt: Use common code for paravirt_steal_clock() riscv/paravirt: Use common code for paravirt_steal_clock() loongarch/paravirt: Use common code for paravirt_steal_clock() arm64/paravirt: Use common code for paravirt_steal_clock() arm/paravirt: Use common code for paravirt_steal_clock() sched: Move clock related paravirt code to kernel/sched paravirt: Remove asm/paravirt_api_clock.h x86/paravirt: Move thunk macros to paravirt_types.h ...
2026-02-10Merge tag 'perf-core-2026-02-09' of ↵Linus Torvalds-2/+2
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull performance event updates from Ingo Molnar: "x86 PMU driver updates: - Add support for the core PMU for Intel Diamond Rapids (DMR) CPUs (Dapeng Mi) Compared to previous iterations of the Intel PMU code, there's been a lot of changes, which center around three main areas: - Introduce the OFF-MODULE RESPONSE (OMR) facility to replace the Off-Core Response (OCR) facility - New PEBS data source encoding layout - Support the new "RDPMC user disable" feature - Likewise, a large series adds uncore PMU support for Intel Diamond Rapids (DMR) CPUs (Zide Chen) This centers around these four main areas: - DMR may have two Integrated I/O and Memory Hub (IMH) dies, separate from the compute tile (CBB) dies. Each CBB and each IMH die has its own discovery domain. - Unlike prior CPUs that retrieve the global discovery table portal exclusively via PCI or MSR, DMR uses PCI for IMH PMON discovery and MSR for CBB PMON discovery. - DMR introduces several new PMON types: SCA, HAMVF, D2D_ULA, UBR, PCIE4, CRS, CPC, ITC, OTC, CMS, and PCIE6. - IIO free-running counters in DMR are MMIO-based, unlike SPR. - Also add support for Add missing PMON units for Intel Panther Lake, and support Nova Lake (NVL), which largely maps to Panther Lake. (Zide Chen) - KVM integration: Add support for mediated vPMUs (by Kan Liang and Sean Christopherson, with fixes and cleanups by Peter Zijlstra, Sandipan Das and Mingwei Zhang) - Add Intel cstate driver to support for Wildcat Lake (WCL) CPUs, which are a low-power variant of Panther Lake (Zide Chen) - Add core, cstate and MSR PMU support for the Airmont NP Intel CPU (aka MaxLinear Lightning Mountain), which maps to the existing Airmont code (Martin Schiller) Performance enhancements: - Speed up kexec shutdown by avoiding unnecessary cross CPU calls (Jan H. Schönherr) - Fix slow perf_event_task_exit() with LBR callstacks (Namhyung Kim) User-space stack unwinding support: - Various cleanups and refactorings in preparation to generalize the unwinding code for other architectures (Jens Remus) Uprobes updates: - Transition from kmap_atomic to kmap_local_page (Keke Ming) - Fix incorrect lockdep condition in filter_chain() (Breno Leitao) - Fix XOL allocation failure for 32-bit tasks (Oleg Nesterov) Misc fixes and cleanups: - s390: Remove kvm_types.h from Kbuild (Randy Dunlap) - x86/intel/uncore: Convert comma to semicolon (Chen Ni) - x86/uncore: Clean up const mismatch (Greg Kroah-Hartman) - x86/ibs: Fix typo in dc_l2tlb_miss comment (Xiang-Bin Shi)" * tag 'perf-core-2026-02-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (58 commits) s390: remove kvm_types.h from Kbuild uprobes: Fix incorrect lockdep condition in filter_chain() x86/ibs: Fix typo in dc_l2tlb_miss comment x86/uprobes: Fix XOL allocation failure for 32-bit tasks perf/x86/intel/uncore: Convert comma to semicolon perf/x86/intel: Add support for rdpmc user disable feature perf/x86: Use macros to replace magic numbers in attr_rdpmc perf/x86/intel: Add core PMU support for Novalake perf/x86/intel: Add support for PEBS memory auxiliary info field in NVL perf/x86/intel: Add core PMU support for DMR perf/x86/intel: Add support for PEBS memory auxiliary info field in DMR perf/x86/intel: Support the 4 new OMR MSRs introduced in DMR and NVL perf/core: Fix slow perf_event_task_exit() with LBR callstacks perf/core: Speed up kexec shutdown by avoiding unnecessary cross CPU calls uprobes: use kmap_local_page() for temporary page mappings arm/uprobes: use kmap_local_page() in arch_uprobe_copy_ixol() mips/uprobes: use kmap_local_page() in arch_uprobe_copy_ixol() arm64/uprobes: use kmap_local_page() in arch_uprobe_copy_ixol() riscv/uprobes: use kmap_local_page() in arch_uprobe_copy_ixol() perf/x86/intel/uncore: Add Nova Lake support ...
2026-02-09Merge tag 'efi-next-for-v7.0' of ↵Linus Torvalds-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi Pull EFI updates from Ard Biesheuvel: - Quirk the broken EFI framebuffer geometry on the Valve Steam Deck - Capture the EDID information of the primary display also on non-x86 EFI systems when booting via the EFI stub. * tag 'efi-next-for-v7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi: efi: Support EDID information sysfb: Move edid_info into sysfb_primary_display sysfb: Pass sysfb_primary_display to devices sysfb: Replace screen_info with sysfb_primary_display sysfb: Add struct sysfb_display_info efi: sysfb_efi: Reduce number of references to global screen_info efi: earlycon: Reduce number of references to global screen_info efi: sysfb_efi: Fix efidrmfb and simpledrmfb on Valve Steam Deck efi: sysfb_efi: Convert swap width and height quirk to a callback efi: sysfb_efi: Fix lfb_linelength calculation when applying quirks efi: sysfb_efi: Replace open coded swap with the macro
2026-02-09riscv: ptrace: validate input vector csr registersSergey Matyukevich-1/+87
Add strict validation for vector csr registers when setting them via ptrace: - reject attempts to set reserved bits or invalid field combinations - enforce strict VL checks against calculated VLMAX values Vector specs 0.7.1 and 1.0 allow normal applications to set candidate VL values and read back the hardware-adjusted results, see section 6 for details. Disallow such flexibility in vector ptrace operations and strictly enforce valid VL input. The traced process may not update its saved vector context if no vector instructions execute between breakpoints. So the purpose of the strict ptrace approach is to make sure that debuggers maintain an accurate view of the tracee's vector context across multiple halt/resume debug cycles. Signed-off-by: Sergey Matyukevich <geomatsi@gmail.com> Reviewed-by: Andy Chiu <andybnac@gmail.com> Tested-by: Andy Chiu <andybnac@gmail.com> Link: https://patch.msgid.link/20251214163537.1054292-5-geomatsi@gmail.com Signed-off-by: Paul Walmsley <pjw@kernel.org>
2026-02-09riscv: vector: init vector context with proper vlenbSergey Matyukevich-4/+8
The vstate in thread_struct is zeroed when the vector context is initialized. That includes read-only register vlenb, which holds the vector register length in bytes. Zeroed state persists until mstatus.VS becomes 'dirty' and a context switch saves the actual hardware values. This can expose the zero vlenb value to the user-space in early debug scenarios, e.g. when ptrace attaches to a traced process early, before any vector instruction except the first one was executed. Fix this by specifying proper vlenb on vector context init. Signed-off-by: Sergey Matyukevich <geomatsi@gmail.com> Reviewed-by: Andy Chiu <andybnac@gmail.com> Tested-by: Andy Chiu <andybnac@gmail.com> Link: https://patch.msgid.link/20251214163537.1054292-3-geomatsi@gmail.com Signed-off-by: Paul Walmsley <pjw@kernel.org>
2026-01-29riscv: ptrace: return ENODATA for inactive vector extensionIlya Mamay-2/+8
Currently, ptrace returns EINVAL when the vector extension is supported but not yet activated for the traced process. This error code is not always appropriate since the ptrace arguments may be valid. Debug tools like gdbserver expect ENODATA when the requested register set is not active, e.g. see [1]. This expectation seems to be more appropriate, so modify the vector ptrace implementation to return: - EINVAL when V extension is not supported - ENODATA when V extension is supported but not active [1] https://github.com/bminor/binutils-gdb/blob/637f25e88675fa47e47f9cc5e2cf37384836b8a2/gdbserver/linux-low.cc#L5020 Signed-off-by: Ilya Mamay <mmamayka01@gmail.com> Signed-off-by: Sergey Matyukevich <geomatsi@gmail.com> Reviewed-by: Andy Chiu <andybnac@gmail.com> Tested-by: Andy Chiu <andybnac@gmail.com> Link: https://patch.msgid.link/20251214163537.1054292-2-geomatsi@gmail.com Signed-off-by: Paul Walmsley <pjw@kernel.org>