summaryrefslogtreecommitdiffstats
path: root/arch/x86/platform
AgeCommit message (Collapse)AuthorLines
2026-04-18Merge tag 'memblock-v7.1-rc1' of ↵Linus Torvalds-5/+2
git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock Pull memblock updates from Mike Rapoport: - improve debuggability of reserve_mem kernel parameter handling with print outs in case of a failure and debugfs info showing what was actually reserved - Make memblock_free_late() and free_reserved_area() use the same core logic for freeing the memory to buddy and ensure it takes care of updating memblock arrays when ARCH_KEEP_MEMBLOCK is enabled. * tag 'memblock-v7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock: x86/alternative: delay freeing of smp_locks section memblock: warn when freeing reserved memory before memory map is initialized memblock, treewide: make memblock_free() handle late freeing memblock: make free_reserved_area() update memblock if ARCH_KEEP_MEMBLOCK=y memblock: extract page freeing from free_reserved_area() into a helper memblock: make free_reserved_area() more robust mm: move free_reserved_area() to mm/memblock.c powerpc: opal-core: pair alloc_pages_exact() with free_pages_exact() powerpc: fadump: pair alloc_pages_exact() with free_pages_exact() memblock: reserve_mem: fix end caclulation in reserve_mem_release_by_name() memblock: move reserve_bootmem_range() to memblock.c and make it static memblock: Add reserve_mem debugfs info memblock: Print out errors on reserve_mem parser
2026-04-17Merge tag 'integrity-v7.1' of ↵Linus Torvalds-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity Pull integrity updates from Mimi Zohar: "There are two main changes, one feature removal, some code cleanup, and a number of bug fixes. Main changes: - Detecting secure boot mode was limited to IMA. Make detecting secure boot mode accessible to EVM and other LSMs - IMA sigv3 support was limited to fsverity. Add IMA sigv3 support for IMA regular file hashes and EVM portable signatures Remove: - Remove IMA support for asychronous hash calculation originally added for hardware acceleration Cleanup: - Remove unnecessary Kconfig CONFIG_MODULE_SIG and CONFIG_KEXEC_SIG tests - Add descriptions of the IMA atomic flags Bug fixes: - Like IMA, properly limit EVM "fix" mode - Define and call evm_fix_hmac() to update security.evm - Fallback to using i_version to detect file change for filesystems that do not support STATX_CHANGE_COOKIE - Address missing kernel support for configured (new) TPM hash algorithms - Add missing crypto_shash_final() return value" * tag 'integrity-v7.1' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity: evm: Enforce signatures version 3 with new EVM policy 'bit 3' integrity: Allow sigv3 verification on EVM_XATTR_PORTABLE_DIGSIG ima: add support to require IMA sigv3 signatures ima: add regular file data hash signature version 3 support ima: Define asymmetric_verify_v3() to verify IMA sigv3 signatures ima: remove buggy support for asynchronous hashes integrity: Eliminate weak definition of arch_get_secureboot() ima: Add code comments to explain IMA iint cache atomic_flags ima_fs: Correctly create securityfs files for unsupported hash algos ima: check return value of crypto_shash_final() in boot aggregate ima: Define and use a digest_size field in the ima_algo_desc structure powerpc/ima: Drop unnecessary check for CONFIG_MODULE_SIG ima: efi: Drop unnecessary check for CONFIG_MODULE_SIG/CONFIG_KEXEC_SIG ima: fallback to using i_version to detect file change evm: fix security.evm for a file with IMA signature s390: Drop unnecessary CONFIG_IMA_SECURE_AND_OR_TRUSTED_BOOT evm: Don't enable fix mode when secure boot is enabled integrity: Make arch_ima_get_secureboot integrity-wide
2026-04-14Merge tag 'x86_cpu_for_7.1-rc1' of ↵Linus Torvalds-0/+35
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 cpu updates from Dave Hansen: - Complete LASS enabling: deal with vsyscall and EFI The existing Linear Address Space Separation (LASS) support punted on support for common EFI and vsyscall configs. Complete the implementation by supporting EFI and vsyscall=xonly. - Clean up CPUID usage in newer Intel "avs" audio driver and update the x86-cpuid-db file * tag 'x86_cpu_for_7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: tools/x86/kcpuid: Update bitfields to x86-cpuid-db v3.0 ASoC: Intel: avs: Include CPUID header at file scope ASoC: Intel: avs: Check maximum valid CPUID leaf x86/cpu: Remove LASS restriction on vsyscall emulation x86/vsyscall: Disable LASS if vsyscall mode is set to EMULATE x86/vsyscall: Restore vsyscall=xonly mode under LASS x86/traps: Consolidate user fixups in the #GP handler x86/vsyscall: Reorganize the page fault emulation code x86/cpu: Remove LASS restriction on EFI x86/efi: Disable LASS while executing runtime services x86/cpu: Defer LASS enabling until userspace comes up
2026-04-01memblock, treewide: make memblock_free() handle late freeingMike Rapoport (Microsoft)-5/+2
It shouldn't be responsibility of memblock users to detect if they free memory allocated from memblock late and should use memblock_free_late(). Make memblock_free() and memblock_phys_free() take care of late memory freeing and drop memblock_free_late(). Link: https://patch.msgid.link/20260323074836.3653702-9-rppt@kernel.org Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
2026-03-31x86/platform/geode: Fix on-stack property data use-after-return bugDmitry Torokhov-6/+18
The PROPERTY_ENTRY_GPIO macro (and by extension PROPERTY_ENTRY_REF) creates a temporary software_node_ref_args structure on the stack when used in a runtime assignment. This results in the property pointing to data that is invalid once the function returns. Fix this by ensuring the GPIO reference data is not stored on stack and using PROPERTY_ENTRY_REF_ARRAY_LEN() to point directly to the persistent reference data. Fixes: 298c9babadb8 ("x86/platform/geode: switch GPIO buttons and LEDs to software properties") Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Rafael J. Wysocki <rafael@kernel.org> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Daniel Scally <djrscally@gmail.com> Cc: Danilo Krummrich <dakr@kernel.org> Cc: Hans de Goede <hansg@kernel.org> Cc: Heikki Krogerus <heikki.krogerus@linux.intel.com> Cc: Sakari Ailus <sakari.ailus@linux.intel.com> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20260329-property-gpio-fix-v2-1-3cca5ba136d8@gmail.com
2026-03-27Merge tag 'efi-fixes-for-v7.0-3' of ↵Linus Torvalds-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi Pull EFI fix from Ard Biesheuvel: "Fix a potential buffer overrun issue introduced by the previous fix for EFI boot services region reservations on x86" * tag 'efi-fixes-for-v7.0-3' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi: x86/efi: efi_unmap_boot_services: fix calculation of ranges_to_free size
2026-03-20x86/efi: efi_unmap_boot_services: fix calculation of ranges_to_free sizeMike Rapoport (Microsoft)-1/+1
ranges_to_free array should have enough room to store the entire EFI memmap plus an extra element for NULL entry. The calculation of this array size wrongly adds 1 to the overall size instead of adding 1 to the number of elements. Add parentheses to properly size the array. Reported-by: Guenter Roeck <linux@roeck-us.net> Fixes: a4b0bf6a40f3 ("x86/efi: defer freeing of boot services memory") Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2026-03-08Merge tag 'efi-fixes-for-v7.0-2' of ↵Linus Torvalds-4/+53
git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi Pull EFI fix from Ard Biesheuvel: "Fix for the x86 EFI workaround keeping boot services code and data regions reserved until after SetVirtualAddressMap() completes: deferred struct page initialization may result in some of this memory being lost permanently" * tag 'efi-fixes-for-v7.0-2' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi: x86/efi: defer freeing of boot services memory
2026-03-07Merge tag 'for-linus-7.0-rc3-tag' of ↵Linus Torvalds-6/+1
git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip Pull xen fixes from Juergen Gross: - a cleanup of arch/x86/kernel/head_64.S removing the pre-built page tables for Xen guests - a small comment update - another cleanup for Xen PVH guests mode - fix an issue with Xen PV-devices backed by driver domains * tag 'for-linus-7.0-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip: xen/xenbus: better handle backend crash xenbus: add xenbus_device parameter to xenbus_read_driver_state() x86/PVH: Use boot params to pass RSDP address in start_info page x86/xen: update outdated comment xen/acpi-processor: fix _CST detection using undersized evaluation buffer x86/xen: Build identity mapping page tables dynamically for XENPV
2026-03-05integrity: Make arch_ima_get_secureboot integrity-wideCoiby Xu-1/+1
EVM and other LSMs need the ability to query the secure boot status of the system, without directly calling the IMA arch_ima_get_secureboot function. Refactor the secure boot status check into a general function named arch_get_secureboot. Reported-and-suggested-by: Mimi Zohar <zohar@linux.ibm.com> Suggested-by: Roberto Sassu <roberto.sassu@huawei.com> Signed-off-by: Coiby Xu <coxu@redhat.com> Acked-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com>
2026-03-03x86/efi: Disable LASS while executing runtime servicesSohil Mehta-0/+35
Ideally, EFI runtime services should switch to kernel virtual addresses after SetVirtualAddressMap(). However, firmware implementations are known to be buggy in this regard and continue to access physical addresses. The kernel maintains a 1:1 mapping of all runtime services code and data regions to avoid breaking such firmware. LASS enforcement relies on bit 63 of the virtual address, which would block such accesses to the lower half. Unfortunately, not doing anything could lead to #GP faults when users update to a kernel with LASS enabled. One option is to use a STAC/CLAC pair to temporarily disable LASS data enforcement. However, there is no guarantee that the stray accesses would only touch data and not perform instruction fetches. Also, relying on the AC bit would depend on the runtime calls preserving RFLAGS, which is highly unlikely in practice. Instead, use the big hammer and switch off the entire LASS mechanism temporarily by clearing CR4.LASS. Runtime services are called in the context of efi_mm, which has explicitly unmapped any memory EFI isn't allowed to touch (including userspace). So, do this right after switching to efi_mm to avoid any security impact. Some runtime services can be invoked during boot when LASS isn't active. Use a global variable (similar to efi_mm) to save and restore the correct CR4.LASS state. The runtime calls are serialized with the efi_runtime_lock, so no concurrency issues are expected. Signed-off-by: Sohil Mehta <sohil.mehta@intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Tested-by: Tony Luck <tony.luck@intel.com> Tested-by: Maciej Wieczor-Retman <maciej.wieczor-retman@intel.com> Link: https://patch.msgid.link/20260120234730.2215498-3-sohil.mehta@intel.com
2026-03-03x86/PVH: Use boot params to pass RSDP address in start_info pageHou Wenlong-6/+1
After commit e6e094e053af75 ("x86/acpi, x86/boot: Take RSDP address from boot params if available"), the RSDP address can be passed in boot params. Therefore, store the RSDP address in start_info page into boot params in the PVH entry instead of registering a different callback. This removes an absolute reference during the PVH entry and is more standardized. Signed-off-by: Hou Wenlong <houwenlong.hwl@antgroup.com> Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: Juergen Gross <jgross@suse.com> Message-ID: <76675c4d49d3a8f72252076812ef8f22276230c2.1772282441.git.houwenlong.hwl@antgroup.com>
2026-02-25x86/efi: defer freeing of boot services memoryMike Rapoport (Microsoft)-4/+53
efi_free_boot_services() frees memory occupied by EFI_BOOT_SERVICES_CODE and EFI_BOOT_SERVICES_DATA using memblock_free_late(). There are two issue with that: memblock_free_late() should be used for memory allocated with memblock_alloc() while the memory reserved with memblock_reserve() should be freed with free_reserved_area(). More acutely, with CONFIG_DEFERRED_STRUCT_PAGE_INIT=y efi_free_boot_services() is called before deferred initialization of the memory map is complete. Benjamin Herrenschmidt reports that this causes a leak of ~140MB of RAM on EC2 t3a.nano instances which only have 512MB or RAM. If the freed memory resides in the areas that memory map for them is still uninitialized, they won't be actually freed because memblock_free_late() calls memblock_free_pages() and the latter skips uninitialized pages. Using free_reserved_area() at this point is also problematic because __free_page() accesses the buddy of the freed page and that again might end up in uninitialized part of the memory map. Delaying the entire efi_free_boot_services() could be problematic because in addition to freeing boot services memory it updates efi.memmap without any synchronization and that's undesirable late in boot when there is concurrency. More robust approach is to only defer freeing of the EFI boot services memory. Split efi_free_boot_services() in two. First efi_unmap_boot_services() collects ranges that should be freed into an array then efi_free_boot_services() later frees them after deferred init is complete. Link: https://lore.kernel.org/all/ec2aaef14783869b3be6e3c253b2dcbf67dbc12a.camel@kernel.crashing.org Fixes: 916f676f8dc0 ("x86, efi: Retain boot service code until after switching to virtual mode") Cc: <stable@vger.kernel.org> Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Reviewed-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2026-02-21Convert 'alloc_obj' family to use the new default GFP_KERNEL argumentLinus Torvalds-4/+4
This was done entirely with mindless brute force, using git grep -l '\<k[vmz]*alloc_objs*(.*, GFP_KERNEL)' | xargs sed -i 's/\(alloc_objs*(.*\), GFP_KERNEL)/\1)/' to convert the new alloc_obj() users that had a simple GFP_KERNEL argument to just drop that argument. Note that due to the extreme simplicity of the scripting, any slightly more complex cases spread over multiple lines would not be triggered: they definitely exist, but this covers the vast bulk of the cases, and the resulting diff is also then easier to check automatically. For the same reason the 'flex' versions will be done as a separate conversion. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2026-02-21treewide: Replace kmalloc with kmalloc_obj for non-scalar typesKees Cook-4/+4
This is the result of running the Coccinelle script from scripts/coccinelle/api/kmalloc_objs.cocci. The script is designed to avoid scalar types (which need careful case-by-case checking), and instead replace kmalloc-family calls that allocate struct or union object instances: Single allocations: kmalloc(sizeof(TYPE), ...) are replaced with: kmalloc_obj(TYPE, ...) Array allocations: kmalloc_array(COUNT, sizeof(TYPE), ...) are replaced with: kmalloc_objs(TYPE, COUNT, ...) Flex array allocations: kmalloc(struct_size(PTR, FAM, COUNT), ...) are replaced with: kmalloc_flex(*PTR, FAM, COUNT, ...) (where TYPE may also be *VAR) The resulting allocations no longer return "void *", instead returning "TYPE *". Signed-off-by: Kees Cook <kees@kernel.org>
2026-02-10Merge tag 'x86_cleanups_for_v7.0_rc1' of ↵Linus Torvalds-2/+3
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 cleanups from Borislav Petkov: - The usual set of cleanups and simplifications all over the tree * tag 'x86_cleanups_for_v7.0_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/segment: Use MOVL when reading segment registers selftests/x86: Clean up sysret_rip coding style x86/mm: Hide mm_free_global_asid() definition under CONFIG_BROADCAST_TLB_FLUSH x86/crash: Use set_memory_p() instead of __set_memory_prot() x86/CPU/AMD: Simplify the spectral chicken fix x86/platform/olpc: Replace strcpy() with strscpy() in xo15_sci_add() x86/split_lock: Remove dead string when split_lock_detect=fatal
2026-02-10Merge tag 'x86-boot-2026-02-09' of ↵Linus Torvalds-2/+1
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86/boot updates from Ingo Molnar: - x86/acpi: Add acpi=spcr to use SPCR-provided default console (Shenghao Yang) - x86/acpi/boot: Correct the acpi_is_processor_usable() check again (Yazen Ghannam) - Refresh the x86 memory map (e820 table) handling code, and make the printouts a bit more informative (Ingo Molnar) * tag 'x86-boot-2026-02-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (30 commits) x86/acpi: Add acpi=spcr to use SPCR-provided default console x86/boot/e820: Use <linux/sizes.h> symbols for literals x86/boot/e820: Make sure e820_search_gap() finds all gaps x86/boot/e820: Simplify the e820__range_remove() API x86/boot/e820: Remove e820__range_remove()'s unused return parameter x86/boot/e820: Simplify append_e820_table() and remove restriction on single-entry tables x86/boot/e820: Standardize __init/__initdata tag placement x86/boot/e820: Simplify & clarify __e820__range_add() a bit x86/boot/e820: Rename gap_start/gap_size to max_gap_start/max_gap_start in e820_search_gap() et al x86/boot/e820: Change e820_search_gap() to search for the highest-address PCI gap x86/boot/e820: Clean up e820__setup_pci_gap()/e820_search_gap() a bit x86/boot/e820: Change struct e820_table::nr_entries type from __u32 to u32 x86/boot/e820: Standardize e820 table index variable types under 'u32' x86/boot/e820: Standardize e820 table index variable names under 'idx' x86/boot/e820: Remove unnecessary header inclusions x86/boot/e820: Clean up __refdata use a bit x86/boot/e820: Clean up __e820__range_add() a bit x86/boot/e820: Improve e820_print_type() messages x86/boot/e820: Clean up confusing and self-contradictory verbiage around E820 related resource allocations x86/boot/e820: Remove pointless early_panic() indirection ...
2026-01-12x86/xen/pvh: Enable PAE mode for 32-bit guest only when CONFIG_X86_PAE is setHou Wenlong-0/+2
The PVH entry is available for 32-bit KVM guests, and 32-bit KVM guests do not depend on CONFIG_X86_PAE. However, mk_early_pgtbl_32() builds different pagetables depending on whether CONFIG_X86_PAE is set. Therefore, enabling PAE mode for 32-bit KVM guests without CONFIG_X86_PAE being set would result in a boot failure during CR3 loading. Signed-off-by: Hou Wenlong <houwenlong.hwl@antgroup.com> Reviewed-by: Juergen Gross <jgross@suse.com> Signed-off-by: Juergen Gross <jgross@suse.com> Message-ID: <d09ce9a134eb9cbc16928a5b316969f8ba606b81.1768017442.git.houwenlong.hwl@antgroup.com>
2026-01-05x86/platform/olpc: Replace strcpy() with strscpy() in xo15_sci_add()Thorsten Blum-2/+3
strcpy() has been deprecated¹ because it performs no bounds checking on the destination buffer, which can lead to buffer overflows. Use the safer strscpy() instead. ¹ https://www.kernel.org/doc/html/latest/process/deprecated.html#strcpy Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://patch.msgid.link/20251124125455.5495-2-thorsten.blum@linux.dev
2025-12-14x86/boot/e820: Simplify the e820__range_remove() APIIngo Molnar-2/+1
Right now e820__range_remove() has two parameters to control the E820 type of the range removed: extern void e820__range_remove(u64 start, u64 size, enum e820_type old_type, bool check_type); Since E820 types start at 1, zero has a natural meaning of 'no type. Consolidate the (old_type,check_type) parameters into a single (filter_type) parameter: extern void e820__range_remove(u64 start, u64 size, enum e820_type filter_type); Note that both e820__mapped_raw_any() and e820__mapped_any() already have such semantics for their 'type' parameter, although it's currently not used with '0' by in-kernel code. Also, the __e820__mapped_all() internal helper already has such semantics implemented as well, and the e820__get_entry_type() API uses the '0' type to such effect. This simplifies not just e820__range_remove(), and synchronizes its use of type filters with other E820 API functions, but simplifies usage sites as well, such as parse_memmap_one(), beyond the reduction of the number of parameters: - else if (from) - e820__range_remove(start_at, mem_size, from, 1); else - e820__range_remove(start_at, mem_size, 0, 0); + e820__range_remove(start_at, mem_size, from); The generated code gets smaller as well: add/remove: 0/0 grow/shrink: 0/5 up/down: 0/-66 (-66) Function old new delta parse_memopt 112 107 -5 efi_init 1048 1039 -9 setup_arch 2719 2709 -10 e820__range_remove 283 273 -10 parse_memmap_opt 559 527 -32 Total: Before=22,675,600, After=22,675,534, chg -0.00% Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: H . Peter Anvin <hpa@zytor.com> Cc: Andy Shevchenko <andy@kernel.org> Cc: Arnd Bergmann <arnd@kernel.org> Cc: David Woodhouse <dwmw@amazon.co.uk> Cc: Juergen Gross <jgross@suse.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Rapoport <rppt@kernel.org> Cc: Paul Menzel <pmenzel@molgen.mpg.de> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://patch.msgid.link/20250515120549.2820541-28-mingo@kernel.org
2025-10-11Merge tag 'x86_core_for_v6.18_rc1' of ↵Linus Torvalds-0/+4
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull more x86 updates from Borislav Petkov: - Remove a bunch of asm implementing condition flags testing in KVM's emulator in favor of int3_emulate_jcc() which is written in C - Replace KVM fastops with C-based stubs which avoids problems with the fastop infra related to latter not adhering to the C ABI due to their special calling convention and, more importantly, bypassing compiler control-flow integrity checking because they're written in asm - Remove wrongly used static branches and other ugliness accumulated over time in hyperv's hypercall implementation with a proper static function call to the correct hypervisor call variant - Add some fixes and modifications to allow running FRED-enabled kernels in KVM even on non-FRED hardware - Add kCFI improvements like validating indirect calls and prepare for enabling kCFI with GCC. Add cmdline params documentation and other code cleanups - Use the single-byte 0xd6 insn as the official #UD single-byte undefined opcode instruction as agreed upon by both x86 vendors - Other smaller cleanups and touchups all over the place * tag 'x86_core_for_v6.18_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (24 commits) x86,retpoline: Optimize patch_retpoline() x86,ibt: Use UDB instead of 0xEA x86/cfi: Remove __noinitretpoline and __noretpoline x86/cfi: Add "debug" option to "cfi=" bootparam x86/cfi: Standardize on common "CFI:" prefix for CFI reports x86/cfi: Document the "cfi=" bootparam options x86/traps: Clarify KCFI instruction layout compiler_types.h: Move __nocfi out of compiler-specific header objtool: Validate kCFI calls x86/fred: KVM: VMX: Always use FRED for IRQs when CONFIG_X86_FRED=y x86/fred: Play nice with invoking asm_fred_entry_from_kvm() on non-FRED hardware x86/fred: Install system vector handlers even if FRED isn't fully enabled x86/hyperv: Use direct call to hypercall-page x86/hyperv: Clean up hv_do_hypercall() KVM: x86: Remove fastops KVM: x86: Convert em_salc() to C KVM: x86: Introduce EM_ASM_3WCL KVM: x86: Introduce EM_ASM_1SRC2 KVM: x86: Introduce EM_ASM_2CL KVM: x86: Introduce EM_ASM_2W ...
2025-10-02Merge tag 'mm-stable-2025-10-01-19-00' of ↵Linus Torvalds-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull MM updates from Andrew Morton: - "mm, swap: improve cluster scan strategy" from Kairui Song improves performance and reduces the failure rate of swap cluster allocation - "support large align and nid in Rust allocators" from Vitaly Wool permits Rust allocators to set NUMA node and large alignment when perforning slub and vmalloc reallocs - "mm/damon/vaddr: support stat-purpose DAMOS" from Yueyang Pan extend DAMOS_STAT's handling of the DAMON operations sets for virtual address spaces for ops-level DAMOS filters - "execute PROCMAP_QUERY ioctl under per-vma lock" from Suren Baghdasaryan reduces mmap_lock contention during reads of /proc/pid/maps - "mm/mincore: minor clean up for swap cache checking" from Kairui Song performs some cleanup in the swap code - "mm: vm_normal_page*() improvements" from David Hildenbrand provides code cleanup in the pagemap code - "add persistent huge zero folio support" from Pankaj Raghav provides a block layer speedup by optionalls making the huge_zero_pagepersistent, instead of releasing it when its refcount falls to zero - "kho: fixes and cleanups" from Mike Rapoport adds a few touchups to the recently added Kexec Handover feature - "mm: make mm->flags a bitmap and 64-bit on all arches" from Lorenzo Stoakes turns mm_struct.flags into a bitmap. To end the constant struggle with space shortage on 32-bit conflicting with 64-bit's needs - "mm/swapfile.c and swap.h cleanup" from Chris Li cleans up some swap code - "selftests/mm: Fix false positives and skip unsupported tests" from Donet Tom fixes a few things in our selftests code - "prctl: extend PR_SET_THP_DISABLE to only provide THPs when advised" from David Hildenbrand "allows individual processes to opt-out of THP=always into THP=madvise, without affecting other workloads on the system". It's a long story - the [1/N] changelog spells out the considerations - "Add and use memdesc_flags_t" from Matthew Wilcox gets us started on the memdesc project. Please see https://kernelnewbies.org/MatthewWilcox/Memdescs and https://blogs.oracle.com/linux/post/introducing-memdesc - "Tiny optimization for large read operations" from Chi Zhiling improves the efficiency of the pagecache read path - "Better split_huge_page_test result check" from Zi Yan improves our folio splitting selftest code - "test that rmap behaves as expected" from Wei Yang adds some rmap selftests - "remove write_cache_pages()" from Christoph Hellwig removes that function and converts its two remaining callers - "selftests/mm: uffd-stress fixes" from Dev Jain fixes some UFFD selftests issues - "introduce kernel file mapped folios" from Boris Burkov introduces the concept of "kernel file pages". Using these permits btrfs to account its metadata pages to the root cgroup, rather than to the cgroups of random inappropriate tasks - "mm/pageblock: improve readability of some pageblock handling" from Wei Yang provides some readability improvements to the page allocator code - "mm/damon: support ARM32 with LPAE" from SeongJae Park teaches DAMON to understand arm32 highmem - "tools: testing: Use existing atomic.h for vma/maple tests" from Brendan Jackman performs some code cleanups and deduplication under tools/testing/ - "maple_tree: Fix testing for 32bit compiles" from Liam Howlett fixes a couple of 32-bit issues in tools/testing/radix-tree.c - "kasan: unify kasan_enabled() and remove arch-specific implementations" from Sabyrzhan Tasbolatov moves KASAN arch-specific initialization code into a common arch-neutral implementation - "mm: remove zpool" from Johannes Weiner removes zspool - an indirection layer which now only redirects to a single thing (zsmalloc) - "mm: task_stack: Stack handling cleanups" from Pasha Tatashin makes a couple of cleanups in the fork code - "mm: remove nth_page()" from David Hildenbrand makes rather a lot of adjustments at various nth_page() callsites, eventually permitting the removal of that undesirable helper function - "introduce kasan.write_only option in hw-tags" from Yeoreum Yun creates a KASAN read-only mode for ARM, using that architecture's memory tagging feature. It is felt that a read-only mode KASAN is suitable for use in production systems rather than debug-only - "mm: hugetlb: cleanup hugetlb folio allocation" from Kefeng Wang does some tidying in the hugetlb folio allocation code - "mm: establish const-correctness for pointer parameters" from Max Kellermann makes quite a number of the MM API functions more accurate about the constness of their arguments. This was getting in the way of subsystems (in this case CEPH) when they attempt to improving their own const/non-const accuracy - "Cleanup free_pages() misuse" from Vishal Moola fixes a number of code sites which were confused over when to use free_pages() vs __free_pages() - "Add Rust abstraction for Maple Trees" from Alice Ryhl makes the mapletree code accessible to Rust. Required by nouveau and by its forthcoming successor: the new Rust Nova driver - "selftests/mm: split_huge_page_test: split_pte_mapped_thp improvements" from David Hildenbrand adds a fix and some cleanups to the thp selftesting code - "mm, swap: introduce swap table as swap cache (phase I)" from Chris Li and Kairui Song is the first step along the path to implementing "swap tables" - a new approach to swap allocation and state tracking which is expected to yield speed and space improvements. This patchset itself yields a 5-20% performance benefit in some situations - "Some ptdesc cleanups" from Matthew Wilcox utilizes the new memdesc layer to clean up the ptdesc code a little - "Fix va_high_addr_switch.sh test failure" from Chunyu Hu fixes some issues in our 5-level pagetable selftesting code - "Minor fixes for memory allocation profiling" from Suren Baghdasaryan addresses a couple of minor issues in relatively new memory allocation profiling feature - "Small cleanups" from Matthew Wilcox has a few cleanups in preparation for more memdesc work - "mm/damon: add addr_unit for DAMON_LRU_SORT and DAMON_RECLAIM" from Quanmin Yan makes some changes to DAMON in furtherance of supporting arm highmem - "selftests/mm: Add -Wunreachable-code and fix warnings" from Muhammad Anjum adds that compiler check to selftests code and fixes the fallout, by removing dead code - "Improvements to Victim Process Thawing and OOM Reaper Traversal Order" from zhongjinji makes a number of improvements in the OOM killer: mainly thawing a more appropriate group of victim threads so they can release resources - "mm/damon: misc fixups and improvements for 6.18" from SeongJae Park is a bunch of small and unrelated fixups for DAMON - "mm/damon: define and use DAMON initialization check function" from SeongJae Park implement reliability and maintainability improvements to a recently-added bug fix - "mm/damon/stat: expose auto-tuned intervals and non-idle ages" from SeongJae Park provides additional transparency to userspace clients of the DAMON_STAT information - "Expand scope of khugepaged anonymous collapse" from Dev Jain removes some constraints on khubepaged's collapsing of anon VMAs. It also increases the success rate of MADV_COLLAPSE against an anon vma - "mm: do not assume file == vma->vm_file in compat_vma_mmap_prepare()" from Lorenzo Stoakes moves us further towards removal of file_operations.mmap(). This patchset concentrates upon clearing up the treatment of stacked filesystems - "mm: Improve mlock tracking for large folios" from Kiryl Shutsemau provides some fixes and improvements to mlock's tracking of large folios. /proc/meminfo's "Mlocked" field became more accurate - "mm/ksm: Fix incorrect accounting of KSM counters during fork" from Donet Tom fixes several user-visible KSM stats inaccuracies across forks and adds selftest code to verify these counters - "mm_slot: fix the usage of mm_slot_entry" from Wei Yang addresses some potential but presently benign issues in KSM's mm_slot handling * tag 'mm-stable-2025-10-01-19-00' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (372 commits) mm: swap: check for stable address space before operating on the VMA mm: convert folio_page() back to a macro mm/khugepaged: use start_addr/addr for improved readability hugetlbfs: skip VMAs without shareable locks in hugetlb_vmdelete_list alloc_tag: fix boot failure due to NULL pointer dereference mm: silence data-race in update_hiwater_rss mm/memory-failure: don't select MEMORY_ISOLATION mm/khugepaged: remove definition of struct khugepaged_mm_slot mm/ksm: get mm_slot by mm_slot_entry() when slot is !NULL hugetlb: increase number of reserving hugepages via cmdline selftests/mm: add fork inheritance test for ksm_merging_pages counter mm/ksm: fix incorrect KSM counter handling in mm_struct during fork drivers/base/node: fix double free in register_one_node() mm: remove PMD alignment constraint in execmem_vmalloc() mm/memory_hotplug: fix typo 'esecially' -> 'especially' mm/rmap: improve mlock tracking for large folios mm/filemap: map entire large folio faultaround mm/fault: try to map the entire file folio in finish_fault() mm/rmap: mlock large folios in try_to_unmap_one() mm/rmap: fix a mlock race condition in folio_referenced_one() ...
2025-09-21x86: stop calling page_address() in free_pages()Vishal Moola (Oracle)-1/+1
free_pages() should be used when we only have a virtual address. We should call __free_pages() directly on our page instead. Link: https://lkml.kernel.org/r/20250903185921.1785167-4-vishal.moola@gmail.com Signed-off-by: Vishal Moola (Oracle) <vishal.moola@gmail.com> Acked-by: Dave Hansen <dave.hansen@linux.intel.com> Acked-by: David Hildenbrand <david@redhat.com> Acked-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Andy Lutomirski <luto@kernel.org> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Justin Sanders <justin@coraid.com> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ritesh Harjani (IBM) <ritesh.list@gmail.com> Cc: SeongJae Park <sj@kernel.org> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-09-03x86/boot: Move startup code out of __head sectionArd Biesheuvel-1/+1
Move startup code out of the __head section, now that this no longer has a special significance. Move everything into .text or .init.text as appropriate, so that startup code is not kept around unnecessarily. [ bp: Fold in hunk to fix 32-bit CPU hotplug: Reported-by: kernel test robot <oliver.sang@intel.com> Closes: https://lore.kernel.org/oe-lkp/202509022207.56fd97f4-lkp@intel.com ] Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/20250828102202.1849035-45-ardb+git@google.com
2025-08-18objtool: Validate kCFI callsPeter Zijlstra-0/+4
Validate that all indirect calls adhere to kCFI rules. Notably doing nocfi indirect call to a cfi function is broken. Apparently some Rust 'core' code violates this and explodes when ran with FineIBT. All the ANNOTATE_NOCFI_SYM sites are prime targets for attackers. - runtime EFI is especially henous because it also needs to disable IBT. Basically calling unknown code without CFI protection at runtime is a massice security issue. - Kexec image handover; if you can exploit this, you get to keep it :-) Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Josh Poimboeuf <jpoimboe@kernel.org> Acked-by: Sean Christopherson <seanjc@google.com> Link: https://lkml.kernel.org/r/20250714103441.496787279@infradead.org
2025-07-29Merge tag 'x86_sev_for_v6.17_rc1' of ↵Linus Torvalds-2/+2
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 SEV updates from Borislav Petkov: - Map the SNP calling area pages too so that OVMF EFI fw can issue SVSM calls properly with the goal of implementing EFI variable store in the SVSM - a component which is trusted by the guest, vs in the firmware, which is not - Allow the kernel to handle #VC exceptions from EFI runtime services properly when running as a SNP guest - Rework and cleanup the SNP guest request issue glue code a bit * tag 'x86_sev_for_v6.17_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/sev: Let sev_es_efi_map_ghcbs() map the CA pages too x86/sev/vc: Fix EFI runtime instruction emulation x86/sev: Drop unnecessary parameter in snp_issue_guest_request() x86/sev: Document requirement for linear mapping of guest request buffers x86/sev: Allocate request in TSC_INFO_REQ on stack virt: sev-guest: Contain snp_guest_request_ioctl in sev-guest
2025-06-29serial: 8250: Move CE4100 quirks to a module under 8250 driverAndy Shevchenko-98/+0
There is inconvenient for maintainers and maintainership to have some quirks under architectural code. Move it to the specific quirk file like other 8250-compatible drivers do. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Link: https://lore.kernel.org/r/20250627182743.1273326-1-andriy.shevchenko@linux.intel.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-06-27x86/sev: Let sev_es_efi_map_ghcbs() map the CA pages tooGerd Hoffmann-2/+2
OVMF EFI firmware needs access to the CA page to do SVSM protocol calls. For example, when the SVSM implements an EFI variable store, such calls will be necessary. So add that to sev_es_efi_map_ghcbs() and also rename the function to reflect the additional job it is doing now. [ bp: Massage. ] Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/20250626114014.373748-4-kraxel@redhat.com
2025-06-24serial: ce4100: clean up serial_in/out() hooksJiri Slaby (SUSE)-18/+21
ce4100_mem_serial_in() unnecessarily contains 4 nested 'if's. That makes the code hard to follow. Invert the conditions and return early if the particular conditions do not hold. And use "<<=" for shifting the offset in both of the hooks. Signed-off-by: "Jiri Slaby (SUSE)" <jirislaby@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: x86@kernel.org Cc: "H. Peter Anvin" <hpa@zytor.com> Link: https://lore.kernel.org/r/20250623101246.486866-2-jirislaby@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-06-24serial: ce4100: fix build after serial_in/out() changesJiri Slaby (SUSE)-3/+3
This x86_32 driver remain unnoticed, so after the commit below, the compilation now fails with: arch/x86/platform/ce4100/ce4100.c:107:16: error: incompatible function pointer types assigning to '...' from '...' To fix the error, convert also ce4100 to the new uart_port::serial_{in,out}() types. Signed-off-by: "Jiri Slaby (SUSE)" <jirislaby@kernel.org> Fixes: fc9ceb501e38 ("serial: 8250: sanitize uart_port::serial_{in,out}() types") Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202506190552.TqNasrC3-lkp@intel.com/ Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: x86@kernel.org Cc: "H. Peter Anvin" <hpa@zytor.com> Link: https://lore.kernel.org/r/20250623101246.486866-1-jirislaby@kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-05-13Merge branch 'x86/msr' into x86/core, to resolve conflictsIngo Molnar-4/+4
Conflicts: arch/x86/boot/startup/sme.c arch/x86/coco/sev/core.c arch/x86/kernel/fpu/core.c arch/x86/kernel/fpu/xstate.c Semantic conflict: arch/x86/include/asm/sev-internal.h Signed-off-by: Ingo Molnar <mingo@kernel.org>
2025-05-13Merge branch 'x86/asm' into x86/core, to merge dependent commitsIngo Molnar-2/+1
Prepare to resolve conflicts with an upstream series of fixes that conflict with pending x86 changes: 6f5bf947bab0 Merge tag 'its-for-linus-20250509' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Signed-off-by: Ingo Molnar <mingo@kernel.org>
2025-05-13Merge branch 'x86/alternatives' into x86/core, to merge dependent commitsIngo Molnar-5/+3
Prepare to resolve conflicts with an upstream series of fixes that conflict with pending x86 changes: 6f5bf947bab0 Merge tag 'its-for-linus-20250509' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Signed-off-by: Ingo Molnar <mingo@kernel.org>
2025-05-06Merge tag 'v6.15-rc4' into x86/asm, to pick up fixesIngo Molnar-2/+2
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2025-05-06x86/mm: Fix false positive warning in switch_mm_irqs_off()Peter Zijlstra-0/+1
Multiple testers reported the following new warning: WARNING: CPU: 0 PID: 0 at arch/x86/mm/tlb.c:795 Which corresponds to: if (IS_ENABLED(CONFIG_DEBUG_VM) && WARN_ON_ONCE(prev != &init_mm && !cpumask_test_cpu(cpu, mm_cpumask(next)))) cpumask_set_cpu(cpu, mm_cpumask(next)); So the problem is that unuse_temporary_mm() explicitly clears that bit; and it has to, because otherwise the flush_tlb_mm_range() in __text_poke() will try sending IPIs, which are not at all needed. See also: https://lore.kernel.org/all/20241113095550.GBZzR3pg-RhJKPDazS@fat_crate.local/ Notably, the whole {,un}use_temporary_mm() thing requires preemption to be disabled across it with the express purpose of keeping all TLB nonsense CPU local, such that invalidations can also stay local etc. However, as a side-effect, we violate this above WARN(), which sorta makes sense for the normal case, but very much doesn't make sense here. Change unuse_temporary_mm() to mark the mm_struct such that a further exception (beyond init_mm) can be grafted, to keep the warning for all the other cases. Reported-by: Chaitanya Kumar Borah <chaitanya.kumar.borah@intel.com> Reported-by: Jani Nikula <jani.nikula@linux.intel.com> Signed-off-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Andrew Cooper <andrew.cooper3@citrix.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Brian Gerst <brgerst@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Juergen Gross <jgross@suse.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Rik van Riel <riel@surriel.com> Link: https://lore.kernel.org/r/20250430081154.GH4439@noisy.programming.kicks-ass.net
2025-05-02Merge tag 'v6.15-rc4' into x86/msr, to pick up fixes and resolve conflictsIngo Molnar-2/+2
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2025-04-23x86/mm: Fix _pgd_alloc() for Xen PV modeJuergen Gross-2/+2
Recently _pgd_alloc() was switched from using __get_free_pages() to pagetable_alloc_noprof(), which might return a compound page in case the allocation order is larger than 0. On x86 this will be the case if CONFIG_MITIGATION_PAGE_TABLE_ISOLATION is set, even if PTI has been disabled at runtime. When running as a Xen PV guest (this will always disable PTI), using a compound page for a PGD will result in VM_BUG_ON_PGFLAGS being triggered when the Xen code tries to pin the PGD. Fix the Xen issue together with the not needed 8k allocation for a PGD with PTI disabled by replacing PGD_ALLOCATION_ORDER with an inline helper returning the needed order for PGD allocations. Fixes: a9b3c355c2e6 ("asm-generic: pgalloc: provide generic __pgd_{alloc,free}") Reported-by: Petr Vaněk <arkamar@atlas.cz> Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Tested-by: Petr Vaněk <arkamar@atlas.cz> Cc:stable@vger.kernel.org Link: https://lore.kernel.org/all/20250422131717.25724-1-jgross%40suse.com
2025-04-18x86/asm: Remove semicolon from "rep" prefixesUros Bizjak-2/+1
Minimum version of binutils required to compile the kernel is 2.25. This version correctly handles the "rep" prefixes, so it is possible to remove the semicolon, which was used to support ancient versions of GNU as. Due to the semicolon, the compiler considers "rep; insn" (or its alternate "rep\n\tinsn" form) as two separate instructions. Removing the semicolon makes asm length calculations more accurate, consequently making scheduling and inlining decisions of the compiler more accurate. Removing the semicolon also enables assembler checks involving "rep" prefixes. Trying to assemble e.g. "rep addl %eax, %ebx" results in: Error: invalid instruction `add' after `rep' Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Brian Gerst <brgerst@gmail.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Pavel Machek <pavel@kernel.org> Cc: Rafael J. Wysocki <rafael@kernel.org> Link: https://lore.kernel.org/r/20250418071437.4144391-2-ubizjak@gmail.com
2025-04-12x86/efi: Make efi_enter/leave_mm() use the use_/unuse_temporary_mm() machineryAndy Lutomirski-5/+2
This should be considerably more robust. It's also necessary for optimized for_each_possible_lazymm_cpu() on x86 -- without this patch, EFI calls in lazy context would remove the lazy mm from mm_cpumask(). [ mingo: Merged it on top of x86/alternatives ] Signed-off-by: Andy Lutomirski <luto@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Rik van Riel <riel@surriel.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Andrew Morton <akpm@linux-foundation.org> Link: https://lore.kernel.org/r/20250402094540.3586683-7-mingo@kernel.org
2025-04-10x86/msr: Rename 'wrmsrl()' to 'wrmsrq()'Ingo Molnar-1/+1
Suggested-by: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Juergen Gross <jgross@suse.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Xin Li <xin@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org>
2025-04-10x86/msr: Rename 'rdmsrl()' to 'rdmsrq()'Ingo Molnar-3/+3
Suggested-by: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Juergen Gross <jgross@suse.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Xin Li <xin@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org>
2025-04-01x86/platform/iosf_mbi: Remove unused ↵Dr. David Alan Gilbert-13/+0
iosf_mbi_unregister_pmic_bus_access_notifier() The last use of iosf_mbi_unregister_pmic_bus_access_notifier() was removed in 2017 by: a5266db4d314 ("drm/i915: Acquire PUNIT->PMIC bus for intel_uncore_forcewake_reset()") Remove it. (Note that the '_unlocked' version is still used.) Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Tvrtko Ursulin <tursulin@ursulin.net> Cc: David Airlie <airlied@gmail.com> Cc: Simona Vetter <simona@ffwll.ch> Cc: intel-gfx@lists.freedesktop.org Cc: dri-devel@lists.freedesktop.org Link: https://lore.kernel.org/r/20241225175010.91783-1-linux@treblig.org
2025-03-24Merge tag 'x86-platform-2025-03-22' of ↵Linus Torvalds-3/+1
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 platform updates from Ingo Molnar: "Two small cleanups in the x86 platform support code" * tag 'x86-platform-2025-03-22' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/platform/olpc: Remove unused variable 'len' in olpc_dt_compatible_match() x86/platform/olpc-xo1-sci: Don't include <linux/pm_wakeup.h> directly
2025-03-06x86/platform/olpc: Remove unused variable 'len' in olpc_dt_compatible_match()Zeng Heng-2/+1
The following build warning highlights some unused code: arch/x86/platform/olpc/olpc_dt.c: In function ‘olpc_dt_compatible_match’: arch/x86/platform/olpc/olpc_dt.c:222:12: warning: variable ‘len’ set but not used [-Wunused-but-set-variable] The compiler is right, the local variable 'len' is set but never used, so remove it. Fixes: a7a9bacb9a32 ("x86/platform/olpc: Use a correct version when making up a battery node") Signed-off-by: Zeng Heng <zengheng4@huawei.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20241025074203.1921344-1-zengheng4@huawei.com
2025-02-21x86/platform/olpc-xo1-sci: Don't include <linux/pm_wakeup.h> directlyWolfram Sang-1/+0
The header clearly states that it does not want to be included directly, only via <linux/(platform_)?device.h>. Which is already present, so delete the superfluous include. Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20250210113453.51825-2-wsa+renesas@sang-engineering.com
2025-02-18x86/percpu/64: Use relative percpu offsetsBrian Gerst-3/+2
The percpu section is currently linked at absolute address 0, because older compilers hard-coded the stack protector canary value at a fixed offset from the start of the GS segment. Now that the canary is a normal percpu variable, the percpu section does not need to be linked at a specific address. x86-64 will now calculate the percpu offsets as the delta between the initial percpu address and the dynamically allocated memory, like other architectures. Note that GSBASE is limited to the canonical address width (48 or 57 bits, sign-extended). As long as the kernel text, modules, and the dynamically allocated percpu memory are all in the negative address space, the delta will not overflow this limit. Signed-off-by: Brian Gerst <brgerst@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Reviewed-by: Uros Bizjak <ubizjak@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/20250123190747.745588-9-brgerst@gmail.com
2025-02-18x86/pvh: Use fixed_percpu_data for early boot GSBASEBrian Gerst-6/+9
Instead of having a private area for the stack canary, use fixed_percpu_data for GSBASE like the native kernel. Signed-off-by: Brian Gerst <brgerst@gmail.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://lore.kernel.org/r/20250123190747.745588-5-brgerst@gmail.com
2025-01-26Merge tag 'mm-stable-2025-01-26-14-59' of ↵Linus Torvalds-5/+1
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull MM updates from Andrew Morton: "The various patchsets are summarized below. Plus of course many indivudual patches which are described in their changelogs. - "Allocate and free frozen pages" from Matthew Wilcox reorganizes the page allocator so we end up with the ability to allocate and free zero-refcount pages. So that callers (ie, slab) can avoid a refcount inc & dec - "Support large folios for tmpfs" from Baolin Wang teaches tmpfs to use large folios other than PMD-sized ones - "Fix mm/rodata_test" from Petr Tesarik performs some maintenance and fixes for this small built-in kernel selftest - "mas_anode_descend() related cleanup" from Wei Yang tidies up part of the mapletree code - "mm: fix format issues and param types" from Keren Sun implements a few minor code cleanups - "simplify split calculation" from Wei Yang provides a few fixes and a test for the mapletree code - "mm/vma: make more mmap logic userland testable" from Lorenzo Stoakes continues the work of moving vma-related code into the (relatively) new mm/vma.c - "mm/page_alloc: gfp flags cleanups for alloc_contig_*()" from David Hildenbrand cleans up and rationalizes handling of gfp flags in the page allocator - "readahead: Reintroduce fix for improper RA window sizing" from Jan Kara is a second attempt at fixing a readahead window sizing issue. It should reduce the amount of unnecessary reading - "synchronously scan and reclaim empty user PTE pages" from Qi Zheng addresses an issue where "huge" amounts of pte pagetables are accumulated: https://lore.kernel.org/lkml/cover.1718267194.git.zhengqi.arch@bytedance.com/ Qi's series addresses this windup by synchronously freeing PTE memory within the context of madvise(MADV_DONTNEED) - "selftest/mm: Remove warnings found by adding compiler flags" from Muhammad Usama Anjum fixes some build warnings in the selftests code when optional compiler warnings are enabled - "mm: don't use __GFP_HARDWALL when migrating remote pages" from David Hildenbrand tightens the allocator's observance of __GFP_HARDWALL - "pkeys kselftests improvements" from Kevin Brodsky implements various fixes and cleanups in the MM selftests code, mainly pertaining to the pkeys tests - "mm/damon: add sample modules" from SeongJae Park enhances DAMON to estimate application working set size - "memcg/hugetlb: Rework memcg hugetlb charging" from Joshua Hahn provides some cleanups to memcg's hugetlb charging logic - "mm/swap_cgroup: remove global swap cgroup lock" from Kairui Song removes the global swap cgroup lock. A speedup of 10% for a tmpfs-based kernel build was demonstrated - "zram: split page type read/write handling" from Sergey Senozhatsky has several fixes and cleaups for zram in the area of zram_write_page(). A watchdog softlockup warning was eliminated - "move pagetable_*_dtor() to __tlb_remove_table()" from Kevin Brodsky cleans up the pagetable destructor implementations. A rare use-after-free race is fixed - "mm/debug: introduce and use VM_WARN_ON_VMG()" from Lorenzo Stoakes simplifies and cleans up the debugging code in the VMA merging logic - "Account page tables at all levels" from Kevin Brodsky cleans up and regularizes the pagetable ctor/dtor handling. This results in improvements in accounting accuracy - "mm/damon: replace most damon_callback usages in sysfs with new core functions" from SeongJae Park cleans up and generalizes DAMON's sysfs file interface logic - "mm/damon: enable page level properties based monitoring" from SeongJae Park increases the amount of information which is presented in response to DAMOS actions - "mm/damon: remove DAMON debugfs interface" from SeongJae Park removes DAMON's long-deprecated debugfs interfaces. Thus the migration to sysfs is completed - "mm/hugetlb: Refactor hugetlb allocation resv accounting" from Peter Xu cleans up and generalizes the hugetlb reservation accounting - "mm: alloc_pages_bulk: small API refactor" from Luiz Capitulino removes a never-used feature of the alloc_pages_bulk() interface - "mm/damon: extend DAMOS filters for inclusion" from SeongJae Park extends DAMOS filters to support not only exclusion (rejecting), but also inclusion (allowing) behavior - "Add zpdesc memory descriptor for zswap.zpool" from Alex Shi introduces a new memory descriptor for zswap.zpool that currently overlaps with struct page for now. This is part of the effort to reduce the size of struct page and to enable dynamic allocation of memory descriptors - "mm, swap: rework of swap allocator locks" from Kairui Song redoes and simplifies the swap allocator locking. A speedup of 400% was demonstrated for one workload. As was a 35% reduction for kernel build time with swap-on-zram - "mm: update mips to use do_mmap(), make mmap_region() internal" from Lorenzo Stoakes reworks MIPS's use of mmap_region() so that mmap_region() can be made MM-internal - "mm/mglru: performance optimizations" from Yu Zhao fixes a few MGLRU regressions and otherwise improves MGLRU performance - "Docs/mm/damon: add tuning guide and misc updates" from SeongJae Park updates DAMON documentation - "Cleanup for memfd_create()" from Isaac Manjarres does that thing - "mm: hugetlb+THP folio and migration cleanups" from David Hildenbrand provides various cleanups in the areas of hugetlb folios, THP folios and migration - "Uncached buffered IO" from Jens Axboe implements the new RWF_DONTCACHE flag which provides synchronous dropbehind for pagecache reading and writing. To permite userspace to address issues with massive buildup of useless pagecache when reading/writing fast devices - "selftests/mm: virtual_address_range: Reduce memory" from Thomas Weißschuh fixes and optimizes some of the MM selftests" * tag 'mm-stable-2025-01-26-14-59' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (321 commits) mm/compaction: fix UBSAN shift-out-of-bounds warning s390/mm: add missing ctor/dtor on page table upgrade kasan: sw_tags: use str_on_off() helper in kasan_init_sw_tags() tools: add VM_WARN_ON_VMG definition mm/damon/core: use str_high_low() helper in damos_wmark_wait_us() seqlock: add missing parameter documentation for raw_seqcount_try_begin() mm/page-writeback: consolidate wb_thresh bumping logic into __wb_calc_thresh mm/page_alloc: remove the incorrect and misleading comment zram: remove zcomp_stream_put() from write_incompressible_page() mm: separate move/undo parts from migrate_pages_batch() mm/kfence: use str_write_read() helper in get_access_type() selftests/mm/mkdirty: fix memory leak in test_uffdio_copy() kasan: hw_tags: Use str_on_off() helper in kasan_init_hw_tags() selftests/mm: virtual_address_range: avoid reading from VM_IO mappings selftests/mm: vm_util: split up /proc/self/smaps parsing selftests/mm: virtual_address_range: unmap chunks after validation selftests/mm: virtual_address_range: mmap() without PROT_WRITE selftests/memfd/memfd_test: fix possible NULL pointer dereference mm: add FGP_DONTCACHE folio creation flag mm: call filemap_fdatawrite_range_kick() after IOCB_DONTCACHE issue ...
2025-01-25mm/memblock: add memblock_alloc_or_panic interfaceGuo Weikang-5/+1
Before SLUB initialization, various subsystems used memblock_alloc to allocate memory. In most cases, when memory allocation fails, an immediate panic is required. To simplify this behavior and reduce repetitive checks, introduce `memblock_alloc_or_panic`. This function ensures that memory allocation failures result in a panic automatically, improving code readability and consistency across subsystems that require this behavior. [guoweikang.kernel@gmail.com: arch/s390: save_area_alloc default failure behavior changed to panic] Link: https://lkml.kernel.org/r/20250109033136.2845676-1-guoweikang.kernel@gmail.com Link: https://lore.kernel.org/lkml/Z2fknmnNtiZbCc7x@kernel.org/ Link: https://lkml.kernel.org/r/20250102072528.650926-1-guoweikang.kernel@gmail.com Signed-off-by: Guo Weikang <guoweikang.kernel@gmail.com> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> [m68k] Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com> [s390] Acked-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-01-24Merge tag 'efi-next-for-v6.14' of ↵Linus Torvalds-10/+5
git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi Pull EFI updates from Ard Biesheuvel: - Increase the headroom in the EFI memory map allocation created by the EFI stub. This is needed because event callbacks called during ExitBootServices() may cause fragmentation, and reallocation is not allowed after that. - Drop obsolete UGA graphics code and switch to a more ergonomic API to traverse handle buffers. Simplify some error paths using a __free() helper while at it. - Fix some W=1 warnings when CONFIG_EFI=n - Rely on the dentry cache to keep track of the contents of the efivarfs filesystem, rather than using a separate linked list. - Improve and extend efivarfs test cases. - Synchronize efivarfs with underlying variable store on resume from hibernation - this is needed because the firmware itself or another OS running on the same machine may have modified it. - Fix x86 EFI stub build with GCC 15. - Fix kexec/x86 false positive warning in EFI memory attributes table sanity check. * tag 'efi-next-for-v6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi: (23 commits) x86/efi: skip memattr table on kexec boot efivarfs: add variable resync after hibernation efivarfs: abstract initial variable creation routine efi: libstub: Use '-std=gnu11' to fix build with GCC 15 selftests/efivarfs: add concurrent update tests selftests/efivarfs: fix tests for failed write removal efivarfs: fix error on write to new variable leaving remnants efivarfs: remove unused efivarfs_list efivarfs: move variable lifetime management into the inodes selftests/efivarfs: add check for disallowing file truncation efivarfs: prevent setting of zero size on the inodes in the cache efi: sysfb_efi: fix W=1 warnings when EFI is not set efi/libstub: Use __free() helper for pool deallocations efi/libstub: Use cleanup helpers for freeing copies of the memory map efi/libstub: Simplify PCI I/O handle buffer traversal efi/libstub: Refactor and clean up GOP resolution picker code efi/libstub: Simplify GOP handling code efi/libstub: Use C99-style for loop to traverse handle buffer x86/efistub: Drop long obsolete UGA support efivarfs: make variable_is_present use dcache lookup ...