summaryrefslogtreecommitdiffstats
path: root/arch/um
AgeCommit message (Collapse)AuthorLines
2026-03-06Merge tag 'kbuild-fixes-7.0-2' of ↵Linus Torvalds-0/+2
git://git.kernel.org/pub/scm/linux/kernel/git/kbuild/linux Pull Kbuild fixes from Nathan Chancellor: - Split out .modinfo section from ELF_DETAILS macro, as that macro may be used in other areas that expect to discard .modinfo, breaking certain image layouts - Adjust genksyms parser to handle optional attributes in certain declarations, necessary after commit 07919126ecfc ("netfilter: annotate NAT helper hook pointers with __rcu") - Include resolve_btfids in external module build created by scripts/package/install-extmod-build when it may be run on external modules - Avoid removing objtool binary with 'make clean', as it is required for external module builds * tag 'kbuild-fixes-7.0-2' of git://git.kernel.org/pub/scm/linux/kernel/git/kbuild/linux: kbuild: Leave objtool binary around with 'make clean' kbuild: install-extmod-build: Package resolve_btfids if necessary genksyms: Fix parsing a declarator with a preceding attribute kbuild: Split .modinfo out from ELF_DETAILS
2026-02-26kbuild: Split .modinfo out from ELF_DETAILSNathan Chancellor-0/+2
Commit 3e86e4d74c04 ("kbuild: keep .modinfo section in vmlinux.unstripped") added .modinfo to ELF_DETAILS while removing it from COMMON_DISCARDS, as it was needed in vmlinux.unstripped and ELF_DETAILS was present in all architecture specific vmlinux linker scripts. While this shuffle is fine for vmlinux, ELF_DETAILS and COMMON_DISCARDS may be used by other linker scripts, such as the s390 and x86 compressed boot images, which may not expect to have a .modinfo section. In certain circumstances, this could result in a bootloader failing to load the compressed kernel [1]. Commit ddc6cbef3ef1 ("s390/boot/vmlinux.lds.S: Ensure bzImage ends with SecureBoot trailer") recently addressed this for the s390 bzImage but the same bug remains for arm, parisc, and x86. The presence of .modinfo in the x86 bzImage was the root cause of the issue worked around with commit d50f21091358 ("kbuild: align modinfo section for Secureboot Authenticode EDK2 compat"). misc.c in arch/x86/boot/compressed includes lib/decompress_unzstd.c, which in turn includes lib/xxhash.c and its MODULE_LICENSE / MODULE_DESCRIPTION macros due to the STATIC definition. Split .modinfo out from ELF_DETAILS into its own macro and handle it in all vmlinux linker scripts. Discard .modinfo in the places where it was previously being discarded from being in COMMON_DISCARDS, as it has never been necessary in those uses. Cc: stable@vger.kernel.org Fixes: 3e86e4d74c04 ("kbuild: keep .modinfo section in vmlinux.unstripped") Reported-by: Ed W <lists@wildgooses.com> Closes: https://lore.kernel.org/587f25e0-a80e-46a5-9f01-87cb40cfa377@wildgooses.com/ [1] Tested-by: Ed W <lists@wildgooses.com> # x86_64 Link: https://patch.msgid.link/20260225-separate-modinfo-from-elf-details-v1-1-387ced6baf4b@kernel.org Signed-off-by: Nathan Chancellor <nathan@kernel.org>
2026-02-23ubd: Use pointer-to-pointers for io_thread_req arraysKees Cook-5/+5
Having an unbounded array for irq_req_buffer and io_req_buffer doesn't provide any bounds safety, and confuses the needed allocation type, which is returning a pointer to pointers. Instead of the implicit cast, switch the variable types. Reported-by: Nathan Chancellor <nathan@kernel.org> Reported-by: Guenter Roeck <linux@roeck-us.net> Closes: https://lore.kernel.org/all/b04b6c13-7d0e-4a89-9e68-b572b6c686ac@roeck-us.net Fixes: 69050f8d6d07 ("treewide: Replace kmalloc with kmalloc_obj for non-scalar types") Acked-by: Richard Weinberger <richard@nod.at> Link: https://patch.msgid.link/20260223214341.work.846-kees@kernel.org Signed-off-by: Kees Cook <kees@kernel.org>
2026-02-22Convert remaining multi-line kmalloc_obj/flex GFP_KERNEL usesKees Cook-3/+3
Conversion performed via this Coccinelle script: // SPDX-License-Identifier: GPL-2.0-only // Options: --include-headers-for-types --all-includes --include-headers --keep-comments virtual patch @gfp depends on patch && !(file in "tools") && !(file in "samples")@ identifier ALLOC = {kmalloc_obj,kmalloc_objs,kmalloc_flex, kzalloc_obj,kzalloc_objs,kzalloc_flex, kvmalloc_obj,kvmalloc_objs,kvmalloc_flex, kvzalloc_obj,kvzalloc_objs,kvzalloc_flex}; @@ ALLOC(... - , GFP_KERNEL ) $ make coccicheck MODE=patch COCCI=gfp.cocci Build and boot tested x86_64 with Fedora 42's GCC and Clang: Linux version 6.19.0+ (user@host) (gcc (GCC) 15.2.1 20260123 (Red Hat 15.2.1-7), GNU ld version 2.44-12.fc42) #1 SMP PREEMPT_DYNAMIC 1970-01-01 Linux version 6.19.0+ (user@host) (clang version 20.1.8 (Fedora 20.1.8-4.fc42), LLD 20.1.8) #1 SMP PREEMPT_DYNAMIC 1970-01-01 Signed-off-by: Kees Cook <kees@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2026-02-21Convert more 'alloc_obj' cases to default GFP_KERNEL argumentsLinus Torvalds-4/+2
This converts some of the visually simpler cases that have been split over multiple lines. I only did the ones that are easy to verify the resulting diff by having just that final GFP_KERNEL argument on the next line. Somebody should probably do a proper coccinelle script for this, but for me the trivial script actually resulted in an assertion failure in the middle of the script. I probably had made it a bit _too_ trivial. So after fighting that far a while I decided to just do some of the syntactically simpler cases with variations of the previous 'sed' scripts. The more syntactically complex multi-line cases would mostly really want whitespace cleanup anyway. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2026-02-21Convert 'alloc_obj' family to use the new default GFP_KERNEL argumentLinus Torvalds-15/+15
This was done entirely with mindless brute force, using git grep -l '\<k[vmz]*alloc_objs*(.*, GFP_KERNEL)' | xargs sed -i 's/\(alloc_objs*(.*\), GFP_KERNEL)/\1)/' to convert the new alloc_obj() users that had a simple GFP_KERNEL argument to just drop that argument. Note that due to the extreme simplicity of the scripting, any slightly more complex cases spread over multiple lines would not be triggered: they definitely exist, but this covers the vast bulk of the cases, and the resulting diff is also then easier to check automatically. For the same reason the 'flex' versions will be done as a separate conversion. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2026-02-21treewide: Replace kmalloc with kmalloc_obj for non-scalar typesKees Cook-40/+30
This is the result of running the Coccinelle script from scripts/coccinelle/api/kmalloc_objs.cocci. The script is designed to avoid scalar types (which need careful case-by-case checking), and instead replace kmalloc-family calls that allocate struct or union object instances: Single allocations: kmalloc(sizeof(TYPE), ...) are replaced with: kmalloc_obj(TYPE, ...) Array allocations: kmalloc_array(COUNT, sizeof(TYPE), ...) are replaced with: kmalloc_objs(TYPE, COUNT, ...) Flex array allocations: kmalloc(struct_size(PTR, FAM, COUNT), ...) are replaced with: kmalloc_flex(*PTR, FAM, COUNT, ...) (where TYPE may also be *VAR) The resulting allocations no longer return "void *", instead returning "TYPE *". Signed-off-by: Kees Cook <kees@kernel.org>
2026-02-13Merge tag 'uml-for-7.0-rc1' of ↵Linus Torvalds-6/+57
git://git.kernel.org/pub/scm/linux/kernel/git/uml/linux Pull UML updates from Johannes Berg: "UML was _really_ quiet, with just four small commits: - two signal handling fixes - dynamic addition of virtio devices - a single code cleanup" * tag 'uml-for-7.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/uml/linux: arch/um: remove unused varible err in remove_files_and_dir() um: virtio_uml: Support adding devices via mconsole um: Handle SIGCHLD in seccomp mode like other IRQ signals um: Preserve errno within signal handler
2026-02-12Merge tag 'mm-nonmm-stable-2026-02-12-10-48' of ↵Linus Torvalds-0/+1
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull non-MM updates from Andrew Morton: - "ocfs2: give ocfs2 the ability to reclaim suballocator free bg" saves disk space by teaching ocfs2 to reclaim suballocator block group space (Heming Zhao) - "Add ARRAY_END(), and use it to fix off-by-one bugs" adds the ARRAY_END() macro and uses it in various places (Alejandro Colomar) - "vmcoreinfo: support VMCOREINFO_BYTES larger than PAGE_SIZE" makes the vmcore code future-safe, if VMCOREINFO_BYTES ever exceeds the page size (Pnina Feder) - "kallsyms: Prevent invalid access when showing module buildid" cleans up kallsyms code related to module buildid and fixes an invalid access crash when printing backtraces (Petr Mladek) - "Address page fault in ima_restore_measurement_list()" fixes a kexec-related crash that can occur when booting the second-stage kernel on x86 (Harshit Mogalapalli) - "kho: ABI headers and Documentation updates" updates the kexec handover ABI documentation (Mike Rapoport) - "Align atomic storage" adds the __aligned attribute to atomic_t and atomic64_t definitions to get natural alignment of both types on csky, m68k, microblaze, nios2, openrisc and sh (Finn Thain) - "kho: clean up page initialization logic" simplifies the page initialization logic in kho_restore_page() (Pratyush Yadav) - "Unload linux/kernel.h" moves several things out of kernel.h and into more appropriate places (Yury Norov) - "don't abuse task_struct.group_leader" removes the usage of ->group_leader when it is "obviously unnecessary" (Oleg Nesterov) - "list private v2 & luo flb" adds some infrastructure improvements to the live update orchestrator (Pasha Tatashin) * tag 'mm-nonmm-stable-2026-02-12-10-48' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (107 commits) watchdog/hardlockup: simplify perf event probe and remove per-cpu dependency procfs: fix missing RCU protection when reading real_parent in do_task_stat() watchdog/softlockup: fix sample ring index wrap in need_counting_irqs() kcsan, compiler_types: avoid duplicate type issues in BPF Type Format kho: fix doc for kho_restore_pages() tests/liveupdate: add in-kernel liveupdate test liveupdate: luo_flb: introduce File-Lifecycle-Bound global state liveupdate: luo_file: Use private list list: add kunit test for private list primitives list: add primitives for private list manipulations delayacct: fix uapi timespec64 definition panic: add panic_force_cpu= parameter to redirect panic to a specific CPU netclassid: use thread_group_leader(p) in update_classid_task() RDMA/umem: don't abuse current->group_leader drm/pan*: don't abuse current->group_leader drm/amd: kill the outdated "Only the pthreads threading model is supported" checks drm/amdgpu: don't abuse current->group_leader android/binder: use same_thread_group(proc->tsk, current) in binder_mmap() android/binder: don't abuse current->group_leader kho: skip memoryless NUMA nodes when reserving scratch areas ...
2026-02-12Merge tag 'mm-stable-2026-02-11-19-22' of ↵Linus Torvalds-6/+6
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull MM updates from Andrew Morton: - "powerpc/64s: do not re-activate batched TLB flush" makes arch_{enter|leave}_lazy_mmu_mode() nest properly (Alexander Gordeev) It adds a generic enter/leave layer and switches architectures to use it. Various hacks were removed in the process. - "zram: introduce compressed data writeback" implements data compression for zram writeback (Richard Chang and Sergey Senozhatsky) - "mm: folio_zero_user: clear page ranges" adds clearing of contiguous page ranges for hugepages. Large improvements during demand faulting are demonstrated (David Hildenbrand) - "memcg cleanups" tidies up some memcg code (Chen Ridong) - "mm/damon: introduce {,max_}nr_snapshots and tracepoint for damos stats" improves DAMOS stat's provided information, deterministic control, and readability (SeongJae Park) - "selftests/mm: hugetlb cgroup charging: robustness fixes" fixes a few issues in the hugetlb cgroup charging selftests (Li Wang) - "Fix va_high_addr_switch.sh test failure - again" addresses several issues in the va_high_addr_switch test (Chunyu Hu) - "mm/damon/tests/core-kunit: extend existing test scenarios" improves the KUnit test coverage for DAMON (Shu Anzai) - "mm/khugepaged: fix dirty page handling for MADV_COLLAPSE" fixes a glitch in khugepaged which was causing madvise(MADV_COLLAPSE) to transiently return -EAGAIN (Shivank Garg) - "arch, mm: consolidate hugetlb early reservation" reworks and consolidates a pile of straggly code related to reservation of hugetlb memory from bootmem and creation of CMA areas for hugetlb (Mike Rapoport) - "mm: clean up anon_vma implementation" cleans up the anon_vma implementation in various ways (Lorenzo Stoakes) - "tweaks for __alloc_pages_slowpath()" does a little streamlining of the page allocator's slowpath code (Vlastimil Babka) - "memcg: separate private and public ID namespaces" cleans up the memcg ID code and prevents the internal-only private IDs from being exposed to userspace (Shakeel Butt) - "mm: hugetlb: allocate frozen gigantic folio" cleans up the allocation of frozen folios and avoids some atomic refcount operations (Kefeng Wang) - "mm/damon: advance DAMOS-based LRU sorting" improves DAMOS's movement of memory betewwn the active and inactive LRUs and adds auto-tuning of the ratio-based quotas and of monitoring intervals (SeongJae Park) - "Support page table check on PowerPC" makes CONFIG_PAGE_TABLE_CHECK_ENFORCED work on powerpc (Andrew Donnellan) - "nodemask: align nodes_and{,not} with underlying bitmap ops" makes nodes_and() and nodes_andnot() propagate the return values from the underlying bit operations, enabling some cleanup in calling code (Yury Norov) - "mm/damon: hide kdamond and kdamond_lock from API callers" cleans up some DAMON internal interfaces (SeongJae Park) - "mm/khugepaged: cleanups and scan limit fix" does some cleanup work in khupaged and fixes a scan limit accounting issue (Shivank Garg) - "mm: balloon infrastructure cleanups" goes to town on the balloon infrastructure and its page migration function. Mainly cleanups, also some locking simplification (David Hildenbrand) - "mm/vmscan: add tracepoint and reason for kswapd_failures reset" adds additional tracepoints to the page reclaim code (Jiayuan Chen) - "Replace wq users and add WQ_PERCPU to alloc_workqueue() users" is part of Marco's kernel-wide migration from the legacy workqueue APIs over to the preferred unbound workqueues (Marco Crivellari) - "Various mm kselftests improvements/fixes" provides various unrelated improvements/fixes for the mm kselftests (Kevin Brodsky) - "mm: accelerate gigantic folio allocation" greatly speeds up gigantic folio allocation, mainly by avoiding unnecessary work in pfn_range_valid_contig() (Kefeng Wang) - "selftests/damon: improve leak detection and wss estimation reliability" improves the reliability of two of the DAMON selftests (SeongJae Park) - "mm/damon: cleanup kdamond, damon_call(), damos filter and DAMON_MIN_REGION" does some cleanup work in the core DAMON code (SeongJae Park) - "Docs/mm/damon: update intro, modules, maintainer profile, and misc" performs maintenance work on the DAMON documentation (SeongJae Park) - "mm: add and use vma_assert_stabilised() helper" refactors and cleans up the core VMA code. The main aim here is to be able to use the mmap write lock's lockdep state to perform various assertions regarding the locking which the VMA code requires (Lorenzo Stoakes) - "mm, swap: swap table phase II: unify swapin use" removes some old swap code (swap cache bypassing and swap synchronization) which wasn't working very well. Various other cleanups and simplifications were made. The end result is a 20% speedup in one benchmark (Kairui Song) - "enable PT_RECLAIM on more 64-bit architectures" makes PT_RECLAIM available on 64-bit alpha, loongarch, mips, parisc, and um. Various cleanups were performed along the way (Qi Zheng) * tag 'mm-stable-2026-02-11-19-22' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (325 commits) mm/memory: handle non-split locks correctly in zap_empty_pte_table() mm: move pte table reclaim code to memory.c mm: make PT_RECLAIM depends on MMU_GATHER_RCU_TABLE_FREE mm: convert __HAVE_ARCH_TLB_REMOVE_TABLE to CONFIG_HAVE_ARCH_TLB_REMOVE_TABLE config um: mm: enable MMU_GATHER_RCU_TABLE_FREE parisc: mm: enable MMU_GATHER_RCU_TABLE_FREE mips: mm: enable MMU_GATHER_RCU_TABLE_FREE LoongArch: mm: enable MMU_GATHER_RCU_TABLE_FREE alpha: mm: enable MMU_GATHER_RCU_TABLE_FREE mm: change mm/pt_reclaim.c to use asm/tlb.h instead of asm-generic/tlb.h mm/damon/stat: remove __read_mostly from memory_idle_ms_percentiles zsmalloc: make common caches global mm: add SPDX id lines to some mm source files mm/zswap: use %pe to print error pointers mm/vmscan: use %pe to print error pointers mm/readahead: fix typo in comment mm: khugepaged: fix NR_FILE_PAGES and NR_SHMEM in collapse_file() mm: refactor vma_map_pages to use vm_insert_pages mm/damon: unify address range representation with damon_addr_range mm/cma: replace snprintf with strscpy in cma_new_area ...
2026-02-06um: mm: enable MMU_GATHER_RCU_TABLE_FREEQi Zheng-0/+1
On a 64-bit system, madvise(MADV_DONTNEED) may cause a large number of empty PTE page table pages (such as 100GB+). To resolve this problem, first enable MMU_GATHER_RCU_TABLE_FREE to prepare for enabling the PT_RECLAIM feature, which resolves this problem. Link: https://lkml.kernel.org/r/e2217546504668b8a87a39eb0e378839339a1bb4.1769515122.git.zhengqi.arch@bytedance.com Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com> Cc: Richard Weinberger <richard@nod.at> Cc: Anton Ivanov <anton.ivanov@cambridgegreys.com> Cc: Johannes Berg <johannes@sipsolutions.net> Cc: Andreas Larsson <andreas@gaisler.com> Cc: "Aneesh Kumar K.V" <aneesh.kumar@kernel.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Hildenbrand (Red Hat) <david@kernel.org> Cc: Dev Jain <dev.jain@arm.com> Cc: Helge Deller <deller@gmx.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com> Cc: Lance Yang <ioworker0@gmail.com> Cc: "Liam R. Howlett" <Liam.Howlett@oracle.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Magnus Lindholm <linmag7@gmail.com> Cc: Matt Turner <mattst88@gmail.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Nicholas Piggin <npiggin@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Richard Henderson <richard.henderson@linaro.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Thomas Gleixner <tglx@kernel.org> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: WANG Xuerui <kernel@xen0n.name> Cc: Wei Yang <richard.weiyang@gmail.com> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-01-26arch, mm: consolidate initialization of nodes, zones and memory mapMike Rapoport (Microsoft)-5/+0
To initialize node, zone and memory map data structures every architecture calls free_area_init() during setup_arch() and passes it an array of zone limits. Beside code duplication it creates "interesting" ordering cases between allocation and initialization of hugetlb and the memory map. Some architectures allocate hugetlb pages very early in setup_arch() in certain cases, some only create hugetlb CMA areas in setup_arch() and sometimes hugetlb allocations happen mm_core_init(). With arch_zone_limits_init() helper available now on all architectures it is no longer necessary to call free_area_init() from architecture setup code. Rather core MM initialization can call arch_zone_limits_init() in a single place. This allows to unify ordering of hugetlb vs memory map allocation and initialization. Remove the call to free_area_init() from architecture specific code and place it in a new mm_core_init_early() function that is called immediately after setup_arch(). After this refactoring it is possible to consolidate hugetlb allocations and eliminate differences in ordering of hugetlb and memory map initialization among different architectures. As the first step of this consolidation move hugetlb_bootmem_alloc() to mm_core_early_init(). Link: https://lkml.kernel.org/r/20260111082105.290734-24-rppt@kernel.org Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Alex Shi <alexs@kernel.org> Cc: Andreas Larsson <andreas@gaisler.com> Cc: "Borislav Petkov (AMD)" <bp@alien8.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: David Hildenbrand <david@kernel.org> Cc: David S. Miller <davem@davemloft.net> Cc: Dinh Nguyen <dinguyen@kernel.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Guo Ren <guoren@kernel.org> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Helge Deller <deller@gmx.de> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Johannes Berg <johannes@sipsolutions.net> Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Klara Modin <klarasmodin@gmail.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Magnus Lindholm <linmag7@gmail.com> Cc: Matt Turner <mattst88@gmail.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Hocko <mhocko@suse.com> Cc: Michal Simek <monstr@monstr.eu> Cc: Muchun Song <muchun.song@linux.dev> Cc: Oscar Salvador <osalvador@suse.de> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Pratyush Yadav <pratyush@kernel.org> Cc: Richard Weinberger <richard@nod.at> Cc: "Ritesh Harjani (IBM)" <ritesh.list@gmail.com> Cc: Russell King <linux@armlinux.org.uk> Cc: Stafford Horne <shorne@gmail.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Vineet Gupta <vgupta@kernel.org> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-01-26um: introduce arch_zone_limits_init()Mike Rapoport (Microsoft)-1/+6
Move calculations of zone limits to a dedicated arch_zone_limits_init() function. Later MM core will use this function as an architecture specific callback during nodes and zones initialization and thus there won't be a need to call free_area_init() from every architecture. Link: https://lkml.kernel.org/r/20260111082105.290734-21-rppt@kernel.org Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Alex Shi <alexs@kernel.org> Cc: Andreas Larsson <andreas@gaisler.com> Cc: "Borislav Petkov (AMD)" <bp@alien8.de> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: David Hildenbrand <david@kernel.org> Cc: David S. Miller <davem@davemloft.net> Cc: Dinh Nguyen <dinguyen@kernel.org> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Guo Ren <guoren@kernel.org> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Helge Deller <deller@gmx.de> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Johannes Berg <johannes@sipsolutions.net> Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Klara Modin <klarasmodin@gmail.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Magnus Lindholm <linmag7@gmail.com> Cc: Matt Turner <mattst88@gmail.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Hocko <mhocko@suse.com> Cc: Michal Simek <monstr@monstr.eu> Cc: Muchun Song <muchun.song@linux.dev> Cc: Oscar Salvador <osalvador@suse.de> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Pratyush Yadav <pratyush@kernel.org> Cc: Richard Weinberger <richard@nod.at> Cc: "Ritesh Harjani (IBM)" <ritesh.list@gmail.com> Cc: Russell King <linux@armlinux.org.uk> Cc: Stafford Horne <shorne@gmail.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Vineet Gupta <vgupta@kernel.org> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-01-20kernel.h: drop hex.h and update all hex.h usersRandy Dunlap-0/+1
Remove <linux/hex.h> from <linux/kernel.h> and update all users/callers of hex.h interfaces to directly #include <linux/hex.h> as part of the process of putting kernel.h on a diet. Removing hex.h from kernel.h means that 36K C source files don't have to pay the price of parsing hex.h for the roughly 120 C source files that need it. This change has been build-tested with allmodconfig on most ARCHes. Also, all users/callers of <linux/hex.h> in the entire source tree have been updated if needed (if not already #included). Link: https://lkml.kernel.org/r/20251215005206.2362276-1-rdunlap@infradead.org Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Reviewed-by: Andy Shevchenko <andriy.shevchenko@intel.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Yury Norov (NVIDIA) <yury.norov@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-01-20treewide: provide a generic clear_user_page() variantDavid Hildenbrand-1/+0
Patch series "mm: folio_zero_user: clear page ranges", v11. This series adds clearing of contiguous page ranges for hugepages. The series improves on the current discontiguous clearing approach in two ways: - clear pages in a contiguous fashion. - use batched clearing via clear_pages() wherever exposed. The first is useful because it allows us to make much better use of hardware prefetchers. The second, enables advertising the real extent to the processor. Where specific instructions support it (ex. string instructions on x86; "mops" on arm64 etc), a processor can optimize based on this because, instead of seeing a sequence of 8-byte stores, or a sequence of 4KB pages, it sees a larger unit being operated on. For instance, AMD Zen uarchs (for extents larger than LLC-size) switch to a mode where they start eliding cacheline allocation. This is helpful not just because it results in higher bandwidth, but also because now the cache is not evicting useful cachelines and replacing them with zeroes. Demand faulting a 64GB region shows performance improvement: $ perf bench mem mmap -p $pg-sz -f demand -s 64GB -l 5 baseline +series (GBps +- %stdev) (GBps +- %stdev) pg-sz=2MB 11.76 +- 1.10% 25.34 +- 1.18% [*] +115.47% preempt=* pg-sz=1GB 24.85 +- 2.41% 39.22 +- 2.32% + 57.82% preempt=none|voluntary pg-sz=1GB (similar) 52.73 +- 0.20% [#] +112.19% preempt=full|lazy [*] This improvement is because switching to sequential clearing allows the hardware prefetchers to do a much better job. [#] For pg-sz=1GB a large part of the improvement is because of the cacheline elision mentioned above. preempt=full|lazy improves upon that because, not needing explicit invocations of cond_resched() to ensure reasonable preemption latency, it can clear the full extent as a single unit. In comparison the maximum extent used for preempt=none|voluntary is PROCESS_PAGES_NON_PREEMPT_BATCH (32MB). When provided the full extent the processor forgoes allocating cachelines on this path almost entirely. (The hope is that eventually, in the fullness of time, the lazy preemption model will be able to do the same job that none or voluntary models are used for, allowing us to do away with cond_resched().) Raghavendra also tested previous version of the series on AMD Genoa and sees similar improvement [1] with preempt=lazy. $ perf bench mem map -p $page-size -f populate -s 64GB -l 10 base patched change pg-sz=2MB 12.731939 GB/sec 26.304263 GB/sec 106.6% pg-sz=1GB 26.232423 GB/sec 61.174836 GB/sec 133.2% This patch (of 8): Let's drop all variants that effectively map to clear_page() and provide it in a generic variant instead. We'll use the macro clear_user_page to indicate whether an architecture provides it's own variant. Also, clear_user_page() is only called from the generic variant of clear_user_highpage(), so define it only if the architecture does not provide a clear_user_highpage(). And, for simplicity define it in linux/highmem.h. Note that for parisc, clear_page() and clear_user_page() map to clear_page_asm(), so we can just get rid of the custom clear_user_page() implementation. There is a clear_user_page_asm() function on parisc, that seems to be unused. Not sure what's up with that. Link: https://lkml.kernel.org/r/20260107072009.1615991-1-ankur.a.arora@oracle.com Link: https://lkml.kernel.org/r/20260107072009.1615991-2-ankur.a.arora@oracle.com Signed-off-by: David Hildenbrand <david@redhat.com> Co-developed-by: Ankur Arora <ankur.a.arora@oracle.com> Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Ankur Arora <ankur.a.arora@oracle.com> Cc: "Borislav Petkov (AMD)" <bp@alien8.de> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: David Hildenbrand <david@kernel.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Konrad Rzessutek Wilk <konrad.wilk@oracle.com> Cc: Lance Yang <ioworker0@gmail.com> Cc: "Liam R. Howlett" <Liam.Howlett@oracle.com> Cc: Li Zhe <lizhe.67@bytedance.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Mateusz Guzik <mjguzik@gmail.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Raghavendra K T <raghavendra.kt@amd.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-01-13arch/um: remove unused varible err in remove_files_and_dir()Alex Shi-2/+1
err is duplicated with errno, and never used; remove it. Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Alex Shi <alexs@kernel.org> Link: https://patch.msgid.link/20260107064009.15380-1-alexs@kernel.org Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2026-01-13um: virtio_uml: Support adding devices via mconsoleTiwei Bie-1/+50
It can be used when we want to add virtio devices to UML while it's up and running. Virtio devices can be added to UML using the same syntax as the command line option: (mconsole) config virtio_uml.device=<socket>:<virtio_id>[:<platform_id>] Signed-off-by: Tiwei Bie <tiwei.btw@antgroup.com> Link: https://patch.msgid.link/20260106004259.1604736-1-tiwei.btw@antgroup.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2026-01-13um: Handle SIGCHLD in seccomp mode like other IRQ signalsTiwei Bie-0/+3
In seccomp mode, SIGCHLD serves as the child reaper IRQ. So, let's handle it in the same way as other IRQ signals, including preventing them from nesting with each other and allowing interrupted syscalls to be automatically restarted. Signed-off-by: Tiwei Bie <tiwei.btw@antgroup.com> Link: https://patch.msgid.link/20260106001228.1531146-3-tiwei.btw@antgroup.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2026-01-13um: Preserve errno within signal handlerTiwei Bie-3/+3
We rely on errno to determine whether a syscall has failed, so we need to ensure that accessing errno is async-signal-safe. Currently, we preserve the errno in sig_handler_common(), but it doesn't cover every possible case. Let's do it in hard_handler() instead, which is the signal handler we actually register. Signed-off-by: Tiwei Bie <tiwei.btw@antgroup.com> Link: https://patch.msgid.link/20260106001228.1531146-2-tiwei.btw@antgroup.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2026-01-05um: Fix incorrect __acquires/__releases annotationsMarco Elver-7/+11
With Clang's context analysis, the compiler is a bit more strict about what goes into the __acquires/__releases annotations and can't refer to non-existent variables. On an UM build, mm_id.h is transitively included into mm_types.h, and we can observe the following error (if context analysis is enabled in e.g. stackdepot.c): In file included from lib/stackdepot.c:17: In file included from include/linux/debugfs.h:15: In file included from include/linux/fs.h:5: In file included from include/linux/fs/super.h:5: In file included from include/linux/fs/super_types.h:7: In file included from include/linux/list_lru.h:14: In file included from include/linux/xarray.h:16: In file included from include/linux/gfp.h:7: In file included from include/linux/mmzone.h:22: In file included from include/linux/mm_types.h:26: In file included from arch/um/include/asm/mmu.h:12: >> arch/um/include/shared/skas/mm_id.h:24:54: error: use of undeclared identifier 'turnstile' 24 | void enter_turnstile(struct mm_id *mm_id) __acquires(turnstile); | ^~~~~~~~~ arch/um/include/shared/skas/mm_id.h:25:53: error: use of undeclared identifier 'turnstile' 25 | void exit_turnstile(struct mm_id *mm_id) __releases(turnstile); | ^~~~~~~~~ One (discarded) option was to use token_context_lock(turnstile) to just define a token with the already used name, but that would not allow the compiler to distinguish between different mm_id-dependent instances. Another constraint is that struct mm_id is only declared and incomplete in the header, so even if we tried to construct an expression to get to the mutex instance, this would fail (including more headers transitively everywhere should also be avoided). Instead, just declare an mm_id-dependent helper to return the mutex, and use the mm_id-dependent call expression in the __acquires/__releases attributes; the compiler will consider the identity of the mutex to be the call expression. Then using __get_turnstile() in the lock/unlock wrappers (with context analysis enabled for mmu.c) the compiler will be able to verify the implementation of the wrappers as-is. We leave context analysis disabled in arch/um/kernel/skas/ for now. This change is a preparatory change to allow enabling context analysis in subsystems that include any of the above headers. No functional change intended. Closes: https://lore.kernel.org/oe-kbuild-all/202512171220.vHlvhpCr-lkp@intel.com/ Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Marco Elver <elver@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://patch.msgid.link/20251219154418.3592607-23-elver@google.com
2025-12-06Merge tag 'objtool-urgent-2025-12-06' of ↵Linus Torvalds-1/+4
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull objtool fixes from Ingo Molnar: "Address various objtool scalability bugs/inefficiencies exposed by allmodconfig builds, plus improve the quality of alternatives instructions generated code and disassembly" * tag 'objtool-urgent-2025-12-06' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: objtool: Simplify .annotate_insn code generation output some more objtool: Add more robust signal error handling, detect and warn about stack overflows objtool: Remove newlines and tabs from annotation macros objtool: Consolidate annotation macros x86/asm: Remove ANNOTATE_DATA_SPECIAL usage x86/alternative: Remove ANNOTATE_DATA_SPECIAL usage objtool: Fix stack overflow in validate_branch()
2025-12-05Merge tag 'uml-for-linux-6.19-rc1' of ↵Linus Torvalds-600/+943
git://git.kernel.org/pub/scm/linux/kernel/git/uml/linux Pull UML updates from Johannes Berg: "Apart from the usual small churn, we have - initial SMP support (only kernel) - major vDSO cleanups (and fixes for 32-bit)" * tag 'uml-for-linux-6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/uml/linux: (33 commits) um: Disable KASAN_INLINE when STATIC_LINK is selected um: Don't rename vmap to kernel_vmap um: drivers: virtio: use string choices helper um: Always set up AT_HWCAP and AT_PLATFORM x86/um: Remove FIXADDR_USER_START and FIXADDR_USE_END um: Remove __access_ok_vsyscall() um: Remove redundant range check from __access_ok_vsyscall() um: Remove fixaddr_user_init() x86/um: Drop gate area handling x86/um: Do not inherit vDSO from host um: Split out default elf_aux_hwcap x86/um: Move ELF_PLATFORM fallback to x86-specific code um: Split out default elf_aux_platform um: Avoid circular dependency on asm-offsets in pgtable.h um: Enable SMP support on x86 asm-generic: percpu: Add assembly guard um: vdso: Remove getcpu support on x86 um: Add initial SMP support um: Define timers on a per-CPU basis um: Determine sleep based on need_resched() ...
2025-12-03Merge tag 'printk-for-6.19' of ↵Linus Torvalds-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux Pull printk updates from Petr Mladek: - Allow creaing nbcon console drivers with an unsafe write_atomic() callback that can only be called by the final nbcon_atomic_flush_unsafe(). Otherwise, the driver would rely on the kthread. It is going to be used as the-best-effort approach for an experimental nbcon netconsole driver, see https://lore.kernel.org/r/20251121-nbcon-v1-2-503d17b2b4af@debian.org Note that a safe .write_atomic() callback is supposed to work in NMI context. But some networking drivers are not safe even in IRQ context: https://lore.kernel.org/r/oc46gdpmmlly5o44obvmoatfqo5bhpgv7pabpvb6sjuqioymcg@gjsma3ghoz35 In an ideal world, all networking drivers would be fixed first and the atomic flush would be blocked only in NMI context. But it brings the question how reliable networking drivers are when the system is in a bad state. They might block flushing more reliable serial consoles which are more suitable for serious debugging anyway. - Allow to use the last 4 bytes of the printk ring buffer. - Prevent queuing IRQ work and block printk kthreads when consoles are suspended. Otherwise, they create non-necessary churn or even block the suspend. - Release console_lock() between each record in the kthread used for legacy consoles on RT. It might significantly speed up the boot. - Release nbcon context between each record in the atomic flush. It prevents stalls of the related printk kthread after it has lost the ownership in the middle of a record - Add support for NBCON consoles into KDB - Add %ptsP modifier for printing struct timespec64 and use it where possible - Misc code clean up * tag 'printk-for-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux: (48 commits) printk: Use console_is_usable on console_unblank arch: um: kmsg_dump: Use console_is_usable drivers: serial: kgdboc: Drop checks for CON_ENABLED and CON_BOOT lib/vsprintf: Unify FORMAT_STATE_NUM handlers printk: Avoid irq_work for printk_deferred() on suspend printk: Avoid scheduling irq_work on suspend printk: Allow printk_trigger_flush() to flush all types tracing: Switch to use %ptSp scsi: snic: Switch to use %ptSp scsi: fnic: Switch to use %ptSp s390/dasd: Switch to use %ptSp ptp: ocp: Switch to use %ptSp pps: Switch to use %ptSp PCI: epf-test: Switch to use %ptSp net: dsa: sja1105: Switch to use %ptSp mmc: mmc_test: Switch to use %ptSp media: av7110: Switch to use %ptSp ipmi: Switch to use %ptSp igb: Switch to use %ptSp e1000e: Switch to use %ptSp ...
2025-12-03x86/asm: Remove ANNOTATE_DATA_SPECIAL usageJosh Poimboeuf-1/+1
Instead of manually annotating each __ex_table entry, just make the section mergeable and store the entry size in the ELF section header. Either way works for objtool create_fake_symbols(), this way produces cleaner code generation. Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://patch.msgid.link/b858cb7891c1ba0080e22a9c32595e6c302435e2.1764694625.git.jpoimboe@kernel.org
2025-12-03x86/alternative: Remove ANNOTATE_DATA_SPECIAL usageJosh Poimboeuf-0/+2
Instead of manually annotating each .altinstructions entry, just make the section mergeable and store the entry size in the ELF section header. Either way works for objtool create_fake_symbols(), this way produces cleaner code generation. Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://patch.msgid.link/5ac04e6db5be6453dce8003a771ebb0c47b4cd7a.1764694625.git.jpoimboe@kernel.org
2025-12-01um: Disable KASAN_INLINE when STATIC_LINK is selectedChristophe Leroy (CS GROUP)-4/+1
um doesn't support KASAN_INLINE together with STATIC_LINK. Instead of failing the build, disable KASAN_INLINE when STATIC_LINK is selected. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202511290451.x9GZVJ1l-lkp@intel.com/ Fixes: 1e338f4d99e6 ("kasan: introduce ARCH_DEFER_KASAN and unify static key across modes") Signed-off-by: Christophe Leroy (CS GROUP) <chleroy@kernel.org> Link: https://patch.msgid.link/2620ab0bbba640b6237c50b9c0dca1c7d1142f5d.1764410067.git.chleroy@kernel.org Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-11-27arch: um: kmsg_dump: Use console_is_usableMarcos Paulo de Souza-1/+1
All consoles found on for_each_console are registered, meaning that all of them have the CON_ENABLED flag set. Since NBCON was introduced it's important to check if a given console also implements the NBCON callbacks. The function console_is_usable does exactly that. Signed-off-by: Marcos Paulo de Souza <mpdesouza@suse.com> Reviewed-by: Petr Mladek <pmladek@suse.com> Link: https://patch.msgid.link/20251121-printk-cleanup-part2-v2-2-57b8b78647f4@suse.com Signed-off-by: Petr Mladek <pmladek@suse.com>
2025-11-25um: Don't rename vmap to kernel_vmapDavid Gow-7/+5
In order to work around the existence of a vmap symbol in libpcap, the UML makefile unconditionally redefines vmap to kernel_vmap. However, this not only affects the actual vmap symbol, but also anything else named vmap, including a number of struct members in DRM. This would not be too much of a problem, since all uses are also updated, except we now have Rust DRM bindings, which expect the corresponding Rust structs to have 'vmap' names. Since the redefinition applies in bindgen, but not to Rust code, we end up with errors such as: error[E0560]: struct `drm_gem_object_funcs` has no fields named `vmap` --> rust/kernel/drm/gem/mod.rs:210:9 Since libpcap support was removed in commit 12b8e7e69aa7 ("um: Remove obsolete pcap driver"), remove the, now unnecessary, define as well. We also take this opportunity to update the comment. Signed-off-by: David Gow <davidgow@google.com> Acked-by: Miguel Ojeda <ojeda@kernel.org> Link: https://patch.msgid.link/20251122083213.3996586-1-davidgow@google.com Fixes: 12b8e7e69aa7 ("um: Remove obsolete pcap driver") [adjust commmit message a bit] Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-11-24um: drivers: virtio: use string choices helperKuninori Morimoto-2/+2
Remove hard-coded strings by using the string helper functions Signed-off-by: Kuninori Morimoto <kuninori.morimoto.gx@renesas.com> Link: https://patch.msgid.link/87h5uywtwp.wl-kuninori.morimoto.gx@renesas.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-11-06um: Always set up AT_HWCAP and AT_PLATFORMThomas Weißschuh-8/+7
Historically the code to set up AT_HWCAP and AT_PLATFORM was only built for 32bit x86 as it was intermingled with the vDSO passthrough code. Now that vDSO passthrough has been removed, always pass through AT_HWCAP and AT_PLATFORM. Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Link: https://patch.msgid.link/20251028-uml-remove-32bit-pseudo-vdso-v1-10-e930063eff5f@weissschuh.net Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-11-06um: Remove __access_ok_vsyscall()Thomas Weißschuh-7/+1
FIXADDR_USER_START and FIXADDR_USER_END are now always zero. Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Link: https://patch.msgid.link/20251028-uml-remove-32bit-pseudo-vdso-v1-8-e930063eff5f@weissschuh.net Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-11-06um: Remove redundant range check from __access_ok_vsyscall()Thomas Weißschuh-2/+1
The only caller __access_ok() is already doing the same check through __addr_range_nowrap(). Remove the redundant check. Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Link: https://patch.msgid.link/20251028-uml-remove-32bit-pseudo-vdso-v1-7-e930063eff5f@weissschuh.net Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-11-06um: Remove fixaddr_user_init()Thomas Weißschuh-107/+0
With the removal of the vDSO passthrough from the host, FIXADDR_USER_START is always 0 and fixaddr_user_init() is dead code. Remove it. Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Link: https://patch.msgid.link/20251028-uml-remove-32bit-pseudo-vdso-v1-6-e930063eff5f@weissschuh.net Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-11-06x86/um: Drop gate area handlingThomas Weißschuh-4/+0
With the removal of the vDSO passthrough from the host, FIXADDR_USER_START is always 0 and the gate area setup code is dead. Remove it. Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Link: https://patch.msgid.link/20251028-uml-remove-32bit-pseudo-vdso-v1-5-e930063eff5f@weissschuh.net Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-11-06x86/um: Do not inherit vDSO from hostThomas Weißschuh-33/+0
Inheriting the vDSO from the host is problematic. The values read from the time functions will not be correct for the UML kernel. Furthermore the start and end of the vDSO are not stable or detectable by userspace. Specifically the vDSO datapages start before AT_SYSINFO_EHDR and the vDSO itself is larger than a single page. This codepath is only used on 32bit x86 UML. In my testing with both 32bit and 64bit hosts the passthrough functionality has always been disabled anyways due to the checks against envp in scan_elf_aux(). Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Link: https://patch.msgid.link/20251028-uml-remove-32bit-pseudo-vdso-v1-4-e930063eff5f@weissschuh.net Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-11-06um: Split out default elf_aux_hwcapThomas Weißschuh-2/+0
Setting all auxiliary vector values to default values if one of them was not provided by the host will discard perfectly fine values. Remove the elf_aux_platform fallback from the vDSO ones. As zero is the correct fallback anyways, don't create a new conditional. Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Link: https://patch.msgid.link/20251028-uml-remove-32bit-pseudo-vdso-v1-3-e930063eff5f@weissschuh.net Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-11-06x86/um: Move ELF_PLATFORM fallback to x86-specific codeThomas Weißschuh-3/+0
The generic UM code should not have references to x86-specific value. Move the fallback into the x86-specific header. Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Link: https://patch.msgid.link/20251028-uml-remove-32bit-pseudo-vdso-v1-2-e930063eff5f@weissschuh.net Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-11-06um: Split out default elf_aux_platformThomas Weißschuh-2/+4
Setting all auxiliary vector values to default values if one of them was not provided by the host will discard perfectly fine values. Move the elf_aux_platform fallback to its own conditional. Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Link: https://patch.msgid.link/20251028-uml-remove-32bit-pseudo-vdso-v1-1-e930063eff5f@weissschuh.net Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-10-28um: Avoid circular dependency on asm-offsets in pgtable.hThomas Weißschuh-0/+2
Recent changes have added an include of as-layout.h to pgtable.h. However this introduces a circular dependency during asm-offsets generation as as-layout.h depends on asm-offsets and pgtable.h is an input for asm-offsets. Building from a clean state results in the following error: CC arch/um/kernel/asm-offsets.s In file included from arch/um/include/asm/pgtable.h:48, from include/linux/pgtable.h:6, from include/linux/mm.h:31, from include/linux/pid_namespace.h:7, from include/linux/ptrace.h:10, from include/linux/audit.h:13, from arch/um/kernel/asm-offsets.c:8: arch/um/include/shared/as-layout.h:9:10: fatal error: generated/asm-offsets.h: No such file or directory 9 | #include <generated/asm-offsets.h> | ^~~~~~~~~~~~~~~~~~~~~~~~~ compilation terminated. make[4]: *** [scripts/Makefile.build:182: arch/um/kernel/asm-offsets.s] Error 1 As the inclusion of as-layout.h in pgtable.h is not yet needed while asm-offsets are generated, break the dependency here. Fixes: a7f7dbae94a5 ("um: Remove file-based iomem emulation support") Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Reviewed-by: Tiwei Bie <tiwei.btw@antgroup.com> Link: https://patch.msgid.link/20251028-uml-offsets-circular-v1-1-601c363cfaaa@weissschuh.net Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-10-27um: Add initial SMP supportTiwei Bie-47/+765
Add initial symmetric multi-processing (SMP) support to UML. With this support enabled, users can tell UML to start multiple virtual processors, each represented as a separate host thread. In UML, kthreads and normal threads (when running in kernel mode) can be scheduled and executed simultaneously on different virtual processors. However, the userspace code of normal threads still runs within their respective single-threaded stubs. That is, SMP support is currently available both within the kernel and across different processes, but still remains limited within threads of the same process in userspace. Signed-off-by: Tiwei Bie <tiwei.btw@antgroup.com> Link: https://patch.msgid.link/20251027001815.1666872-6-tiwei.bie@linux.dev Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-10-27um: Define timers on a per-CPU basisTiwei Bie-31/+69
Define timers on a per-CPU basis to enable each CPU to have its own timer. This is a preparation for adding SMP support. Signed-off-by: Tiwei Bie <tiwei.btw@antgroup.com> Link: https://patch.msgid.link/20251027001815.1666872-5-tiwei.bie@linux.dev Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-10-27um: Determine sleep based on need_resched()Tiwei Bie-6/+26
With SMP and NO_HZ enabled, the CPU may still need to sleep even if the timer is disarmed. Switch to deciding whether to sleep based on pending resched. Additionally, because disabling IRQs does not block SIGALRM, it is also necessary to check for any pending timer alarms. This is a preparation for adding SMP support. Signed-off-by: Tiwei Bie <tiwei.btw@antgroup.com> Link: https://patch.msgid.link/20251027001815.1666872-4-tiwei.bie@linux.dev Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-10-27um: Turn signals_* into thread-local variablesTiwei Bie-8/+13
Turn signals_enabled, signals_pending and signals_active into thread-local variables. This enables us to control and track signals independently on each CPU thread. This is a preparation for adding SMP support. Signed-off-by: Tiwei Bie <tiwei.btw@antgroup.com> Link: https://patch.msgid.link/20251027001815.1666872-3-tiwei.bie@linux.dev Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-10-27um: Do not disable kmalloc in initial_thread_cb()Tiwei Bie-4/+0
Currently, initial_thread_cb() temporarily disables kmalloc when it invokes the callback, allowing the callback to bypass kmalloc. This is unnecessary for the current users of initial_thread_cb(), and we should avoid memory allocations that are not under the control of the UML kernel. Therefore, let's stop temporarily disabling kmalloc in initial_thread_cb(). Signed-off-by: Tiwei Bie <tiwei.btw@antgroup.com> Link: https://patch.msgid.link/20251027001815.1666872-2-tiwei.bie@linux.dev Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-10-27um: Remove file-based iomem emulation supportTiwei Bie-293/+4
The file-based iomem emulation was introduced to support writing paravirtualized drivers based on emulated iomem regions. However, the only driver that makes use of it is an example driver called mmapper, which was written over two decades ago. We now have several modern device emulation mechanisms, such as vhost-user-based virtio-uml. Remove the file-based iomem emulation support to reduce the maintenance burden. Signed-off-by: Tiwei Bie <tiwei.btw@antgroup.com> Link: https://patch.msgid.link/20251027054519.1996090-5-tiwei.bie@linux.dev Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-10-27um: Replace UML_ROUND_UP() with PAGE_ALIGN()Tiwei Bie-7/+3
Although UML_ROUND_UP() is defined in a shared header file, it depends on the PAGE_SIZE and PAGE_MASK macros, so it can only be used in kernel code. Considering its name is not very clear and its functionality is the same as PAGE_ALIGN(), replace its usages with a direct call to PAGE_ALIGN() and remove it. Signed-off-by: Tiwei Bie <tiwei.btw@antgroup.com> Link: https://patch.msgid.link/20251027054519.1996090-4-tiwei.bie@linux.dev Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-10-27um: Use PAGE_ALIGN() for address alignmentTiwei Bie-6/+3
Use PAGE_ALIGN() instead of open-coded calculations. Signed-off-by: Tiwei Bie <tiwei.btw@antgroup.com> Link: https://patch.msgid.link/20251027054519.1996090-3-tiwei.bie@linux.dev Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-10-27um: Make host_task_size a local variableTiwei Bie-3/+1
Currently, host_task_size is a global variable, but it is only used in linux_main() to compute stub_start and task_size. Make it a local variable to limit its scope to where it is actually needed. Signed-off-by: Tiwei Bie <tiwei.btw@antgroup.com> Link: https://patch.msgid.link/20251027054519.1996090-2-tiwei.bie@linux.dev Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-10-27um: move asm-offsets generation into a single fileJohannes Berg-31/+43
There's nothing subarch dependent here, and it's odd that includes need to be done in the subarch, and then entries defined in the common file. Simplify the whole thing from three files into one. Link: https://patch.msgid.link/20251007071452.367989-4-johannes@sipsolutions.net Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2025-10-27um/hostfs: define HOSTFS_ATTR_* via asm-offsetsJohannes Berg-0/+10
The HOSTFS_ATTR_* values were meant to be standalone for communication between hostfs's kernel and user code parts. However, it's easy to forget that HOSTFS_ATTR_* should be used even on the kernel side, and that wasn't consistently done. As a result, the values need to match ATTR_* values, which is not useful to maintain by hand. Instead, generate them via asm-offsets like other constants that UML needs in user-side code that aren't otherwise available in any header files that can be included there. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Reviewed-by: Hongbo Li <lihongbo22@huawei.com> Link: https://patch.msgid.link/20251007071452.367989-3-johannes@sipsolutions.net Signed-off-by: Johannes Berg <johannes.berg@intel.com>