aboutsummaryrefslogtreecommitdiffstats
path: root/tools/perf/scripts/python/stackcollapse.py (unfollow)
AgeCommit message (Collapse)AuthorFilesLines
2025-09-13samples/damon/wsse: avoid starting DAMON before initializationSeongJae Park1-0/+3
Patch series "samples/damon: fix boot time enable handling fixup merge mistakes". First three patches of the patch series "mm/damon: fix misc bugs in DAMON modules" [1] were trying to fix boot time DAMON sample modules enabling issues. The issues are the modules can crash if those are enabled before DAMON is enabled, like using boot time parameter options. The three patches were fixing the issues by avoiding starting DAMON before the module initialization phase. However, probably by a mistake during a merge, only half of the change is merged, and the part for avoiding the starting of DAMON before the module initialized is missed. So the problem is not solved and thus the modules can still crash if enabled before DAMON is initialized. Fix those by applying the unmerged parts again. Note that the broken commits are merged into 6.17-rc1, but also backported to relevant stable kernels. So this series also needs to be merged into the stable kernels. Hence Cc-ing stable@. This patch (of 3): Commit 0ed1165c3727 ("samples/damon/wsse: fix boot time enable handling") is somehow incompletely applying the origin patch [2]. It is missing the part that avoids starting DAMON before module initialization. Probably a mistake during a merge has happened. Fix it by applying the missed part again. Link: https://lkml.kernel.org/r/20250909022238.2989-1-sj@kernel.org Link: https://lkml.kernel.org/r/20250909022238.2989-2-sj@kernel.org Link: https://lkml.kernel.org/r/20250706193207.39810-1-sj@kernel.org [1] Link: https://lore.kernel.org/20250706193207.39810-2-sj@kernel.org [2] Fixes: 0ed1165c3727 ("samples/damon/wsse: fix boot time enable handling") Signed-off-by: SeongJae Park <sj@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-09-13MAINTAINERS: add Lance Yang as a THP reviewerLance Yang1-0/+1
I've been actively digging into the MM/THP subsystem for over a year now, and there's a real interest in contributing more and getting further involved. Well, missing out on any more cool THP things is really a pain ;) Link: https://lkml.kernel.org/r/20250908104857.35397-1-lance.yang@linux.dev Signed-off-by: Lance Yang <lance.yang@linux.dev> Acked-by: David Hildenbrand <david@redhat.com> Acked-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Acked-by: Zi Yan <ziy@nvidia.com> Acked-by: Barry Song <baohua@kernel.org> Acked-by: Baolin Wang <baolin.wang@linux.alibaba.com> Cc: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Nico Pache <npache@redhat.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Dev Jain <dev.jain@arm.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-09-13MAINTAINERS: add Jann Horn as rmap reviewerLorenzo Stoakes1-0/+1
Jann has been an excellent contributor in all areas of memory management, and has demonstrated great expertise in the reverse mapping. It's therefore appropriate for him to become a reviewer. Link: https://lkml.kernel.org/r/20250908194959.820913-1-lorenzo.stoakes@oracle.com Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Acked-by: David Hildenbrand <david@redhat.com> Acked-by: Harry Yoo <harry.yoo@oracle.com> Acked-by: SeongJae Park <sj@kernel.org> Acked-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Jann Horn <jannh@google.com> Cc: Rik van Riel <riel@surriel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-09-13mm/damon/sysfs: use dynamically allocated repeat mode damon_call_controlSeongJae Park1-8/+15
DAMON sysfs interface is using a single global repeat mode damon_call_control variable for refresh_ms handling, for all DAMON contexts. As a result, when there are more than one context, the single global damon_call_control is unexpectedly over-written (corrupted). Particularly the ->link field is overwritten by the multiple contexts and this can cause a user hangup, and/or a kernel crash. Fix it by using dynamically allocated damon_call_control object per DAMON context. Link: https://lkml.kernel.org/r/20250908201513.60802-3-sj@kernel.org Link: https://lore.kernel.org/20250904011738.930-1-yunjeong.mun@sk.com [1] Link: https://lore.kernel.org/20250905035411.39501-1-sj@kernel.org [2] Fixes: d809a7c64ba8 ("mm/damon/sysfs: implement refresh_ms file internal work") Signed-off-by: SeongJae Park <sj@kernel.org> Reported-by: Yunjeong Mun <yunjeong.mun@sk.com> Closes: https://lore.kernel.org/20250904011738.930-1-yunjeong.mun@sk.com Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-09-13mm/damon/core: introduce damon_call_control->dealloc_on_cancelSeongJae Park2-2/+8
Patch series "mm/damon/sysfs: fix refresh_ms control overwriting on multi-kdamonds usages". Automatic esssential DAMON/DAMOS status update feature of DAMON sysfs interface (refresh_ms) is broken [1] for multiple DAMON contexts (kdamonds) use case, since it uses a global single damon_call_control object for all created DAMON contexts. The fields of the object, particularly the list field is over-written for the contexts and it makes unexpected results including user-space hangup and kernel crashes [2]. Fix it by extending damon_call_control for the use case and updating the usage on DAMON sysfs interface to use per-context dynamically allocated damon_call_control object. This patch (of 2): When damon_call_control->repeat is set, damon_call() is executed asynchronously, and is eventually canceled when kdamond finishes. If the damon_call_control object is dynamically allocated, finding the place to deallocate the object is difficult. Introduce a new damon_call_control field, namely dealloc_on_cancel, to ask the kdamond deallocates those dynamically allocated objects when those are canceled. Link: https://lkml.kernel.org/r/20250908201513.60802-3-sj@kernel.org Link: https://lkml.kernel.org/r/20250908201513.60802-2-sj@kernel.org Fixes: d809a7c64ba8 ("mm/damon/sysfs: implement refresh_ms file internal work") Signed-off-by: SeongJae Park <sj@kernel.org> Cc: Yunjeong Mun <yunjeong.mun@sk.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-09-13mm: folio_may_be_lru_cached() unless folio_test_large()Hugh Dickins4-6/+16
mm/swap.c and mm/mlock.c agree to drain any per-CPU batch as soon as a large folio is added: so collect_longterm_unpinnable_folios() just wastes effort when calling lru_add_drain[_all]() on a large folio. But although there is good reason not to batch up PMD-sized folios, we might well benefit from batching a small number of low-order mTHPs (though unclear how that "small number" limitation will be implemented). So ask if folio_may_be_lru_cached() rather than !folio_test_large(), to insulate those particular checks from future change. Name preferred to "folio_is_batchable" because large folios can well be put on a batch: it's just the per-CPU LRU caches, drained much later, which need care. Marked for stable, to counter the increase in lru_add_drain_all()s from "mm/gup: check ref_count instead of lru before migration". Link: https://lkml.kernel.org/r/57d2eaf8-3607-f318-e0c5-be02dce61ad0@google.com Fixes: 9a4e9f3b2d73 ("mm: update get_user_pages_longterm to migrate pages allocated from CMA region") Signed-off-by: Hugh Dickins <hughd@google.com> Suggested-by: David Hildenbrand <david@redhat.com> Acked-by: David Hildenbrand <david@redhat.com> Cc: "Aneesh Kumar K.V" <aneesh.kumar@kernel.org> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: Chris Li <chrisl@kernel.org> Cc: Christoph Hellwig <hch@infradead.org> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Keir Fraser <keirf@google.com> Cc: Konstantin Khlebnikov <koct9i@gmail.com> Cc: Li Zhe <lizhe.67@bytedance.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Peter Xu <peterx@redhat.com> Cc: Rik van Riel <riel@surriel.com> Cc: Shivank Garg <shivankg@amd.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Wei Xu <weixugc@google.com> Cc: Will Deacon <will@kernel.org> Cc: yangge <yangge1116@126.com> Cc: Yuanchu Xie <yuanchu@google.com> Cc: Yu Zhao <yuzhao@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-09-13mm: revert "mm: vmscan.c: fix OOM on swap stress test"Hugh Dickins1-1/+1
This reverts commit 0885ef470560: that was a fix to the reverted 33dfe9204f29b415bbc0abb1a50642d1ba94f5e9. Link: https://lkml.kernel.org/r/aa0e9d67-fbcd-9d79-88a1-641dfbe1d9d1@google.com Signed-off-by: Hugh Dickins <hughd@google.com> Acked-by: David Hildenbrand <david@redhat.com> Cc: "Aneesh Kumar K.V" <aneesh.kumar@kernel.org> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: Chris Li <chrisl@kernel.org> Cc: Christoph Hellwig <hch@infradead.org> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Keir Fraser <keirf@google.com> Cc: Konstantin Khlebnikov <koct9i@gmail.com> Cc: Li Zhe <lizhe.67@bytedance.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Peter Xu <peterx@redhat.com> Cc: Rik van Riel <riel@surriel.com> Cc: Shivank Garg <shivankg@amd.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Wei Xu <weixugc@google.com> Cc: Will Deacon <will@kernel.org> Cc: yangge <yangge1116@126.com> Cc: Yuanchu Xie <yuanchu@google.com> Cc: Yu Zhao <yuzhao@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-09-13mm: revert "mm/gup: clear the LRU flag of a page before adding to LRU batch"Hugh Dickins1-24/+26
This reverts commit 33dfe9204f29: now that collect_longterm_unpinnable_folios() is checking ref_count instead of lru, and mlock/munlock do not participate in the revised LRU flag clearing, those changes are misleading, and enlarge the window during which mlock/munlock may miss an mlock_count update. It is possible (I'd hesitate to claim probable) that the greater likelihood of missed mlock_count updates would explain the "Realtime threads delayed due to kcompactd0" observed on 6.12 in the Link below. If that is the case, this reversion will help; but a complete solution needs also a further patch, beyond the scope of this series. Included some 80-column cleanup around folio_batch_add_and_move(). The role of folio_test_clear_lru() (before taking per-memcg lru_lock) is questionable since 6.13 removed mem_cgroup_move_account() etc; but perhaps there are still some races which need it - not examined here. Link: https://lore.kernel.org/linux-mm/DU0PR01MB10385345F7153F334100981888259A@DU0PR01MB10385.eurprd01.prod.exchangelabs.com/ Link: https://lkml.kernel.org/r/05905d7b-ed14-68b1-79d8-bdec30367eba@google.com Signed-off-by: Hugh Dickins <hughd@google.com> Acked-by: David Hildenbrand <david@redhat.com> Cc: "Aneesh Kumar K.V" <aneesh.kumar@kernel.org> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: Chris Li <chrisl@kernel.org> Cc: Christoph Hellwig <hch@infradead.org> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Keir Fraser <keirf@google.com> Cc: Konstantin Khlebnikov <koct9i@gmail.com> Cc: Li Zhe <lizhe.67@bytedance.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Peter Xu <peterx@redhat.com> Cc: Rik van Riel <riel@surriel.com> Cc: Shivank Garg <shivankg@amd.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Wei Xu <weixugc@google.com> Cc: Will Deacon <will@kernel.org> Cc: yangge <yangge1116@126.com> Cc: Yuanchu Xie <yuanchu@google.com> Cc: Yu Zhao <yuzhao@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-09-13mm/gup: local lru_add_drain() to avoid lru_add_drain_all()Hugh Dickins1-4/+11
In many cases, if collect_longterm_unpinnable_folios() does need to drain the LRU cache to release a reference, the cache in question is on this same CPU, and much more efficiently drained by a preliminary local lru_add_drain(), than the later cross-CPU lru_add_drain_all(). Marked for stable, to counter the increase in lru_add_drain_all()s from "mm/gup: check ref_count instead of lru before migration". Note for clean backports: can take 6.16 commit a03db236aebf ("gup: optimize longterm pin_user_pages() for large folio") first. Link: https://lkml.kernel.org/r/66f2751f-283e-816d-9530-765db7edc465@google.com Signed-off-by: Hugh Dickins <hughd@google.com> Acked-by: David Hildenbrand <david@redhat.com> Cc: "Aneesh Kumar K.V" <aneesh.kumar@kernel.org> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: Chris Li <chrisl@kernel.org> Cc: Christoph Hellwig <hch@infradead.org> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Keir Fraser <keirf@google.com> Cc: Konstantin Khlebnikov <koct9i@gmail.com> Cc: Li Zhe <lizhe.67@bytedance.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Peter Xu <peterx@redhat.com> Cc: Rik van Riel <riel@surriel.com> Cc: Shivank Garg <shivankg@amd.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Wei Xu <weixugc@google.com> Cc: Will Deacon <will@kernel.org> Cc: yangge <yangge1116@126.com> Cc: Yuanchu Xie <yuanchu@google.com> Cc: Yu Zhao <yuzhao@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-09-13mm/gup: check ref_count instead of lru before migrationHugh Dickins1-1/+2
Patch series "mm: better GUP pin lru_add_drain_all()", v2. Series of lru_add_drain_all()-related patches, arising from recent mm/gup migration report from Will Deacon. This patch (of 5): Will Deacon reports:- When taking a longterm GUP pin via pin_user_pages(), __gup_longterm_locked() tries to migrate target folios that should not be longterm pinned, for example because they reside in a CMA region or movable zone. This is done by first pinning all of the target folios anyway, collecting all of the longterm-unpinnable target folios into a list, dropping the pins that were just taken and finally handing the list off to migrate_pages() for the actual migration. It is critically important that no unexpected references are held on the folios being migrated, otherwise the migration will fail and pin_user_pages() will return -ENOMEM to its caller. Unfortunately, it is relatively easy to observe migration failures when running pKVM (which uses pin_user_pages() on crosvm's virtual address space to resolve stage-2 page faults from the guest) on a 6.15-based Pixel 6 device and this results in the VM terminating prematurely. In the failure case, 'crosvm' has called mlock(MLOCK_ONFAULT) on its mapping of guest memory prior to the pinning. Subsequently, when pin_user_pages() walks the page-table, the relevant 'pte' is not present and so the faulting logic allocates a new folio, mlocks it with mlock_folio() and maps it in the page-table. Since commit 2fbb0c10d1e8 ("mm/munlock: mlock_page() munlock_page() batch by pagevec"), mlock/munlock operations on a folio (formerly page), are deferred. For example, mlock_folio() takes an additional reference on the target folio before placing it into a per-cpu 'folio_batch' for later processing by mlock_folio_batch(), which drops the refcount once the operation is complete. Processing of the batches is coupled with the LRU batch logic and can be forcefully drained with lru_add_drain_all() but as long as a folio remains unprocessed on the batch, its refcount will be elevated. This deferred batching therefore interacts poorly with the pKVM pinning scenario as we can find ourselves in a situation where the migration code fails to migrate a folio due to the elevated refcount from the pending mlock operation. Hugh Dickins adds:- !folio_test_lru() has never been a very reliable way to tell if an lru_add_drain_all() is worth calling, to remove LRU cache references to make the folio migratable: the LRU flag may be set even while the folio is held with an extra reference in a per-CPU LRU cache. 5.18 commit 2fbb0c10d1e8 may have made it more unreliable. Then 6.11 commit 33dfe9204f29 ("mm/gup: clear the LRU flag of a page before adding to LRU batch") tried to make it reliable, by moving LRU flag clearing; but missed the mlock/munlock batches, so still unreliable as reported. And it turns out to be difficult to extend 33dfe9204f29's LRU flag clearing to the mlock/munlock batches: if they do benefit from batching, mlock/munlock cannot be so effective when easily suppressed while !LRU. Instead, switch to an expected ref_count check, which was more reliable all along: some more false positives (unhelpful drains) than before, and never a guarantee that the folio will prove migratable, but better. Note on PG_private_2: ceph and nfs are still using the deprecated PG_private_2 flag, with the aid of netfs and filemap support functions. Although it is consistently matched by an increment of folio ref_count, folio_expected_ref_count() intentionally does not recognize it, and ceph folio migration currently depends on that for PG_private_2 folios to be rejected. New references to the deprecated flag are discouraged, so do not add it into the collect_longterm_unpinnable_folios() calculation: but longterm pinning of transiently PG_private_2 ceph and nfs folios (an uncommon case) may invoke a redundant lru_add_drain_all(). And this makes easy the backport to earlier releases: up to and including 6.12, btrfs also used PG_private_2, but without a ref_count increment. Note for stable backports: requires 6.16 commit 86ebd50224c0 ("mm: add folio_expected_ref_count() for reference count calculation"). Link: https://lkml.kernel.org/r/41395944-b0e3-c3ac-d648-8ddd70451d28@google.com Link: https://lkml.kernel.org/r/bd1f314a-fca1-8f19-cac0-b936c9614557@google.com Fixes: 9a4e9f3b2d73 ("mm: update get_user_pages_longterm to migrate pages allocated from CMA region") Signed-off-by: Hugh Dickins <hughd@google.com> Reported-by: Will Deacon <will@kernel.org> Closes: https://lore.kernel.org/linux-mm/20250815101858.24352-1-will@kernel.org/ Acked-by: Kiryl Shutsemau <kas@kernel.org> Acked-by: David Hildenbrand <david@redhat.com> Cc: "Aneesh Kumar K.V" <aneesh.kumar@kernel.org> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: Chris Li <chrisl@kernel.org> Cc: Christoph Hellwig <hch@infradead.org> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Keir Fraser <keirf@google.com> Cc: Konstantin Khlebnikov <koct9i@gmail.com> Cc: Li Zhe <lizhe.67@bytedance.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Peter Xu <peterx@redhat.com> Cc: Rik van Riel <riel@surriel.com> Cc: Shivank Garg <shivankg@amd.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Wei Xu <weixugc@google.com> Cc: yangge <yangge1116@126.com> Cc: Yuanchu Xie <yuanchu@google.com> Cc: Yu Zhao <yuzhao@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-09-08MAINTAINERS: add tree entry to numa memblocks and emulation blockMike Rapoport (Microsoft)1-0/+1
Link: https://lkml.kernel.org/r/20250905091557.3529937-1-rppt@kernel.org Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Acked-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Acked-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: David Hildenbrand <david@redhat.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-09-08mm/damon/sysfs: fix use-after-free in state_show()Stanislav Fort1-5/+9
state_show() reads kdamond->damon_ctx without holding damon_sysfs_lock. This allows a use-after-free race: CPU 0 CPU 1 ----- ----- state_show() damon_sysfs_turn_damon_on() ctx = kdamond->damon_ctx; mutex_lock(&damon_sysfs_lock); damon_destroy_ctx(kdamond->damon_ctx); kdamond->damon_ctx = NULL; mutex_unlock(&damon_sysfs_lock); damon_is_running(ctx); /* ctx is freed */ mutex_lock(&ctx->kdamond_lock); /* UAF */ (The race can also occur with damon_sysfs_kdamonds_rm_dirs() and damon_sysfs_kdamond_release(), which free or replace the context under damon_sysfs_lock.) Fix by taking damon_sysfs_lock before dereferencing the context, mirroring the locking used in pid_show(). The bug has existed since state_show() first accessed kdamond->damon_ctx. Link: https://lkml.kernel.org/r/20250905101046.2288-1-disclosure@aisle.com Fixes: a61ea561c871 ("mm/damon/sysfs: link DAMON for virtual address spaces monitoring") Signed-off-by: Stanislav Fort <disclosure@aisle.com> Reported-by: Stanislav Fort <disclosure@aisle.com> Reviewed-by: SeongJae Park <sj@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-09-08proc: fix type confusion in pde_set_flags()wangzijie1-1/+2
Commit 2ce3d282bd50 ("proc: fix missing pde_set_flags() for net proc files") missed a key part in the definition of proc_dir_entry: union { const struct proc_ops *proc_ops; const struct file_operations *proc_dir_ops; }; So dereference of ->proc_ops assumes it is a proc_ops structure results in type confusion and make NULL check for 'proc_ops' not work for proc dir. Add !S_ISDIR(dp->mode) test before calling pde_set_flags() to fix it. Link: https://lkml.kernel.org/r/20250904135715.3972782-1-wangzijie1@honor.com Fixes: 2ce3d282bd50 ("proc: fix missing pde_set_flags() for net proc files") Signed-off-by: wangzijie <wangzijie1@honor.com> Reported-by: Brad Spengler <spender@grsecurity.net> Closes: https://lore.kernel.org/all/20250903065758.3678537-1-wangzijie1@honor.com/ Cc: Alexey Dobriyan <adobriyan@gmail.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Christian Brauner <brauner@kernel.org> Cc: Jiri Slaby <jirislaby@kernel.org> Cc: Stefano Brivio <sbrivio@redhat.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-09-08compiler-clang.h: define __SANITIZE_*__ macros only when undefinedNathan Chancellor1-5/+24
Clang 22 recently added support for defining __SANITIZE__ macros similar to GCC [1], which causes warnings (or errors with CONFIG_WERROR=y or W=e) with the existing defines that the kernel creates to emulate this behavior with existing clang versions. In file included from <built-in>:3: In file included from include/linux/compiler_types.h:171: include/linux/compiler-clang.h:37:9: error: '__SANITIZE_THREAD__' macro redefined [-Werror,-Wmacro-redefined] 37 | #define __SANITIZE_THREAD__ | ^ <built-in>:352:9: note: previous definition is here 352 | #define __SANITIZE_THREAD__ 1 | ^ Refactor compiler-clang.h to only define the sanitizer macros when they are undefined and adjust the rest of the code to use these macros for checking if the sanitizers are enabled, clearing up the warnings and allowing the kernel to easily drop these defines when the minimum supported version of LLVM for building the kernel becomes 22.0.0 or newer. Link: https://lkml.kernel.org/r/20250902-clang-update-sanitize-defines-v1-1-cf3702ca3d92@kernel.org Link: https://github.com/llvm/llvm-project/commit/568c23bbd3303518c5056d7f03444dae4fdc8a9c [1] Signed-off-by: Nathan Chancellor <nathan@kernel.org> Reviewed-by: Justin Stitt <justinstitt@google.com> Cc: Alexander Potapenko <glider@google.com> Cc: Andrey Konovalov <andreyknvl@gmail.com> Cc: Andrey Ryabinin <ryabinin.a.a@gmail.com> Cc: Bill Wendling <morbo@google.com> Cc: Dmitriy Vyukov <dvyukov@google.com> Cc: Marco Elver <elver@google.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-09-08mm/vmalloc, mm/kasan: respect gfp mask in kasan_populate_vmalloc()Uladzislau Rezki (Sony)3-14/+31
kasan_populate_vmalloc() and its helpers ignore the caller's gfp_mask and always allocate memory using the hardcoded GFP_KERNEL flag. This makes them inconsistent with vmalloc(), which was recently extended to support GFP_NOFS and GFP_NOIO allocations. Page table allocations performed during shadow population also ignore the external gfp_mask. To preserve the intended semantics of GFP_NOFS and GFP_NOIO, wrap the apply_to_page_range() calls into the appropriate memalloc scope. xfs calls vmalloc with GFP_NOFS, so this bug could lead to deadlock. There was a report here https://lkml.kernel.org/r/686ea951.050a0220.385921.0016.GAE@google.com This patch: - Extends kasan_populate_vmalloc() and helpers to take gfp_mask; - Passes gfp_mask down to alloc_pages_bulk() and __get_free_page(); - Enforces GFP_NOFS/NOIO semantics with memalloc_*_save()/restore() around apply_to_page_range(); - Updates vmalloc.c and percpu allocator call sites accordingly. Link: https://lkml.kernel.org/r/20250831121058.92971-1-urezki@gmail.com Fixes: 451769ebb7e7 ("mm/vmalloc: alloc GFP_NO{FS,IO} for vmalloc") Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com> Reported-by: syzbot+3470c9ffee63e4abafeb@syzkaller.appspotmail.com Reviewed-by: Andrey Ryabinin <ryabinin.a.a@gmail.com> Cc: Baoquan He <bhe@redhat.com> Cc: Michal Hocko <mhocko@kernel.org> Cc: Alexander Potapenko <glider@google.com> Cc: Andrey Konovalov <andreyknvl@gmail.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Vincenzo Frascino <vincenzo.frascino@arm.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-09-08ocfs2: fix recursive semaphore deadlock in fiemap callMark Tinguely1-1/+9
syzbot detected a OCFS2 hang due to a recursive semaphore on a FS_IOC_FIEMAP of the extent list on a specially crafted mmap file. context_switch kernel/sched/core.c:5357 [inline] __schedule+0x1798/0x4cc0 kernel/sched/core.c:6961 __schedule_loop kernel/sched/core.c:7043 [inline] schedule+0x165/0x360 kernel/sched/core.c:7058 schedule_preempt_disabled+0x13/0x30 kernel/sched/core.c:7115 rwsem_down_write_slowpath+0x872/0xfe0 kernel/locking/rwsem.c:1185 __down_write_common kernel/locking/rwsem.c:1317 [inline] __down_write kernel/locking/rwsem.c:1326 [inline] down_write+0x1ab/0x1f0 kernel/locking/rwsem.c:1591 ocfs2_page_mkwrite+0x2ff/0xc40 fs/ocfs2/mmap.c:142 do_page_mkwrite+0x14d/0x310 mm/memory.c:3361 wp_page_shared mm/memory.c:3762 [inline] do_wp_page+0x268d/0x5800 mm/memory.c:3981 handle_pte_fault mm/memory.c:6068 [inline] __handle_mm_fault+0x1033/0x5440 mm/memory.c:6195 handle_mm_fault+0x40a/0x8e0 mm/memory.c:6364 do_user_addr_fault+0x764/0x1390 arch/x86/mm/fault.c:1387 handle_page_fault arch/x86/mm/fault.c:1476 [inline] exc_page_fault+0x76/0xf0 arch/x86/mm/fault.c:1532 asm_exc_page_fault+0x26/0x30 arch/x86/include/asm/idtentry.h:623 RIP: 0010:copy_user_generic arch/x86/include/asm/uaccess_64.h:126 [inline] RIP: 0010:raw_copy_to_user arch/x86/include/asm/uaccess_64.h:147 [inline] RIP: 0010:_inline_copy_to_user include/linux/uaccess.h:197 [inline] RIP: 0010:_copy_to_user+0x85/0xb0 lib/usercopy.c:26 Code: e8 00 bc f7 fc 4d 39 fc 72 3d 4d 39 ec 77 38 e8 91 b9 f7 fc 4c 89 f7 89 de e8 47 25 5b fd 0f 01 cb 4c 89 ff 48 89 d9 4c 89 f6 <f3> a4 0f 1f 00 48 89 cb 0f 01 ca 48 89 d8 5b 41 5c 41 5d 41 5e 41 RSP: 0018:ffffc9000403f950 EFLAGS: 00050256 RAX: ffffffff84c7f101 RBX: 0000000000000038 RCX: 0000000000000038 RDX: 0000000000000000 RSI: ffffc9000403f9e0 RDI: 0000200000000060 RBP: ffffc9000403fa90 R08: ffffc9000403fa17 R09: 1ffff92000807f42 R10: dffffc0000000000 R11: fffff52000807f43 R12: 0000200000000098 R13: 00007ffffffff000 R14: ffffc9000403f9e0 R15: 0000200000000060 copy_to_user include/linux/uaccess.h:225 [inline] fiemap_fill_next_extent+0x1c0/0x390 fs/ioctl.c:145 ocfs2_fiemap+0x888/0xc90 fs/ocfs2/extent_map.c:806 ioctl_fiemap fs/ioctl.c:220 [inline] do_vfs_ioctl+0x1173/0x1430 fs/ioctl.c:532 __do_sys_ioctl fs/ioctl.c:596 [inline] __se_sys_ioctl+0x82/0x170 fs/ioctl.c:584 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] do_syscall_64+0xfa/0x3b0 arch/x86/entry/syscall_64.c:94 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7f5f13850fd9 RSP: 002b:00007ffe3b3518b8 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 RAX: ffffffffffffffda RBX: 0000200000000000 RCX: 00007f5f13850fd9 RDX: 0000200000000040 RSI: 00000000c020660b RDI: 0000000000000004 RBP: 6165627472616568 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffe3b3518f0 R13: 00007ffe3b351b18 R14: 431bde82d7b634db R15: 00007f5f1389a03b ocfs2_fiemap() takes a read lock of the ip_alloc_sem semaphore (since v2.6.22-527-g7307de80510a) and calls fiemap_fill_next_extent() to read the extent list of this running mmap executable. The user supplied buffer to hold the fiemap information page faults calling ocfs2_page_mkwrite() which will take a write lock (since v2.6.27-38-g00dc417fa3e7) of the same semaphore. This recursive semaphore will hold filesystem locks and causes a hang of the fileystem. The ip_alloc_sem protects the inode extent list and size. Release the read semphore before calling fiemap_fill_next_extent() in ocfs2_fiemap() and ocfs2_fiemap_inline(). This does an unnecessary semaphore lock/unlock on the last extent but simplifies the error path. Link: https://lkml.kernel.org/r/61d1a62b-2631-4f12-81e2-cd689914360b@oracle.com Fixes: 00dc417fa3e7 ("ocfs2: fiemap support") Signed-off-by: Mark Tinguely <mark.tinguely@oracle.com> Reported-by: syzbot+541dcc6ee768f77103e7@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=541dcc6ee768f77103e7 Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Mark Fasheh <mark@fasheh.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Jun Piao <piaojun@huawei.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-09-08mm/memory-failure: fix VM_BUG_ON_PAGE(PagePoisoned(page)) when unpoison memoryMiaohe Lin1-4/+3
When I did memory failure tests, below panic occurs: page dumped because: VM_BUG_ON_PAGE(PagePoisoned(page)) kernel BUG at include/linux/page-flags.h:616! Oops: invalid opcode: 0000 [#1] PREEMPT SMP NOPTI CPU: 3 PID: 720 Comm: bash Not tainted 6.10.0-rc1-00195-g148743902568 #40 RIP: 0010:unpoison_memory+0x2f3/0x590 RSP: 0018:ffffa57fc8787d60 EFLAGS: 00000246 RAX: 0000000000000037 RBX: 0000000000000009 RCX: ffff9be25fcdc9c8 RDX: 0000000000000000 RSI: 0000000000000027 RDI: ffff9be25fcdc9c0 RBP: 0000000000300000 R08: ffffffffb4956f88 R09: 0000000000009ffb R10: 0000000000000284 R11: ffffffffb4926fa0 R12: ffffe6b00c000000 R13: ffff9bdb453dfd00 R14: 0000000000000000 R15: fffffffffffffffe FS: 00007f08f04e4740(0000) GS:ffff9be25fcc0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000564787a30410 CR3: 000000010d4e2000 CR4: 00000000000006f0 Call Trace: <TASK> unpoison_memory+0x2f3/0x590 simple_attr_write_xsigned.constprop.0.isra.0+0xb3/0x110 debugfs_attr_write+0x42/0x60 full_proxy_write+0x5b/0x80 vfs_write+0xd5/0x540 ksys_write+0x64/0xe0 do_syscall_64+0xb9/0x1d0 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7f08f0314887 RSP: 002b:00007ffece710078 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 0000000000000009 RCX: 00007f08f0314887 RDX: 0000000000000009 RSI: 0000564787a30410 RDI: 0000000000000001 RBP: 0000564787a30410 R08: 000000000000fefe R09: 000000007fffffff R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000009 R13: 00007f08f041b780 R14: 00007f08f0417600 R15: 00007f08f0416a00 </TASK> Modules linked in: hwpoison_inject ---[ end trace 0000000000000000 ]--- RIP: 0010:unpoison_memory+0x2f3/0x590 RSP: 0018:ffffa57fc8787d60 EFLAGS: 00000246 RAX: 0000000000000037 RBX: 0000000000000009 RCX: ffff9be25fcdc9c8 RDX: 0000000000000000 RSI: 0000000000000027 RDI: ffff9be25fcdc9c0 RBP: 0000000000300000 R08: ffffffffb4956f88 R09: 0000000000009ffb R10: 0000000000000284 R11: ffffffffb4926fa0 R12: ffffe6b00c000000 R13: ffff9bdb453dfd00 R14: 0000000000000000 R15: fffffffffffffffe FS: 00007f08f04e4740(0000) GS:ffff9be25fcc0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000564787a30410 CR3: 000000010d4e2000 CR4: 00000000000006f0 Kernel panic - not syncing: Fatal exception Kernel Offset: 0x31c00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff) ---[ end Kernel panic - not syncing: Fatal exception ]--- The root cause is that unpoison_memory() tries to check the PG_HWPoison flags of an uninitialized page. So VM_BUG_ON_PAGE(PagePoisoned(page)) is triggered. This can be reproduced by below steps: 1.Offline memory block: echo offline > /sys/devices/system/memory/memory12/state 2.Get offlined memory pfn: page-types -b n -rlN 3.Write pfn to unpoison-pfn echo <pfn> > /sys/kernel/debug/hwpoison/unpoison-pfn This scenario can be identified by pfn_to_online_page() returning NULL. And ZONE_DEVICE pages are never expected, so we can simply fail if pfn_to_online_page() == NULL to fix the bug. Link: https://lkml.kernel.org/r/20250828024618.1744895-1-linmiaohe@huawei.com Fixes: f1dd2cd13c4b ("mm, memory_hotplug: do not associate hotadded memory to zones until online") Signed-off-by: Miaohe Lin <linmiaohe@huawei.com> Suggested-by: David Hildenbrand <david@redhat.com> Acked-by: David Hildenbrand <david@redhat.com> Cc: Naoya Horiguchi <nao.horiguchi@gmail.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-09-08mm/mremap: fix regression in vrm->new_addr checkCarlos Llamas1-3/+6
Commit 3215eaceca87 ("mm/mremap: refactor initial parameter sanity checks") moved the sanity check for vrm->new_addr from mremap_to() to check_mremap_params(). However, this caused a regression as vrm->new_addr is now checked even when MREMAP_FIXED and MREMAP_DONTUNMAP flags are not specified. In this case, vrm->new_addr can be garbage and create unexpected failures. Fix this by moving the new_addr check after the vrm_implies_new_addr() guard. This ensures that the new_addr is only checked when the user has specified one explicitly. Link: https://lkml.kernel.org/r/20250828142657.770502-1-cmllamas@google.com Fixes: 3215eaceca87 ("mm/mremap: refactor initial parameter sanity checks") Signed-off-by: Carlos Llamas <cmllamas@google.com> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Carlos Llamas <cmllamas@google.com> Cc: Jann Horn <jannh@google.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-09-08percpu: fix race on alloc failed warning limitVlad Dumitrescu1-8/+12
The 'allocation failed, ...' warning messages can cause unlimited log spam, contrary to the implementation's intent. The warn_limit variable is accessed without synchronization. If more than <warn_limit> threads enter the warning path at the same time, the variable will get decremented past 0. Once it becomes negative, the non-zero check will always return true leading to unlimited log spam. Use atomic operation to access warn_limit and change condition to test for non-negative (>= 0) - atomic_dec_if_positive will return -1 once warn_limit becomes 0. Continue to print disable message alongside the last warning. While the change cited in Fixes is only adjacent, the warning limit implementation was correct before it. Only non-atomic allocations were considered for warnings, and those happened to hold pcpu_alloc_mutex while accessing warn_limit. [vdumitrescu@nvidia.com: prevent warn_limit from going negative, per Christoph Lameter] Link: https://lkml.kernel.org/r/ee87cc59-2717-4dbb-8052-1d2692c5aaaa@nvidia.com Link: https://lkml.kernel.org/r/ab22061a-a62f-4429-945b-744e5cc4ba35@nvidia.com Fixes: f7d77dfc91f7 ("mm/percpu.c: print error message too if atomic alloc failed") Signed-off-by: Vlad Dumitrescu <vdumitrescu@nvidia.com> Reviewed-by: Baoquan He <bhe@redhat.com> Cc: Christoph Lameter (Ampere) <cl@gentwo.org> Cc: Dennis Zhou <dennis@kernel.org> Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-09-03mm/memory-failure: fix redundant updates for already poisoned pagesKyle Meyer1-7/+6
Duplicate memory errors can be reported by multiple sources. Passing an already poisoned page to action_result() causes issues: * The amount of hardware corrupted memory is incorrectly updated. * Per NUMA node MF stats are incorrectly updated. * Redundant "already poisoned" messages are printed. Avoid those issues by: * Skipping hardware corrupted memory updates for already poisoned pages. * Skipping per NUMA node MF stats updates for already poisoned pages. * Dropping redundant "already poisoned" messages. Make MF_MSG_ALREADY_POISONED consistent with other action_page_types and make calls to action_result() consistent for already poisoned normal pages and huge pages. Link: https://lkml.kernel.org/r/aLCiHMy12Ck3ouwC@hpe.com Fixes: b8b9488d50b7 ("mm/memory-failure: improve memory failure action_result messages") Signed-off-by: Kyle Meyer <kyle.meyer@hpe.com> Reviewed-by: Jiaqi Yan <jiaqiyan@google.com> Acked-by: David Hildenbrand <david@redhat.com> Reviewed-by: Jane Chu <jane.chu@oracle.com> Acked-by: Miaohe Lin <linmiaohe@huawei.com> Cc: Borislav Betkov <bp@alien8.de> Cc: Kyle Meyer <kyle.meyer@hpe.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: "Luck, Tony" <tony.luck@intel.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Naoya Horiguchi <nao.horiguchi@gmail.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: Russ Anderson <russ.anderson@hpe.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-09-03s390: kexec: initialize kexec_buf structBreno Leitao3-5/+5
The kexec_buf structure was previously declared without initialization. commit bf454ec31add ("kexec_file: allow to place kexec_buf randomly") added a field that is always read but not consistently populated by all architectures. This un-initialized field will contain garbage. This is also triggering a UBSAN warning when the uninitialized data was accessed: ------------[ cut here ]------------ UBSAN: invalid-load in ./include/linux/kexec.h:210:10 load of value 252 is not a valid value for type '_Bool' Zero-initializing kexec_buf at declaration ensures all fields are cleanly set, preventing future instances of uninitialized memory being used. Link: https://lkml.kernel.org/r/20250827-kbuf_all-v1-3-1df9882bb01a@debian.org Fixes: bf454ec31add ("kexec_file: allow to place kexec_buf randomly") Signed-off-by: Breno Leitao <leitao@debian.org> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Alexandre Ghiti <alex@ghiti.fr> Cc: Baoquan He <bhe@redhat.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christian Borntraeger <borntraeger@linux.ibm.com> Cc: Coiby Xu <coxu@redhat.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Will Deacon <will@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-09-03riscv: kexec: initialize kexec_buf structBreno Leitao3-4/+4
The kexec_buf structure was previously declared without initialization. commit bf454ec31add ("kexec_file: allow to place kexec_buf randomly") added a field that is always read but not consistently populated by all architectures. This un-initialized field will contain garbage. This is also triggering a UBSAN warning when the uninitialized data was accessed: ------------[ cut here ]------------ UBSAN: invalid-load in ./include/linux/kexec.h:210:10 load of value 252 is not a valid value for type '_Bool' Zero-initializing kexec_buf at declaration ensures all fields are cleanly set, preventing future instances of uninitialized memory being used. Link: https://lkml.kernel.org/r/20250827-kbuf_all-v1-2-1df9882bb01a@debian.org Fixes: bf454ec31add ("kexec_file: allow to place kexec_buf randomly") Signed-off-by: Breno Leitao <leitao@debian.org> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Alexandre Ghiti <alex@ghiti.fr> Cc: Baoquan He <bhe@redhat.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christian Borntraeger <borntraeger@linux.ibm.com> Cc: Coiby Xu <coxu@redhat.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Will Deacon <will@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-09-03arm64: kexec: initialize kexec_buf struct in load_other_segments()Breno Leitao1-1/+1
Patch series "kexec: Fix invalid field access". The kexec_buf structure was previously declared without initialization. commit bf454ec31add ("kexec_file: allow to place kexec_buf randomly") added a field that is always read but not consistently populated by all architectures. This un-initialized field will contain garbage. This is also triggering a UBSAN warning when the uninitialized data was accessed: ------------[ cut here ]------------ UBSAN: invalid-load in ./include/linux/kexec.h:210:10 load of value 252 is not a valid value for type '_Bool' Zero-initializing kexec_buf at declaration ensures all fields are cleanly set, preventing future instances of uninitialized memory being used. An initial fix was already landed for arm64[0], and this patchset fixes the problem on the remaining arm64 code and on riscv, as raised by Mark. Discussions about this problem could be found at[1][2]. This patch (of 3): The kexec_buf structure was previously declared without initialization. commit bf454ec31add ("kexec_file: allow to place kexec_buf randomly") added a field that is always read but not consistently populated by all architectures. This un-initialized field will contain garbage. This is also triggering a UBSAN warning when the uninitialized data was accessed: ------------[ cut here ]------------ UBSAN: invalid-load in ./include/linux/kexec.h:210:10 load of value 252 is not a valid value for type '_Bool' Zero-initializing kexec_buf at declaration ensures all fields are cleanly set, preventing future instances of uninitialized memory being used. Link: https://lkml.kernel.org/r/20250827-kbuf_all-v1-0-1df9882bb01a@debian.org Link: https://lkml.kernel.org/r/20250827-kbuf_all-v1-1-1df9882bb01a@debian.org Link: https://lore.kernel.org/all/20250826180742.f2471131255ec1c43683ea07@linux-foundation.org/ [0] Link: https://lore.kernel.org/all/oninomspajhxp4omtdapxnckxydbk2nzmrix7rggmpukpnzadw@c67o7njgdgm3/ [1] Link: https://lore.kernel.org/all/20250826-akpm-v1-1-3c831f0e3799@debian.org/ [2] Fixes: bf454ec31add ("kexec_file: allow to place kexec_buf randomly") Signed-off-by: Breno Leitao <leitao@debian.org> Acked-by: Baoquan He <bhe@redhat.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexander Gordeev <agordeev@linux.ibm.com> Cc: Alexandre Ghiti <alex@ghiti.fr> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christian Borntraeger <borntraeger@linux.ibm.com> Cc: Coiby Xu <coxu@redhat.com> Cc: Heiko Carstens <hca@linux.ibm.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Sven Schnelle <svens@linux.ibm.com> Cc: Vasily Gorbik <gor@linux.ibm.com> Cc: Will Deacon <will@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-09-03mm/damon/reclaim: avoid divide-by-zero in damon_reclaim_apply_parameters()Quanmin Yan1-0/+5
When creating a new scheme of DAMON_RECLAIM, the calculation of 'min_age_region' uses 'aggr_interval' as the divisor, which may lead to division-by-zero errors. Fix it by directly returning -EINVAL when such a case occurs. Link: https://lkml.kernel.org/r/20250827115858.1186261-3-yanquanmin1@huawei.com Fixes: f5a79d7c0c87 ("mm/damon: introduce struct damos_access_pattern") Signed-off-by: Quanmin Yan <yanquanmin1@huawei.com> Reviewed-by: SeongJae Park <sj@kernel.org> Cc: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: ze zuo <zuoze1@huawei.com> Cc: <stable@vger.kernel.org> [6.1+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-09-03mm/damon/lru_sort: avoid divide-by-zero in damon_lru_sort_apply_parameters()Quanmin Yan1-0/+5
Patch series "mm/damon: avoid divide-by-zero in DAMON module's parameters application". DAMON's RECLAIM and LRU_SORT modules perform no validation on user-configured parameters during application, which may lead to division-by-zero errors. Avoid the divide-by-zero by adding validation checks when DAMON modules attempt to apply the parameters. This patch (of 2): During the calculation of 'hot_thres' and 'cold_thres', either 'sample_interval' or 'aggr_interval' is used as the divisor, which may lead to division-by-zero errors. Fix it by directly returning -EINVAL when such a case occurs. Additionally, since 'aggr_interval' is already required to be set no smaller than 'sample_interval' in damon_set_attrs(), only the case where 'sample_interval' is zero needs to be checked. Link: https://lkml.kernel.org/r/20250827115858.1186261-2-yanquanmin1@huawei.com Fixes: 40e983cca927 ("mm/damon: introduce DAMON-based LRU-lists Sorting") Signed-off-by: Quanmin Yan <yanquanmin1@huawei.com> Reviewed-by: SeongJae Park <sj@kernel.org> Cc: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: ze zuo <zuoze1@huawei.com> Cc: <stable@vger.kernel.org> [6.0+] Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-09-03mm/damon/core: set quota->charged_from to jiffies at first charge windowSang-Heon Jeon1-0/+4
Kernel initializes the "jiffies" timer as 5 minutes below zero, as shown in include/linux/jiffies.h /* * Have the 32 bit jiffies value wrap 5 minutes after boot * so jiffies wrap bugs show up earlier. */ #define INITIAL_JIFFIES ((unsigned long)(unsigned int) (-300*HZ)) And jiffies comparison help functions cast unsigned value to signed to cover wraparound #define time_after_eq(a,b) \ (typecheck(unsigned long, a) && \ typecheck(unsigned long, b) && \ ((long)((a) - (b)) >= 0)) When quota->charged_from is initialized to 0, time_after_eq() can incorrectly return FALSE even after reset_interval has elapsed. This occurs when (jiffies - reset_interval) produces a value with MSB=1, which is interpreted as negative in signed arithmetic. This issue primarily affects 32-bit systems because: On 64-bit systems: MSB=1 values occur after ~292 million years from boot (assuming HZ=1000), almost impossible. On 32-bit systems: MSB=1 values occur during the first 5 minutes after boot, and the second half of every jiffies wraparound cycle, starting from day 25 (assuming HZ=1000) When above unexpected FALSE return from time_after_eq() occurs, the charging window will not reset. The user impact depends on esz value at that time. If esz is 0, scheme ignores configured quotas and runs without any limits. If esz is not 0, scheme stops working once the quota is exhausted. It remains until the charging window finally resets. So, change quota->charged_from to jiffies at damos_adjust_quota() when it is considered as the first charge window. By this change, we can avoid unexpected FALSE return from time_after_eq() Link: https://lkml.kernel.org/r/20250822025057.1740854-1-ekffu200098@gmail.com Fixes: 2b8a248d5873 ("mm/damon/schemes: implement size quota for schemes application speed control") # 5.16 Signed-off-by: Sang-Heon Jeon <ekffu200098@gmail.com> Reviewed-by: SeongJae Park <sj@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-09-03mm/hugetlb: add missing hugetlb_lock in __unmap_hugepage_range()Jeongjun Park1-3/+6
When restoring a reservation for an anonymous page, we need to check to freeing a surplus. However, __unmap_hugepage_range() causes data race because it reads h->surplus_huge_pages without the protection of hugetlb_lock. And adjust_reservation is a boolean variable that indicates whether reservations for anonymous pages in each folio should be restored. Therefore, it should be initialized to false for each round of the loop. However, this variable is not initialized to false except when defining the current adjust_reservation variable. This means that once adjust_reservation is set to true even once within the loop, reservations for anonymous pages will be restored unconditionally in all subsequent rounds, regardless of the folio's state. To fix this, we need to add the missing hugetlb_lock, unlock the page_table_lock earlier so that we don't lock the hugetlb_lock inside the page_table_lock lock, and initialize adjust_reservation to false on each round within the loop. Link: https://lkml.kernel.org/r/20250823182115.1193563-1-aha310510@gmail.com Fixes: df7a6d1f6405 ("mm/hugetlb: restore the reservation if needed") Signed-off-by: Jeongjun Park <aha310510@gmail.com> Reported-by: syzbot+417aeb05fd190f3a6da9@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=417aeb05fd190f3a6da9 Reviewed-by: Sidhartha Kumar <sidhartha.kumar@oracle.com> Cc: Breno Leitao <leitao@debian.org> Cc: David Hildenbrand <david@redhat.com> Cc: Muchun Song <muchun.song@linux.dev> Cc: Oscar Salvador <osalvador@suse.de> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-09-03init/main.c: fix boot time tracing crashMike Rapoport (Microsoft)1-1/+1
Steven Rostedt reported a crash with "ftrace=function" kernel command line: [ 0.159269] BUG: kernel NULL pointer dereference, address: 000000000000001c [ 0.160254] #PF: supervisor read access in kernel mode [ 0.160975] #PF: error_code(0x0000) - not-present page [ 0.161697] PGD 0 P4D 0 [ 0.162055] Oops: Oops: 0000 [#1] SMP PTI [ 0.162619] CPU: 0 UID: 0 PID: 0 Comm: swapper Not tainted 6.17.0-rc2-test-00006-g48d06e78b7cb-dirty #9 PREEMPT(undef) [ 0.164141] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 [ 0.165439] RIP: 0010:kmem_cache_alloc_noprof (mm/slub.c:4237) [ 0.166186] Code: 90 90 90 f3 0f 1e fa 0f 1f 44 00 00 55 48 89 e5 41 57 41 56 41 55 41 54 49 89 fc 53 48 83 e4 f0 48 83 ec 20 8b 05 c9 b6 7e 01 <44> 8b 77 1c 65 4c 8b 2d b5 ea 20 02 4c 89 6c 24 18 41 89 f5 21 f0 [ 0.168811] RSP: 0000:ffffffffb2e03b30 EFLAGS: 00010086 [ 0.169545] RAX: 0000000001fff33f RBX: 0000000000000000 RCX: 0000000000000000 [ 0.170544] RDX: 0000000000002800 RSI: 0000000000002800 RDI: 0000000000000000 [ 0.171554] RBP: ffffffffb2e03b80 R08: 0000000000000004 R09: ffffffffb2e03c90 [ 0.172549] R10: ffffffffb2e03c90 R11: 0000000000000000 R12: 0000000000000000 [ 0.173544] R13: ffffffffb2e03c90 R14: ffffffffb2e03c90 R15: 0000000000000001 [ 0.174542] FS: 0000000000000000(0000) GS:ffff9d2808114000(0000) knlGS:0000000000000000 [ 0.175684] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 0.176486] CR2: 000000000000001c CR3: 000000007264c001 CR4: 00000000000200b0 [ 0.177483] Call Trace: [ 0.177828] <TASK> [ 0.178123] mas_alloc_nodes (lib/maple_tree.c:176 (discriminator 2) lib/maple_tree.c:1255 (discriminator 2)) [ 0.178692] mas_store_gfp (lib/maple_tree.c:5468) [ 0.179223] execmem_cache_add_locked (mm/execmem.c:207) [ 0.179870] execmem_alloc (mm/execmem.c:213 mm/execmem.c:313 mm/execmem.c:335 mm/execmem.c:475) [ 0.180397] ? ftrace_caller (arch/x86/kernel/ftrace_64.S:169) [ 0.180922] ? __pfx_ftrace_caller (arch/x86/kernel/ftrace_64.S:158) [ 0.181517] execmem_alloc_rw (mm/execmem.c:487) [ 0.182052] arch_ftrace_update_trampoline (arch/x86/kernel/ftrace.c:266 arch/x86/kernel/ftrace.c:344 arch/x86/kernel/ftrace.c:474) [ 0.182778] ? ftrace_caller_op_ptr (arch/x86/kernel/ftrace_64.S:182) [ 0.183388] ftrace_update_trampoline (kernel/trace/ftrace.c:7947) [ 0.184024] __register_ftrace_function (kernel/trace/ftrace.c:368) [ 0.184682] ftrace_startup (kernel/trace/ftrace.c:3048) [ 0.185205] ? __pfx_function_trace_call (kernel/trace/trace_functions.c:210) [ 0.185877] register_ftrace_function_nolock (kernel/trace/ftrace.c:8717) [ 0.186595] register_ftrace_function (kernel/trace/ftrace.c:8745) [ 0.187254] ? __pfx_function_trace_call (kernel/trace/trace_functions.c:210) [ 0.187924] function_trace_init (kernel/trace/trace_functions.c:170) [ 0.188499] tracing_set_tracer (kernel/trace/trace.c:5916 kernel/trace/trace.c:6349) [ 0.189088] register_tracer (kernel/trace/trace.c:2391) [ 0.189642] early_trace_init (kernel/trace/trace.c:11075 kernel/trace/trace.c:11149) [ 0.190204] start_kernel (init/main.c:970) [ 0.190732] x86_64_start_reservations (arch/x86/kernel/head64.c:307) [ 0.191381] x86_64_start_kernel (??:?) [ 0.191955] common_startup_64 (arch/x86/kernel/head_64.S:419) [ 0.192534] </TASK> [ 0.192839] Modules linked in: [ 0.193267] CR2: 000000000000001c [ 0.193730] ---[ end trace 0000000000000000 ]--- The crash happens because on x86 ftrace allocations from execmem require maple tree to be initialized. Move maple tree initialization that depends only on slab availability earlier in boot so that it will happen right after mm_core_init(). Link: https://lkml.kernel.org/r/20250824130759.1732736-1-rppt@kernel.org Fixes: 5d79c2be5081 ("x86/ftrace: enable EXECMEM_ROX_CACHE for ftrace allocations") Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Reported-by: Steven Rostedt (Google) <rostedt@goodmis.org> Tested-by: Steven Rostedt (Google) <rostedt@goodmis.org> Closes: https://lore.kernel.org/all/20250820184743.0302a8b5@gandalf.local.home/ Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Borislav Betkov <bp@alien8.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleinxer <tglx@linutronix.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-09-03mm/memory_hotplug: fix hwpoisoned large folio handling in do_migrate_range()Jinjiang Tu1-2/+8
In do_migrate_range(), the hwpoisoned folio may be large folio, which can't be handled by unmap_poisoned_folio(). I can reproduce this issue in qemu after adding delay in memory_failure() BUG: kernel NULL pointer dereference, address: 0000000000000000 Workqueue: kacpi_hotplug acpi_hotplug_work_fn RIP: 0010:try_to_unmap_one+0x16a/0xfc0 <TASK> rmap_walk_anon+0xda/0x1f0 try_to_unmap+0x78/0x80 ? __pfx_try_to_unmap_one+0x10/0x10 ? __pfx_folio_not_mapped+0x10/0x10 ? __pfx_folio_lock_anon_vma_read+0x10/0x10 unmap_poisoned_folio+0x60/0x140 do_migrate_range+0x4d1/0x600 ? slab_memory_callback+0x6a/0x190 ? notifier_call_chain+0x56/0xb0 offline_pages+0x3e6/0x460 memory_subsys_offline+0x130/0x1f0 device_offline+0xba/0x110 acpi_bus_offline+0xb7/0x130 acpi_scan_hot_remove+0x77/0x290 acpi_device_hotplug+0x1e0/0x240 acpi_hotplug_work_fn+0x1a/0x30 process_one_work+0x186/0x340 Besides, do_migrate_range() may be called between memory_failure set hwpoison flag and isolate the folio from lru, so remove WARN_ON(). In other places, unmap_poisoned_folio() is called when the folio is isolated, obey it in do_migrate_range() too. [david@redhat.com: don't abort offlining, fixed typo, add comment] Link: https://lkml.kernel.org/r/3c214dff-9649-4015-840f-10de0e03ebe4@redhat.com Fixes: b15c87263a69 ("hwpoison, memory_hotplug: allow hwpoisoned pages to be offlined") Signed-off-by: Jinjiang Tu <tujinjiang@huawei.com> Signed-off-by: David Hildenbrand <david@redhat.com> Acked-by: Zi Yan <ziy@nvidia.com> Reviewed-by: Miaohe Lin <linmiaohe@huawei.com> Cc: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: Luis Chamberalin <mcgrof@kernel.org> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Michal Hocko <mhocko@kernel.org> Cc: Oscar Salvador <osalvador@suse.de> Cc: Pankaj Raghav <kernel@pankajraghav.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-09-03mm/khugepaged: fix the address passed to notifier on testing youngWei Yang1-2/+2
Commit 8ee53820edfd ("thp: mmu_notifier_test_young") introduced mmu_notifier_test_young(), but we are passing the wrong address. In xxx_scan_pmd(), the actual iteration address is "_address" not "address". We seem to misuse the variable on the very beginning. Change it to the right one. [akpm@linux-foundation.org fix whitespace, per everyone] Link: https://lkml.kernel.org/r/20250822063318.11644-1-richard.weiyang@gmail.com Fixes: 8ee53820edfd ("thp: mmu_notifier_test_young") Signed-off-by: Wei Yang <richard.weiyang@gmail.com> Reviewed-by: Dev Jain <dev.jain@arm.com> Reviewed-by: Zi Yan <ziy@nvidia.com> Acked-by: David Hildenbrand <david@redhat.com> Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Baolin Wang <baolin.wang@linux.alibaba.com> Cc: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Nico Pache <npache@redhat.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Barry Song <baohua@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-09-01mm: fix possible deadlock in kmemleakGu Bowen1-7/+20
There are some AA deadlock issues in kmemleak, similar to the situation reported by Breno [1]. The deadlock path is as follows: mem_pool_alloc() -> raw_spin_lock_irqsave(&kmemleak_lock, flags); -> pr_warn() -> netconsole subsystem -> netpoll -> __alloc_skb -> __create_object -> raw_spin_lock_irqsave(&kmemleak_lock, flags); To solve this problem, switch to printk_safe mode before printing warning message, this will redirect all printk()-s to a special per-CPU buffer, which will be flushed later from a safe context (irq work), and this deadlock problem can be avoided. The proper API to use should be printk_deferred_enter()/printk_deferred_exit() [2]. Another way is to place the warn print after kmemleak is released. Link: https://lkml.kernel.org/r/20250822073541.1886469-1-gubowen5@huawei.com Link: https://lore.kernel.org/all/20250731-kmemleak_lock-v1-1-728fd470198f@debian.org/#t [1] Link: https://lore.kernel.org/all/5ca375cd-4a20-4807-b897-68b289626550@redhat.com/ [2] Signed-off-by: Gu Bowen <gubowen5@huawei.com> Reviewed-by: Waiman Long <longman@redhat.com> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com> Reviewed-by: Breno Leitao <leitao@debian.org> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: John Ogness <john.ogness@linutronix.de> Cc: Lu Jialin <lujialin4@huawei.com> Cc: Petr Mladek <pmladek@suse.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-09-01ALSA: hda/hdmi: Add pin fix for another HP EliteDesk 800 G4 modelTakashi Iwai1-0/+1
It was reported that HP EliteDesk 800 G4 DM 65W (SSID 103c:845a) needs the similar quirk for enabling HDMI outputs, too. This patch adds the corresponding quirk entry. Cc: <stable@vger.kernel.org> Link: https://patch.msgid.link/20250901115009.27498-1-tiwai@suse.de Signed-off-by: Takashi Iwai <tiwai@suse.de>
2025-09-01ALSA: usb-audio: Allow Focusrite devices to use low sampleratesTina Wuest1-4/+8
Commit 05f254a6369ac020fc0382a7cbd3ef64ad997c92 ("ALSA: usb-audio: Improve filtering of sample rates on Focusrite devices") changed the check for max_rate in a way which was overly restrictive, forcing devices to use very high samplerates if they support them, despite support existing for lower rates as well. This maintains the intended outcome (ensuring samplerates selected are supported) while allowing devices with higher maximum samplerates to be opened at all supported samplerates. This patch was tested with a Clarett+ 8Pre USB Fixes: 05f254a6369a ("ALSA: usb-audio: Improve filtering of sample rates on Focusrite devices") Signed-off-by: Tina Wuest <tina@wuest.me> Link: https://patch.msgid.link/20250901092024.140993-1-tina@wuest.me Signed-off-by: Takashi Iwai <tiwai@suse.de>
2025-08-31Linux 6.17-rc4v6.17-rc4Linus Torvalds1-1/+1
2025-08-30kselftest/arm64: Don't open code SVE_PT_SIZE() in fp-ptraceMark Brown1-3/+2
In fp-trace when allocating a buffer to write SVE register data we open code the addition of the header size to the VL depeendent register data size, which lead to an underallocation bug when we cut'n'pasted the code for FPSIMD format writes. Use the SVE_PT_SIZE() macro that the kernel UAPI provides for this. Fixes: b84d2b27954f ("kselftest/arm64: Test FPSIMD format data writes via NT_ARM_SVE in fp-ptrace") Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20250812-arm64-fp-trace-macro-v1-1-317cfff986a5@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2025-08-30arm64: mm: Fix CFI failure due to kpti_ng_pgd_alloc function signatureKees Cook3-9/+10
Seen during KPTI initialization: CFI failure at create_kpti_ng_temp_pgd+0x124/0xce8 (target: kpti_ng_pgd_alloc+0x0/0x14; expected type: 0xd61b88b6) The call site is alloc_init_pud() at arch/arm64/mm/mmu.c: pud_phys = pgtable_alloc(TABLE_PUD); alloc_init_pud() has the prototype: static void alloc_init_pud(p4d_t *p4dp, unsigned long addr, unsigned long end, phys_addr_t phys, pgprot_t prot, phys_addr_t (*pgtable_alloc)(enum pgtable_type), int flags) where the pgtable_alloc() prototype is declared. The target (kpti_ng_pgd_alloc) is used in arch/arm64/kernel/cpufeature.c: create_kpti_ng_temp_pgd(kpti_ng_temp_pgd, __pa(alloc), KPTI_NG_TEMP_VA, PAGE_SIZE, PAGE_KERNEL, kpti_ng_pgd_alloc, 0); which is an alias for __create_pgd_mapping_locked() with prototype: extern __alias(__create_pgd_mapping_locked) void create_kpti_ng_temp_pgd(pgd_t *pgdir, phys_addr_t phys, unsigned long virt, phys_addr_t size, pgprot_t prot, phys_addr_t (*pgtable_alloc)(enum pgtable_type), int flags); __create_pgd_mapping_locked() passes the function pointer down: __create_pgd_mapping_locked() -> alloc_init_p4d() -> alloc_init_pud() But the target function (kpti_ng_pgd_alloc) has the wrong signature: static phys_addr_t __init kpti_ng_pgd_alloc(int shift); The "int" should be "enum pgtable_type". To make "enum pgtable_type" available to cpufeature.c, move enum pgtable_type definition from arch/arm64/mm/mmu.c to arch/arm64/include/asm/mmu.h. Adjust kpti_ng_pgd_alloc to use "enum pgtable_type" instead of "int". The function behavior remains identical (parameter is unused). Fixes: c64f46ee1377 ("arm64: mm: use enum to identify pgtable level instead of *_SHIFT") Cc: <stable@vger.kernel.org> # 6.16.x Signed-off-by: Kees Cook <kees@kernel.org> Acked-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20250829190721.it.373-kees@kernel.org Reviewed-by: Ryan Roberts <ryan.roberts@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2025-08-30ALSA: hda: tas2781: reorder tas2563 calibration variablesGergo Koteles1-1/+1
The tasdev_load_calibrated_data() function expects the calibration data values in the cali_data buffer as R0, R0Low, InvR0, Power, TLim which is not the same as what tas2563_save_calibration() writes to the buffer. Reorder the EFI variables in the tas2563_save_calibration() function to put the values in the buffer in the correct order. Fixes: 4fe238513407 ("ALSA: hda/tas2781: Move and unified the calibrated-data getting function for SPI and I2C into the tas2781_hda lib") Cc: <stable@vger.kernel.org> Signed-off-by: Gergo Koteles <soyer@irl.hu> Link: https://patch.msgid.link/20250829160450.66623-2-soyer@irl.hu Signed-off-by: Takashi Iwai <tiwai@suse.de>
2025-08-30ALSA: hda: tas2781: fix tas2563 EFI data endiannessGergo Koteles1-0/+3
Before conversion to unify the calibration data management, the tas2563_apply_calib() function performed the big endian conversion and wrote the calibration data to the device. The writing is now done by the common tasdev_load_calibrated_data() function, but without conversion. Put the values into the calibration data buffer with the expected endianness. Fixes: 4fe238513407 ("ALSA: hda/tas2781: Move and unified the calibrated-data getting function for SPI and I2C into the tas2781_hda lib") Cc: <stable@vger.kernel.org> Signed-off-by: Gergo Koteles <soyer@irl.hu> Link: https://patch.msgid.link/20250829160450.66623-1-soyer@irl.hu Signed-off-by: Takashi Iwai <tiwai@suse.de>
2025-08-30ALSA: firewire-motu: drop EPOLLOUT from poll return values as write is not ↵Takashi Sakamoto1-1/+1
supported The ALSA HwDep character device of the firewire-motu driver incorrectly returns EPOLLOUT in poll(2), even though the driver implements no operation for write(2). This misleads userspace applications to believe write() is allowed, potentially resulting in unnecessarily wakeups. This issue dates back to the driver's initial code added by a commit 71c3797779d3 ("ALSA: firewire-motu: add hwdep interface"), and persisted when POLLOUT was updated to EPOLLOUT by a commit a9a08845e9ac ('vfs: do bulk POLL* -> EPOLL* replacement("").'). This commit fixes the bug. Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp> Link: https://patch.msgid.link/20250829233749.366222-1-o-takashi@sakamocchi.jp Signed-off-by: Takashi Iwai <tiwai@suse.de>
2025-08-29hardening: Require clang 20.1.0 for __counted_byNathan Chancellor1-4/+5
After an innocuous change in -next that modified a structure that contains __counted_by, clang-19 start crashing when building certain files in drivers/gpu/drm/xe. When assertions are enabled, the more descriptive failure is: clang: clang/lib/AST/RecordLayoutBuilder.cpp:3335: const ASTRecordLayout &clang::ASTContext::getASTRecordLayout(const RecordDecl *) const: Assertion `D && "Cannot get layout of forward declarations!"' failed. According to a reverse bisect, a tangential change to the LLVM IR generation phase of clang during the LLVM 20 development cycle [1] resolves this problem. Bump the version of clang that enables CONFIG_CC_HAS_COUNTED_BY to 20.1.0 to ensure that this issue cannot be hit. Link: https://github.com/llvm/llvm-project/commit/160fb1121cdf703c3ef5e61fb26c5659eb581489 [1] Signed-off-by: Nathan Chancellor <nathan@kernel.org> Reviewed-by: Justin Stitt <justinstitt@google.com> Link: https://lore.kernel.org/r/20250807-fix-counted_by-clang-19-v1-1-902c86c1d515@kernel.org Signed-off-by: Kees Cook <kees@kernel.org>
2025-08-29ALSA: docs: Add documents for recently changes in snd-usb-audioCryolitia PukNgae1-4/+25
Changed: - ignore_ctl_error - lowlatency - skip_validation - quirk_flags[19:24] [ corrected a typo -- tiwai ] Signed-off-by: Cryolitia PukNgae <cryolitia@uniontech.com> Link: https://patch.msgid.link/20250829-sound-doc-v1-1-e0110452b03d@uniontech.com Signed-off-by: Takashi Iwai <tiwai@suse.de>
2025-08-29ALSA: usb-audio: Add mute TLV for playback volumes on more devicesqaqland1-0/+10
Applying the quirk of that, the lowest Playback mixer volume setting mutes the audio output, on more devices. Suggested-by: Cryolitia PukNgae <cryolitia@uniontech.com> Signed-off-by: qaqland <anguoli@uniontech.com> Link: https://patch.msgid.link/20250829-sound_quirk-v1-1-745529b44440@uniontech.com Signed-off-by: Takashi Iwai <tiwai@suse.de>
2025-08-28drm/mediatek: mtk_hdmi: Fix inverted parameters in some regmap_update_bits callsLouis-Alexis Eyraud1-4/+4
In mtk_hdmi driver, a recent change replaced custom register access function calls by regmap ones, but two replacements by regmap_update_bits were done incorrectly, because original offset and mask parameters were inverted, so fix them. Fixes: d6e25b3590a0 ("drm/mediatek: hdmi: Use regmap instead of iomem for main registers") Signed-off-by: Louis-Alexis Eyraud <louisalexis.eyraud@collabora.com> Reviewed-by: CK Hu <ck.hu@mediatek.com> Link: https://patchwork.kernel.org/project/dri-devel/patch/20250818-mt8173-fix-hdmi-issue-v1-1-55aff9b0295d@collabora.com/ Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>
2025-08-28MAINTAINERS: mark bcachefs externally maintainedLinus Torvalds1-1/+1
As per many long discussion threads, public and private. Signed-off-by: Linus Torvalds <torbalds@linux-foundation.org>
2025-08-28KVM: arm64: nv: Fix ATS12 handling of single-stage translationMarc Zyngier1-3/+3
Volodymyr reports that using a Xen DomU as a nested guest (where HCR_EL2.E2H == 0), ATS12 results in a translation that stops at the L2's S1, which isn't something you'd normally expects. Comparing the code against the spec proves to be illuminating, and suggests that the author of such code must have been tired, cross-eyed, drunk, or maybe all of the above. The gist of it is that, apart from HCR_EL2.VM or HCR_EL2.DC being 0, only the use of the EL2&0 translation regime limits the walk to S1 only, and that we must finish the S2 walk in any other case. Which solves the above issue, as E2H==0 indicates that ATS12 walks the EL1&0 translation regime. Explicitly checking for EL2&0 fixes this. Reported-by: Volodymyr Babchuk <volodymyr_babchuk@epam.com> Suggested-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Marc Zyngier <maz@kernel.org> Fixes: be04cebf3e788 ("KVM: arm64: nv: Add emulation of AT S12E{0,1}{R,W}") Link: https://lore.kernel.org/r/20250806141707.3479194-2-volodymyr_babchuk@epam.com Link: https://lore.kernel.org/r/20250809144811.2314038-2-maz@kernel.org Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2025-08-28ASoC: SOF: Intel: WCL: Add the sdw_process_wakeen opAjye Huang1-0/+1
Add the missing op in the device description to avoid issues with jack detection. Fixes: 6b04629ae97a ("ASoC: SOF: Intel: add initial support for WCL") Acked-by: Peter Ujfalusi <peter.ujfalusi@linux.intel.com> Signed-off-by: Ajye Huang <ajye_huang@compal.corp-partner.google.com> Message-ID: <20250826154040.2723998-1-ajye_huang@compal.corp-partner.google.com> Signed-off-by: Mark Brown <broonie@kernel.org>
2025-08-28KVM: arm64: Remove __vcpu_{read,write}_sys_reg_{from,to}_cpu()Marc Zyngier2-111/+80
There is no point having __vcpu_{read,write}_sys_reg_{from,to}_cpu() exposed to the rest of the kernel, as the only callers are in sys_regs.c. Move them where they below, which is another opportunity to simplify things a bit. Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20250817121926.217900-5-maz@kernel.org Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2025-08-28KVM: arm64: Fix vcpu_{read,write}_sys_reg() accessorsMarc Zyngier2-112/+162
Volodymyr reports (again!) that under some circumstances (E2H==0, walking S1 PTs), PAR_EL1 doesn't report the value of the latest walk in the CPU register, but that instead the value is written to the backing store. Further investigation indicates that the root cause of this is that a group of registers (PAR_EL1, TPIDR*_EL{0,1}, the *32_EL2 dregs) should always be considered as "on CPU", as they are not remapped between EL1 and EL2. We fail to treat them accordingly, and end-up considering that the register (PAR_EL1 in this example) should be written to memory instead of in the register. While it would be possible to quickly work around it, it is obvious that the way we track these things at the moment is pretty horrible, and could do with some improvement. Revamp the whole thing by: - defining a location for a register (memory, cpu), potentially depending on the state of the vcpu - define a transformation for this register (mapped register, potential translation, special register needing some particular attention) - convey this information in a structure that can be easily passed around As a result, the accessors themselves become much simpler, as the state is explicit instead of being driven by hard-to-understand conventions. We get rid of the "pure EL2 register" notion, which wasn't very useful, and add sanitisation of the values by applying the RESx masks as required, something that was missing so far. And of course, we add the missing registers to the list, with the indication that they are always loaded. Reported-by: Volodymyr Babchuk <volodymyr_babchuk@epam.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Fixes: fedc612314acf ("KVM: arm64: nv: Handle virtual EL2 registers in vcpu_read/write_sys_reg()") Link: https://lore.kernel.org/r/20250806141707.3479194-3-volodymyr_babchuk@epam.com Link: https://lore.kernel.org/r/20250817121926.217900-4-maz@kernel.org Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2025-08-28KVM: arm64: Simplify sysreg access on exception deliveryMarc Zyngier1-12/+4
Distinguishing between NV and VHE is slightly pointless, and only serves as an extra complication, or a way to introduce bugs, such as the way SPSR_EL1 gets written without checking for the state being resident. Get rid if this silly distinction, and fix the bug in one go. Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20250817121926.217900-3-maz@kernel.org Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2025-08-28KVM: arm64: Check for SYSREGS_ON_CPU before accessing the 32bit stateMarc Zyngier1-2/+2
Just like c6e35dff58d3 ("KVM: arm64: Check for SYSREGS_ON_CPU before accessing the CPU state") fixed the 64bit state access, add a check for the 32bit state actually being on the CPU before writing it. Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20250817121926.217900-2-maz@kernel.org Signed-off-by: Oliver Upton <oliver.upton@linux.dev>