<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux, branch v6.15</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v6.15</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v6.15'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2025-05-25T23:09:23Z</updated>
<entry>
<title>Linux 6.15</title>
<updated>2025-05-25T23:09:23Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2025-05-25T23:09:23Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=0ff41df1cb268fc69e703a08a57ee14ae967d0ca'/>
<id>urn:sha1:0ff41df1cb268fc69e703a08a57ee14ae967d0ca</id>
<content type='text'>
</content>
</entry>
<entry>
<title>Disable FOP_DONTCACHE for now due to bugs</title>
<updated>2025-05-25T22:43:36Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2025-05-25T22:43:36Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=478ad02d6844217cc7568619aeb0809d93ade43d'/>
<id>urn:sha1:478ad02d6844217cc7568619aeb0809d93ade43d</id>
<content type='text'>
This is kind of last-minute, but Al Viro reported that the new
FOP_DONTCACHE flag causes memory corruption due to use-after-free
issues.

This was triggered by commit 974c5e6139db ("xfs: flag as supporting
FOP_DONTCACHE"), but that is not the underlying bug - it is just the
first user of the flag.

Vlastimil Babka suspects the underlying problem stems from the
folio_end_writeback() logic introduced in commit fb7d3bc414939
("mm/filemap: drop streaming/uncached pages when writeback completes").

The most straightforward fix would be to just revert the commit that
exposed this, but Matthew Wilcox points out that other filesystems are
also starting to enable the FOP_DONTCACHE logic, so this instead
disables that bit globally for now.

The fix will hopefully end up being trivial and we can just re-enable
this logic after more testing, but until such a time we'll have to
disable the new FOP_DONTCACHE flag.

Reported-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Link: https://lore.kernel.org/all/20250525083209.GS2023217@ZenIV/
Triggered-by: 974c5e6139db ("xfs: flag as supporting FOP_DONTCACHE")
Cc: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Cc: Matthew Wilcox &lt;willy@infradead.org&gt;
Cc: Jan Kara &lt;jack@suse.cz&gt;
Cc: Jens Axboe &lt;axboe@kernel.dk&gt;
Cc: Christoph Hellwig &lt;hch@lst.de&gt;
Cc: Darrick J. Wong &lt;djwong@kernel.org&gt;
Cc: Christian Brauner &lt;brauner@kernel.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>Merge tag 'mm-hotfixes-stable-2025-05-25-00-58' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm</title>
<updated>2025-05-25T14:48:35Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2025-05-25T14:48:35Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=0f8c0258bf042a7da8645148f96d063b9c2060b9'/>
<id>urn:sha1:0f8c0258bf042a7da8645148f96d063b9c2060b9</id>
<content type='text'>
Pull hotfixes from Andrew Morton:
 "22 hotfixes.

  13 are cc:stable and the remainder address post-6.14 issues or aren't
  considered necessary for -stable kernels. 19 are for MM"

* tag 'mm-hotfixes-stable-2025-05-25-00-58' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (22 commits)
  mailmap: add Jarkko's employer email address
  mm: fix copy_vma() error handling for hugetlb mappings
  memcg: always call cond_resched() after fn()
  mm/hugetlb: fix kernel NULL pointer dereference when replacing free hugetlb folios
  mm: vmalloc: only zero-init on vrealloc shrink
  mm: vmalloc: actually use the in-place vrealloc region
  alloc_tag: allocate percpu counters for module tags dynamically
  module: release codetag section when module load fails
  mm/cma: make detection of highmem_start more robust
  MAINTAINERS: add mm memory policy section
  MAINTAINERS: add mm ksm section
  kasan: avoid sleepable page allocation from atomic context
  highmem: add folio_test_partial_kmap()
  MAINTAINERS: add hung-task detector section
  taskstats: fix struct taskstats breaks backward compatibility since version 15
  mm/truncate: fix out-of-bounds when doing a right-aligned split
  MAINTAINERS: add mm reclaim section
  MAINTAINERS: update page allocator section
  mm: fix VM_UFFD_MINOR == VM_SHADOW_STACK on USERFAULTFD=y &amp;&amp; ARM64_GCS=y
  mm: mmap: map MAP_STACK to VM_NOHUGEPAGE only if THP is enabled
  ...
</content>
</entry>
<entry>
<title>mailmap: add Jarkko's employer email address</title>
<updated>2025-05-25T07:53:49Z</updated>
<author>
<name>Jarkko Sakkinen</name>
<email>jarkko@kernel.org</email>
</author>
<published>2025-05-23T12:11:04Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=1ec971da1c10e6376411e6d4a3f3b2351217d94f'/>
<id>urn:sha1:1ec971da1c10e6376411e6d4a3f3b2351217d94f</id>
<content type='text'>
Add the current employer email address to mailmap.

Link: https://lkml.kernel.org/r/20250523121105.15850-1-jarkko@kernel.org
Signed-off-by: Jarkko Sakkinen &lt;jarkko@kernel.org&gt;
Cc: Alexander Sverdlin &lt;alexander.sverdlin@gmail.com&gt;
Cc: Antonio Quartulli &lt;antonio@openvpn.net&gt;
Cc: Carlos Bilbao &lt;carlos.bilbao@kernel.org&gt;
Cc: Kees Cook &lt;kees@kernel.org&gt;
Cc: Simon Wunderlich &lt;sw@simonwunderlich.de&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: fix copy_vma() error handling for hugetlb mappings</title>
<updated>2025-05-25T07:53:49Z</updated>
<author>
<name>Ricardo Cañuelo Navarro</name>
<email>rcn@igalia.com</email>
</author>
<published>2025-05-23T12:19:10Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=ee40c9920ac286c5bfe7c811e66ff899266d2582'/>
<id>urn:sha1:ee40c9920ac286c5bfe7c811e66ff899266d2582</id>
<content type='text'>
If, during a mremap() operation for a hugetlb-backed memory mapping,
copy_vma() fails after the source vma has been duplicated and opened (ie. 
vma_link() fails), the error is handled by closing the new vma.  This
updates the hugetlbfs reservation counter of the reservation map which at
this point is referenced by both the source vma and the new copy.  As a
result, once the new vma has been freed and copy_vma() returns, the
reservation counter for the source vma will be incorrect.

This patch addresses this corner case by clearing the hugetlb private page
reservation reference for the new vma and decrementing the reference
before closing the vma, so that vma_close() won't update the reservation
counter.  This is also what copy_vma_and_data() does with the source vma
if copy_vma() succeeds, so a helper function has been added to do the
fixup in both functions.

The issue was reported by a private syzbot instance and can be reproduced
using the C reproducer in [1].  It's also a possible duplicate of public
syzbot report [2].  The WARNING report is:

============================================================
page_counter underflow: -1024 nr_pages=1024
WARNING: CPU: 0 PID: 3287 at mm/page_counter.c:61 page_counter_cancel+0xf6/0x120
Modules linked in:
CPU: 0 UID: 0 PID: 3287 Comm: repro__WARNING_ Not tainted 6.15.0-rc7+ #54 NONE
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.3-2-gc13ff2cd-prebuilt.qemu.org 04/01/2014
RIP: 0010:page_counter_cancel+0xf6/0x120
Code: ff 5b 41 5e 41 5f 5d c3 cc cc cc cc e8 f3 4f 8f ff c6 05 64 01 27 06 01 48 c7 c7 60 15 f8 85 48 89 de 4c 89 fa e8 2a a7 51 ff &lt;0f&gt; 0b e9 66 ff ff ff 44 89 f9 80 e1 07 38 c1 7c 9d 4c 81
RSP: 0018:ffffc900025df6a0 EFLAGS: 00010246
RAX: 2edfc409ebb44e00 RBX: fffffffffffffc00 RCX: ffff8880155f0000
RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000000000000000
RBP: dffffc0000000000 R08: ffffffff81c4a23c R09: 1ffff1100330482a
R10: dffffc0000000000 R11: ffffed100330482b R12: 0000000000000000
R13: ffff888058a882c0 R14: ffff888058a882c0 R15: 0000000000000400
FS:  0000000000000000(0000) GS:ffff88808fc53000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000004b33e0 CR3: 00000000076d6000 CR4: 00000000000006f0
Call Trace:
 &lt;TASK&gt;
 page_counter_uncharge+0x33/0x80
 hugetlb_cgroup_uncharge_counter+0xcb/0x120
 hugetlb_vm_op_close+0x579/0x960
 ? __pfx_hugetlb_vm_op_close+0x10/0x10
 remove_vma+0x88/0x130
 exit_mmap+0x71e/0xe00
 ? __pfx_exit_mmap+0x10/0x10
 ? __mutex_unlock_slowpath+0x22e/0x7f0
 ? __pfx_exit_aio+0x10/0x10
 ? __up_read+0x256/0x690
 ? uprobe_clear_state+0x274/0x290
 ? mm_update_next_owner+0xa9/0x810
 __mmput+0xc9/0x370
 exit_mm+0x203/0x2f0
 ? __pfx_exit_mm+0x10/0x10
 ? taskstats_exit+0x32b/0xa60
 do_exit+0x921/0x2740
 ? do_raw_spin_lock+0x155/0x3b0
 ? __pfx_do_exit+0x10/0x10
 ? __pfx_do_raw_spin_lock+0x10/0x10
 ? _raw_spin_lock_irq+0xc5/0x100
 do_group_exit+0x20c/0x2c0
 get_signal+0x168c/0x1720
 ? __pfx_get_signal+0x10/0x10
 ? schedule+0x165/0x360
 arch_do_signal_or_restart+0x8e/0x7d0
 ? __pfx_arch_do_signal_or_restart+0x10/0x10
 ? __pfx___se_sys_futex+0x10/0x10
 syscall_exit_to_user_mode+0xb8/0x2c0
 do_syscall_64+0x75/0x120
 entry_SYSCALL_64_after_hwframe+0x76/0x7e
RIP: 0033:0x422dcd
Code: Unable to access opcode bytes at 0x422da3.
RSP: 002b:00007ff266cdb208 EFLAGS: 00000246 ORIG_RAX: 00000000000000ca
RAX: 0000000000000001 RBX: 00007ff266cdbcdc RCX: 0000000000422dcd
RDX: 00000000000f4240 RSI: 0000000000000081 RDI: 00000000004c7bec
RBP: 00007ff266cdb220 R08: 203a6362696c6720 R09: 203a6362696c6720
R10: 0000200000c00000 R11: 0000000000000246 R12: ffffffffffffffd0
R13: 0000000000000002 R14: 00007ffe1cb5f520 R15: 00007ff266cbb000
 &lt;/TASK&gt;
============================================================

Link: https://lkml.kernel.org/r/20250523-warning_in_page_counter_cancel-v2-1-b6df1a8cfefd@igalia.com
Link: https://people.igalia.com/rcn/kernel_logs/20250422__WARNING_in_page_counter_cancel__repro.c [1]
Link: https://lore.kernel.org/all/67000a50.050a0220.49194.048d.GAE@google.com/ [2]
Signed-off-by: Ricardo Cañuelo Navarro &lt;rcn@igalia.com&gt;
Suggested-by: Lorenzo Stoakes &lt;lorenzo.stoakes@oracle.com&gt;
Reviewed-by: Liam R. Howlett &lt;Liam.Howlett@oracle.com&gt;
Cc: Florent Revest &lt;revest@google.com&gt;
Cc: Jann Horn &lt;jannh@google.com&gt;
Cc: Oscar Salvador &lt;osalvador@suse.de&gt;
Cc: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>memcg: always call cond_resched() after fn()</title>
<updated>2025-05-25T07:53:49Z</updated>
<author>
<name>Breno Leitao</name>
<email>leitao@debian.org</email>
</author>
<published>2025-05-23T17:21:06Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=06717a7b6c86514dbd6ab322e8083ffaa4db5712'/>
<id>urn:sha1:06717a7b6c86514dbd6ab322e8083ffaa4db5712</id>
<content type='text'>
I am seeing soft lockup on certain machine types when a cgroup OOMs.  This
is happening because killing the process in certain machine might be very
slow, which causes the soft lockup and RCU stalls.  This happens usually
when the cgroup has MANY processes and memory.oom.group is set.

Example I am seeing in real production:

       [462012.244552] Memory cgroup out of memory: Killed process 3370438 (crosvm) ....
       ....
       [462037.318059] Memory cgroup out of memory: Killed process 4171372 (adb) ....
       [462037.348314] watchdog: BUG: soft lockup - CPU#64 stuck for 26s! [stat_manager-ag:1618982]
       ....

Quick look at why this is so slow, it seems to be related to serial flush
for certain machine types.  For all the crashes I saw, the target CPU was
at console_flush_all().

In the case above, there are thousands of processes in the cgroup, and it
is soft locking up before it reaches the 1024 limit in the code (which
would call the cond_resched()).  So, cond_resched() in 1024 blocks is not
sufficient.

Remove the counter-based conditional rescheduling logic and call
cond_resched() unconditionally after each task iteration, after fn() is
called.  This avoids the lockup independently of how slow fn() is.

Link: https://lkml.kernel.org/r/20250523-memcg_fix-v1-1-ad3eafb60477@debian.org
Fixes: ade81479c7dd ("memcg: fix soft lockup in the OOM process")
Signed-off-by: Breno Leitao &lt;leitao@debian.org&gt;
Suggested-by: Rik van Riel &lt;riel@surriel.com&gt;
Acked-by: Shakeel Butt &lt;shakeel.butt@linux.dev&gt;
Cc: Michael van der Westhuizen &lt;rmikey@meta.com&gt;
Cc: Usama Arif &lt;usamaarif642@gmail.com&gt;
Cc: Pavel Begunkov &lt;asml.silence@gmail.com&gt;
Cc: Chen Ridong &lt;chenridong@huawei.com&gt;
Cc: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Cc: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: Michal Hocko &lt;mhocko@kernel.org&gt;
Cc: Michal Hocko &lt;mhocko@suse.com&gt;
Cc: Muchun Song &lt;muchun.song@linux.dev&gt;
Cc: Roman Gushchin &lt;roman.gushchin@linux.dev&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm/hugetlb: fix kernel NULL pointer dereference when replacing free hugetlb folios</title>
<updated>2025-05-25T07:53:48Z</updated>
<author>
<name>Ge Yang</name>
<email>yangge1116@126.com</email>
</author>
<published>2025-05-22T03:22:17Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=113ed54ad276c352ee5ce109bdcf0df118a43bda'/>
<id>urn:sha1:113ed54ad276c352ee5ce109bdcf0df118a43bda</id>
<content type='text'>
A kernel crash was observed when replacing free hugetlb folios:

BUG: kernel NULL pointer dereference, address: 0000000000000028
PGD 0 P4D 0
Oops: Oops: 0000 [#1] SMP NOPTI
CPU: 28 UID: 0 PID: 29639 Comm: test_cma.sh Tainted 6.15.0-rc6-zp #41 PREEMPT(voluntary)
RIP: 0010:alloc_and_dissolve_hugetlb_folio+0x1d/0x1f0
RSP: 0018:ffffc9000b30fa90 EFLAGS: 00010286
RAX: 0000000000000000 RBX: 0000000000342cca RCX: ffffea0043000000
RDX: ffffc9000b30fb08 RSI: ffffea0043000000 RDI: 0000000000000000
RBP: ffffc9000b30fb20 R08: 0000000000001000 R09: 0000000000000000
R10: ffff88886f92eb00 R11: 0000000000000000 R12: ffffea0043000000
R13: 0000000000000000 R14: 00000000010c0200 R15: 0000000000000004
FS:  00007fcda5f14740(0000) GS:ffff8888ec1d8000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000028 CR3: 0000000391402000 CR4: 0000000000350ef0
Call Trace:
&lt;TASK&gt;
 replace_free_hugepage_folios+0xb6/0x100
 alloc_contig_range_noprof+0x18a/0x590
 ? srso_return_thunk+0x5/0x5f
 ? down_read+0x12/0xa0
 ? srso_return_thunk+0x5/0x5f
 cma_range_alloc.constprop.0+0x131/0x290
 __cma_alloc+0xcf/0x2c0
 cma_alloc_write+0x43/0xb0
 simple_attr_write_xsigned.constprop.0.isra.0+0xb2/0x110
 debugfs_attr_write+0x46/0x70
 full_proxy_write+0x62/0xa0
 vfs_write+0xf8/0x420
 ? srso_return_thunk+0x5/0x5f
 ? filp_flush+0x86/0xa0
 ? srso_return_thunk+0x5/0x5f
 ? filp_close+0x1f/0x30
 ? srso_return_thunk+0x5/0x5f
 ? do_dup2+0xaf/0x160
 ? srso_return_thunk+0x5/0x5f
 ksys_write+0x65/0xe0
 do_syscall_64+0x64/0x170
 entry_SYSCALL_64_after_hwframe+0x76/0x7e

There is a potential race between __update_and_free_hugetlb_folio() and
replace_free_hugepage_folios():

CPU1                              CPU2
__update_and_free_hugetlb_folio   replace_free_hugepage_folios
                                    folio_test_hugetlb(folio)
                                    -- It's still hugetlb folio.

  __folio_clear_hugetlb(folio)
  hugetlb_free_folio(folio)
                                    h = folio_hstate(folio)
                                    -- Here, h is NULL pointer

When the above race condition occurs, folio_hstate(folio) returns NULL,
and subsequent access to this NULL pointer will cause the system to crash.
To resolve this issue, execute folio_hstate(folio) under the protection
of the hugetlb_lock lock, ensuring that folio_hstate(folio) does not
return NULL.

Link: https://lkml.kernel.org/r/1747884137-26685-1-git-send-email-yangge1116@126.com
Fixes: 04f13d241b8b ("mm: replace free hugepage folios after migration")
Signed-off-by: Ge Yang &lt;yangge1116@126.com&gt;
Reviewed-by: Muchun Song &lt;muchun.song@linux.dev&gt;
Reviewed-by: Oscar Salvador &lt;osalvador@suse.de&gt;
Cc: Baolin Wang &lt;baolin.wang@linux.alibaba.com&gt;
Cc: Barry Song &lt;21cnbao@gmail.com&gt;
Cc: David Hildenbrand &lt;david@redhat.com&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: vmalloc: only zero-init on vrealloc shrink</title>
<updated>2025-05-25T07:53:48Z</updated>
<author>
<name>Kees Cook</name>
<email>kees@kernel.org</email>
</author>
<published>2025-05-15T21:42:16Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=70d1eb031a68cbde4eed8099674be21778441c94'/>
<id>urn:sha1:70d1eb031a68cbde4eed8099674be21778441c94</id>
<content type='text'>
The common case is to grow reallocations, and since init_on_alloc will
have already zeroed the whole allocation, we only need to zero when
shrinking the allocation.

Link: https://lkml.kernel.org/r/20250515214217.619685-2-kees@kernel.org
Fixes: a0309faf1cb0 ("mm: vmalloc: support more granular vrealloc() sizing")
Signed-off-by: Kees Cook &lt;kees@kernel.org&gt;
Tested-by: Pawan Gupta &lt;pawan.kumar.gupta@linux.intel.com&gt;
Cc: Danilo Krummrich &lt;dakr@kernel.org&gt;
Cc: Eduard Zingerman &lt;eddyz87@gmail.com&gt;
Cc: "Erhard F." &lt;erhard_f@mailbox.org&gt;
Cc: Shung-Hsi Yu &lt;shung-hsi.yu@suse.com&gt;
Cc: "Uladzislau Rezki (Sony)" &lt;urezki@gmail.com&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: vmalloc: actually use the in-place vrealloc region</title>
<updated>2025-05-25T07:53:48Z</updated>
<author>
<name>Kees Cook</name>
<email>kees@kernel.org</email>
</author>
<published>2025-05-15T21:42:15Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=f7a35a3c36d1e36059c5654737d9bee3454f01a3'/>
<id>urn:sha1:f7a35a3c36d1e36059c5654737d9bee3454f01a3</id>
<content type='text'>
Patch series "mm: vmalloc: Actually use the in-place vrealloc region".

This fixes a performance regression[1] with vrealloc()[1].


The refactoring to not build a new vmalloc region only actually worked
when shrinking.  Actually return the resized area when it grows.  Ugh.

Link: https://lkml.kernel.org/r/20250515214217.619685-1-kees@kernel.org
Fixes: a0309faf1cb0 ("mm: vmalloc: support more granular vrealloc() sizing")
Signed-off-by: Kees Cook &lt;kees@kernel.org&gt;
Reported-by: Shung-Hsi Yu &lt;shung-hsi.yu@suse.com&gt;
Closes: https://lore.kernel.org/all/20250515-bpf-verifier-slowdown-vwo2meju4cgp2su5ckj@6gi6ssxbnfqg [1]
Tested-by: Eduard Zingerman &lt;eddyz87@gmail.com&gt;
Tested-by: Pawan Gupta &lt;pawan.kumar.gupta@linux.intel.com&gt;
Tested-by: Shung-Hsi Yu &lt;shung-hsi.yu@suse.com&gt;
Reviewed-by: "Uladzislau Rezki (Sony)" &lt;urezki@gmail.com&gt;
Reviewed-by: Danilo Krummrich &lt;dakr@kernel.org&gt;
Cc: "Erhard F." &lt;erhard_f@mailbox.org&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>alloc_tag: allocate percpu counters for module tags dynamically</title>
<updated>2025-05-25T07:53:48Z</updated>
<author>
<name>Suren Baghdasaryan</name>
<email>surenb@google.com</email>
</author>
<published>2025-05-17T00:07:39Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=12ca42c237756182aad8ab04654c952765cb9061'/>
<id>urn:sha1:12ca42c237756182aad8ab04654c952765cb9061</id>
<content type='text'>
When a module gets unloaded it checks whether any of its tags are still in
use and if so, we keep the memory containing module's allocation tags
alive until all tags are unused.  However percpu counters referenced by
the tags are freed by free_module().  This will lead to UAF if the memory
allocated by a module is accessed after module was unloaded.

To fix this we allocate percpu counters for module allocation tags
dynamically and we keep it alive for tags which are still in use after
module unloading.  This also removes the requirement of a larger
PERCPU_MODULE_RESERVE when memory allocation profiling is enabled because
percpu memory for counters does not need to be reserved anymore.

Link: https://lkml.kernel.org/r/20250517000739.5930-1-surenb@google.com
Fixes: 0db6f8d7820a ("alloc_tag: load module tags into separate contiguous memory")
Signed-off-by: Suren Baghdasaryan &lt;surenb@google.com&gt;
Reported-by: David Wang &lt;00107082@163.com&gt;
Closes: https://lore.kernel.org/all/20250516131246.6244-1-00107082@163.com/
Tested-by: David Wang &lt;00107082@163.com&gt;
Cc: Christoph Lameter (Ampere) &lt;cl@gentwo.org&gt;
Cc: Dennis Zhou &lt;dennis@kernel.org&gt;
Cc: Kent Overstreet &lt;kent.overstreet@linux.dev&gt;
Cc: Pasha Tatashin &lt;pasha.tatashin@soleen.com&gt;
Cc: Tejun Heo &lt;tj@kernel.org&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
</feed>
