<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/arch/x86/lib, branch master</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=master</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=master'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2026-04-13T15:39:51Z</updated>
<entry>
<title>Merge branch 'nocache-cleanup'</title>
<updated>2026-04-13T15:39:51Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2026-04-13T15:39:51Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=fdcbb1bc06508eb7ad961b3876b16382ae678ef8'/>
<id>urn:sha1:fdcbb1bc06508eb7ad961b3876b16382ae678ef8</id>
<content type='text'>
This series cleans up some of the special user copy functions naming and
semantics.  In particular, get rid of the (very traditional) double
underscore names and behavior: the whole "optimize away the range check"
model has been largely excised from the other user accessors because
it's so subtle and can be unsafe, but also because it's just not a
relevant optimization any more.

To do that, a couple of drivers that misused the "user" copies as kernel
copies in order to get non-temporal stores had to be fixed up, but that
kind of code should never have been allowed anyway.

The x86-only "nocache" version was also renamed to more accurately
reflect what it actually does.

This was all done because I looked at this code due to a report by Jann
Horn, and I just couldn't stand the inconsistent naming, the horrible
semantics, and the random misuse of these functions.  This code should
probably be cleaned up further, but it's at least slightly closer to
normal semantics.

I had a more intrusive series that went even further in trying to
normalize the semantics, but that ended up hitting so many other
inconsistencies between different architectures in this area (eg
'size_t' vs 'unsigned long' vs 'int' as size arguments, and various
iovec check differences that Vasily Gorbik pointed out) that I ended up
with this more limited version that fixed the worst of the issues.

Reported-by: Jann Horn &lt;jannh@google.com&gt;
Tested-by: Will Deacon &lt;will@kernel.org&gt;
Link: https://lore.kernel.org/all/CAHk-=wgg1QVWNWG-UCFo1hx0zqrPnB3qhPzUTrWNft+MtXQXig@mail.gmail.com/

* nocache-cleanup:
  x86-64/arm64/powerpc: clean up and rename __copy_from_user_flushcache
  x86: rename and clean up __copy_from_user_inatomic_nocache()
  x86-64: rename misleadingly named '__copy_user_nocache()' function
</content>
</entry>
<entry>
<title>x86-64/arm64/powerpc: clean up and rename __copy_from_user_flushcache</title>
<updated>2026-03-30T22:05:57Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2026-03-30T21:52:45Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=809b997a5ce945ab470f70c187048fe4f5df20bf'/>
<id>urn:sha1:809b997a5ce945ab470f70c187048fe4f5df20bf</id>
<content type='text'>
This finishes the work on these odd functions that were only implemented
by a handful of architectures.

The 'flushcache' function was only used from the iterator code, and
let's make it do the same thing that the nontemporal version does:
remove the two underscores and add the user address checking.

Yes, yes, the user address checking is also done at iovec import time,
but we have long since walked away from the old double-underscore thing
where we try to avoid address checking overhead at access time, and
these functions shouldn't be so special and old-fashioned.

The arm64 version already did the address check, in fact, so there it's
just a matter of renaming it.  For powerpc and x86-64 we now do the
proper user access boilerplate.

Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>x86: rename and clean up __copy_from_user_inatomic_nocache()</title>
<updated>2026-03-30T22:05:57Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2026-03-30T20:11:07Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=5de7bcaadf160c1716b20a263cf8f5b06f658959'/>
<id>urn:sha1:5de7bcaadf160c1716b20a263cf8f5b06f658959</id>
<content type='text'>
Similarly to the previous commit, this renames the somewhat confusingly
named function.  But in this case, it was at least less confusing: the
__copy_from_user_inatomic_nocache is indeed copying from user memory,
and it is indeed ok to be used in an atomic context, so it will not warn
about it.

But the previous commit also removed the NTB mis-use of the
__copy_from_user_inatomic_nocache() function, and as a result every
call-site is now _actually_ doing a real user copy.  That means that we
can now do the proper user pointer verification too.

End result: add proper address checking, remove the double underscores,
and change the "nocache" to "nontemporal" to more accurately describe
what this x86-only function actually does.  It might be worth noting
that only the target is non-temporal: the actual user accesses are
normal memory accesses.

Also worth noting is that non-x86 targets (and on older 32-bit x86 CPU's
before XMM2 in the Pentium III) we end up just falling back on a regular
user copy, so nothing can actually depend on the non-temporal semantics,
but that has always been true.

Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>x86-64: rename misleadingly named '__copy_user_nocache()' function</title>
<updated>2026-03-30T22:05:56Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2026-03-30T17:39:09Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=d187a86de793f84766ea40b9ade7ac60aabbb4fe'/>
<id>urn:sha1:d187a86de793f84766ea40b9ade7ac60aabbb4fe</id>
<content type='text'>
This function was a masterclass in bad naming, for various historical
reasons.

It claimed to be a non-cached user copy.  It is literally _neither_ of
those things.  It's a specialty memory copy routine that uses
non-temporal stores for the destination (but not the source), and that
does exception handling for both source and destination accesses.

Also note that while it works for unaligned targets, any unaligned parts
(whether at beginning or end) will not use non-temporal stores, since
only words and quadwords can be non-temporal on x86.

The exception handling means that it _can_ be used for user space
accesses, but not on its own - it needs all the normal "start user space
access" logic around it.

But typically the user space access would be the source, not the
non-temporal destination.  That was the original intention of this,
where the destination was some fragile persistent memory target that
needed non-temporal stores in order to catch machine check exceptions
synchronously and deal with them gracefully.

Thus that non-descriptive name: one use case was to copy from user space
into a non-cached kernel buffer.  However, the existing users are a mix
of that intended use-case, and a couple of random drivers that just did
this as a performance tweak.

Some of those random drivers then actively misused the user copying
version (with STAC/CLAC and all) to do kernel copies without ever even
caring about the exception handling, _just_ for the non-temporal
destination.

Rename it as a first small step to actually make it halfway sane, and
change the prototype to be more normal: it doesn't take a user pointer
unless the caller has done the proper conversion, and the argument size
is the full size_t (it still won't actually copy more than 4GB in one
go, but there's also no reason to silently truncate the size argument in
the caller).

Finally, use this now sanely named function in the NTB code, which
mis-used a user copy version (with STAC/CLAC and all) of this interface
despite it not actually being a user copy at all.

Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>Merge tag 'mm-stable-2026-02-11-19-22' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm</title>
<updated>2026-02-12T19:32:37Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2026-02-12T19:32:37Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=4cff5c05e076d2ee4e34122aa956b84a2eaac587'/>
<id>urn:sha1:4cff5c05e076d2ee4e34122aa956b84a2eaac587</id>
<content type='text'>
Pull MM updates from Andrew Morton:

 - "powerpc/64s: do not re-activate batched TLB flush" makes
   arch_{enter|leave}_lazy_mmu_mode() nest properly (Alexander Gordeev)

   It adds a generic enter/leave layer and switches architectures to use
   it. Various hacks were removed in the process.

 - "zram: introduce compressed data writeback" implements data
   compression for zram writeback (Richard Chang and Sergey Senozhatsky)

 - "mm: folio_zero_user: clear page ranges" adds clearing of contiguous
   page ranges for hugepages. Large improvements during demand faulting
   are demonstrated (David Hildenbrand)

 - "memcg cleanups" tidies up some memcg code (Chen Ridong)

 - "mm/damon: introduce {,max_}nr_snapshots and tracepoint for damos
   stats" improves DAMOS stat's provided information, deterministic
   control, and readability (SeongJae Park)

 - "selftests/mm: hugetlb cgroup charging: robustness fixes" fixes a few
   issues in the hugetlb cgroup charging selftests (Li Wang)

 - "Fix va_high_addr_switch.sh test failure - again" addresses several
   issues in the va_high_addr_switch test (Chunyu Hu)

 - "mm/damon/tests/core-kunit: extend existing test scenarios" improves
   the KUnit test coverage for DAMON (Shu Anzai)

 - "mm/khugepaged: fix dirty page handling for MADV_COLLAPSE" fixes a
   glitch in khugepaged which was causing madvise(MADV_COLLAPSE) to
   transiently return -EAGAIN (Shivank Garg)

 - "arch, mm: consolidate hugetlb early reservation" reworks and
   consolidates a pile of straggly code related to reservation of
   hugetlb memory from bootmem and creation of CMA areas for hugetlb
   (Mike Rapoport)

 - "mm: clean up anon_vma implementation" cleans up the anon_vma
   implementation in various ways (Lorenzo Stoakes)

 - "tweaks for __alloc_pages_slowpath()" does a little streamlining of
   the page allocator's slowpath code (Vlastimil Babka)

 - "memcg: separate private and public ID namespaces" cleans up the
   memcg ID code and prevents the internal-only private IDs from being
   exposed to userspace (Shakeel Butt)

 - "mm: hugetlb: allocate frozen gigantic folio" cleans up the
   allocation of frozen folios and avoids some atomic refcount
   operations (Kefeng Wang)

 - "mm/damon: advance DAMOS-based LRU sorting" improves DAMOS's movement
   of memory betewwn the active and inactive LRUs and adds auto-tuning
   of the ratio-based quotas and of monitoring intervals (SeongJae Park)

 - "Support page table check on PowerPC" makes
   CONFIG_PAGE_TABLE_CHECK_ENFORCED work on powerpc (Andrew Donnellan)

 - "nodemask: align nodes_and{,not} with underlying bitmap ops" makes
   nodes_and() and nodes_andnot() propagate the return values from the
   underlying bit operations, enabling some cleanup in calling code
   (Yury Norov)

 - "mm/damon: hide kdamond and kdamond_lock from API callers" cleans up
   some DAMON internal interfaces (SeongJae Park)

 - "mm/khugepaged: cleanups and scan limit fix" does some cleanup work
   in khupaged and fixes a scan limit accounting issue (Shivank Garg)

 - "mm: balloon infrastructure cleanups" goes to town on the balloon
   infrastructure and its page migration function. Mainly cleanups, also
   some locking simplification (David Hildenbrand)

 - "mm/vmscan: add tracepoint and reason for kswapd_failures reset" adds
   additional tracepoints to the page reclaim code (Jiayuan Chen)

 - "Replace wq users and add WQ_PERCPU to alloc_workqueue() users" is
   part of Marco's kernel-wide migration from the legacy workqueue APIs
   over to the preferred unbound workqueues (Marco Crivellari)

 - "Various mm kselftests improvements/fixes" provides various unrelated
   improvements/fixes for the mm kselftests (Kevin Brodsky)

 - "mm: accelerate gigantic folio allocation" greatly speeds up gigantic
   folio allocation, mainly by avoiding unnecessary work in
   pfn_range_valid_contig() (Kefeng Wang)

 - "selftests/damon: improve leak detection and wss estimation
   reliability" improves the reliability of two of the DAMON selftests
   (SeongJae Park)

 - "mm/damon: cleanup kdamond, damon_call(), damos filter and
   DAMON_MIN_REGION" does some cleanup work in the core DAMON code
   (SeongJae Park)

 - "Docs/mm/damon: update intro, modules, maintainer profile, and misc"
   performs maintenance work on the DAMON documentation (SeongJae Park)

 - "mm: add and use vma_assert_stabilised() helper" refactors and cleans
   up the core VMA code. The main aim here is to be able to use the mmap
   write lock's lockdep state to perform various assertions regarding
   the locking which the VMA code requires (Lorenzo Stoakes)

 - "mm, swap: swap table phase II: unify swapin use" removes some old
   swap code (swap cache bypassing and swap synchronization) which
   wasn't working very well. Various other cleanups and simplifications
   were made. The end result is a 20% speedup in one benchmark (Kairui
   Song)

 - "enable PT_RECLAIM on more 64-bit architectures" makes PT_RECLAIM
   available on 64-bit alpha, loongarch, mips, parisc, and um. Various
   cleanups were performed along the way (Qi Zheng)

* tag 'mm-stable-2026-02-11-19-22' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (325 commits)
  mm/memory: handle non-split locks correctly in zap_empty_pte_table()
  mm: move pte table reclaim code to memory.c
  mm: make PT_RECLAIM depends on MMU_GATHER_RCU_TABLE_FREE
  mm: convert __HAVE_ARCH_TLB_REMOVE_TABLE to CONFIG_HAVE_ARCH_TLB_REMOVE_TABLE config
  um: mm: enable MMU_GATHER_RCU_TABLE_FREE
  parisc: mm: enable MMU_GATHER_RCU_TABLE_FREE
  mips: mm: enable MMU_GATHER_RCU_TABLE_FREE
  LoongArch: mm: enable MMU_GATHER_RCU_TABLE_FREE
  alpha: mm: enable MMU_GATHER_RCU_TABLE_FREE
  mm: change mm/pt_reclaim.c to use asm/tlb.h instead of asm-generic/tlb.h
  mm/damon/stat: remove __read_mostly from memory_idle_ms_percentiles
  zsmalloc: make common caches global
  mm: add SPDX id lines to some mm source files
  mm/zswap: use %pe to print error pointers
  mm/vmscan: use %pe to print error pointers
  mm/readahead: fix typo in comment
  mm: khugepaged: fix NR_FILE_PAGES and NR_SHMEM in collapse_file()
  mm: refactor vma_map_pages to use vm_insert_pages
  mm/damon: unify address range representation with damon_addr_range
  mm/cma: replace snprintf with strscpy in cma_new_area
  ...
</content>
</entry>
<entry>
<title>Merge tag 'x86_misc_for_7.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip</title>
<updated>2026-02-11T03:52:18Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2026-02-11T03:52:18Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=45a1b8cc6c0690d82f176ec9c3ca8ad0aa050511'/>
<id>urn:sha1:45a1b8cc6c0690d82f176ec9c3ca8ad0aa050511</id>
<content type='text'>
Pull misc x86 updates from Dave Hansen:
 "The usual smattering of x86/misc changes.

  The IPv6 patch in here surprised me in a couple of ways. First, the
  function it inlines is able to eat a lot more CPU time than I would
  have expected. Second, the inlining does not seem to bloat the kernel,
  at least in the configs folks have tested.

   - Inline x86-specific IPv6 checksum helper

   - Update IOMMU docs to use stable identifiers

   - Print unhashed pointers on fatal stack overflows"

* tag 'x86_misc_for_7.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/traps: Print unhashed pointers on stack overflow
  Documentation/x86: Update IOMMU spec references to use stable identifiers
  x86/lib: Inline csum_ipv6_magic()
</content>
</entry>
<entry>
<title>x86/mm: simplify clear_page_*</title>
<updated>2026-01-21T03:24:40Z</updated>
<author>
<name>Ankur Arora</name>
<email>ankur.a.arora@oracle.com</email>
</author>
<published>2026-01-07T07:20:05Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=54a6b89a3db2ecb4462abcd6e6e52dfebaa7e6c4'/>
<id>urn:sha1:54a6b89a3db2ecb4462abcd6e6e52dfebaa7e6c4</id>
<content type='text'>
clear_page_rep() and clear_page_erms() are wrappers around "REP; STOS"
variations.  Inlining gets rid of an unnecessary CALL/RET (which isn't
free when using RETHUNK speculative execution mitigations.) Fixup and
rename clear_page_orig() to adapt to the changed calling convention.

Also add a comment from Dave Hansen detailing various clearing mechanisms
used in clear_page().

Link: https://lkml.kernel.org/r/20260107072009.1615991-5-ankur.a.arora@oracle.com
Signed-off-by: Ankur Arora &lt;ankur.a.arora@oracle.com&gt;
Tested-by: Raghavendra K T &lt;raghavendra.kt@amd.com&gt;
Reviewed-by: Borislav Petkov (AMD) &lt;bp@alien8.de&gt;
Cc: Andy Lutomirski &lt;luto@kernel.org&gt;
Cc: Boris Ostrovsky &lt;boris.ostrovsky@oracle.com&gt;
Cc: David Hildenbrand &lt;david@kernel.org&gt;
Cc: David Hildenbrand &lt;david@redhat.com&gt;
Cc: "H. Peter Anvin" &lt;hpa@zytor.com&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: Konrad Rzessutek Wilk &lt;konrad.wilk@oracle.com&gt;
Cc: Lance Yang &lt;ioworker0@gmail.com&gt;
Cc: "Liam R. Howlett" &lt;Liam.Howlett@oracle.com&gt;
Cc: Li Zhe &lt;lizhe.67@bytedance.com&gt;
Cc: Lorenzo Stoakes &lt;lorenzo.stoakes@oracle.com&gt;
Cc: Mateusz Guzik &lt;mjguzik@gmail.com&gt;
Cc: Matthew Wilcox (Oracle) &lt;willy@infradead.org&gt;
Cc: Michal Hocko &lt;mhocko@suse.com&gt;
Cc: Mike Rapoport &lt;rppt@kernel.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Suren Baghdasaryan &lt;surenb@google.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>x86/paravirt: Remove not needed includes of paravirt.h</title>
<updated>2026-01-12T10:26:52Z</updated>
<author>
<name>Juergen Gross</name>
<email>jgross@suse.com</email>
</author>
<published>2026-01-05T11:05:00Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=07f2961235ac26da12479535cf19a82cde529865'/>
<id>urn:sha1:07f2961235ac26da12479535cf19a82cde529865</id>
<content type='text'>
In some places asm/paravirt.h is included without really being needed.

Remove the related #include statements.

Signed-off-by: Juergen Gross &lt;jgross@suse.com&gt;
Signed-off-by: Borislav Petkov (AMD) &lt;bp@alien8.de&gt;
Acked-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Link: https://patch.msgid.link/20260105110520.21356-2-jgross@suse.com
</content>
</entry>
<entry>
<title>x86/lib: Inline csum_ipv6_magic()</title>
<updated>2026-01-05T18:14:05Z</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2025-11-13T15:45:45Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=529676cabcf4a5046d217bba2c8f3b94a3f6a10f'/>
<id>urn:sha1:529676cabcf4a5046d217bba2c8f3b94a3f6a10f</id>
<content type='text'>
Inline this small helper. It has been observed to consume up
to 0.75%, which is significant for such a small function.

This should reduce register pressure, as saddr and daddr are often
back to back in memory.

For instance code inlined in tcp6_gro_receive() will look like:

 55a:	48 03 73 28          	add    0x28(%rbx),%rsi
 55e:	8b 43 70             	mov    0x70(%rbx),%eax
 561:	29 f8                	sub    %edi,%eax
 563:	0f c8                	bswap  %eax
 565:	89 c0                	mov    %eax,%eax
 567:	48 05 00 06 00 00    	add    $0x600,%rax
 56d:	48 03 46 08          	add    0x8(%rsi),%rax
 571:	48 13 46 10          	adc    0x10(%rsi),%rax
 575:	48 13 46 18          	adc    0x18(%rsi),%rax
 579:	48 13 46 20          	adc    0x20(%rsi),%rax
 57d:	48 83 d0 00          	adc    $0x0,%rax
 581:	48 89 c6             	mov    %rax,%rsi
 584:	48 c1 ee 20          	shr    $0x20,%rsi
 588:	01 f0                	add    %esi,%eax
 58a:	83 d0 00             	adc    $0x0,%eax
 58d:	89 c6                	mov    %eax,%esi
 58f:	66 31 c0             	xor    %ax,%ax

Surprisingly, this inlining does not seem to bloat kernel text size.
It at least two cases[1], it either has no effect or results in a
slightly smaller kernel.

1. https://lore.kernel.org/all/CANn89iJzcb_XO9oCApKYfRxsMMmg7BHukRDqWTca3ZLQ8HT0iQ@mail.gmail.com/

[ dhansen: add justification and note about lack of kernel bloat ]

Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: Dave Hansen &lt;dave.hansen@linux.intel.com&gt;
Acked-by: Dave Hansen &lt;dave.hansen@linux.intel.com&gt;
Link: https://patch.msgid.link/20251113154545.594580-1-edumazet@google.com
</content>
</entry>
<entry>
<title>Merge tag 'objtool-urgent-2025-12-06' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip</title>
<updated>2025-12-06T19:56:51Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2025-12-06T19:56:51Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=08b8ddac1f4339fbf950df45590a032578ec35f7'/>
<id>urn:sha1:08b8ddac1f4339fbf950df45590a032578ec35f7</id>
<content type='text'>
Pull objtool fixes from Ingo Molnar:
 "Address various objtool scalability bugs/inefficiencies exposed by
  allmodconfig builds, plus improve the quality of alternatives
  instructions generated code and disassembly"

* tag 'objtool-urgent-2025-12-06' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  objtool: Simplify .annotate_insn code generation output some more
  objtool: Add more robust signal error handling, detect and warn about stack overflows
  objtool: Remove newlines and tabs from annotation macros
  objtool: Consolidate annotation macros
  x86/asm: Remove ANNOTATE_DATA_SPECIAL usage
  x86/alternative: Remove ANNOTATE_DATA_SPECIAL usage
  objtool: Fix stack overflow in validate_branch()
</content>
</entry>
</feed>
