<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/mm, branch v6.7</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v6.7</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v6.7'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2024-01-05T17:58:32Z</updated>
<entry>
<title>mm: shrinker: use kvzalloc_node() from expand_one_shrinker_info()</title>
<updated>2024-01-05T17:58:32Z</updated>
<author>
<name>Tetsuo Handa</name>
<email>penguin-kernel@I-love.SAKURA.ne.jp</email>
</author>
<published>2024-01-03T01:52:22Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=7fba9420b726561966e1671004df60a08b39beb3'/>
<id>urn:sha1:7fba9420b726561966e1671004df60a08b39beb3</id>
<content type='text'>
syzbot is reporting uninit-value at shrinker_alloc(), for commit
307bececcd12 ("mm: shrinker: add a secondary array for
shrinker_info::{map, nr_deferred}") which assumed that the -&gt;unit was
allocated with __GFP_ZERO forgot to replace kvmalloc_node() in
expand_one_shrinker_info() with kvzalloc_node().

Link: https://lkml.kernel.org/r/9226cc0a-10e0-4489-80c5-58c3b5b4359c@I-love.SAKURA.ne.jp
Reported-by: syzbot &lt;syzbot+1e0ed05798af62917464@syzkaller.appspotmail.com&gt;
Closes: https://syzkaller.appspot.com/bug?extid=1e0ed05798af62917464
Fixes: 307bececcd12 ("mm: shrinker: add a secondary array for shrinker_info::{map, nr_deferred}")
Signed-off-by: Tetsuo Handa &lt;penguin-kernel@I-love.SAKURA.ne.jp&gt;
Acked-by: Qi Zheng &lt;zhengqi.arch@bytedance.com&gt;
Cc: Muchun Song &lt;songmuchun@bytedance.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm/mglru: skip special VMAs in lru_gen_look_around()</title>
<updated>2023-12-29T19:06:48Z</updated>
<author>
<name>Yu Zhao</name>
<email>yuzhao@google.com</email>
</author>
<published>2023-12-23T04:56:47Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=c28ac3c7eb945fee6e20f47d576af68fdff1392a'/>
<id>urn:sha1:c28ac3c7eb945fee6e20f47d576af68fdff1392a</id>
<content type='text'>
Special VMAs like VM_PFNMAP can contain anon pages from COW.  There isn't
much profit in doing lookaround on them.  Besides, they can trigger the
pte_special() warning in get_pte_pfn().

Skip them in lru_gen_look_around().

Link: https://lkml.kernel.org/r/20231223045647.1566043-1-yuzhao@google.com
Fixes: 018ee47f1489 ("mm: multi-gen LRU: exploit locality in rmap")
Signed-off-by: Yu Zhao &lt;yuzhao@google.com&gt;
Reported-by: syzbot+03fd9b3f71641f0ebf2d@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/000000000000f9ff00060d14c256@google.com/
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: fix unmap_mapping_range high bits shift bug</title>
<updated>2023-12-29T19:06:48Z</updated>
<author>
<name>Jiajun Xie</name>
<email>jiajun.xie.sh@gmail.com</email>
</author>
<published>2023-12-20T05:28:39Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=9eab0421fa94a3dde0d1f7e36ab3294fc306c99d'/>
<id>urn:sha1:9eab0421fa94a3dde0d1f7e36ab3294fc306c99d</id>
<content type='text'>
The bug happens when highest bit of holebegin is 1, suppose holebegin is
0x8000000111111000, after shift, hba would be 0xfff8000000111111, then
vma_interval_tree_foreach would look it up fail or leads to the wrong
result.

error call seq e.g.:
- mmap(..., offset=0x8000000111111000)
  |- syscall(mmap, ... unsigned long, off):
     |- ksys_mmap_pgoff( ... , off &gt;&gt; PAGE_SHIFT);

  here pgoff is correctly shifted to 0x8000000111111,
  but pass 0x8000000111111000 as holebegin to unmap
  would then cause terrible result, as shown below:

- unmap_mapping_range(..., loff_t const holebegin)
  |- pgoff_t hba = holebegin &gt;&gt; PAGE_SHIFT;
          /* hba = 0xfff8000000111111 unexpectedly */

The issue happens in Heterogeneous computing, where the device(e.g. 
gpu) and host share the same virtual address space.

A simple workflow pattern which hit the issue is:
        /* host */
    1. userspace first mmap a file backed VA range with specified offset.
                        e.g. (offset=0x800..., mmap return: va_a)
    2. write some data to the corresponding sys page
                         e.g. (va_a = 0xAABB)
        /* device */
    3. gpu workload touches VA, triggers gpu fault and notify the host.
        /* host */
    4. reviced gpu fault notification, then it will:
            4.1 unmap host pages and also takes care of cpu tlb
                  (use unmap_mapping_range with offset=0x800...)
            4.2 migrate sys page to device
            4.3 setup device page table and resolve device fault.
        /* device */
    5. gpu workload continued, it accessed va_a and got 0xAABB.
    6. gpu workload continued, it wrote 0xBBCC to va_a.
        /* host */
    7. userspace access va_a, as expected, it will:
            7.1 trigger cpu vm fault.
            7.2 driver handling fault to migrate gpu local page to host.
    8. userspace then could correctly get 0xBBCC from va_a
    9. done

But in step 4.1, if we hit the bug this patch mentioned, then userspace
would never trigger cpu fault, and still get the old value: 0xAABB.

Making holebegin unsigned first fixes the bug.

Link: https://lkml.kernel.org/r/20231220052839.26970-1-jiajun.xie.sh@gmail.com
Signed-off-by: Jiajun Xie &lt;jiajun.xie.sh@gmail.com&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: memcg: fix split queue list crash when large folio migration</title>
<updated>2023-12-29T19:06:47Z</updated>
<author>
<name>Baolin Wang</name>
<email>baolin.wang@linux.alibaba.com</email>
</author>
<published>2023-12-20T06:51:40Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=9bcef5973e31020e5aa8571eb994d67b77318356'/>
<id>urn:sha1:9bcef5973e31020e5aa8571eb994d67b77318356</id>
<content type='text'>
When running autonuma with enabling multi-size THP, I encountered the
following kernel crash issue:

[  134.290216] list_del corruption. prev-&gt;next should be fffff9ad42e1c490,
but was dead000000000100. (prev=fffff9ad42399890)
[  134.290877] kernel BUG at lib/list_debug.c:62!
[  134.291052] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
[  134.291210] CPU: 56 PID: 8037 Comm: numa01 Kdump: loaded Tainted:
G            E      6.7.0-rc4+ #20
[  134.291649] RIP: 0010:__list_del_entry_valid_or_report+0x97/0xb0
......
[  134.294252] Call Trace:
[  134.294362]  &lt;TASK&gt;
[  134.294440]  ? die+0x33/0x90
[  134.294561]  ? do_trap+0xe0/0x110
......
[  134.295681]  ? __list_del_entry_valid_or_report+0x97/0xb0
[  134.295842]  folio_undo_large_rmappable+0x99/0x100
[  134.296003]  destroy_large_folio+0x68/0x70
[  134.296172]  migrate_folio_move+0x12e/0x260
[  134.296264]  ? __pfx_remove_migration_pte+0x10/0x10
[  134.296389]  migrate_pages_batch+0x495/0x6b0
[  134.296523]  migrate_pages+0x1d0/0x500
[  134.296646]  ? __pfx_alloc_misplaced_dst_folio+0x10/0x10
[  134.296799]  migrate_misplaced_folio+0x12d/0x2b0
[  134.296953]  do_numa_page+0x1f4/0x570
[  134.297121]  __handle_mm_fault+0x2b0/0x6c0
[  134.297254]  handle_mm_fault+0x107/0x270
[  134.300897]  do_user_addr_fault+0x167/0x680
[  134.304561]  exc_page_fault+0x65/0x140
[  134.307919]  asm_exc_page_fault+0x22/0x30

The reason for the crash is that, the commit 85ce2c517ade ("memcontrol:
only transfer the memcg data for migration") removed the charging and
uncharging operations of the migration folios and cleared the memcg data
of the old folio.

During the subsequent release process of the old large folio in
destroy_large_folio(), if the large folio needs to be removed from the
split queue, an incorrect split queue can be obtained (which is
pgdat-&gt;deferred_split_queue) because the old folio's memcg is NULL now. 
This can lead to list operations being performed under the wrong split
queue lock protection, resulting in a list crash as above.

After the migration, the old folio is going to be freed, so we can remove
it from the split queue in mem_cgroup_migrate() a bit earlier before
clearing the memcg data to avoid getting incorrect split queue.

[akpm@linux-foundation.org: fix comment, per Zi Yan]
Link: https://lkml.kernel.org/r/61273e5e9b490682388377c20f52d19de4a80460.1703054559.git.baolin.wang@linux.alibaba.com
Fixes: 85ce2c517ade ("memcontrol: only transfer the memcg data for migration")
Signed-off-by: Baolin Wang &lt;baolin.wang@linux.alibaba.com&gt;
Reviewed-by: Nhat Pham &lt;nphamcs@gmail.com&gt;
Reviewed-by: Yang Shi &lt;shy828301@gmail.com&gt;
Reviewed-by: Zi Yan &lt;ziy@nvidia.com&gt;
Cc: David Hildenbrand &lt;david@redhat.com&gt;
Cc: "Huang, Ying" &lt;ying.huang@intel.com&gt;
Cc: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: Michal Hocko &lt;mhocko@kernel.org&gt;
Cc: Muchun Song &lt;muchun.song@linux.dev&gt;
Cc: Roman Gushchin &lt;roman.gushchin@linux.dev&gt;
Cc: Shakeel Butt &lt;shakeelb@google.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: fix arithmetic for max_prop_frac when setting max_ratio</title>
<updated>2023-12-29T19:06:47Z</updated>
<author>
<name>Jingbo Xu</name>
<email>jefflexu@linux.alibaba.com</email>
</author>
<published>2023-12-19T14:25:08Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=fa151a39a6879144b587f35c0dfcc15e1be9450f'/>
<id>urn:sha1:fa151a39a6879144b587f35c0dfcc15e1be9450f</id>
<content type='text'>
Since now bdi-&gt;max_ratio is part per million, fix the wrong arithmetic for
max_prop_frac when setting max_ratio.  Otherwise the miscalculated
max_prop_frac will affect the incrementing of writeout completion count
when max_ratio is not 100%.

Link: https://lkml.kernel.org/r/20231219142508.86265-3-jefflexu@linux.alibaba.com
Fixes: efc3e6ad53ea ("mm: split off __bdi_set_max_ratio() function")
Signed-off-by: Jingbo Xu &lt;jefflexu@linux.alibaba.com&gt;
Cc: Joseph Qi &lt;joseph.qi@linux.alibaba.com&gt;
Cc: Matthew Wilcox (Oracle) &lt;willy@infradead.org&gt;
Cc: Stefan Roesch &lt;shr@devkernel.io&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: fix arithmetic for bdi min_ratio</title>
<updated>2023-12-29T19:06:47Z</updated>
<author>
<name>Jingbo Xu</name>
<email>jefflexu@linux.alibaba.com</email>
</author>
<published>2023-12-19T14:25:07Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=e0646b7590084a5bf3b056d3ad871d9379d2c25a'/>
<id>urn:sha1:e0646b7590084a5bf3b056d3ad871d9379d2c25a</id>
<content type='text'>
Since now bdi-&gt;min_ratio is part per million, fix the wrong arithmetic. 
Otherwise it will fail with -EINVAL when setting a reasonable min_ratio,
as it tries to set min_ratio to (min_ratio * BDI_RATIO_SCALE) in
percentage unit, which exceeds 100% anyway.

    # cat /sys/class/bdi/253\:0/min_ratio
    0
    # cat /sys/class/bdi/253\:0/max_ratio
    100
    # echo 1 &gt; /sys/class/bdi/253\:0/min_ratio
    -bash: echo: write error: Invalid argument

Link: https://lkml.kernel.org/r/20231219142508.86265-2-jefflexu@linux.alibaba.com
Fixes: 8021fb3232f2 ("mm: split off __bdi_set_min_ratio() function")
Signed-off-by: Jingbo Xu &lt;jefflexu@linux.alibaba.com&gt;
Reported-by: Joseph Qi &lt;joseph.qi@linux.alibaba.com&gt;
Cc: Matthew Wilcox (Oracle) &lt;willy@infradead.org&gt;
Cc: Stefan Roesch &lt;shr@devkernel.io&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: align larger anonymous mappings on THP boundaries</title>
<updated>2023-12-29T19:06:47Z</updated>
<author>
<name>Rik van Riel</name>
<email>riel@surriel.com</email>
</author>
<published>2023-12-14T22:34:23Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=efa7df3e3bb5da8e6abbe37727417f32a37fba47'/>
<id>urn:sha1:efa7df3e3bb5da8e6abbe37727417f32a37fba47</id>
<content type='text'>
Align larger anonymous memory mappings on THP boundaries by going through
thp_get_unmapped_area if THPs are enabled for the current process.

With this patch, larger anonymous mappings are now THP aligned.  When a
malloc library allocates a 2MB or larger arena, that arena can now be
mapped with THPs right from the start, which can result in better TLB hit
rates and execution time.

Link: https://lkml.kernel.org/r/20220809142457.4751229f@imladris.surriel.com
Link: https://lkml.kernel.org/r/20231214223423.1133074-1-yang@os.amperecomputing.com
Signed-off-by: Rik van Riel &lt;riel@surriel.com&gt;
Reviewed-by: Yang Shi &lt;shy828301@gmail.com&gt;
Cc: Matthew Wilcox &lt;willy@infradead.org&gt;
Cc: Christopher Lameter &lt;cl@linux.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm/memory-failure: cast index to loff_t before shifting it</title>
<updated>2023-12-20T21:46:20Z</updated>
<author>
<name>Matthew Wilcox (Oracle)</name>
<email>willy@infradead.org</email>
</author>
<published>2023-12-18T13:58:37Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=39ebd6dce62d8cfe3864e16148927a139f11bc9a'/>
<id>urn:sha1:39ebd6dce62d8cfe3864e16148927a139f11bc9a</id>
<content type='text'>
On 32-bit systems, we'll lose the top bits of index because arithmetic
will be performed in unsigned long instead of unsigned long long.  This
affects files over 4GB in size.

Link: https://lkml.kernel.org/r/20231218135837.3310403-4-willy@infradead.org
Fixes: 6100e34b2526 ("mm, memory_failure: Teach memory_failure() about dev_pagemap pages")
Signed-off-by: Matthew Wilcox (Oracle) &lt;willy@infradead.org&gt;
Cc: Dan Williams &lt;dan.j.williams@intel.com&gt;
Cc: Naoya Horiguchi &lt;n-horiguchi@ah.jp.nec.com&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm/memory-failure: check the mapcount of the precise page</title>
<updated>2023-12-20T21:46:19Z</updated>
<author>
<name>Matthew Wilcox (Oracle)</name>
<email>willy@infradead.org</email>
</author>
<published>2023-12-18T13:58:36Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=c79c5a0a00a9457718056b588f312baadf44e471'/>
<id>urn:sha1:c79c5a0a00a9457718056b588f312baadf44e471</id>
<content type='text'>
A process may map only some of the pages in a folio, and might be missed
if it maps the poisoned page but not the head page.  Or it might be
unnecessarily hit if it maps the head page, but not the poisoned page.

Link: https://lkml.kernel.org/r/20231218135837.3310403-3-willy@infradead.org
Fixes: 7af446a841a2 ("HWPOISON, hugetlb: enable error handling path for hugepage")
Signed-off-by: Matthew Wilcox (Oracle) &lt;willy@infradead.org&gt;
Cc: Dan Williams &lt;dan.j.williams@intel.com&gt;
Cc: Naoya Horiguchi &lt;n-horiguchi@ah.jp.nec.com&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm/memory-failure: pass the folio and the page to collect_procs()</title>
<updated>2023-12-20T21:46:19Z</updated>
<author>
<name>Matthew Wilcox (Oracle)</name>
<email>willy@infradead.org</email>
</author>
<published>2023-12-18T13:58:35Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=376907f3a0b34a17e80417825f8cc1c40fcba81b'/>
<id>urn:sha1:376907f3a0b34a17e80417825f8cc1c40fcba81b</id>
<content type='text'>
Patch series "Three memory-failure fixes".

I've been looking at the memory-failure code and I believe I have found
three bugs that need fixing -- one going all the way back to 2010!  I'll
have more patches later to use folios more extensively but didn't want
these bugfixes to get caught up in that.


This patch (of 3):

Both collect_procs_anon() and collect_procs_file() iterate over the VMA
interval trees looking for a single pgoff, so it is wrong to look for the
pgoff of the head page as is currently done.  However, it is also wrong to
look at page-&gt;mapping of the precise page as this is invalid for tail
pages.  Clear up the confusion by passing both the folio and the precise
page to collect_procs().

Link: https://lkml.kernel.org/r/20231218135837.3310403-1-willy@infradead.org
Link: https://lkml.kernel.org/r/20231218135837.3310403-2-willy@infradead.org
Fixes: 415c64c1453a ("mm/memory-failure: split thp earlier in memory error handling")
Signed-off-by: Matthew Wilcox (Oracle) &lt;willy@infradead.org&gt;
Cc: Dan Williams &lt;dan.j.williams@intel.com&gt;
Cc: Naoya Horiguchi &lt;n-horiguchi@ah.jp.nec.com&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
</feed>
