<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/mm, branch v5.14</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v5.14</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v5.14'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2021-08-25T19:25:12Z</updated>
<entry>
<title>mm/memory_hotplug: fix potential permanent lru cache disable</title>
<updated>2021-08-25T19:25:12Z</updated>
<author>
<name>Miaohe Lin</name>
<email>linmiaohe@huawei.com</email>
</author>
<published>2021-08-25T19:17:55Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=946746d1ad921e5f493b536533dda02ea22ca609'/>
<id>urn:sha1:946746d1ad921e5f493b536533dda02ea22ca609</id>
<content type='text'>
If offline_pages failed after lru_cache_disable(), it forgot to do
lru_cache_enable() in error path.  So we would have lru cache disabled
permanently in this case.

Link: https://lkml.kernel.org/r/20210821094246.10149-3-linmiaohe@huawei.com
Fixes: d479960e44f2 ("mm: disable LRU pagevec during the migration temporarily")
Signed-off-by: Miaohe Lin &lt;linmiaohe@huawei.com&gt;
Reviewed-by: David Hildenbrand &lt;david@redhat.com&gt;
Reviewed-by: Oscar Salvador &lt;osalvador@suse.de&gt;
Reviewed-by: Naoya Horiguchi &lt;naoya.horiguchi@nec.com&gt;
Cc: Chris Goldsworthy &lt;cgoldswo@codeaurora.org&gt;
Cc: Michal Hocko &lt;mhocko@suse.com&gt;
Cc: Minchan Kim &lt;minchan@kernel.org&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>hugetlb: don't pass page cache pages to restore_reserve_on_error</title>
<updated>2021-08-20T18:31:42Z</updated>
<author>
<name>Mike Kravetz</name>
<email>mike.kravetz@oracle.com</email>
</author>
<published>2021-08-20T02:04:33Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=c7b1850dfb41d0b4154aca8dbc04777fbd75616f'/>
<id>urn:sha1:c7b1850dfb41d0b4154aca8dbc04777fbd75616f</id>
<content type='text'>
syzbot hit kernel BUG at fs/hugetlbfs/inode.c:532 as described in [1].
This BUG triggers if the HPageRestoreReserve flag is set on a page in
the page cache.  It should never be set, as the routine
huge_add_to_page_cache explicitly clears the flag after adding a page to
the cache.

The only code other than huge page allocation which sets the flag is
restore_reserve_on_error.  It will potentially set the flag in rare out
of memory conditions.  syzbot was injecting errors to cause memory
allocation errors which exercised this specific path.

The code in restore_reserve_on_error is doing the right thing.  However,
there are instances where pages in the page cache were being passed to
restore_reserve_on_error.  This is incorrect, as once a page goes into
the cache reservation information will not be modified for the page
until it is removed from the cache.  Error paths do not remove pages
from the cache, so even in the case of error, the page will remain in
the cache and no reservation adjustment is needed.

Modify routines that potentially call restore_reserve_on_error with a
page cache page to no longer do so.

Note on fixes tag: Prior to commit 846be08578ed ("mm/hugetlb: expand
restore_reserve_on_error functionality") the routine would not process
page cache pages because the HPageRestoreReserve flag is not set on such
pages.  Therefore, this issue could not be trigggered.  The code added
by commit 846be08578ed ("mm/hugetlb: expand restore_reserve_on_error
functionality") is needed and correct.  It exposed incorrect calls to
restore_reserve_on_error which is the root cause addressed by this
commit.

[1] https://lore.kernel.org/linux-mm/00000000000050776d05c9b7c7f0@google.com/

Link: https://lkml.kernel.org/r/20210818213304.37038-1-mike.kravetz@oracle.com
Fixes: 846be08578ed ("mm/hugetlb: expand restore_reserve_on_error functionality")
Signed-off-by: Mike Kravetz &lt;mike.kravetz@oracle.com&gt;
Reported-by: &lt;syzbot+67654e51e54455f1c585@syzkaller.appspotmail.com&gt;
Cc: Mina Almasry &lt;almasrymina@google.com&gt;
Cc: Axel Rasmussen &lt;axelrasmussen@google.com&gt;
Cc: Peter Xu &lt;peterx@redhat.com&gt;
Cc: Muchun Song &lt;songmuchun@bytedance.com&gt;
Cc: Michal Hocko &lt;mhocko@suse.com&gt;
Cc: Naoya Horiguchi &lt;naoya.horiguchi@linux.dev&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: vmscan: fix missing psi annotation for node_reclaim()</title>
<updated>2021-08-20T18:31:42Z</updated>
<author>
<name>Johannes Weiner</name>
<email>hannes@cmpxchg.org</email>
</author>
<published>2021-08-20T02:04:27Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=57f29762cdd4687a02f245d1b1e78de046388eac'/>
<id>urn:sha1:57f29762cdd4687a02f245d1b1e78de046388eac</id>
<content type='text'>
In a debugging session the other day, Rik noticed that node_reclaim()
was missing memstall annotations.  This means we'll miss pressure and
lost productivity resulting from reclaim on an overloaded local NUMA
node when vm.zone_reclaim_mode is enabled.

There haven't been any reports, but that's likely because
vm.zone_reclaim_mode hasn't been a commonly used feature recently, and
the intersection between such setups and psi users is probably nil.

But secondary memory such as CXL-connected DIMMS, persistent memory etc,
and the page demotion patches that handle them
(https://lore.kernel.org/lkml/20210401183216.443C4443@viggo.jf.intel.com/)
could soon make this a more common codepath again.

Link: https://lkml.kernel.org/r/20210818152457.35846-1-hannes@cmpxchg.org
Signed-off-by: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Reported-by: Rik van Riel &lt;riel@surriel.com&gt;
Reviewed-by: Shakeel Butt &lt;shakeelb@google.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm/hwpoison: retry with shake_page() for unhandlable pages</title>
<updated>2021-08-20T18:31:42Z</updated>
<author>
<name>Naoya Horiguchi</name>
<email>naoya.horiguchi@nec.com</email>
</author>
<published>2021-08-20T02:04:24Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=fcc00621d88b274b5dffd8daeea71d0e4c28b84e'/>
<id>urn:sha1:fcc00621d88b274b5dffd8daeea71d0e4c28b84e</id>
<content type='text'>
HWPoisonHandlable() sometimes returns false for typical user pages due
to races with average memory events like transfers over LRU lists.  This
causes failures in hwpoison handling.

There's retry code for such a case but does not work because the retry
loop reaches the retry limit too quickly before the page settles down to
handlable state.  Let get_any_page() call shake_page() to fix it.

[naoya.horiguchi@nec.com: get_any_page(): return -EIO when retry limit reached]
  Link: https://lkml.kernel.org/r/20210819001958.2365157-1-naoya.horiguchi@linux.dev

Link: https://lkml.kernel.org/r/20210817053703.2267588-1-naoya.horiguchi@linux.dev
Fixes: 25182f05ffed ("mm,hwpoison: fix race with hugetlb page allocation")
Signed-off-by: Naoya Horiguchi &lt;naoya.horiguchi@nec.com&gt;
Reported-by: Tony Luck &lt;tony.luck@intel.com&gt;
Reviewed-by: Yang Shi &lt;shy828301@gmail.com&gt;
Cc: Oscar Salvador &lt;osalvador@suse.de&gt;
Cc: Muchun Song &lt;songmuchun@bytedance.com&gt;
Cc: Mike Kravetz &lt;mike.kravetz@oracle.com&gt;
Cc: Michal Hocko &lt;mhocko@suse.com&gt;
Cc: &lt;stable@vger.kernel.org&gt;		[5.13+]
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: memcontrol: fix occasional OOMs due to proportional memory.low reclaim</title>
<updated>2021-08-20T18:31:42Z</updated>
<author>
<name>Johannes Weiner</name>
<email>hannes@cmpxchg.org</email>
</author>
<published>2021-08-20T02:04:21Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=f56ce412a59d7d938b81de8878faef128812482c'/>
<id>urn:sha1:f56ce412a59d7d938b81de8878faef128812482c</id>
<content type='text'>
We've noticed occasional OOM killing when memory.low settings are in
effect for cgroups.  This is unexpected and undesirable as memory.low is
supposed to express non-OOMing memory priorities between cgroups.

The reason for this is proportional memory.low reclaim.  When cgroups
are below their memory.low threshold, reclaim passes them over in the
first round, and then retries if it couldn't find pages anywhere else.
But when cgroups are slightly above their memory.low setting, page scan
force is scaled down and diminished in proportion to the overage, to the
point where it can cause reclaim to fail as well - only in that case we
currently don't retry, and instead trigger OOM.

To fix this, hook proportional reclaim into the same retry logic we have
in place for when cgroups are skipped entirely.  This way if reclaim
fails and some cgroups were scanned with diminished pressure, we'll try
another full-force cycle before giving up and OOMing.

[akpm@linux-foundation.org: coding-style fixes]

Link: https://lkml.kernel.org/r/20210817180506.220056-1-hannes@cmpxchg.org
Fixes: 9783aa9917f8 ("mm, memcg: proportional memory.{low,min} reclaim")
Signed-off-by: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Reported-by: Leon Yang &lt;lnyng@fb.com&gt;
Reviewed-by: Rik van Riel &lt;riel@surriel.com&gt;
Reviewed-by: Shakeel Butt &lt;shakeelb@google.com&gt;
Acked-by: Roman Gushchin &lt;guro@fb.com&gt;
Acked-by: Chris Down &lt;chris@chrisdown.name&gt;
Acked-by: Michal Hocko &lt;mhocko@suse.com&gt;
Cc: &lt;stable@vger.kernel.org&gt;		[5.4+]
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm/page_alloc: don't corrupt pcppage_migratetype</title>
<updated>2021-08-20T18:31:42Z</updated>
<author>
<name>Doug Berger</name>
<email>opendmb@gmail.com</email>
</author>
<published>2021-08-20T02:04:12Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=47aef6010b837657e1433021bfdeeee7a26a174c'/>
<id>urn:sha1:47aef6010b837657e1433021bfdeeee7a26a174c</id>
<content type='text'>
When placing pages on a pcp list, migratetype values over
MIGRATE_PCPTYPES get added to the MIGRATE_MOVABLE pcp list.

However, the actual migratetype is preserved in the page and should
not be changed to MIGRATE_MOVABLE or the page may end up on the wrong
free_list.

The impact is that HIGHATOMIC or CMA pages getting bulk freed from the
PCP lists could potentially end up on the wrong buddy list.  There are
various consequences but minimally NR_FREE_CMA_PAGES accounting could
get screwed up.

[mgorman@techsingularity.net: changelog update]

Link: https://lkml.kernel.org/r/20210811182917.2607994-1-opendmb@gmail.com
Fixes: df1acc856923 ("mm/page_alloc: avoid conflating IRQs disabled with zone-&gt;lock")
Signed-off-by: Doug Berger &lt;opendmb@gmail.com&gt;
Acked-by: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Acked-by: Mel Gorman &lt;mgorman@techsingularity.net&gt;
Cc: "Peter Zijlstra (Intel)" &lt;peterz@infradead.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>Revert "mm: swap: check if swap backing device is congested or not"</title>
<updated>2021-08-20T18:31:42Z</updated>
<author>
<name>Yang Shi</name>
<email>shy828301@gmail.com</email>
</author>
<published>2021-08-20T02:04:09Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=c04b3d06904368b71ab9e09336ecfc91f4009bc9'/>
<id>urn:sha1:c04b3d06904368b71ab9e09336ecfc91f4009bc9</id>
<content type='text'>
Due to the change about how block layer detects congestion the
justification of commit 8fd2e0b505d1 ("mm: swap: check if swap backing
device is congested or not") doesn't stand anymore, so the commit could
be just reverted in order to solve the race reported by commit
2efa33fc7f6e ("mm/shmem: fix shmem_swapin() race with swapoff").  The
fix was reverted by the previous patch.

Link: https://lkml.kernel.org/r/20210810202936.2672-3-shy828301@gmail.com
Signed-off-by: Yang Shi &lt;shy828301@gmail.com&gt;
Suggested-by: Hugh Dickins &lt;hughd@google.com&gt;
Acked-by: Hugh Dickins &lt;hughd@google.com&gt;
Cc: "Huang, Ying" &lt;ying.huang@intel.com&gt;
Cc: Miaohe Lin &lt;linmiaohe@huawei.com&gt;
Cc: Matthew Wilcox (Oracle) &lt;willy@infradead.org&gt;
Cc: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: Michal Hocko &lt;mhocko@suse.com&gt;
Cc: Joonsoo Kim &lt;iamjoonsoo.kim@lge.com&gt;
Cc: Minchan Kim &lt;minchan@kernel.org&gt;
Cc: David Hildenbrand &lt;david@redhat.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>Revert "mm/shmem: fix shmem_swapin() race with swapoff"</title>
<updated>2021-08-20T18:31:41Z</updated>
<author>
<name>Yang Shi</name>
<email>shy828301@gmail.com</email>
</author>
<published>2021-08-20T02:04:05Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=b1e1ef345433fb03742003677ddfb980d148092b'/>
<id>urn:sha1:b1e1ef345433fb03742003677ddfb980d148092b</id>
<content type='text'>
Due to the change about how block layer detects congestion the
justification of commit 8fd2e0b505d1 ("mm: swap: check if swap backing
device is congested or not") doesn't stand anymore, so the commit could
be just reverted in order to solve the race reported by commit
2efa33fc7f6e ("mm/shmem: fix shmem_swapin() race with swapoff"), so the
fix commit could be just reverted as well.

And that fix is also kind of buggy as discussed by [1] and [2].

[1] https://lore.kernel.org/linux-mm/24187e5e-069-9f3f-cefe-39ac70783753@google.com/
[2] https://lore.kernel.org/linux-mm/e82380b9-3ad4-4a52-be50-6d45c7f2b5da@google.com/

Link: https://lkml.kernel.org/r/20210810202936.2672-2-shy828301@gmail.com
Signed-off-by: Yang Shi &lt;shy828301@gmail.com&gt;
Suggested-by: Hugh Dickins &lt;hughd@google.com&gt;
Acked-by: Hugh Dickins &lt;hughd@google.com&gt;
Cc: "Huang, Ying" &lt;ying.huang@intel.com&gt;
Cc: Miaohe Lin &lt;linmiaohe@huawei.com&gt;
Cc: David Hildenbrand &lt;david@redhat.com&gt;
Cc: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: Joonsoo Kim &lt;iamjoonsoo.kim@lge.com&gt;
Cc: Matthew Wilcox (Oracle) &lt;willy@infradead.org&gt;
Cc: Michal Hocko &lt;mhocko@suse.com&gt;
Cc: Minchan Kim &lt;minchan@kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm/memcg: fix incorrect flushing of lruvec data in obj_stock</title>
<updated>2021-08-14T00:09:32Z</updated>
<author>
<name>Waiman Long</name>
<email>longman@redhat.com</email>
</author>
<published>2021-08-13T23:54:41Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=7fa0dacbaf1259fd3d1dda6d602fdd084dea9c0e'/>
<id>urn:sha1:7fa0dacbaf1259fd3d1dda6d602fdd084dea9c0e</id>
<content type='text'>
When mod_objcg_state() is called with a pgdat that is different from
that in the obj_stock, the old lruvec data cached in obj_stock are
flushed out.  Unfortunately, they were flushed to the new pgdat and so
the data go to the wrong node.  This will screw up the slab data
reported in /sys/devices/system/node/node*/meminfo.

Fix that by flushing the data to the cached pgdat instead.

Link: https://lkml.kernel.org/r/20210802143834.30578-1-longman@redhat.com
Fixes: 68ac5b3c8db2 ("mm/memcg: cache vmstat data in percpu memcg_stock_pcp")
Signed-off-by: Waiman Long &lt;longman@redhat.com&gt;
Acked-by: Michal Hocko &lt;mhocko@suse.com&gt;
Reviewed-by: Shakeel Butt &lt;shakeelb@google.com&gt;
Acked-by: Roman Gushchin &lt;guro@fb.com&gt;
Cc: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: Vladimir Davydov &lt;vdavydov.dev@gmail.com&gt;
Cc: Tejun Heo &lt;tj@kernel.org&gt;
Cc: Christoph Lameter &lt;cl@linux.com&gt;
Cc: Pekka Enberg &lt;penberg@kernel.org&gt;
Cc: David Rientjes &lt;rientjes@google.com&gt;
Cc: Joonsoo Kim &lt;iamjoonsoo.kim@lge.com&gt;
Cc: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Cc: Muchun Song &lt;songmuchun@bytedance.com&gt;
Cc: Alex Shi &lt;alex.shi@linux.alibaba.com&gt;
Cc: Chris Down &lt;chris@chrisdown.name&gt;
Cc: Yafang Shao &lt;laoar.shao@gmail.com&gt;
Cc: Wei Yang &lt;richard.weiyang@gmail.com&gt;
Cc: Masayoshi Mizuma &lt;msys.mizuma@gmail.com&gt;
Cc: Xing Zhengjun &lt;zhengjun.xing@linux.intel.com&gt;
Cc: Matthew Wilcox &lt;willy@infradead.org&gt;
Cc: Waiman Long &lt;longman@redhat.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm/madvise: report SIGBUS as -EFAULT for MADV_POPULATE_(READ|WRITE)</title>
<updated>2021-08-14T00:09:31Z</updated>
<author>
<name>David Hildenbrand</name>
<email>david@redhat.com</email>
</author>
<published>2021-08-13T23:54:37Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=eb2faa513c246ed47ae34a205928ab663bc5a18f'/>
<id>urn:sha1:eb2faa513c246ed47ae34a205928ab663bc5a18f</id>
<content type='text'>
Doing some extended tests and polishing the man page update for
MADV_POPULATE_(READ|WRITE), I realized that we end up converting also
SIGBUS (via -EFAULT) to -EINVAL, making it look like yet another
madvise() user error.

We want to report only problematic mappings and permission problems that
the user could have know as -EINVAL.

Let's not convert -EFAULT arising due to SIGBUS (or SIGSEGV) to -EINVAL,
but instead indicate -EFAULT to user space.  While we could also convert
it to -ENOMEM, using -EFAULT looks more helpful when user space might
want to troubleshoot what's going wrong: MADV_POPULATE_(READ|WRITE) is
not part of an final Linux release and we can still adjust the behavior.

Link: https://lkml.kernel.org/r/20210726154932.102880-1-david@redhat.com
Fixes: 4ca9b3859dac ("mm/madvise: introduce MADV_POPULATE_(READ|WRITE) to prefault page tables")
Signed-off-by: David Hildenbrand &lt;david@redhat.com&gt;
Cc: Arnd Bergmann &lt;arnd@arndb.de&gt;
Cc: Michal Hocko &lt;mhocko@suse.com&gt;
Cc: Oscar Salvador &lt;osalvador@suse.de&gt;
Cc: Matthew Wilcox (Oracle) &lt;willy@infradead.org&gt;
Cc: Andrea Arcangeli &lt;aarcange@redhat.com&gt;
Cc: Minchan Kim &lt;minchan@kernel.org&gt;
Cc: Jann Horn &lt;jannh@google.com&gt;
Cc: Jason Gunthorpe &lt;jgg@ziepe.ca&gt;
Cc: Dave Hansen &lt;dave.hansen@intel.com&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Cc: Rik van Riel &lt;riel@surriel.com&gt;
Cc: Michael S. Tsirkin &lt;mst@redhat.com&gt;
Cc: Kirill A. Shutemov &lt;kirill.shutemov@linux.intel.com&gt;
Cc: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Cc: Richard Henderson &lt;rth@twiddle.net&gt;
Cc: Ivan Kokshaysky &lt;ink@jurassic.park.msu.ru&gt;
Cc: Matt Turner &lt;mattst88@gmail.com&gt;
Cc: Thomas Bogendoerfer &lt;tsbogend@alpha.franken.de&gt;
Cc: "James E.J. Bottomley" &lt;James.Bottomley@HansenPartnership.com&gt;
Cc: Helge Deller &lt;deller@gmx.de&gt;
Cc: Chris Zankel &lt;chris@zankel.net&gt;
Cc: Max Filippov &lt;jcmvbkbc@gmail.com&gt;
Cc: Mike Kravetz &lt;mike.kravetz@oracle.com&gt;
Cc: Peter Xu &lt;peterx@redhat.com&gt;
Cc: Rolf Eike Beer &lt;eike-kernel@sf-tec.de&gt;
Cc: Ram Pai &lt;linuxram@us.ibm.com&gt;
Cc: Shuah Khan &lt;shuah@kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
</feed>
