<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/mm/memory_hotplug.c, branch v5.14</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v5.14</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v5.14'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2021-08-25T19:25:12Z</updated>
<entry>
<title>mm/memory_hotplug: fix potential permanent lru cache disable</title>
<updated>2021-08-25T19:25:12Z</updated>
<author>
<name>Miaohe Lin</name>
<email>linmiaohe@huawei.com</email>
</author>
<published>2021-08-25T19:17:55Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=946746d1ad921e5f493b536533dda02ea22ca609'/>
<id>urn:sha1:946746d1ad921e5f493b536533dda02ea22ca609</id>
<content type='text'>
If offline_pages failed after lru_cache_disable(), it forgot to do
lru_cache_enable() in error path.  So we would have lru cache disabled
permanently in this case.

Link: https://lkml.kernel.org/r/20210821094246.10149-3-linmiaohe@huawei.com
Fixes: d479960e44f2 ("mm: disable LRU pagevec during the migration temporarily")
Signed-off-by: Miaohe Lin &lt;linmiaohe@huawei.com&gt;
Reviewed-by: David Hildenbrand &lt;david@redhat.com&gt;
Reviewed-by: Oscar Salvador &lt;osalvador@suse.de&gt;
Reviewed-by: Naoya Horiguchi &lt;naoya.horiguchi@nec.com&gt;
Cc: Chris Goldsworthy &lt;cgoldswo@codeaurora.org&gt;
Cc: Michal Hocko &lt;mhocko@suse.com&gt;
Cc: Minchan Kim &lt;minchan@kernel.org&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm/memory_hotplug: fix kerneldoc comment for __remove_memory</title>
<updated>2021-07-01T18:06:02Z</updated>
<author>
<name>Mel Gorman</name>
<email>mgorman@techsingularity.net</email>
</author>
<published>2021-07-01T01:53:38Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=5640c9ca7ed2e54628938f9d505c969b48e3fa67'/>
<id>urn:sha1:5640c9ca7ed2e54628938f9d505c969b48e3fa67</id>
<content type='text'>
make W=1 generates the following warning for __remove_memory

  mm/memory_hotplug.c:2044: warning: expecting prototype for remove_memory(). Prototype was for __remove_memory() instead

Commit eca499ab3749 ("mm/hotplug: make remove_memory() interface usable")
introduced the kerneldoc comment and function but the kerneldoc name and
function name did not match.

Link: https://lkml.kernel.org/r/20210520084809.8576-9-mgorman@techsingularity.net
Fixes: eca499ab3749 ("mm/hotplug: make remove_memory() interface usable")
Signed-off-by: Mel Gorman &lt;mgorman@techsingularity.net&gt;
Reviewed-by: David Hildenbrand &lt;david@redhat.com&gt;
Reviewed-by: Yang Shi &lt;shy828301@gmail.com&gt;
Acked-by: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Cc: Dan Streetman &lt;ddstreet@ieee.org&gt;
Cc: Michal Hocko &lt;mhocko@kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm/memory_hotplug: fix kerneldoc comment for __try_online_node</title>
<updated>2021-07-01T18:06:02Z</updated>
<author>
<name>Mel Gorman</name>
<email>mgorman@techsingularity.net</email>
</author>
<published>2021-07-01T01:53:35Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=ba2d26660d0e13b3465917022aca78d49e259b59'/>
<id>urn:sha1:ba2d26660d0e13b3465917022aca78d49e259b59</id>
<content type='text'>
make W=1 generates the following warning for try_online_node

mm/memory_hotplug.c:1087: warning: expecting prototype for try_online_node(). Prototype was for __try_online_node() instead

Commit b9ff036082cd ("mm/memory_hotplug.c: make add_memory_resource use
__try_online_node") renamed the function but did not update the associated
kerneldoc.  The function is static and somewhat specialised in nature so
it's not clear it warrants being a kerneldoc by moving the comment to
try_online_node.  Hence, leave the comment of the internal helper in place
but leave it out of kerneldoc and correct the function name in the
comment.

Link: https://lkml.kernel.org/r/20210520084809.8576-8-mgorman@techsingularity.net
Fixes: Commit b9ff036082cd ("mm/memory_hotplug.c: make add_memory_resource use __try_online_node")
Signed-off-by: Mel Gorman &lt;mgorman@techsingularity.net&gt;
Reviewed-by: David Hildenbrand &lt;david@redhat.com&gt;
Reviewed-by: Yang Shi &lt;shy828301@gmail.com&gt;
Acked-by: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Cc: Dan Streetman &lt;ddstreet@ieee.org&gt;
Cc: Michal Hocko &lt;mhocko@kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: fix spelling mistakes</title>
<updated>2021-07-01T18:06:02Z</updated>
<author>
<name>Zhen Lei</name>
<email>thunder.leizhen@huawei.com</email>
</author>
<published>2021-07-01T01:53:17Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=041711ce7cdf023f53d76f64d82b75210248e18d'/>
<id>urn:sha1:041711ce7cdf023f53d76f64d82b75210248e18d</id>
<content type='text'>
Fix some spelling mistakes in comments:
each having differents usage ==&gt; each has a different usage
statments ==&gt; statements
adresses ==&gt; addresses
aggresive ==&gt; aggressive
datas ==&gt; data
posion ==&gt; poison
higer ==&gt; higher
precisly ==&gt; precisely
wont ==&gt; won't
We moves tha ==&gt; We move the
endianess ==&gt; endianness

Link: https://lkml.kernel.org/r/20210519065853.7723-2-thunder.leizhen@huawei.com
Signed-off-by: Zhen Lei &lt;thunder.leizhen@huawei.com&gt;
Reviewed-by: Souptick Joarder &lt;jrdr.linux@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm,memory_hotplug: drop unneeded locking</title>
<updated>2021-07-01T03:47:31Z</updated>
<author>
<name>Oscar Salvador</name>
<email>osalvador@suse.de</email>
</author>
<published>2021-07-01T01:52:46Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=27cacaad16c549ce5dd30ae84100b7e680536822'/>
<id>urn:sha1:27cacaad16c549ce5dd30ae84100b7e680536822</id>
<content type='text'>
Currently, memory-hotplug code takes zone's span_writelock and pgdat's
resize_lock when resizing the node/zone's spanned pages via
{move_pfn_range_to_zone(),remove_pfn_range_from_zone()} and when resizing
node and zone's present pages via adjust_present_page_count().

These locks are also taken during the initialization of the system at boot
time, where it protects parallel struct page initialization, but they
should not really be needed in memory-hotplug where all operations are a)
synchronized on device level and b) serialized by the mem_hotplug_lock
lock.

[akpm@linux-foundation.org: remove now-unused locals]

Link: https://lkml.kernel.org/r/20210531093958.15021-1-osalvador@suse.de
Signed-off-by: Oscar Salvador &lt;osalvador@suse.de&gt;
Acked-by: David Hildenbrand &lt;david@redhat.com&gt;
Acked-by: Michal Hocko &lt;mhocko@suse.com&gt;
Cc: Anshuman Khandual &lt;anshuman.khandual@arm.com&gt;
Cc: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Cc: Pavel Tatashin &lt;pasha.tatashin@soleen.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm/memory_hotplug: rate limit page migration warnings</title>
<updated>2021-07-01T03:47:31Z</updated>
<author>
<name>Liam Mark</name>
<email>lmark@codeaurora.org</email>
</author>
<published>2021-07-01T01:52:43Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=786dee864804f8e851cf0f258df2ccbb4ee03d80'/>
<id>urn:sha1:786dee864804f8e851cf0f258df2ccbb4ee03d80</id>
<content type='text'>
When offlining memory the system can attempt to migrate a lot of pages, if
there are problems with migration this can flood the logs.  Printing all
the data hogs the CPU and cause some RT threads to run for a long time,
which may have some bad consequences.

Rate limit the page migration warnings in order to avoid this.

Link: https://lkml.kernel.org/r/20210505140542.24935-1-georgi.djakov@linaro.org
Signed-off-by: Liam Mark &lt;lmark@codeaurora.org&gt;
Signed-off-by: Georgi Djakov &lt;georgi.djakov@linaro.org&gt;
Cc: David Hildenbrand &lt;david@redhat.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: sparsemem: use huge PMD mapping for vmemmap pages</title>
<updated>2021-07-01T03:47:26Z</updated>
<author>
<name>Muchun Song</name>
<email>songmuchun@bytedance.com</email>
</author>
<published>2021-07-01T01:48:25Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=2d7a21715f25122779e2bed17db8c57aa01e922f'/>
<id>urn:sha1:2d7a21715f25122779e2bed17db8c57aa01e922f</id>
<content type='text'>
The preparation of splitting huge PMD mapping of vmemmap pages is ready,
so switch the mapping from PTE to PMD.

Link: https://lkml.kernel.org/r/20210616094915.34432-3-songmuchun@bytedance.com
Signed-off-by: Muchun Song &lt;songmuchun@bytedance.com&gt;
Reviewed-by: Mike Kravetz &lt;mike.kravetz@oracle.com&gt;
Cc: Chen Huang &lt;chenhuang5@huawei.com&gt;
Cc: David Hildenbrand &lt;david@redhat.com&gt;
Cc: Jonathan Corbet &lt;corbet@lwn.net&gt;
Cc: Michal Hocko &lt;mhocko@suse.com&gt;
Cc: Oscar Salvador &lt;osalvador@suse.de&gt;
Cc: Xiongchun Duan &lt;duanxiongchun@bytedance.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: memory_hotplug: disable memmap_on_memory when hugetlb_free_vmemmap enabled</title>
<updated>2021-07-01T03:47:25Z</updated>
<author>
<name>Muchun Song</name>
<email>songmuchun@bytedance.com</email>
</author>
<published>2021-07-01T01:47:29Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=4bab4964a59f277915285787c828b810151de7a1'/>
<id>urn:sha1:4bab4964a59f277915285787c828b810151de7a1</id>
<content type='text'>
The parameter of memory_hotplug.memmap_on_memory is not compatible with
hugetlb_free_vmemmap.  So disable it when hugetlb_free_vmemmap is enabled.

[akpm@linux-foundation.org: remove unneeded include, per Oscar]

Link: https://lkml.kernel.org/r/20210510030027.56044-9-songmuchun@bytedance.com
Signed-off-by: Muchun Song &lt;songmuchun@bytedance.com&gt;
Acked-by: Mike Kravetz &lt;mike.kravetz@oracle.com&gt;
Cc: Alexander Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: Andy Lutomirski &lt;luto@kernel.org&gt;
Cc: Anshuman Khandual &lt;anshuman.khandual@arm.com&gt;
Cc: Balbir Singh &lt;bsingharora@gmail.com&gt;
Cc: Barry Song &lt;song.bao.hua@hisilicon.com&gt;
Cc: Bodeddula Balasubramaniam &lt;bodeddub@amazon.com&gt;
Cc: Borislav Petkov &lt;bp@alien8.de&gt;
Cc: Chen Huang &lt;chenhuang5@huawei.com&gt;
Cc: Dave Hansen &lt;dave.hansen@linux.intel.com&gt;
Cc: David Hildenbrand &lt;david@redhat.com&gt;
Cc: David Rientjes &lt;rientjes@google.com&gt;
Cc: HORIGUCHI NAOYA &lt;naoya.horiguchi@nec.com&gt;
Cc: "H. Peter Anvin" &lt;hpa@zytor.com&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: Joao Martins &lt;joao.m.martins@oracle.com&gt;
Cc: Joerg Roedel &lt;jroedel@suse.de&gt;
Cc: Jonathan Corbet &lt;corbet@lwn.net&gt;
Cc: Matthew Wilcox &lt;willy@infradead.org&gt;
Cc: Miaohe Lin &lt;linmiaohe@huawei.com&gt;
Cc: Michal Hocko &lt;mhocko@suse.com&gt;
Cc: Mina Almasry &lt;almasrymina@google.com&gt;
Cc: Oliver Neukum &lt;oneukum@suse.com&gt;
Cc: Oscar Salvador &lt;osalvador@suse.de&gt;
Cc: Paul E. McKenney &lt;paulmck@kernel.org&gt;
Cc: Pawan Gupta &lt;pawan.kumar.gupta@linux.intel.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Randy Dunlap &lt;rdunlap@infradead.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Xiongchun Duan &lt;duanxiongchun@bytedance.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: memory_hotplug: factor out bootmem core functions to bootmem_info.c</title>
<updated>2021-07-01T03:47:25Z</updated>
<author>
<name>Muchun Song</name>
<email>songmuchun@bytedance.com</email>
</author>
<published>2021-07-01T01:47:00Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=426e5c429d16e4cd5ded46e21ff8e939bf8abd0f'/>
<id>urn:sha1:426e5c429d16e4cd5ded46e21ff8e939bf8abd0f</id>
<content type='text'>
Patch series "Free some vmemmap pages of HugeTLB page", v23.

This patch series will free some vmemmap pages(struct page structures)
associated with each HugeTLB page when preallocated to save memory.

In order to reduce the difficulty of the first version of code review.  In
this version, we disable PMD/huge page mapping of vmemmap if this feature
was enabled.  This acutely eliminates a bunch of the complex code doing
page table manipulation.  When this patch series is solid, we cam add the
code of vmemmap page table manipulation in the future.

The struct page structures (page structs) are used to describe a physical
page frame.  By default, there is an one-to-one mapping from a page frame
to it's corresponding page struct.

The HugeTLB pages consist of multiple base page size pages and is
supported by many architectures.  See hugetlbpage.rst in the Documentation
directory for more details.  On the x86 architecture, HugeTLB pages of
size 2MB and 1GB are currently supported.  Since the base page size on x86
is 4KB, a 2MB HugeTLB page consists of 512 base pages and a 1GB HugeTLB
page consists of 4096 base pages.  For each base page, there is a
corresponding page struct.

Within the HugeTLB subsystem, only the first 4 page structs are used to
contain unique information about a HugeTLB page.  HUGETLB_CGROUP_MIN_ORDER
provides this upper limit.  The only 'useful' information in the remaining
page structs is the compound_head field, and this field is the same for
all tail pages.

By removing redundant page structs for HugeTLB pages, memory can returned
to the buddy allocator for other uses.

When the system boot up, every 2M HugeTLB has 512 struct page structs which
size is 8 pages(sizeof(struct page) * 512 / PAGE_SIZE).

    HugeTLB                  struct pages(8 pages)         page frame(8 pages)
 +-----------+ ---virt_to_page---&gt; +-----------+   mapping to   +-----------+
 |           |                     |     0     | -------------&gt; |     0     |
 |           |                     +-----------+                +-----------+
 |           |                     |     1     | -------------&gt; |     1     |
 |           |                     +-----------+                +-----------+
 |           |                     |     2     | -------------&gt; |     2     |
 |           |                     +-----------+                +-----------+
 |           |                     |     3     | -------------&gt; |     3     |
 |           |                     +-----------+                +-----------+
 |           |                     |     4     | -------------&gt; |     4     |
 |    2MB    |                     +-----------+                +-----------+
 |           |                     |     5     | -------------&gt; |     5     |
 |           |                     +-----------+                +-----------+
 |           |                     |     6     | -------------&gt; |     6     |
 |           |                     +-----------+                +-----------+
 |           |                     |     7     | -------------&gt; |     7     |
 |           |                     +-----------+                +-----------+
 |           |
 |           |
 |           |
 +-----------+

The value of page-&gt;compound_head is the same for all tail pages.  The
first page of page structs (page 0) associated with the HugeTLB page
contains the 4 page structs necessary to describe the HugeTLB.  The only
use of the remaining pages of page structs (page 1 to page 7) is to point
to page-&gt;compound_head.  Therefore, we can remap pages 2 to 7 to page 1.
Only 2 pages of page structs will be used for each HugeTLB page.  This
will allow us to free the remaining 6 pages to the buddy allocator.

Here is how things look after remapping.

    HugeTLB                  struct pages(8 pages)         page frame(8 pages)
 +-----------+ ---virt_to_page---&gt; +-----------+   mapping to   +-----------+
 |           |                     |     0     | -------------&gt; |     0     |
 |           |                     +-----------+                +-----------+
 |           |                     |     1     | -------------&gt; |     1     |
 |           |                     +-----------+                +-----------+
 |           |                     |     2     | ----------------^ ^ ^ ^ ^ ^
 |           |                     +-----------+                   | | | | |
 |           |                     |     3     | ------------------+ | | | |
 |           |                     +-----------+                     | | | |
 |           |                     |     4     | --------------------+ | | |
 |    2MB    |                     +-----------+                       | | |
 |           |                     |     5     | ----------------------+ | |
 |           |                     +-----------+                         | |
 |           |                     |     6     | ------------------------+ |
 |           |                     +-----------+                           |
 |           |                     |     7     | --------------------------+
 |           |                     +-----------+
 |           |
 |           |
 |           |
 +-----------+

When a HugeTLB is freed to the buddy system, we should allocate 6 pages
for vmemmap pages and restore the previous mapping relationship.

Apart from 2MB HugeTLB page, we also have 1GB HugeTLB page.  It is similar
to the 2MB HugeTLB page.  We also can use this approach to free the
vmemmap pages.

In this case, for the 1GB HugeTLB page, we can save 4094 pages.  This is a
very substantial gain.  On our server, run some SPDK/QEMU applications
which will use 1024GB HugeTLB page.  With this feature enabled, we can
save ~16GB (1G hugepage)/~12GB (2MB hugepage) memory.

Because there are vmemmap page tables reconstruction on the
freeing/allocating path, it increases some overhead.  Here are some
overhead analysis.

1) Allocating 10240 2MB HugeTLB pages.

   a) With this patch series applied:
   # time echo 10240 &gt; /proc/sys/vm/nr_hugepages

   real     0m0.166s
   user     0m0.000s
   sys      0m0.166s

   # bpftrace -e 'kprobe:alloc_fresh_huge_page { @start[tid] = nsecs; }
     kretprobe:alloc_fresh_huge_page /@start[tid]/ { @latency = hist(nsecs -
     @start[tid]); delete(@start[tid]); }'
   Attaching 2 probes...

   @latency:
   [8K, 16K)           5476 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|
   [16K, 32K)          4760 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@       |
   [32K, 64K)             4 |                                                    |

   b) Without this patch series:
   # time echo 10240 &gt; /proc/sys/vm/nr_hugepages

   real     0m0.067s
   user     0m0.000s
   sys      0m0.067s

   # bpftrace -e 'kprobe:alloc_fresh_huge_page { @start[tid] = nsecs; }
     kretprobe:alloc_fresh_huge_page /@start[tid]/ { @latency = hist(nsecs -
     @start[tid]); delete(@start[tid]); }'
   Attaching 2 probes...

   @latency:
   [4K, 8K)           10147 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|
   [8K, 16K)             93 |                                                    |

   Summarize: this feature is about ~2x slower than before.

2) Freeing 10240 2MB HugeTLB pages.

   a) With this patch series applied:
   # time echo 0 &gt; /proc/sys/vm/nr_hugepages

   real     0m0.213s
   user     0m0.000s
   sys      0m0.213s

   # bpftrace -e 'kprobe:free_pool_huge_page { @start[tid] = nsecs; }
     kretprobe:free_pool_huge_page /@start[tid]/ { @latency = hist(nsecs -
     @start[tid]); delete(@start[tid]); }'
   Attaching 2 probes...

   @latency:
   [8K, 16K)              6 |                                                    |
   [16K, 32K)         10227 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|
   [32K, 64K)             7 |                                                    |

   b) Without this patch series:
   # time echo 0 &gt; /proc/sys/vm/nr_hugepages

   real     0m0.081s
   user     0m0.000s
   sys      0m0.081s

   # bpftrace -e 'kprobe:free_pool_huge_page { @start[tid] = nsecs; }
     kretprobe:free_pool_huge_page /@start[tid]/ { @latency = hist(nsecs -
     @start[tid]); delete(@start[tid]); }'
   Attaching 2 probes...

   @latency:
   [4K, 8K)            6805 |@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@|
   [8K, 16K)           3427 |@@@@@@@@@@@@@@@@@@@@@@@@@@                          |
   [16K, 32K)             8 |                                                    |

   Summary: The overhead of __free_hugepage is about ~2-3x slower than before.

Although the overhead has increased, the overhead is not significant.
Like Mike said, "However, remember that the majority of use cases create
HugeTLB pages at or shortly after boot time and add them to the pool.  So,
additional overhead is at pool creation time.  There is no change to
'normal run time' operations of getting a page from or returning a page to
the pool (think page fault/unmap)".

Despite the overhead and in addition to the memory gains from this series.
The following data is obtained by Joao Martins.  Very thanks to his
effort.

There's an additional benefit which is page (un)pinners will see an improvement
and Joao presumes because there are fewer memmap pages and thus the tail/head
pages are staying in cache more often.

Out of the box Joao saw (when comparing linux-next against linux-next +
this series) with gup_test and pinning a 16G HugeTLB file (with 1G pages):

	get_user_pages(): ~32k -&gt; ~9k
	unpin_user_pages(): ~75k -&gt; ~70k

Usually any tight loop fetching compound_head(), or reading tail pages
data (e.g.  compound_head) benefit a lot.  There's some unpinning
inefficiencies Joao was fixing[2], but with that in added it shows even
more:

	unpin_user_pages(): ~27k -&gt; ~3.8k

[1] https://lore.kernel.org/linux-mm/20210409205254.242291-1-mike.kravetz@oracle.com/
[2] https://lore.kernel.org/linux-mm/20210204202500.26474-1-joao.m.martins@oracle.com/

This patch (of 9):

Move bootmem info registration common API to individual bootmem_info.c.
And we will use {get,put}_page_bootmem() to initialize the page for the
vmemmap pages or free the vmemmap pages to buddy in the later patch.  So
move them out of CONFIG_MEMORY_HOTPLUG_SPARSE.  This is just code movement
without any functional change.

Link: https://lkml.kernel.org/r/20210510030027.56044-1-songmuchun@bytedance.com
Link: https://lkml.kernel.org/r/20210510030027.56044-2-songmuchun@bytedance.com
Signed-off-by: Muchun Song &lt;songmuchun@bytedance.com&gt;
Acked-by: Mike Kravetz &lt;mike.kravetz@oracle.com&gt;
Reviewed-by: Oscar Salvador &lt;osalvador@suse.de&gt;
Reviewed-by: David Hildenbrand &lt;david@redhat.com&gt;
Reviewed-by: Miaohe Lin &lt;linmiaohe@huawei.com&gt;
Tested-by: Chen Huang &lt;chenhuang5@huawei.com&gt;
Tested-by: Bodeddula Balasubramaniam &lt;bodeddub@amazon.com&gt;
Cc: Jonathan Corbet &lt;corbet@lwn.net&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: Borislav Petkov &lt;bp@alien8.de&gt;
Cc: x86@kernel.org
Cc: "H. Peter Anvin" &lt;hpa@zytor.com&gt;
Cc: Dave Hansen &lt;dave.hansen@linux.intel.com&gt;
Cc: Andy Lutomirski &lt;luto@kernel.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Alexander Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: Paul E. McKenney &lt;paulmck@kernel.org&gt;
Cc: Pawan Gupta &lt;pawan.kumar.gupta@linux.intel.com&gt;
Cc: Randy Dunlap &lt;rdunlap@infradead.org&gt;
Cc: Oliver Neukum &lt;oneukum@suse.com&gt;
Cc: Anshuman Khandual &lt;anshuman.khandual@arm.com&gt;
Cc: Joerg Roedel &lt;jroedel@suse.de&gt;
Cc: Mina Almasry &lt;almasrymina@google.com&gt;
Cc: David Rientjes &lt;rientjes@google.com&gt;
Cc: Matthew Wilcox &lt;willy@infradead.org&gt;
Cc: Michal Hocko &lt;mhocko@suse.com&gt;
Cc: Barry Song &lt;song.bao.hua@hisilicon.com&gt;
Cc: HORIGUCHI NAOYA &lt;naoya.horiguchi@nec.com&gt;
Cc: Joao Martins &lt;joao.m.martins@oracle.com&gt;
Cc: Xiongchun Duan &lt;duanxiongchun@bytedance.com&gt;
Cc: Balbir Singh &lt;bsingharora@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm/page_alloc: disassociate the pcp-&gt;high from pcp-&gt;batch</title>
<updated>2021-06-29T17:53:54Z</updated>
<author>
<name>Mel Gorman</name>
<email>mgorman@techsingularity.net</email>
</author>
<published>2021-06-29T02:42:12Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=b92ca18e8ca596f4f3d80c1fe833bc57a1b2458c'/>
<id>urn:sha1:b92ca18e8ca596f4f3d80c1fe833bc57a1b2458c</id>
<content type='text'>
The pcp high watermark is based on the batch size but there is no
relationship between them other than it is convenient to use early in
boot.

This patch takes the first step and bases pcp-&gt;high on the zone low
watermark split across the number of CPUs local to a zone while the batch
size remains the same to avoid increasing allocation latencies.  The
intent behind the default pcp-&gt;high is "set the number of PCP pages such
that if they are all full that background reclaim is not started
prematurely".

Note that in this patch the pcp-&gt;high values are adjusted after memory
hotplug events, min_free_kbytes adjustments and watermark scale factor
adjustments but not CPU hotplug events which is handled later in the
series.

On a test KVM instance;

Before grep -E "high:|batch" /proc/zoneinfo | tail -2
              high:  378
              batch: 63

After grep -E "high:|batch" /proc/zoneinfo | tail -2
              high:  649
              batch: 63

[mgorman@techsingularity.net:  fix __setup_per_zone_wmarks for parallel memory
hotplug]
  Link: https://lkml.kernel.org/r/20210528105925.GN30378@techsingularity.net

Link: https://lkml.kernel.org/r/20210525080119.5455-3-mgorman@techsingularity.net
Signed-off-by: Mel Gorman &lt;mgorman@techsingularity.net&gt;
Acked-by: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Cc: Dave Hansen &lt;dave.hansen@linux.intel.com&gt;
Cc: Hillf Danton &lt;hdanton@sina.com&gt;
Cc: Michal Hocko &lt;mhocko@kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
</feed>
