<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/mm/memory_hotplug.c, branch v4.6</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v4.6</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v4.6'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2016-03-17T22:09:34Z</updated>
<entry>
<title>mm: coalesce split strings</title>
<updated>2016-03-17T22:09:34Z</updated>
<author>
<name>Joe Perches</name>
<email>joe@perches.com</email>
</author>
<published>2016-03-17T21:19:47Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=756a025f00091918d9d09ca3229defb160b409c0'/>
<id>urn:sha1:756a025f00091918d9d09ca3229defb160b409c0</id>
<content type='text'>
Kernel style prefers a single string over split strings when the string is
'user-visible'.

Miscellanea:

 - Add a missing newline
 - Realign arguments

Signed-off-by: Joe Perches &lt;joe@perches.com&gt;
Acked-by: Tejun Heo &lt;tj@kernel.org&gt;	[percpu]
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm, memory hotplug: print debug message in the proper way for online_pages</title>
<updated>2016-03-17T22:09:34Z</updated>
<author>
<name>Chen Yucong</name>
<email>slaoub@gmail.com</email>
</author>
<published>2016-03-17T21:19:35Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=e33e33b4d1c699d06fb8ccd6da80b309b84ec975'/>
<id>urn:sha1:e33e33b4d1c699d06fb8ccd6da80b309b84ec975</id>
<content type='text'>
online_pages() simply returns an error value if
memory_notify(MEM_GOING_ONLINE, &amp;arg) return a value that is not what we
want for successfully onlining target pages.  This patch arms to print
more failure information like offline_pages() in online_pages.

This patch also converts printk(KERN_&lt;LEVEL&gt;) to pr_&lt;level&gt;(), and moves
__offline_pages() to not print failure information with KERN_INFO
according to David Rientjes's suggestion[1].

[1] https://lkml.org/lkml/2016/2/24/1094

Signed-off-by: Chen Yucong &lt;slaoub@gmail.com&gt;
Acked-by: David Rientjes &lt;rientjes@google.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: introduce page reference manipulation functions</title>
<updated>2016-03-17T22:09:34Z</updated>
<author>
<name>Joonsoo Kim</name>
<email>iamjoonsoo.kim@lge.com</email>
</author>
<published>2016-03-17T21:19:26Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=fe896d1878949ea92ba547587bc3075cc688fb8f'/>
<id>urn:sha1:fe896d1878949ea92ba547587bc3075cc688fb8f</id>
<content type='text'>
The success of CMA allocation largely depends on the success of
migration and key factor of it is page reference count.  Until now, page
reference is manipulated by direct calling atomic functions so we cannot
follow up who and where manipulate it.  Then, it is hard to find actual
reason of CMA allocation failure.  CMA allocation should be guaranteed
to succeed so finding offending place is really important.

In this patch, call sites where page reference is manipulated are
converted to introduced wrapper function.  This is preparation step to
add tracepoint to each page reference manipulation function.  With this
facility, we can easily find reason of CMA allocation failure.  There is
no functional change in this patch.

In addition, this patch also converts reference read sites.  It will
help a second step that renames page._count to something else and
prevents later attempt to direct access to it (Suggested by Andrew).

Signed-off-by: Joonsoo Kim &lt;iamjoonsoo.kim@lge.com&gt;
Acked-by: Michal Nazarewicz &lt;mina86@mina86.com&gt;
Acked-by: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Cc: Minchan Kim &lt;minchan@kernel.org&gt;
Cc: Mel Gorman &lt;mgorman@techsingularity.net&gt;
Cc: "Kirill A. Shutemov" &lt;kirill.shutemov@linux.intel.com&gt;
Cc: Sergey Senozhatsky &lt;sergey.senozhatsky.work@gmail.com&gt;
Cc: Steven Rostedt &lt;rostedt@goodmis.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm, memory hotplug: small cleanup in online_pages()</title>
<updated>2016-03-17T22:09:34Z</updated>
<author>
<name>Vlastimil Babka</name>
<email>vbabka@suse.cz</email>
</author>
<published>2016-03-17T21:18:12Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=e888ca3545dc6823603b976e40b62af2c68b6fcc'/>
<id>urn:sha1:e888ca3545dc6823603b976e40b62af2c68b6fcc</id>
<content type='text'>
We can reuse the nid we've determined instead of repeated pfn_to_nid()
usages.  Also zone_to_nid() should be a bit cheaper in general than
pfn_to_nid().

Signed-off-by: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Cc: Andrea Arcangeli &lt;aarcange@redhat.com&gt;
Cc: "Kirill A. Shutemov" &lt;kirill.shutemov@linux.intel.com&gt;
Cc: Rik van Riel &lt;riel@redhat.com&gt;
Cc: Joonsoo Kim &lt;iamjoonsoo.kim@lge.com&gt;
Cc: Mel Gorman &lt;mgorman@techsingularity.net&gt;
Cc: David Rientjes &lt;rientjes@google.com&gt;
Cc: Michal Hocko &lt;mhocko@suse.com&gt;
Cc: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm, compaction: introduce kcompactd</title>
<updated>2016-03-17T22:09:34Z</updated>
<author>
<name>Vlastimil Babka</name>
<email>vbabka@suse.cz</email>
</author>
<published>2016-03-17T21:18:08Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=698b1b30642f1ff0ea10ef1de9745ab633031377'/>
<id>urn:sha1:698b1b30642f1ff0ea10ef1de9745ab633031377</id>
<content type='text'>
Memory compaction can be currently performed in several contexts:

 - kswapd balancing a zone after a high-order allocation failure
 - direct compaction to satisfy a high-order allocation, including THP
   page fault attemps
 - khugepaged trying to collapse a hugepage
 - manually from /proc

The purpose of compaction is two-fold.  The obvious purpose is to
satisfy a (pending or future) high-order allocation, and is easy to
evaluate.  The other purpose is to keep overal memory fragmentation low
and help the anti-fragmentation mechanism.  The success wrt the latter
purpose is more

The current situation wrt the purposes has a few drawbacks:

 - compaction is invoked only when a high-order page or hugepage is not
   available (or manually).  This might be too late for the purposes of
   keeping memory fragmentation low.
 - direct compaction increases latency of allocations.  Again, it would
   be better if compaction was performed asynchronously to keep
   fragmentation low, before the allocation itself comes.
 - (a special case of the previous) the cost of compaction during THP
   page faults can easily offset the benefits of THP.
 - kswapd compaction appears to be complex, fragile and not working in
   some scenarios.  It could also end up compacting for a high-order
   allocation request when it should be reclaiming memory for a later
   order-0 request.

To improve the situation, we should be able to benefit from an
equivalent of kswapd, but for compaction - i.e. a background thread
which responds to fragmentation and the need for high-order allocations
(including hugepages) somewhat proactively.

One possibility is to extend the responsibilities of kswapd, which could
however complicate its design too much.  It should be better to let
kswapd handle reclaim, as order-0 allocations are often more critical
than high-order ones.

Another possibility is to extend khugepaged, but this kthread is a
single instance and tied to THP configs.

This patch goes with the option of a new set of per-node kthreads called
kcompactd, and lays the foundations, without introducing any new
tunables.  The lifecycle mimics kswapd kthreads, including the memory
hotplug hooks.

For compaction, kcompactd uses the standard compaction_suitable() and
ompact_finished() criteria and the deferred compaction functionality.
Unlike direct compaction, it uses only sync compaction, as there's no
allocation latency to minimize.

This patch doesn't yet add a call to wakeup_kcompactd.  The kswapd
compact/reclaim loop for high-order pages will be replaced by waking up
kcompactd in the next patch with the description of what's wrong with
the old approach.

Waking up of the kcompactd threads is also tied to kswapd activity and
follows these rules:
 - we don't want to affect any fastpaths, so wake up kcompactd only from
   the slowpath, as it's done for kswapd
 - if kswapd is doing reclaim, it's more important than compaction, so
   don't invoke kcompactd until kswapd goes to sleep
 - the target order used for kswapd is passed to kcompactd

Future possible future uses for kcompactd include the ability to wake up
kcompactd on demand in special situations, such as when hugepages are
not available (currently not done due to __GFP_NO_KSWAPD) or when a
fragmentation event (i.e.  __rmqueue_fallback()) occurs.  It's also
possible to perform periodic compaction with kcompactd.

[arnd@arndb.de: fix build errors with kcompactd]
[paul.gortmaker@windriver.com: don't use modular references for non modular code]
Signed-off-by: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Cc: Andrea Arcangeli &lt;aarcange@redhat.com&gt;
Cc: "Kirill A. Shutemov" &lt;kirill.shutemov@linux.intel.com&gt;
Cc: Rik van Riel &lt;riel@redhat.com&gt;
Cc: Joonsoo Kim &lt;iamjoonsoo.kim@lge.com&gt;
Cc: Mel Gorman &lt;mgorman@techsingularity.net&gt;
Cc: David Rientjes &lt;rientjes@google.com&gt;
Cc: Michal Hocko &lt;mhocko@suse.com&gt;
Cc: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Signed-off-by: Arnd Bergmann &lt;arnd@arndb.de&gt;
Signed-off-by: Paul Gortmaker &lt;paul.gortmaker@windriver.com&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm/compaction: speed up pageblock_pfn_to_page() when zone is contiguous</title>
<updated>2016-03-15T23:55:16Z</updated>
<author>
<name>Joonsoo Kim</name>
<email>iamjoonsoo.kim@lge.com</email>
</author>
<published>2016-03-15T21:57:51Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=7cf91a98e607c2f935dbcc177d70011e95b8faff'/>
<id>urn:sha1:7cf91a98e607c2f935dbcc177d70011e95b8faff</id>
<content type='text'>
There is a performance drop report due to hugepage allocation and in
there half of cpu time are spent on pageblock_pfn_to_page() in
compaction [1].

In that workload, compaction is triggered to make hugepage but most of
pageblocks are un-available for compaction due to pageblock type and
skip bit so compaction usually fails.  Most costly operations in this
case is to find valid pageblock while scanning whole zone range.  To
check if pageblock is valid to compact, valid pfn within pageblock is
required and we can obtain it by calling pageblock_pfn_to_page().  This
function checks whether pageblock is in a single zone and return valid
pfn if possible.  Problem is that we need to check it every time before
scanning pageblock even if we re-visit it and this turns out to be very
expensive in this workload.

Although we have no way to skip this pageblock check in the system where
hole exists at arbitrary position, we can use cached value for zone
continuity and just do pfn_to_page() in the system where hole doesn't
exist.  This optimization considerably speeds up in above workload.

Before vs After
  Max: 1096 MB/s vs 1325 MB/s
  Min: 635 MB/s 1015 MB/s
  Avg: 899 MB/s 1194 MB/s

Avg is improved by roughly 30% [2].

[1]: http://www.spinics.net/lists/linux-mm/msg97378.html
[2]: https://lkml.org/lkml/2015/12/9/23

[akpm@linux-foundation.org: don't forget to restore zone-&gt;contiguous on error path, per Vlastimil]
Signed-off-by: Joonsoo Kim &lt;iamjoonsoo.kim@lge.com&gt;
Reported-by: Aaron Lu &lt;aaron.lu@intel.com&gt;
Acked-by: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Tested-by: Aaron Lu &lt;aaron.lu@intel.com&gt;
Cc: Mel Gorman &lt;mgorman@suse.de&gt;
Cc: Rik van Riel &lt;riel@redhat.com&gt;
Cc: David Rientjes &lt;rientjes@google.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>memory-hotplug: add automatic onlining policy for the newly added memory</title>
<updated>2016-03-15T23:55:16Z</updated>
<author>
<name>Vitaly Kuznetsov</name>
<email>vkuznets@redhat.com</email>
</author>
<published>2016-03-15T21:56:48Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=31bc3858ea3ebcc3157b3f5f0e624c5962f5a7a6'/>
<id>urn:sha1:31bc3858ea3ebcc3157b3f5f0e624c5962f5a7a6</id>
<content type='text'>
Currently, all newly added memory blocks remain in 'offline' state
unless someone onlines them, some linux distributions carry special udev
rules like:

  SUBSYSTEM=="memory", ACTION=="add", ATTR{state}=="offline", ATTR{state}="online"

to make this happen automatically.  This is not a great solution for
virtual machines where memory hotplug is being used to address high
memory pressure situations as such onlining is slow and a userspace
process doing this (udev) has a chance of being killed by the OOM killer
as it will probably require to allocate some memory.

Introduce default policy for the newly added memory blocks in
/sys/devices/system/memory/auto_online_blocks file with two possible
values: "offline" which preserves the current behavior and "online"
which causes all newly added memory blocks to go online as soon as
they're added.  The default is "offline".

Signed-off-by: Vitaly Kuznetsov &lt;vkuznets@redhat.com&gt;
Reviewed-by: Daniel Kiper &lt;daniel.kiper@oracle.com&gt;
Cc: Jonathan Corbet &lt;corbet@lwn.net&gt;
Cc: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Cc: Daniel Kiper &lt;daniel.kiper@oracle.com&gt;
Cc: Dan Williams &lt;dan.j.williams@intel.com&gt;
Cc: Tang Chen &lt;tangchen@cn.fujitsu.com&gt;
Cc: David Vrabel &lt;david.vrabel@citrix.com&gt;
Acked-by: David Rientjes &lt;rientjes@google.com&gt;
Cc: Naoya Horiguchi &lt;n-horiguchi@ah.jp.nec.com&gt;
Cc: Xishi Qiu &lt;qiuxishi@huawei.com&gt;
Cc: Mel Gorman &lt;mgorman@techsingularity.net&gt;
Cc: "K. Y. Srinivasan" &lt;kys@microsoft.com&gt;
Cc: Igor Mammedov &lt;imammedo@redhat.com&gt;
Cc: Kay Sievers &lt;kay@vrfy.org&gt;
Cc: Konrad Rzeszutek Wilk &lt;konrad.wilk@oracle.com&gt;
Cc: Boris Ostrovsky &lt;boris.ostrovsky@oracle.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>xen, mm: Set IORESOURCE_SYSTEM_RAM to System RAM</title>
<updated>2016-01-30T08:49:58Z</updated>
<author>
<name>Toshi Kani</name>
<email>toshi.kani@hpe.com</email>
</author>
<published>2016-01-26T20:57:24Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=782b86641e5d471e9eb1cf0072c012d2f758e568'/>
<id>urn:sha1:782b86641e5d471e9eb1cf0072c012d2f758e568</id>
<content type='text'>
Set IORESOURCE_SYSTEM_RAM in struct resource.flags of "System
RAM" entries.

Signed-off-by: Toshi Kani &lt;toshi.kani@hpe.com&gt;
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
Acked-by: David Vrabel &lt;david.vrabel@citrix.com&gt; # xen
Cc: Andrew Banman &lt;abanman@sgi.com&gt;
Cc: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: Andy Lutomirski &lt;luto@amacapital.net&gt;
Cc: Boris Ostrovsky &lt;boris.ostrovsky@oracle.com&gt;
Cc: Borislav Petkov &lt;bp@alien8.de&gt;
Cc: Brian Gerst &lt;brgerst@gmail.com&gt;
Cc: Dan Williams &lt;dan.j.williams@intel.com&gt;
Cc: David Rientjes &lt;rientjes@google.com&gt;
Cc: Denys Vlasenko &lt;dvlasenk@redhat.com&gt;
Cc: Gu Zheng &lt;guz.fnst@cn.fujitsu.com&gt;
Cc: H. Peter Anvin &lt;hpa@zytor.com&gt;
Cc: Konrad Rzeszutek Wilk &lt;konrad.wilk@oracle.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Luis R. Rodriguez &lt;mcgrof@suse.com&gt;
Cc: Mel Gorman &lt;mgorman@techsingularity.net&gt;
Cc: Naoya Horiguchi &lt;n-horiguchi@ah.jp.nec.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Tang Chen &lt;tangchen@cn.fujitsu.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Toshi Kani &lt;toshi.kani@hp.com&gt;
Cc: linux-arch@vger.kernel.org
Cc: linux-mm &lt;linux-mm@kvack.org&gt;
Cc: xen-devel@lists.xenproject.org
Link: http://lkml.kernel.org/r/1453841853-11383-9-git-send-email-bp@alien8.de
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>x86, mm: introduce vmem_altmap to augment vmemmap_populate()</title>
<updated>2016-01-16T01:56:32Z</updated>
<author>
<name>Dan Williams</name>
<email>dan.j.williams@intel.com</email>
</author>
<published>2016-01-16T00:56:22Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=4b94ffdc4163bae1ec73b6e977ffb7a7da3d06d3'/>
<id>urn:sha1:4b94ffdc4163bae1ec73b6e977ffb7a7da3d06d3</id>
<content type='text'>
In support of providing struct page for large persistent memory
capacities, use struct vmem_altmap to change the default policy for
allocating memory for the memmap array.  The default vmemmap_populate()
allocates page table storage area from the page allocator.  Given
persistent memory capacities relative to DRAM it may not be feasible to
store the memmap in 'System Memory'.  Instead vmem_altmap represents
pre-allocated "device pages" to satisfy vmemmap_alloc_block_buf()
requests.

Signed-off-by: Dan Williams &lt;dan.j.williams@intel.com&gt;
Reported-by: kbuild test robot &lt;lkp@intel.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: "H. Peter Anvin" &lt;hpa@zytor.com&gt;
Cc: Dave Hansen &lt;dave.hansen@linux.intel.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>memory-hotplug: don't BUG() in register_memory_resource()</title>
<updated>2016-01-15T00:00:49Z</updated>
<author>
<name>Vitaly Kuznetsov</name>
<email>vkuznets@redhat.com</email>
</author>
<published>2016-01-14T23:21:55Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=6f754ba4cfcc044078d4836056ac45404e1b6e85'/>
<id>urn:sha1:6f754ba4cfcc044078d4836056ac45404e1b6e85</id>
<content type='text'>
Out of memory condition is not a bug and while we can't add new memory
in such case crashing the system seems wrong.  Propagating the return
value from register_memory_resource() requires interface change.

Signed-off-by: Vitaly Kuznetsov &lt;vkuznets@redhat.com&gt;
Reviewed-by: Igor Mammedov &lt;imammedo@redhat.com&gt;
Acked-by: David Rientjes &lt;rientjes@google.com&gt;
Cc: Tang Chen &lt;tangchen@cn.fujitsu.com&gt;
Cc: Naoya Horiguchi &lt;n-horiguchi@ah.jp.nec.com&gt;
Cc: Xishi Qiu &lt;qiuxishi@huawei.com&gt;
Cc: Sheng Yong &lt;shengyong1@huawei.com&gt;
Cc: Zhu Guihua &lt;zhugh.fnst@cn.fujitsu.com&gt;
Cc: Dan Williams &lt;dan.j.williams@intel.com&gt;
Cc: David Vrabel &lt;david.vrabel@citrix.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
</feed>
