linux/mm/memory_hotplug.c, branch v4.6

mm: coalesce split strings

2016-03-17T22:09:34Z

Kernel style prefers a single string over split strings when the string is 'user-visible'. Miscellanea: - Add a missing newline - Realign arguments Signed-off-by: Joe Perches Acked-by: Tejun Heo [percpu] Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds

mm, memory hotplug: print debug message in the proper way for online_pages

2016-03-17T22:09:34Z

online_pages() simply returns an error value if memory_notify(MEM_GOING_ONLINE, &arg) return a value that is not what we want for successfully onlining target pages. This patch arms to print more failure information like offline_pages() in online_pages. This patch also converts printk(KERN_) to pr_(), and moves __offline_pages() to not print failure information with KERN_INFO according to David Rientjes's suggestion[1]. [1] https://lkml.org/lkml/2016/2/24/1094 Signed-off-by: Chen Yucong Acked-by: David Rientjes Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds

mm: introduce page reference manipulation functions

2016-03-17T22:09:34Z

The success of CMA allocation largely depends on the success of migration and key factor of it is page reference count. Until now, page reference is manipulated by direct calling atomic functions so we cannot follow up who and where manipulate it. Then, it is hard to find actual reason of CMA allocation failure. CMA allocation should be guaranteed to succeed so finding offending place is really important. In this patch, call sites where page reference is manipulated are converted to introduced wrapper function. This is preparation step to add tracepoint to each page reference manipulation function. With this facility, we can easily find reason of CMA allocation failure. There is no functional change in this patch. In addition, this patch also converts reference read sites. It will help a second step that renames page._count to something else and prevents later attempt to direct access to it (Suggested by Andrew). Signed-off-by: Joonsoo Kim Acked-by: Michal Nazarewicz Acked-by: Vlastimil Babka Cc: Minchan Kim Cc: Mel Gorman Cc: "Kirill A. Shutemov" Cc: Sergey Senozhatsky Cc: Steven Rostedt Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds

mm, memory hotplug: small cleanup in online_pages()

2016-03-17T22:09:34Z

We can reuse the nid we've determined instead of repeated pfn_to_nid() usages. Also zone_to_nid() should be a bit cheaper in general than pfn_to_nid(). Signed-off-by: Vlastimil Babka Cc: Andrea Arcangeli Cc: "Kirill A. Shutemov" Cc: Rik van Riel Cc: Joonsoo Kim Cc: Mel Gorman Cc: David Rientjes Cc: Michal Hocko Cc: Johannes Weiner Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds

mm, compaction: introduce kcompactd

2016-03-17T22:09:34Z

Memory compaction can be currently performed in several contexts: - kswapd balancing a zone after a high-order allocation failure - direct compaction to satisfy a high-order allocation, including THP page fault attemps - khugepaged trying to collapse a hugepage - manually from /proc The purpose of compaction is two-fold. The obvious purpose is to satisfy a (pending or future) high-order allocation, and is easy to evaluate. The other purpose is to keep overal memory fragmentation low and help the anti-fragmentation mechanism. The success wrt the latter purpose is more The current situation wrt the purposes has a few drawbacks: - compaction is invoked only when a high-order page or hugepage is not available (or manually). This might be too late for the purposes of keeping memory fragmentation low. - direct compaction increases latency of allocations. Again, it would be better if compaction was performed asynchronously to keep fragmentation low, before the allocation itself comes. - (a special case of the previous) the cost of compaction during THP page faults can easily offset the benefits of THP. - kswapd compaction appears to be complex, fragile and not working in some scenarios. It could also end up compacting for a high-order allocation request when it should be reclaiming memory for a later order-0 request. To improve the situation, we should be able to benefit from an equivalent of kswapd, but for compaction - i.e. a background thread which responds to fragmentation and the need for high-order allocations (including hugepages) somewhat proactively. One possibility is to extend the responsibilities of kswapd, which could however complicate its design too much. It should be better to let kswapd handle reclaim, as order-0 allocations are often more critical than high-order ones. Another possibility is to extend khugepaged, but this kthread is a single instance and tied to THP configs. This patch goes with the option of a new set of per-node kthreads called kcompactd, and lays the foundations, without introducing any new tunables. The lifecycle mimics kswapd kthreads, including the memory hotplug hooks. For compaction, kcompactd uses the standard compaction_suitable() and ompact_finished() criteria and the deferred compaction functionality. Unlike direct compaction, it uses only sync compaction, as there's no allocation latency to minimize. This patch doesn't yet add a call to wakeup_kcompactd. The kswapd compact/reclaim loop for high-order pages will be replaced by waking up kcompactd in the next patch with the description of what's wrong with the old approach. Waking up of the kcompactd threads is also tied to kswapd activity and follows these rules: - we don't want to affect any fastpaths, so wake up kcompactd only from the slowpath, as it's done for kswapd - if kswapd is doing reclaim, it's more important than compaction, so don't invoke kcompactd until kswapd goes to sleep - the target order used for kswapd is passed to kcompactd Future possible future uses for kcompactd include the ability to wake up kcompactd on demand in special situations, such as when hugepages are not available (currently not done due to __GFP_NO_KSWAPD) or when a fragmentation event (i.e. __rmqueue_fallback()) occurs. It's also possible to perform periodic compaction with kcompactd. [arnd@arndb.de: fix build errors with kcompactd] [paul.gortmaker@windriver.com: don't use modular references for non modular code] Signed-off-by: Vlastimil Babka Cc: Andrea Arcangeli Cc: "Kirill A. Shutemov" Cc: Rik van Riel Cc: Joonsoo Kim Cc: Mel Gorman Cc: David Rientjes Cc: Michal Hocko Cc: Johannes Weiner Signed-off-by: Arnd Bergmann Signed-off-by: Paul Gortmaker Cc: Hugh Dickins Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds

mm/compaction: speed up pageblock_pfn_to_page() when zone is contiguous

2016-03-15T23:55:16Z

There is a performance drop report due to hugepage allocation and in there half of cpu time are spent on pageblock_pfn_to_page() in compaction [1]. In that workload, compaction is triggered to make hugepage but most of pageblocks are un-available for compaction due to pageblock type and skip bit so compaction usually fails. Most costly operations in this case is to find valid pageblock while scanning whole zone range. To check if pageblock is valid to compact, valid pfn within pageblock is required and we can obtain it by calling pageblock_pfn_to_page(). This function checks whether pageblock is in a single zone and return valid pfn if possible. Problem is that we need to check it every time before scanning pageblock even if we re-visit it and this turns out to be very expensive in this workload. Although we have no way to skip this pageblock check in the system where hole exists at arbitrary position, we can use cached value for zone continuity and just do pfn_to_page() in the system where hole doesn't exist. This optimization considerably speeds up in above workload. Before vs After Max: 1096 MB/s vs 1325 MB/s Min: 635 MB/s 1015 MB/s Avg: 899 MB/s 1194 MB/s Avg is improved by roughly 30% [2]. [1]: http://www.spinics.net/lists/linux-mm/msg97378.html [2]: https://lkml.org/lkml/2015/12/9/23 [akpm@linux-foundation.org: don't forget to restore zone->contiguous on error path, per Vlastimil] Signed-off-by: Joonsoo Kim Reported-by: Aaron Lu Acked-by: Vlastimil Babka Tested-by: Aaron Lu Cc: Mel Gorman Cc: Rik van Riel Cc: David Rientjes Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds

memory-hotplug: add automatic onlining policy for the newly added memory

2016-03-15T23:55:16Z

Currently, all newly added memory blocks remain in 'offline' state unless someone onlines them, some linux distributions carry special udev rules like: SUBSYSTEM=="memory", ACTION=="add", ATTR{state}=="offline", ATTR{state}="online" to make this happen automatically. This is not a great solution for virtual machines where memory hotplug is being used to address high memory pressure situations as such onlining is slow and a userspace process doing this (udev) has a chance of being killed by the OOM killer as it will probably require to allocate some memory. Introduce default policy for the newly added memory blocks in /sys/devices/system/memory/auto_online_blocks file with two possible values: "offline" which preserves the current behavior and "online" which causes all newly added memory blocks to go online as soon as they're added. The default is "offline". Signed-off-by: Vitaly Kuznetsov Reviewed-by: Daniel Kiper Cc: Jonathan Corbet Cc: Greg Kroah-Hartman Cc: Daniel Kiper Cc: Dan Williams Cc: Tang Chen Cc: David Vrabel Acked-by: David Rientjes Cc: Naoya Horiguchi Cc: Xishi Qiu Cc: Mel Gorman Cc: "K. Y. Srinivasan" Cc: Igor Mammedov Cc: Kay Sievers Cc: Konrad Rzeszutek Wilk Cc: Boris Ostrovsky Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds

xen, mm: Set IORESOURCE_SYSTEM_RAM to System RAM

2016-01-30T08:49:58Z

Set IORESOURCE_SYSTEM_RAM in struct resource.flags of "System RAM" entries. Signed-off-by: Toshi Kani Signed-off-by: Borislav Petkov Acked-by: David Vrabel # xen Cc: Andrew Banman Cc: Andrew Morton Cc: Andy Lutomirski Cc: Boris Ostrovsky Cc: Borislav Petkov Cc: Brian Gerst Cc: Dan Williams Cc: David Rientjes Cc: Denys Vlasenko Cc: Gu Zheng Cc: H. Peter Anvin Cc: Konrad Rzeszutek Wilk Cc: Linus Torvalds Cc: Luis R. Rodriguez Cc: Mel Gorman Cc: Naoya Horiguchi Cc: Peter Zijlstra Cc: Tang Chen Cc: Thomas Gleixner Cc: Toshi Kani Cc: linux-arch@vger.kernel.org Cc: linux-mm Cc: xen-devel@lists.xenproject.org Link: http://lkml.kernel.org/r/1453841853-11383-9-git-send-email-bp@alien8.de Signed-off-by: Ingo Molnar

x86, mm: introduce vmem_altmap to augment vmemmap_populate()

2016-01-16T01:56:32Z

In support of providing struct page for large persistent memory capacities, use struct vmem_altmap to change the default policy for allocating memory for the memmap array. The default vmemmap_populate() allocates page table storage area from the page allocator. Given persistent memory capacities relative to DRAM it may not be feasible to store the memmap in 'System Memory'. Instead vmem_altmap represents pre-allocated "device pages" to satisfy vmemmap_alloc_block_buf() requests. Signed-off-by: Dan Williams Reported-by: kbuild test robot Cc: Thomas Gleixner Cc: Ingo Molnar Cc: "H. Peter Anvin" Cc: Dave Hansen Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds

memory-hotplug: don't BUG() in register_memory_resource()

2016-01-15T00:00:49Z

Out of memory condition is not a bug and while we can't add new memory in such case crashing the system seems wrong. Propagating the return value from register_memory_resource() requires interface change. Signed-off-by: Vitaly Kuznetsov Reviewed-by: Igor Mammedov Acked-by: David Rientjes Cc: Tang Chen Cc: Naoya Horiguchi Cc: Xishi Qiu Cc: Sheng Yong Cc: Zhu Guihua Cc: Dan Williams Cc: David Vrabel Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds