<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/mm/sparse.c, branch v3.5</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v3.5</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v3.5'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2012-07-11T23:04:49Z</updated>
<entry>
<title>mm: sparse: fix usemap allocation above node descriptor section</title>
<updated>2012-07-11T23:04:49Z</updated>
<author>
<name>Yinghai Lu</name>
<email>yinghai@kernel.org</email>
</author>
<published>2012-07-11T21:02:53Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=99ab7b19440a72ebdf225f99b20f8ef40decee86'/>
<id>urn:sha1:99ab7b19440a72ebdf225f99b20f8ef40decee86</id>
<content type='text'>
After commit f5bf18fa22f8 ("bootmem/sparsemem: remove limit constraint
in alloc_bootmem_section"), usemap allocations may easily be placed
outside the optimal section that holds the node descriptor, even if
there is space available in that section.  This results in unnecessary
hotplug dependencies that need to have the node unplugged before the
section holding the usemap.

The reason is that the bootmem allocator doesn't guarantee a linear
search starting from the passed allocation goal but may start out at a
much higher address absent an upper limit.

Fix this by trying the allocation with the limit at the section end,
then retry without if that fails.  This keeps the fix from f5bf18fa22f8
of not panicking if the allocation does not fit in the section, but
still makes sure to try to stay within the section at first.

Signed-off-by: Yinghai Lu &lt;yinghai@kernel.org&gt;
Signed-off-by: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: &lt;stable@vger.kernel.org&gt;	[3.3.x, 3.4.x]
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: sparse: fix section usemap placement calculation</title>
<updated>2012-07-11T23:04:49Z</updated>
<author>
<name>Yinghai Lu</name>
<email>yinghai@kernel.org</email>
</author>
<published>2012-07-11T21:02:51Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=07b4e2bc9c35ea88cbd36d806fcd5e3bcbf022be'/>
<id>urn:sha1:07b4e2bc9c35ea88cbd36d806fcd5e3bcbf022be</id>
<content type='text'>
Commit 238305bb4d41 ("mm: remove sparsemem allocation details from the
bootmem allocator") introduced a bug in the allocation goal calculation
that put section usemaps not in the same section as the node
descriptors, creating unnecessary hotplug dependencies between them:

  node 0 must be removed before remove section 16399
  node 1 must be removed before remove section 16399
  node 2 must be removed before remove section 16399
  node 3 must be removed before remove section 16399
  node 4 must be removed before remove section 16399
  node 5 must be removed before remove section 16399
  node 6 must be removed before remove section 16399

The reason is that it applies PAGE_SECTION_MASK to the physical address
of the node descriptor when finding a suitable place to put the usemap,
when this mask is actually intended to be used with PFNs.  Because the
PFN mask is wider, the target address will point beyond the wanted
section holding the node descriptor and the node must be offlined before
the section holding the usemap can go.

Fix this by extending the mask to address width before use.

Signed-off-by: Yinghai Lu &lt;yinghai@kernel.org&gt;
Signed-off-by: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: remove sparsemem allocation details from the bootmem allocator</title>
<updated>2012-05-29T23:22:22Z</updated>
<author>
<name>Johannes Weiner</name>
<email>hannes@cmpxchg.org</email>
</author>
<published>2012-05-29T22:06:36Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=238305bb4d418c95977162ba13c11880685fc731'/>
<id>urn:sha1:238305bb4d418c95977162ba13c11880685fc731</id>
<content type='text'>
alloc_bootmem_section() derives allocation area constraints from the
specified sparsemem section.  This is a bit specific for a generic memory
allocator like bootmem, though, so move it over to sparsemem.

As __alloc_bootmem_node_nopanic() already retries failed allocations with
relaxed area constraints, the fallback code in sparsemem.c can be removed
and the code becomes a bit more compact overall.

[akpm@linux-foundation.org: fix build]
Signed-off-by: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Acked-by: Tejun Heo &lt;tj@kernel.org&gt;
Acked-by: David S. Miller &lt;davem@davemloft.net&gt;
Cc: Yinghai Lu &lt;yinghai@kernel.org&gt;
Cc: Gavin Shan &lt;shangw@linux.vnet.ibm.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>bootmem/sparsemem: remove limit constraint in alloc_bootmem_section</title>
<updated>2012-03-22T00:54:58Z</updated>
<author>
<name>Nishanth Aravamudan</name>
<email>nacc@linux.vnet.ibm.com</email>
</author>
<published>2012-03-21T23:34:07Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=f5bf18fa22f8c41a13eb8762c7373eb3a93a7333'/>
<id>urn:sha1:f5bf18fa22f8c41a13eb8762c7373eb3a93a7333</id>
<content type='text'>
While testing AMS (Active Memory Sharing) / CMO (Cooperative Memory
Overcommit) on powerpc, we tripped the following:

  kernel BUG at mm/bootmem.c:483!
  cpu 0x0: Vector: 700 (Program Check) at [c000000000c03940]
      pc: c000000000a62bd8: .alloc_bootmem_core+0x90/0x39c
      lr: c000000000a64bcc: .sparse_early_usemaps_alloc_node+0x84/0x29c
      sp: c000000000c03bc0
     msr: 8000000000021032
    current = 0xc000000000b0cce0
    paca    = 0xc000000001d80000
      pid   = 0, comm = swapper
  kernel BUG at mm/bootmem.c:483!
  enter ? for help
  [c000000000c03c80] c000000000a64bcc
  .sparse_early_usemaps_alloc_node+0x84/0x29c
  [c000000000c03d50] c000000000a64f10 .sparse_init+0x12c/0x28c
  [c000000000c03e20] c000000000a474f4 .setup_arch+0x20c/0x294
  [c000000000c03ee0] c000000000a4079c .start_kernel+0xb4/0x460
  [c000000000c03f90] c000000000009670 .start_here_common+0x1c/0x2c

This is

        BUG_ON(limit &amp;&amp; goal + size &gt; limit);

and after some debugging, it seems that

	goal = 0x7ffff000000
	limit = 0x80000000000

and sparse_early_usemaps_alloc_node -&gt;
sparse_early_usemaps_alloc_pgdat_section calls

	return alloc_bootmem_section(usemap_size() * count, section_nr);

This is on a system with 8TB available via the AMS pool, and as a quirk
of AMS in firmware, all of that memory shows up in node 0.  So, we end
up with an allocation that will fail the goal/limit constraints.

In theory, we could "fall-back" to alloc_bootmem_node() in
sparse_early_usemaps_alloc_node(), but since we actually have HOTREMOVE
defined, we'll BUG_ON() instead.  A simple solution appears to be to
unconditionally remove the limit condition in alloc_bootmem_section,
meaning allocations are allowed to cross section boundaries (necessary
for systems of this size).

Johannes Weiner pointed out that if alloc_bootmem_section() no longer
guarantees section-locality, we need check_usemap_section_nr() to print
possible cross-dependencies between node descriptors and the usemaps
allocated through it.  That makes the two loops in
sparse_early_usemaps_alloc_node() identical, so re-factor the code a
bit.

[akpm@linux-foundation.org: code simplification]
Signed-off-by: Nishanth Aravamudan &lt;nacc@us.ibm.com&gt;
Cc: Dave Hansen &lt;haveblue@us.ibm.com&gt;
Cc: Anton Blanchard &lt;anton@au1.ibm.com&gt;
Cc: Paul Mackerras &lt;paulus@samba.org&gt;
Cc: Ben Herrenschmidt &lt;benh@kernel.crashing.org&gt;
Cc: Robert Jennings &lt;rcj@linux.vnet.ibm.com&gt;
Acked-by: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Acked-by: Mel Gorman &lt;mgorman@suse.de&gt;
Cc: &lt;stable@vger.kernel.org&gt;	[3.3.1]
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: Map most files to use export.h instead of module.h</title>
<updated>2011-10-31T13:20:12Z</updated>
<author>
<name>Paul Gortmaker</name>
<email>paul.gortmaker@windriver.com</email>
</author>
<published>2011-10-16T06:01:52Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=b95f1b31b75588306e32b2afd32166cad48f670b'/>
<id>urn:sha1:b95f1b31b75588306e32b2afd32166cad48f670b</id>
<content type='text'>
The files changed within are only using the EXPORT_SYMBOL
macro variants.  They are not using core modular infrastructure
and hence don't need module.h but only the export.h header.

Signed-off-by: Paul Gortmaker &lt;paul.gortmaker@windriver.com&gt;
</content>
</entry>
<entry>
<title>mm: make some struct page's const</title>
<updated>2011-07-26T03:57:07Z</updated>
<author>
<name>Ian Campbell</name>
<email>ian.campbell@citrix.com</email>
</author>
<published>2011-07-26T00:11:51Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=33dd4e0ec91138c3d80e790c08a3db47426c81f2'/>
<id>urn:sha1:33dd4e0ec91138c3d80e790c08a3db47426c81f2</id>
<content type='text'>
These uses are read-only and in a subsequent patch I have a const struct
page in my hand...

[akpm@linux-foundation.org: fix warnings in lowmem_page_address()]
Signed-off-by: Ian Campbell &lt;ian.campbell@citrix.com&gt;
Cc: Rik van Riel &lt;riel@redhat.com&gt;
Cc: Andrea Arcangeli &lt;aarcange@redhat.com&gt;
Cc: Mel Gorman &lt;mel@csn.ul.ie&gt;
Cc: Michel Lespinasse &lt;walken@google.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>Fix common misspellings</title>
<updated>2011-03-31T14:26:23Z</updated>
<author>
<name>Lucas De Marchi</name>
<email>lucas.demarchi@profusion.mobi</email>
</author>
<published>2011-03-31T01:57:33Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=25985edcedea6396277003854657b5f3cb31a628'/>
<id>urn:sha1:25985edcedea6396277003854657b5f3cb31a628</id>
<content type='text'>
Fixes generated by 'codespell' and manually reviewed.

Signed-off-by: Lucas De Marchi &lt;lucas.demarchi@profusion.mobi&gt;
</content>
</entry>
<entry>
<title>thp: remove PG_buddy</title>
<updated>2011-01-14T01:32:43Z</updated>
<author>
<name>Andrea Arcangeli</name>
<email>aarcange@redhat.com</email>
</author>
<published>2011-01-13T23:47:00Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=5f24ce5fd34c3ca1b3d10d30da754732da64d5c0'/>
<id>urn:sha1:5f24ce5fd34c3ca1b3d10d30da754732da64d5c0</id>
<content type='text'>
PG_buddy can be converted to _mapcount == -2.  So the PG_compound_lock can
be added to page-&gt;flags without overflowing (because of the sparse section
bits increasing) with CONFIG_X86_PAE=y and CONFIG_X86_PAT=y.  This also
has to move the memory hotplug code from _mapcount to lru.next to avoid
any risk of clashes.  We can't use lru.next for PG_buddy removal, but
memory hotplug can use lru.next even more easily than the mapcount
instead.

Signed-off-by: Andrea Arcangeli &lt;aarcange@redhat.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>sparsemem: on no vmemmap path put mem_map on node high too</title>
<updated>2010-05-25T15:06:56Z</updated>
<author>
<name>Yinghai Lu</name>
<email>yinghai@kernel.org</email>
</author>
<published>2010-05-24T21:31:57Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=e48e67e08c340def3d0349c2910d23c7985fb6fa'/>
<id>urn:sha1:e48e67e08c340def3d0349c2910d23c7985fb6fa</id>
<content type='text'>
We need to put mem_map high when virtual memmap is not used.

before this patch
free mem pfn range on first node:
[    0.000000]  19 - 1f
[    0.000000]  28 40 - 80 95
[    0.000000]  702 740 - 1000 1000
[    0.000000]  347c - 347e
[    0.000000]  34e7 3500 - 3b80 3b8b
[    0.000000]  73b8b 73bc0 - 73c00 73c00
[    0.000000]  73ddd - 73e00
[    0.000000]  73fdd - 74000
[    0.000000]  741dd - 74200
[    0.000000]  743dd - 74400
[    0.000000]  745dd - 74600
[    0.000000]  747dd - 74800
[    0.000000]  749dd - 74a00
[    0.000000]  74bdd - 74c00
[    0.000000]  74ddd - 74e00
[    0.000000]  74fdd - 75000
[    0.000000]  751dd - 75200
[    0.000000]  753dd - 75400
[    0.000000]  755dd - 75600
[    0.000000]  757dd - 75800
[    0.000000]  759dd - 75a00
[    0.000000]  79bdd 79c00 - 7d540 7d550
[    0.000000]  7f745 - 7f750
[    0.000000]  10000b 100040 - 2080000 2080000
so only 79c00 - 7d540 are major free block under 4g...

after this patch, we will get
[    0.000000]  19 - 1f
[    0.000000]  28 40 - 80 95
[    0.000000]  702 740 - 1000 1000
[    0.000000]  347c - 347e
[    0.000000]  34e7 3500 - 3600 3600
[    0.000000]  37dd - 3800
[    0.000000]  39dd - 3a00
[    0.000000]  3bdd - 3c00
[    0.000000]  3ddd - 3e00
[    0.000000]  3fdd - 4000
[    0.000000]  41dd - 4200
[    0.000000]  43dd - 4400
[    0.000000]  45dd - 4600
[    0.000000]  47dd - 4800
[    0.000000]  49dd - 4a00
[    0.000000]  4bdd - 4c00
[    0.000000]  4ddd - 4e00
[    0.000000]  4fdd - 5000
[    0.000000]  51dd - 5200
[    0.000000]  53dd - 5400
[    0.000000]  95dd 9600 - 7d540 7d550
[    0.000000]  7f745 - 7f750
[    0.000000]  17000b 170040 - 2080000 2080000
we will have 9600 - 7d540 for major free block...

sparse-vmemmap path already used __alloc_bootmem_node_high()

Signed-off-by: Yinghai Lu &lt;yinghai@kernel.org&gt;
Cc: Jiri Slaby &lt;jirislaby@gmail.com&gt;
Cc: "H. Peter Anvin" &lt;hpa@zytor.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Ingo Molnar &lt;mingo@elte.hu&gt;
Cc: Christoph Lameter &lt;cl@linux-foundation.org&gt;
Cc: Greg Thelen &lt;gthelen@google.com&gt;
Cc: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h</title>
<updated>2010-03-30T13:02:32Z</updated>
<author>
<name>Tejun Heo</name>
<email>tj@kernel.org</email>
</author>
<published>2010-03-24T08:04:11Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=5a0e3ad6af8660be21ca98a971cd00f331318c05'/>
<id>urn:sha1:5a0e3ad6af8660be21ca98a971cd00f331318c05</id>
<content type='text'>
percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files.  percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.

percpu.h -&gt; slab.h dependency is about to be removed.  Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability.  As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.

  http://userweb.kernel.org/~tj/misc/slabh-sweep.py

The script does the followings.

* Scan files for gfp and slab usages and update includes such that
  only the necessary includes are there.  ie. if only gfp is used,
  gfp.h, if slab is used, slab.h.

* When the script inserts a new include, it looks at the include
  blocks and try to put the new include such that its order conforms
  to its surrounding.  It's put in the include block which contains
  core kernel includes, in the same order that the rest are ordered -
  alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
  doesn't seem to be any matching order.

* If the script can't find a place to put a new include (mostly
  because the file doesn't have fitting include block), it prints out
  an error message indicating which .h file needs to be added to the
  file.

The conversion was done in the following steps.

1. The initial automatic conversion of all .c files updated slightly
   over 4000 files, deleting around 700 includes and adding ~480 gfp.h
   and ~3000 slab.h inclusions.  The script emitted errors for ~400
   files.

2. Each error was manually checked.  Some didn't need the inclusion,
   some needed manual addition while adding it to implementation .h or
   embedding .c file was more appropriate for others.  This step added
   inclusions to around 150 files.

3. The script was run again and the output was compared to the edits
   from #2 to make sure no file was left behind.

4. Several build tests were done and a couple of problems were fixed.
   e.g. lib/decompress_*.c used malloc/free() wrappers around slab
   APIs requiring slab.h to be added manually.

5. The script was run on all .h files but without automatically
   editing them as sprinkling gfp.h and slab.h inclusions around .h
   files could easily lead to inclusion dependency hell.  Most gfp.h
   inclusion directives were ignored as stuff from gfp.h was usually
   wildly available and often used in preprocessor macros.  Each
   slab.h inclusion directive was examined and added manually as
   necessary.

6. percpu.h was updated not to include slab.h.

7. Build test were done on the following configurations and failures
   were fixed.  CONFIG_GCOV_KERNEL was turned off for all tests (as my
   distributed build env didn't work with gcov compiles) and a few
   more options had to be turned off depending on archs to make things
   build (like ipr on powerpc/64 which failed due to missing writeq).

   * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
   * powerpc and powerpc64 SMP allmodconfig
   * sparc and sparc64 SMP allmodconfig
   * ia64 SMP allmodconfig
   * s390 SMP allmodconfig
   * alpha SMP allmodconfig
   * um on x86_64 SMP allmodconfig

8. percpu.h modifications were reverted so that it could be applied as
   a separate patch and serve as bisection point.

Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Guess-its-ok-by: Christoph Lameter &lt;cl@linux-foundation.org&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: Lee Schermerhorn &lt;Lee.Schermerhorn@hp.com&gt;
</content>
</entry>
</feed>
