<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/mm/sparse.c, branch v3.8</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v3.8</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v3.8'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2012-12-12T01:22:23Z</updated>
<entry>
<title>memory-hotplug, mm/sparse.c: clear the memory to store struct page</title>
<updated>2012-12-12T01:22:23Z</updated>
<author>
<name>Wen Congyang</name>
<email>wency@cn.fujitsu.com</email>
</author>
<published>2012-12-12T00:00:59Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=3ac19f8efe26451cacac31d0be34fa9c51114c2a'/>
<id>urn:sha1:3ac19f8efe26451cacac31d0be34fa9c51114c2a</id>
<content type='text'>
If sparse memory vmemmap is enabled, we can't free the memory to store
struct page when a memory device is hotremoved, because we may store
struct page in the memory to manage the memory which doesn't belong to
this memory device.  When we hotadded this memory device again, we will
reuse this memory to store struct page, and struct page may contain some
obsolete information, and we will get bad-page state:

  init_memory_mapping: [mem 0x80000000-0x9fffffff]
  Built 2 zonelists in Node order, mobility grouping on.  Total pages: 547617
  Policy zone: Normal
  BUG: Bad page state in process bash  pfn:9b6dc
  page:ffffea0002200020 count:0 mapcount:0 mapping:          (null) index:0xfdfdfdfdfdfdfdfd
  page flags: 0x2fdfdfdfd5df9fd(locked|referenced|uptodate|dirty|lru|active|slab|owner_priv_1|private|private_2|writeback|head|tail|swapcache|reclaim|swapbacked|unevictable|uncached|compound_lock)
  Modules linked in: netconsole acpiphp pci_hotplug acpi_memhotplug loop kvm_amd kvm microcode tpm_tis tpm tpm_bios evdev psmouse serio_raw i2c_piix4 i2c_core parport_pc parport processor button thermal_sys ext3 jbd mbcache sg sr_mod cdrom ata_generic virtio_net ata_piix virtio_blk libata virtio_pci virtio_ring virtio scsi_mod
  Pid: 988, comm: bash Not tainted 3.6.0-rc7-guest #12
  Call Trace:
   [&lt;ffffffff810e9b30&gt;] ? bad_page+0xb0/0x100
   [&lt;ffffffff810ea4c3&gt;] ? free_pages_prepare+0xb3/0x100
   [&lt;ffffffff810ea668&gt;] ? free_hot_cold_page+0x48/0x1a0
   [&lt;ffffffff8112cc08&gt;] ? online_pages_range+0x68/0xa0
   [&lt;ffffffff8112cba0&gt;] ? __online_page_increment_counters+0x10/0x10
   [&lt;ffffffff81045561&gt;] ? walk_system_ram_range+0x101/0x110
   [&lt;ffffffff814c4f95&gt;] ? online_pages+0x1a5/0x2b0
   [&lt;ffffffff8135663d&gt;] ? __memory_block_change_state+0x20d/0x270
   [&lt;ffffffff81356756&gt;] ? store_mem_state+0xb6/0xf0
   [&lt;ffffffff8119e482&gt;] ? sysfs_write_file+0xd2/0x160
   [&lt;ffffffff8113769a&gt;] ? vfs_write+0xaa/0x160
   [&lt;ffffffff81137977&gt;] ? sys_write+0x47/0x90
   [&lt;ffffffff814e2f25&gt;] ? async_page_fault+0x25/0x30
   [&lt;ffffffff814ea239&gt;] ? system_call_fastpath+0x16/0x1b
  Disabling lock debugging due to kernel taint

This patch clears the memory to store struct page to avoid unexpected error.

Signed-off-by: Wen Congyang &lt;wency@cn.fujitsu.com&gt;
Cc: David Rientjes &lt;rientjes@google.com&gt;
Cc: Jiang Liu &lt;liuj97@gmail.com&gt;
Cc: Minchan Kim &lt;minchan.kim@gmail.com&gt;
Acked-by: KOSAKI Motohiro &lt;kosaki.motohiro@jp.fujitsu.com&gt;
Cc: Yasuaki Ishimatsu &lt;isimatu.yasuaki@jp.fujitsu.com&gt;
Reported-by: Vasilis Liaskovitis &lt;vasilis.liaskovitis@profitbricks.com&gt;
Cc: Dave Hansen &lt;dave@linux.vnet.ibm.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>memory-hotplug: update mce_bad_pages when removing the memory</title>
<updated>2012-12-12T01:22:22Z</updated>
<author>
<name>Wen Congyang</name>
<email>wency@cn.fujitsu.com</email>
</author>
<published>2012-12-12T00:00:47Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=95a4774d055c72d96ab192a1c6675cbf4d513f71'/>
<id>urn:sha1:95a4774d055c72d96ab192a1c6675cbf4d513f71</id>
<content type='text'>
When we hotremove a memory device, we will free the memory to store struct
page.  If the page is hwpoisoned page, we should decrease mce_bad_pages.

[akpm@linux-foundation.org: cleanup ifdefs]
Signed-off-by: Wen Congyang &lt;wency@cn.fujitsu.com&gt;
Cc: David Rientjes &lt;rientjes@google.com&gt;
Cc: Jiang Liu &lt;liuj97@gmail.com&gt;
Cc: Len Brown &lt;len.brown@intel.com&gt;
Cc: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
Cc: Paul Mackerras &lt;paulus@samba.org&gt;
Cc: Christoph Lameter &lt;cl@linux.com&gt;
Cc: Minchan Kim &lt;minchan.kim@gmail.com&gt;
Cc: KOSAKI Motohiro &lt;kosaki.motohiro@jp.fujitsu.com&gt;
Cc: Yasuaki Ishimatsu &lt;isimatu.yasuaki@jp.fujitsu.com&gt;
Cc: Dave Hansen &lt;dave@linux.vnet.ibm.com&gt;
Cc: Mel Gorman &lt;mel@csn.ul.ie&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm/vmemmap: fix wrong use of virt_to_page</title>
<updated>2012-11-30T16:51:17Z</updated>
<author>
<name>Jianguo Wu</name>
<email>wujianguo@huawei.com</email>
</author>
<published>2012-11-29T21:54:21Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=ae64ffcac35de0db628ba9631edf8ff34c5cd7ac'/>
<id>urn:sha1:ae64ffcac35de0db628ba9631edf8ff34c5cd7ac</id>
<content type='text'>
I enable CONFIG_DEBUG_VIRTUAL and CONFIG_SPARSEMEM_VMEMMAP, when doing
memory hotremove, there is a kernel BUG at arch/x86/mm/physaddr.c:20.

It is caused by free_section_usemap()-&gt;virt_to_page(), virt_to_page() is
only used for kernel direct mapping address, but sparse-vmemmap uses
vmemmap address, so it is going wrong here.

  ------------[ cut here ]------------
  kernel BUG at arch/x86/mm/physaddr.c:20!
  invalid opcode: 0000 [#1] SMP
  Modules linked in: acpihp_drv acpihp_slot edd cpufreq_conservative cpufreq_userspace cpufreq_powersave acpi_cpufreq mperf fuse vfat fat loop dm_mod coretemp kvm crc32c_intel ipv6 ixgbe igb iTCO_wdt i7core_edac edac_core pcspkr iTCO_vendor_support ioatdma microcode joydev sr_mod i2c_i801 dca lpc_ich mfd_core mdio tpm_tis i2c_core hid_generic tpm cdrom sg tpm_bios rtc_cmos button ext3 jbd mbcache usbhid hid uhci_hcd ehci_hcd usbcore usb_common sd_mod crc_t10dif processor thermal_sys hwmon scsi_dh_alua scsi_dh_hp_sw scsi_dh_rdac scsi_dh_emc scsi_dh ata_generic ata_piix libata megaraid_sas scsi_mod
  CPU 39
  Pid: 6454, comm: sh Not tainted 3.7.0-rc1-acpihp-final+ #45 QCI QSSC-S4R/QSSC-S4R
  RIP: 0010:[&lt;ffffffff8103c908&gt;]  [&lt;ffffffff8103c908&gt;] __phys_addr+0x88/0x90
  RSP: 0018:ffff8804440d7c08  EFLAGS: 00010006
  RAX: 0000000000000006 RBX: ffffea0012000000 RCX: 000000000000002c
  ...

Signed-off-by: Jianguo Wu &lt;wujianguo@huawei.com&gt;
Signed-off-by: Jiang Liu &lt;jiang.liu@huawei.com&gt;
Reviewd-by: Wen Congyang &lt;wency@cn.fujitsu.com&gt;
Acked-by: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Reviewed-by: Yasuaki Ishimatsu &lt;isimatu.yasuaki@jp.fujitsu.com&gt;
Reviewed-by: Michal Hocko &lt;mhocko@suse.cz&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm/sparse: remove index_init_lock</title>
<updated>2012-08-01T01:42:49Z</updated>
<author>
<name>Gavin Shan</name>
<email>shangw@linux.vnet.ibm.com</email>
</author>
<published>2012-07-31T23:46:06Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=c1c9518331969f97ea403bac66f0fd4a85d204d5'/>
<id>urn:sha1:c1c9518331969f97ea403bac66f0fd4a85d204d5</id>
<content type='text'>
sparse_index_init() uses the index_init_lock spinlock to protect root
mem_section assignment.  The lock is not necessary anymore because the
function is called only during boot (during paging init which is executed
only from a single CPU) and from the hotplug code (by add_memory() via
arch_add_memory()) which uses mem_hotplug_mutex.

The lock was introduced by 28ae55c9 ("sparsemem extreme: hotplug
preparation") and sparse_index_init() was used only during boot at that
time.

Later when the hotplug code (and add_memory()) was introduced there was no
synchronization so it was possible to online more sections from the same
root probably (though I am not 100% sure about that).  The first
synchronization has been added by 6ad696d2 ("mm: allow memory hotplug and
hibernation in the same kernel") which was later replaced by the
mem_hotplug_mutex - 20d6c96b ("mem-hotplug: introduce
{un}lock_memory_hotplug()").

Let's remove the lock as it is not needed and it makes the code more
confusing.

[mhocko@suse.cz: changelog]
Signed-off-by: Gavin Shan &lt;shangw@linux.vnet.ibm.com&gt;
Reviewed-by: Michal Hocko &lt;mhocko@suse.cz&gt;
Cc: Michal Hocko &lt;mhocko@suse.cz&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm/sparse: more checks on mem_section number</title>
<updated>2012-08-01T01:42:49Z</updated>
<author>
<name>Gavin Shan</name>
<email>shangw@linux.vnet.ibm.com</email>
</author>
<published>2012-07-31T23:46:04Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=db36a46113e101a8aa2d6ede41e78f2eaabed3f1'/>
<id>urn:sha1:db36a46113e101a8aa2d6ede41e78f2eaabed3f1</id>
<content type='text'>
__section_nr() was implemented to retrieve the corresponding memory
section number according to its descriptor.  It's possible that the
specified memory section descriptor doesn't exist in the global array.  So
add more checking on that and report an error for a wrong case.

Signed-off-by: Gavin Shan &lt;shangw@linux.vnet.ibm.com&gt;
Acked-by: David Rientjes &lt;rientjes@google.com&gt;
Cc: Michal Hocko &lt;mhocko@suse.cz&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm/sparse: optimize sparse_index_alloc</title>
<updated>2012-08-01T01:42:49Z</updated>
<author>
<name>Gavin Shan</name>
<email>shangw@linux.vnet.ibm.com</email>
</author>
<published>2012-07-31T23:46:02Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=5b760e64a64c8940cdccd0ba6fce19a9bd010d20'/>
<id>urn:sha1:5b760e64a64c8940cdccd0ba6fce19a9bd010d20</id>
<content type='text'>
With CONFIG_SPARSEMEM_EXTREME, the two levels of memory section
descriptors are allocated from slab or bootmem.  When allocating from
slab, let slab/bootmem allocator clear the memory chunk.  We needn't clear
it explicitly.

Signed-off-by: Gavin Shan &lt;shangw@linux.vnet.ibm.com&gt;
Reviewed-by: Michal Hocko &lt;mhocko@suse.cz&gt;
Acked-by: David Rientjes &lt;rientjes@google.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: setup pageblock_order before it's used by sparsemem</title>
<updated>2012-08-01T01:42:43Z</updated>
<author>
<name>Xishi Qiu</name>
<email>qiuxishi@huawei.com</email>
</author>
<published>2012-07-31T23:43:19Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=ca57df79d4f64e1a4886606af4289d40636189c5'/>
<id>urn:sha1:ca57df79d4f64e1a4886606af4289d40636189c5</id>
<content type='text'>
On architectures with CONFIG_HUGETLB_PAGE_SIZE_VARIABLE set, such as
Itanium, pageblock_order is a variable with default value of 0.  It's set
to the right value by set_pageblock_order() in function
free_area_init_core().

But pageblock_order may be used by sparse_init() before free_area_init_core()
is called along path:
sparse_init()
    -&gt;sparse_early_usemaps_alloc_node()
	-&gt;usemap_size()
	    -&gt;SECTION_BLOCKFLAGS_BITS
		-&gt;((1UL &lt;&lt; (PFN_SECTION_SHIFT - pageblock_order)) *
NR_PAGEBLOCK_BITS)

The uninitialized pageblock_size will cause memory wasting because
usemap_size() returns a much bigger value then it's really needed.

For example, on an Itanium platform,
sparse_init() pageblock_order=0 usemap_size=24576
free_area_init_core() before pageblock_order=0, usemap_size=24576
free_area_init_core() after pageblock_order=12, usemap_size=8

That means 24K memory has been wasted for each section, so fix it by calling
set_pageblock_order() from sparse_init().

Signed-off-by: Xishi Qiu &lt;qiuxishi@huawei.com&gt;
Signed-off-by: Jiang Liu &lt;liuj97@gmail.com&gt;
Cc: Tony Luck &lt;tony.luck@intel.com&gt;
Cc: Yinghai Lu &lt;yinghai@kernel.org&gt;
Cc: KAMEZAWA Hiroyuki &lt;kamezawa.hiroyu@jp.fujitsu.com&gt;
Cc: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
Cc: KOSAKI Motohiro &lt;kosaki.motohiro@jp.fujitsu.com&gt;
Cc: David Rientjes &lt;rientjes@google.com&gt;
Cc: Keping Chen &lt;chenkeping@huawei.com&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: sparse: fix usemap allocation above node descriptor section</title>
<updated>2012-07-11T23:04:49Z</updated>
<author>
<name>Yinghai Lu</name>
<email>yinghai@kernel.org</email>
</author>
<published>2012-07-11T21:02:53Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=99ab7b19440a72ebdf225f99b20f8ef40decee86'/>
<id>urn:sha1:99ab7b19440a72ebdf225f99b20f8ef40decee86</id>
<content type='text'>
After commit f5bf18fa22f8 ("bootmem/sparsemem: remove limit constraint
in alloc_bootmem_section"), usemap allocations may easily be placed
outside the optimal section that holds the node descriptor, even if
there is space available in that section.  This results in unnecessary
hotplug dependencies that need to have the node unplugged before the
section holding the usemap.

The reason is that the bootmem allocator doesn't guarantee a linear
search starting from the passed allocation goal but may start out at a
much higher address absent an upper limit.

Fix this by trying the allocation with the limit at the section end,
then retry without if that fails.  This keeps the fix from f5bf18fa22f8
of not panicking if the allocation does not fit in the section, but
still makes sure to try to stay within the section at first.

Signed-off-by: Yinghai Lu &lt;yinghai@kernel.org&gt;
Signed-off-by: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: &lt;stable@vger.kernel.org&gt;	[3.3.x, 3.4.x]
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: sparse: fix section usemap placement calculation</title>
<updated>2012-07-11T23:04:49Z</updated>
<author>
<name>Yinghai Lu</name>
<email>yinghai@kernel.org</email>
</author>
<published>2012-07-11T21:02:51Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=07b4e2bc9c35ea88cbd36d806fcd5e3bcbf022be'/>
<id>urn:sha1:07b4e2bc9c35ea88cbd36d806fcd5e3bcbf022be</id>
<content type='text'>
Commit 238305bb4d41 ("mm: remove sparsemem allocation details from the
bootmem allocator") introduced a bug in the allocation goal calculation
that put section usemaps not in the same section as the node
descriptors, creating unnecessary hotplug dependencies between them:

  node 0 must be removed before remove section 16399
  node 1 must be removed before remove section 16399
  node 2 must be removed before remove section 16399
  node 3 must be removed before remove section 16399
  node 4 must be removed before remove section 16399
  node 5 must be removed before remove section 16399
  node 6 must be removed before remove section 16399

The reason is that it applies PAGE_SECTION_MASK to the physical address
of the node descriptor when finding a suitable place to put the usemap,
when this mask is actually intended to be used with PFNs.  Because the
PFN mask is wider, the target address will point beyond the wanted
section holding the node descriptor and the node must be offlined before
the section holding the usemap can go.

Fix this by extending the mask to address width before use.

Signed-off-by: Yinghai Lu &lt;yinghai@kernel.org&gt;
Signed-off-by: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: remove sparsemem allocation details from the bootmem allocator</title>
<updated>2012-05-29T23:22:22Z</updated>
<author>
<name>Johannes Weiner</name>
<email>hannes@cmpxchg.org</email>
</author>
<published>2012-05-29T22:06:36Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=238305bb4d418c95977162ba13c11880685fc731'/>
<id>urn:sha1:238305bb4d418c95977162ba13c11880685fc731</id>
<content type='text'>
alloc_bootmem_section() derives allocation area constraints from the
specified sparsemem section.  This is a bit specific for a generic memory
allocator like bootmem, though, so move it over to sparsemem.

As __alloc_bootmem_node_nopanic() already retries failed allocations with
relaxed area constraints, the fallback code in sparsemem.c can be removed
and the code becomes a bit more compact overall.

[akpm@linux-foundation.org: fix build]
Signed-off-by: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Acked-by: Tejun Heo &lt;tj@kernel.org&gt;
Acked-by: David S. Miller &lt;davem@davemloft.net&gt;
Cc: Yinghai Lu &lt;yinghai@kernel.org&gt;
Cc: Gavin Shan &lt;shangw@linux.vnet.ibm.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
</feed>
