<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/mm/sparse.c, branch v5.7</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v5.7</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v5.7'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2020-04-07T17:43:40Z</updated>
<entry>
<title>mm/sparse.c: move subsection_map related functions together</title>
<updated>2020-04-07T17:43:40Z</updated>
<author>
<name>Baoquan He</name>
<email>bhe@redhat.com</email>
</author>
<published>2020-04-07T03:07:13Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=6ecb0fc61290e16217a8f6165225b53b193b337c'/>
<id>urn:sha1:6ecb0fc61290e16217a8f6165225b53b193b337c</id>
<content type='text'>
No functional change.

[bhe@redhat.com: move functions into CONFIG_MEMORY_HOTPLUG ifdeffery scope]
  Link: http://lkml.kernel.org/r/20200316045804.GC3486@MiWiFi-R3L-srv
Signed-off-by: Baoquan He &lt;bhe@redhat.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: Michal Hocko &lt;mhocko@suse.com&gt;
Cc: David Hildenbrand &lt;david@redhat.com&gt;
Cc: Wei Yang &lt;richard.weiyang@gmail.com&gt;
Cc: Dan Williams &lt;dan.j.williams@intel.com&gt;
Cc: Pankaj Gupta &lt;pankaj.gupta.linux@gmail.com&gt;
Cc: Stephen Rothwell &lt;sfr@canb.auug.org.au&gt;
Link: http://lkml.kernel.org/r/20200312124414.439-6-bhe@redhat.com
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm/sparse.c: add note about only VMEMMAP supporting sub-section hotplug</title>
<updated>2020-04-07T17:43:40Z</updated>
<author>
<name>Baoquan He</name>
<email>bhe@redhat.com</email>
</author>
<published>2020-04-07T03:07:09Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=95a5a34dfe22be11bc4af58854585e90124b1db0'/>
<id>urn:sha1:95a5a34dfe22be11bc4af58854585e90124b1db0</id>
<content type='text'>
And tell check_pfn_span() gating the porper alignment and size of hot
added memory region.

And also move the code comments from inside section_deactivate() to being
above it.  The code comments are reasonable for the whole function, and
the moving makes code cleaner.

Signed-off-by: Baoquan He &lt;bhe@redhat.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Reviewed-by: David Hildenbrand &lt;david@redhat.com&gt;
Acked-by: Michal Hocko &lt;mhocko@suse.com&gt;
Cc: Dan Williams &lt;dan.j.williams@intel.com&gt;
Cc: Pankaj Gupta &lt;pankaj.gupta.linux@gmail.com&gt;
Cc: Wei Yang &lt;richard.weiyang@gmail.com&gt;
Link: http://lkml.kernel.org/r/20200312124414.439-5-bhe@redhat.com
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm/sparse.c: only use subsection map in VMEMMAP case</title>
<updated>2020-04-07T17:43:40Z</updated>
<author>
<name>Baoquan He</name>
<email>bhe@redhat.com</email>
</author>
<published>2020-04-07T03:07:06Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=0a9f9f62316606ee827fa3318e95a1c489d9acf5'/>
<id>urn:sha1:0a9f9f62316606ee827fa3318e95a1c489d9acf5</id>
<content type='text'>
Currently, to support subsection aligned memory region adding for pmem,
subsection map is added to track which subsection is present.

However, config ZONE_DEVICE depends on SPARSEMEM_VMEMMAP.  It means
subsection map only makes sense when SPARSEMEM_VMEMMAP enabled.  For the
classic sparse, it's meaningless.  Even worse, it may confuse people when
checking code related to the classic sparse.

About the classic sparse which doesn't support subsection hotplug, Dan
said it's more because the effort and maintenance burden outweighs the
benefit.  Besides, the current 64 bit ARCHes all enable
SPARSEMEM_VMEMMAP_ENABLE by default.

Combining the above reasons, no need to provide subsection map and the
relevant handling for the classic sparse.  Let's remove them.

Signed-off-by: Baoquan He &lt;bhe@redhat.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Reviewed-by: David Hildenbrand &lt;david@redhat.com&gt;
Cc: Dan Williams &lt;dan.j.williams@intel.com&gt;
Cc: Michal Hocko &lt;mhocko@suse.com&gt;
Cc: Pankaj Gupta &lt;pankaj.gupta.linux@gmail.com&gt;
Cc: Wei Yang &lt;richard.weiyang@gmail.com&gt;
Link: http://lkml.kernel.org/r/20200312124414.439-4-bhe@redhat.com
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm/sparse.c: introduce a new function clear_subsection_map()</title>
<updated>2020-04-07T17:43:40Z</updated>
<author>
<name>Baoquan He</name>
<email>bhe@redhat.com</email>
</author>
<published>2020-04-07T03:07:03Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=37bc15020a96035cb157be5864b958672fe02d7c'/>
<id>urn:sha1:37bc15020a96035cb157be5864b958672fe02d7c</id>
<content type='text'>
Factor out the code which clear subsection map of one memory region from
section_deactivate() into clear_subsection_map().

And also add helper function is_subsection_map_empty() to check if the
current subsection map is empty or not.

Signed-off-by: Baoquan He &lt;bhe@redhat.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Reviewed-by: David Hildenbrand &lt;david@redhat.com&gt;
Acked-by: Pankaj Gupta &lt;pankaj.gupta.linux@gmail.com&gt;
Cc: Dan Williams &lt;dan.j.williams@intel.com&gt;
Cc: Michal Hocko &lt;mhocko@suse.com&gt;
Cc: Wei Yang &lt;richard.weiyang@gmail.com&gt;
Link: http://lkml.kernel.org/r/20200312124414.439-3-bhe@redhat.com
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm/sparse.c: introduce new function fill_subsection_map()</title>
<updated>2020-04-07T17:43:40Z</updated>
<author>
<name>Baoquan He</name>
<email>bhe@redhat.com</email>
</author>
<published>2020-04-07T03:07:00Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=5d87255cadde243763ca22b35e01312550114167'/>
<id>urn:sha1:5d87255cadde243763ca22b35e01312550114167</id>
<content type='text'>
Patch series "mm/hotplug: Only use subsection map for VMEMMAP", v4.

Memory sub-section hotplug was added to fix the issue that nvdimm could be
mapped at non-section aligned starting address.  A subsection map is added
into struct mem_section_usage to implement it.

However, config ZONE_DEVICE depends on SPARSEMEM_VMEMMAP.  It means
subsection map only makes sense when SPARSEMEM_VMEMMAP enabled.  For the
classic sparse, subsection map is meaningless and confusing.

About the classic sparse which doesn't support subsection hotplug, Dan
said it's more because the effort and maintenance burden outweighs the
benefit.  Besides, the current 64 bit ARCHes all enable
SPARSEMEM_VMEMMAP_ENABLE by default.

This patch (of 5):

Factor out the code that fills the subsection map from section_activate()
into fill_subsection_map(), this makes section_activate() cleaner and
easier to follow.

Signed-off-by: Baoquan He &lt;bhe@redhat.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Reviewed-by: Wei Yang &lt;richard.weiyang@gmail.com&gt;
Reviewed-by: David Hildenbrand &lt;david@redhat.com&gt;
Acked-by: Pankaj Gupta &lt;pankaj.gupta.linux@gmail.com&gt;
Cc: Dan Williams &lt;dan.j.williams@intel.com&gt;
Cc: Michal Hocko &lt;mhocko@suse.com&gt;
Link: http://lkml.kernel.org/r/20200312124414.439-2-bhe@redhat.com
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm/sparse.c: allocate memmap preferring the given node</title>
<updated>2020-04-02T16:35:30Z</updated>
<author>
<name>Baoquan He</name>
<email>bhe@redhat.com</email>
</author>
<published>2020-04-02T04:09:34Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=4027149abde8d57da4c4c4f498b310c85a297bba'/>
<id>urn:sha1:4027149abde8d57da4c4c4f498b310c85a297bba</id>
<content type='text'>
When allocating memmap for hot added memory with the classic sparse, the
specified 'nid' is ignored in populate_section_memmap().

While in allocating memmap for the classic sparse during boot, the node
given by 'nid' is preferred.  And VMEMMAP prefers the node of 'nid' in
both boot stage and memory hot adding.  So seems no reason to not respect
the node of 'nid' for the classic sparse when hot adding memory.

Use kvmalloc_node instead to use the passed in 'nid'.

Signed-off-by: Baoquan He &lt;bhe@redhat.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Reviewed-by: Matthew Wilcox (Oracle) &lt;willy@infradead.org&gt;
Reviewed-by: David Hildenbrand &lt;david@redhat.com&gt;
Reviewed-by: Wei Yang &lt;richard.weiyang@gmail.com&gt;
Acked-by: Michal Hocko &lt;mhocko@suse.com&gt;
Acked-by: Pankaj Gupta &lt;pankaj.gupta.linux@gmail.com&gt;
Link: http://lkml.kernel.org/r/20200316125625.GH3486@MiWiFi-R3L-srv
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm/sparse.c: use kvmalloc/kvfree to alloc/free memmap for the classic sparse</title>
<updated>2020-04-02T16:35:30Z</updated>
<author>
<name>Baoquan He</name>
<email>bhe@redhat.com</email>
</author>
<published>2020-04-02T04:09:31Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=3af776f601dc13e1cae1f0f461407533669cf666'/>
<id>urn:sha1:3af776f601dc13e1cae1f0f461407533669cf666</id>
<content type='text'>
This change makes populate_section_memmap()/depopulate_section_memmap
much simpler.

Suggested-by: Michal Hocko &lt;mhocko@kernel.org&gt;
Signed-off-by: Baoquan He &lt;bhe@redhat.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Reviewed-by: David Hildenbrand &lt;david@redhat.com&gt;
Reviewed-by: Pankaj Gupta &lt;pankaj.gupta.linux@gmail.com&gt;
Reviewed-by: Matthew Wilcox (Oracle) &lt;willy@infradead.org&gt;
Reviewed-by: Wei Yang &lt;richard.weiyang@gmail.com&gt;
Acked-by: Michal Hocko &lt;mhocko@suse.com&gt;
Link: http://lkml.kernel.org/r/20200316125450.GG3486@MiWiFi-R3L-srv
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm/sparsemem: get address to page struct instead of address to pfn</title>
<updated>2020-04-02T16:35:30Z</updated>
<author>
<name>Wei Yang</name>
<email>richardw.yang@linux.intel.com</email>
</author>
<published>2020-04-02T04:09:24Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=4627d76dcf0482c56e925a3477948df136255f0c'/>
<id>urn:sha1:4627d76dcf0482c56e925a3477948df136255f0c</id>
<content type='text'>
memmap should be the address to page struct instead of address to pfn.

As mentioned by David, if system memory and devmem sit within a section,
the mismatch address would lead kdump to dump unexpected memory.

Since sub-section only works for SPARSEMEM_VMEMMAP, pfn_to_page() is valid
to get the page struct address at this point.

Fixes: ba72b4c8cf60 ("mm/sparsemem: support sub-section hotplug")
Signed-off-by: Wei Yang &lt;richardw.yang@linux.intel.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Acked-by: David Hildenbrand &lt;david@redhat.com&gt;
Cc: Dan Williams &lt;dan.j.williams@intel.com&gt;
Cc: Baoquan He &lt;bhe@redhat.com&gt;
Link: http://lkml.kernel.org/r/20200210005048.10437-1-richardw.yang@linux.intel.com
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm/sparse: fix kernel crash with pfn_section_valid check</title>
<updated>2020-03-29T16:47:06Z</updated>
<author>
<name>Aneesh Kumar K.V</name>
<email>aneesh.kumar@linux.ibm.com</email>
</author>
<published>2020-03-29T02:17:29Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=b943f045a9af9fd02f923e43fe8d7517e9961701'/>
<id>urn:sha1:b943f045a9af9fd02f923e43fe8d7517e9961701</id>
<content type='text'>
Fix the crash like this:

    BUG: Kernel NULL pointer dereference on read at 0x00000000
    Faulting instruction address: 0xc000000000c3447c
    Oops: Kernel access of bad area, sig: 11 [#1]
    LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
    CPU: 11 PID: 7519 Comm: lt-ndctl Not tainted 5.6.0-rc7-autotest #1
    ...
    NIP [c000000000c3447c] vmemmap_populated+0x98/0xc0
    LR [c000000000088354] vmemmap_free+0x144/0x320
    Call Trace:
       section_deactivate+0x220/0x240
       __remove_pages+0x118/0x170
       arch_remove_memory+0x3c/0x150
       memunmap_pages+0x1cc/0x2f0
       devm_action_release+0x30/0x50
       release_nodes+0x2f8/0x3e0
       device_release_driver_internal+0x168/0x270
       unbind_store+0x130/0x170
       drv_attr_store+0x44/0x60
       sysfs_kf_write+0x68/0x80
       kernfs_fop_write+0x100/0x290
       __vfs_write+0x3c/0x70
       vfs_write+0xcc/0x240
       ksys_write+0x7c/0x140
       system_call+0x5c/0x68

The crash is due to NULL dereference at

	test_bit(idx, ms-&gt;usage-&gt;subsection_map);

due to ms-&gt;usage = NULL in pfn_section_valid()

With commit d41e2f3bd546 ("mm/hotplug: fix hot remove failure in
SPARSEMEM|!VMEMMAP case") section_mem_map is set to NULL after
depopulate_section_mem().  This was done so that pfn_page() can work
correctly with kernel config that disables SPARSEMEM_VMEMMAP.  With that
config pfn_to_page does

	__section_mem_map_addr(__sec) + __pfn;

where

  static inline struct page *__section_mem_map_addr(struct mem_section *section)
  {
	unsigned long map = section-&gt;section_mem_map;
	map &amp;= SECTION_MAP_MASK;
	return (struct page *)map;
  }

Now with SPASEMEM_VMEMAP enabled, mem_section-&gt;usage-&gt;subsection_map is
used to check the pfn validity (pfn_valid()).  Since section_deactivate
release mem_section-&gt;usage if a section is fully deactivated,
pfn_valid() check after a subsection_deactivate cause a kernel crash.

  static inline int pfn_valid(unsigned long pfn)
  {
  ...
	return early_section(ms) || pfn_section_valid(ms, pfn);
  }

where

  static inline int pfn_section_valid(struct mem_section *ms, unsigned long pfn)
  {
	int idx = subsection_map_index(pfn);

	return test_bit(idx, ms-&gt;usage-&gt;subsection_map);
  }

Avoid this by clearing SECTION_HAS_MEM_MAP when mem_section-&gt;usage is
freed.  For architectures like ppc64 where large pages are used for
vmmemap mapping (16MB), a specific vmemmap mapping can cover multiple
sections.  Hence before a vmemmap mapping page can be freed, the kernel
needs to make sure there are no valid sections within that mapping.
Clearing the section valid bit before depopulate_section_memap enables
this.

[aneesh.kumar@linux.ibm.com: add comment]
  Link: http://lkml.kernel.org/r/20200326133235.343616-1-aneesh.kumar@linux.ibm.comLink: http://lkml.kernel.org/r/20200325031914.107660-1-aneesh.kumar@linux.ibm.com
Fixes: d41e2f3bd546 ("mm/hotplug: fix hot remove failure in SPARSEMEM|!VMEMMAP case")
Reported-by: Sachin Sant &lt;sachinp@linux.vnet.ibm.com&gt;
Signed-off-by: Aneesh Kumar K.V &lt;aneesh.kumar@linux.ibm.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Tested-by: Sachin Sant &lt;sachinp@linux.vnet.ibm.com&gt;
Reviewed-by: Baoquan He &lt;bhe@redhat.com&gt;
Reviewed-by: Wei Yang &lt;richard.weiyang@gmail.com&gt;
Acked-by: Michal Hocko &lt;mhocko@suse.com&gt;
Acked-by: Pankaj Gupta &lt;pankaj.gupta.linux@gmail.com&gt;
Cc: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Cc: Dan Williams &lt;dan.j.williams@intel.com&gt;
Cc: David Hildenbrand &lt;david@redhat.com&gt;
Cc: Oscar Salvador &lt;osalvador@suse.de&gt;
Cc: Mike Rapoport &lt;rppt@linux.ibm.com&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm/hotplug: fix hot remove failure in SPARSEMEM|!VMEMMAP case</title>
<updated>2020-03-22T01:56:06Z</updated>
<author>
<name>Baoquan He</name>
<email>bhe@redhat.com</email>
</author>
<published>2020-03-22T01:22:13Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=d41e2f3bd54699f85b3d6f45abd09fa24a222cb9'/>
<id>urn:sha1:d41e2f3bd54699f85b3d6f45abd09fa24a222cb9</id>
<content type='text'>
In section_deactivate(), pfn_to_page() doesn't work any more after
ms-&gt;section_mem_map is resetting to NULL in SPARSEMEM|!VMEMMAP case.  It
causes a hot remove failure:

  kernel BUG at mm/page_alloc.c:4806!
  invalid opcode: 0000 [#1] SMP PTI
  CPU: 3 PID: 8 Comm: kworker/u16:0 Tainted: G        W         5.5.0-next-20200205+ #340
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 0.0.0 02/06/2015
  Workqueue: kacpi_hotplug acpi_hotplug_work_fn
  RIP: 0010:free_pages+0x85/0xa0
  Call Trace:
   __remove_pages+0x99/0xc0
   arch_remove_memory+0x23/0x4d
   try_remove_memory+0xc8/0x130
   __remove_memory+0xa/0x11
   acpi_memory_device_remove+0x72/0x100
   acpi_bus_trim+0x55/0x90
   acpi_device_hotplug+0x2eb/0x3d0
   acpi_hotplug_work_fn+0x1a/0x30
   process_one_work+0x1a7/0x370
   worker_thread+0x30/0x380
   kthread+0x112/0x130
   ret_from_fork+0x35/0x40

Let's move the -&gt;section_mem_map resetting after
depopulate_section_memmap() to fix it.

[akpm@linux-foundation.org: remove unneeded initialization, per David]
Fixes: ba72b4c8cf60 ("mm/sparsemem: support sub-section hotplug")
Signed-off-by: Baoquan He &lt;bhe@redhat.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Reviewed-by: Pankaj Gupta &lt;pankaj.gupta.linux@gmail.com&gt;
Reviewed-by: David Hildenbrand &lt;david@redhat.com&gt;
Acked-by: Michal Hocko &lt;mhocko@suse.com&gt;
Cc: Wei Yang &lt;richardw.yang@linux.intel.com&gt;
Cc: Oscar Salvador &lt;osalvador@suse.de&gt;
Cc: Mike Rapoport &lt;rppt@linux.ibm.com&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Link: http://lkml.kernel.org/r/20200307084229.28251-2-bhe@redhat.com
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
</feed>
