<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/mm/page_alloc.c, branch v3.8</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v3.8</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v3.8'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2013-02-18T17:58:02Z</updated>
<entry>
<title>mm: fix pageblock bitmap allocation</title>
<updated>2013-02-18T17:58:02Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2013-02-18T17:58:02Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=7c45512df987c5619db041b5c9b80d281e26d3db'/>
<id>urn:sha1:7c45512df987c5619db041b5c9b80d281e26d3db</id>
<content type='text'>
Commit c060f943d092 ("mm: use aligned zone start for pfn_to_bitidx
calculation") fixed out calculation of the index into the pageblock
bitmap when a !SPARSEMEM zome was not aligned to pageblock_nr_pages.

However, the _allocation_ of that bitmap had never taken this alignment
requirement into accout, so depending on the exact size and alignment of
the zone, the use of that index could then access past the allocation,
resulting in some very subtle memory corruption.

This was reported (and bisected) by Ingo Molnar: one of his random
config builds would hang with certain very specific kernel command line
options.

In the meantime, commit c060f943d092 has been marked for stable, so this
fix needs to be back-ported to the stable kernels that backported the
commit to use the right alignment.

Bisected-and-tested-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Acked-by: Mel Gorman &lt;mgorman@suse.de&gt;
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: cma: fix accounting of CMA pages placed in high memory</title>
<updated>2013-02-12T22:34:00Z</updated>
<author>
<name>Marek Szyprowski</name>
<email>m.szyprowski@samsung.com</email>
</author>
<published>2013-02-12T21:46:24Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=41a7973447b0b8717f0a214d4328dc31ec2291d7'/>
<id>urn:sha1:41a7973447b0b8717f0a214d4328dc31ec2291d7</id>
<content type='text'>
The total number of low memory pages is determined as totalram_pages -
totalhigh_pages, so without this patch all CMA pageblocks placed in
highmem were accounted to low memory.

Signed-off-by: Marek Szyprowski &lt;m.szyprowski@samsung.com&gt;
Acked-by: Kyungmin Park &lt;kyungmin.park@samsung.com&gt;
Cc: Minchan Kim &lt;minchan.kim@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: compaction: partially revert capture of suitable high-order page</title>
<updated>2013-01-11T22:54:56Z</updated>
<author>
<name>Mel Gorman</name>
<email>mgorman@suse.de</email>
</author>
<published>2013-01-11T22:32:16Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=8fb74b9fb2b182d54beee592350d9ea1f325917a'/>
<id>urn:sha1:8fb74b9fb2b182d54beee592350d9ea1f325917a</id>
<content type='text'>
Eric Wong reported on 3.7 and 3.8-rc2 that ppoll() got stuck when
waiting for POLLIN on a local TCP socket.  It was easier to trigger if
there was disk IO and dirty pages at the same time and he bisected it to
commit 1fb3f8ca0e92 ("mm: compaction: capture a suitable high-order page
immediately when it is made available").

The intention of that patch was to improve high-order allocations under
memory pressure after changes made to reclaim in 3.6 drastically hurt
THP allocations but the approach was flawed.  For Eric, the problem was
that page-&gt;pfmemalloc was not being cleared for captured pages leading
to a poor interaction with swap-over-NFS support causing the packets to
be dropped.  However, I identified a few more problems with the patch
including the fact that it can increase contention on zone-&gt;lock in some
cases which could result in async direct compaction being aborted early.

In retrospect the capture patch took the wrong approach.  What it should
have done is mark the pageblock being migrated as MIGRATE_ISOLATE if it
was allocating for THP and avoided races that way.  While the patch was
showing to improve allocation success rates at the time, the benefit is
marginal given the relative complexity and it should be revisited from
scratch in the context of the other reclaim-related changes that have
taken place since the patch was first written and tested.  This patch
partially reverts commit 1fb3f8ca0e92 ("mm: compaction: capture a
suitable high-order page immediately when it is made available").

Reported-and-tested-by: Eric Wong &lt;normalperson@yhbt.net&gt;
Tested-by: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Mel Gorman &lt;mgorman@suse.de&gt;
Cc: David Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: use aligned zone start for pfn_to_bitidx calculation</title>
<updated>2013-01-11T22:54:55Z</updated>
<author>
<name>Laura Abbott</name>
<email>lauraa@codeaurora.org</email>
</author>
<published>2013-01-11T22:31:51Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=c060f943d0929f3e429c5d9522290584f6281d6e'/>
<id>urn:sha1:c060f943d0929f3e429c5d9522290584f6281d6e</id>
<content type='text'>
The current calculation in pfn_to_bitidx assumes that (pfn -
zone-&gt;zone_start_pfn) &gt;&gt; pageblock_order will return the same bit for
all pfn in a pageblock.  If zone_start_pfn is not aligned to
pageblock_nr_pages, this may not always be correct.

Consider the following with pageblock order = 10, zone start 2MB:

  pfn     | pfn - zone start | (pfn - zone start) &gt;&gt; page block order
  ----------------------------------------------------------------
  0x26000 | 0x25e00	   |  0x97
  0x26100 | 0x25f00	   |  0x97
  0x26200 | 0x26000	   |  0x98
  0x26300 | 0x26100	   |  0x98

This means that calling {get,set}_pageblock_migratetype on a single page
will not set the migratetype for the full block.  Fix this by rounding
down zone_start_pfn when doing the bitidx calculation.

For our use case, the effects of this bug were mostly tied to the fact
that CMA allocations would either take a long time or fail to happen.
Depending on the driver using CMA, this could result in anything from
visual glitches to application failures.

Signed-off-by: Laura Abbott &lt;lauraa@codeaurora.org&gt;
Acked-by: Mel Gorman &lt;mgorman@suse.de&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: fix zone_watermark_ok_safe() accounting of isolated pages</title>
<updated>2013-01-05T00:11:46Z</updated>
<author>
<name>Bartlomiej Zolnierkiewicz</name>
<email>b.zolnierkie@samsung.com</email>
</author>
<published>2013-01-04T23:35:08Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=a458431e176ddb27e8ef8b98c2a681b217337393'/>
<id>urn:sha1:a458431e176ddb27e8ef8b98c2a681b217337393</id>
<content type='text'>
Commit 702d1a6e0766 ("memory-hotplug: fix kswapd looping forever
problem") added an isolated pageblocks counter (nr_pageblock_isolate in
struct zone) and used it to adjust free pages counter in
zone_watermark_ok_safe() to prevent kswapd looping forever problem.

Then later, commit 2139cbe627b8 ("cma: fix counting of isolated pages")
fixed accounting of isolated pages in global free pages counter.  It
made the previous zone_watermark_ok_safe() fix unnecessary and
potentially harmful (cause now isolated pages may be accounted twice
making free pages counter incorrect).

This patch removes the special isolated pageblocks counter altogether
which fixes zone_watermark_ok_safe() free pages check.

Reported-by: Tomasz Stanislawski &lt;t.stanislaws@samsung.com&gt;
Signed-off-by: Bartlomiej Zolnierkiewicz &lt;b.zolnierkie@samsung.com&gt;
Signed-off-by: Kyungmin Park &lt;kyungmin.park@samsung.com&gt;
Cc: Minchan Kim &lt;minchan@kernel.org&gt;
Cc: KOSAKI Motohiro &lt;kosaki.motohiro@jp.fujitsu.com&gt;
Cc: Aaditya Kumar &lt;aaditya.kumar.30@gmail.com&gt;
Cc: KAMEZAWA Hiroyuki &lt;kamezawa.hiroyu@jp.fujitsu.com&gt;
Cc: Mel Gorman &lt;mgorman@suse.de&gt;
Cc: Michal Hocko &lt;mhocko@suse.cz&gt;
Cc: Marek Szyprowski &lt;m.szyprowski@samsung.com&gt;
Cc: Michal Nazarewicz &lt;mina86@mina86.com&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: cma: WARN if freed memory is still in use</title>
<updated>2012-12-21T01:40:19Z</updated>
<author>
<name>Marek Szyprowski</name>
<email>m.szyprowski@samsung.com</email>
</author>
<published>2012-12-20T23:05:18Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=bcc2b02f4c1b36bc67272df7119b75bac78525ab'/>
<id>urn:sha1:bcc2b02f4c1b36bc67272df7119b75bac78525ab</id>
<content type='text'>
Memory returned to free_contig_range() must have no other references.
Let kernel to complain loudly if page reference count is not equal to 1.

[rientjes@google.com: support sparsemem]
Signed-off-by: Marek Szyprowski &lt;m.szyprowski@samsung.com&gt;
Reviewed-by: Kyungmin Park &lt;kyungmin.park@samsung.com&gt;
Acked-by: Michal Nazarewicz &lt;mina86@mina86.com&gt;
Signed-off-by: David Rientjes &lt;rientjes@google.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: allocate kernel pages to the right memcg</title>
<updated>2012-12-18T23:02:12Z</updated>
<author>
<name>Glauber Costa</name>
<email>glommer@parallels.com</email>
</author>
<published>2012-12-18T22:22:00Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=6a1a0d3b625a4091e7a0eb249aefc6a644385149'/>
<id>urn:sha1:6a1a0d3b625a4091e7a0eb249aefc6a644385149</id>
<content type='text'>
When a process tries to allocate a page with the __GFP_KMEMCG flag, the
page allocator will call the corresponding memcg functions to validate
the allocation.  Tasks in the root memcg can always proceed.

To avoid adding markers to the page - and a kmem flag that would
necessarily follow, as much as doing page_cgroup lookups for no reason,
whoever is marking its allocations with __GFP_KMEMCG flag is responsible
for telling the page allocator that this is such an allocation at
free_pages() time.  This is done by the invocation of
__free_accounted_pages() and free_accounted_pages().

Signed-off-by: Glauber Costa &lt;glommer@parallels.com&gt;
Acked-by: Michal Hocko &lt;mhocko@suse.cz&gt;
Acked-by: Mel Gorman &lt;mgorman@suse.de&gt;
Acked-by: Kamezawa Hiroyuki &lt;kamezawa.hiroyu@jp.fujitsu.com&gt;
Acked-by: David Rientjes &lt;rientjes@google.com&gt;
Cc: Christoph Lameter &lt;cl@linux.com&gt;
Cc: Pekka Enberg &lt;penberg@cs.helsinki.fi&gt;
Cc: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: Suleiman Souhlal &lt;suleiman@google.com&gt;
Cc: Tejun Heo &lt;tj@kernel.org&gt;
Cc: Frederic Weisbecker &lt;fweisbec@redhat.com&gt;
Cc: Greg Thelen &lt;gthelen@google.com&gt;
Cc: JoonSoo Kim &lt;js1304@gmail.com&gt;
Cc: Rik van Riel &lt;riel@redhat.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm/page_alloc.c: remove duplicate check</title>
<updated>2012-12-18T23:02:12Z</updated>
<author>
<name>Gavin Shan</name>
<email>shangw@linux.vnet.ibm.com</email>
</author>
<published>2012-12-18T22:21:32Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=0bb2c7637ef0db4f44528698fa725179fb4917ad'/>
<id>urn:sha1:0bb2c7637ef0db4f44528698fa725179fb4917ad</id>
<content type='text'>
While allocating pages using buddy allocator, the compound page is
probably split up to free pages.  Under these circumstances, the compound
page should be destroyed by destroy_compound_page().  However, there is a
duplicate check to judge if the page is compound.

Remove the duplicate check since the compound_order() returns 0 when the
page doesn't have PG_head set in destroy_compound_page().  That is to say,
destroy_compound_page() needn't check PageHead().

Signed-off-by: Gavin Shan &lt;shangw@linux.vnet.ibm.com&gt;
Acked-by: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Acked-by: Pekka Enberg &lt;penberg@cs.helsinki.fi&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>Merge tag 'balancenuma-v11' of git://git.kernel.org/pub/scm/linux/kernel/git/mel/linux-balancenuma</title>
<updated>2012-12-16T23:18:08Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2012-12-16T22:33:25Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=3d59eebc5e137bd89c6351e4c70e90ba1d0dc234'/>
<id>urn:sha1:3d59eebc5e137bd89c6351e4c70e90ba1d0dc234</id>
<content type='text'>
Pull Automatic NUMA Balancing bare-bones from Mel Gorman:
 "There are three implementations for NUMA balancing, this tree
  (balancenuma), numacore which has been developed in tip/master and
  autonuma which is in aa.git.

  In almost all respects balancenuma is the dumbest of the three because
  its main impact is on the VM side with no attempt to be smart about
  scheduling.  In the interest of getting the ball rolling, it would be
  desirable to see this much merged for 3.8 with the view to building
  scheduler smarts on top and adapting the VM where required for 3.9.

  The most recent set of comparisons available from different people are

    mel:    https://lkml.org/lkml/2012/12/9/108
    mingo:  https://lkml.org/lkml/2012/12/7/331
    tglx:   https://lkml.org/lkml/2012/12/10/437
    srikar: https://lkml.org/lkml/2012/12/10/397

  The results are a mixed bag.  In my own tests, balancenuma does
  reasonably well.  It's dumb as rocks and does not regress against
  mainline.  On the other hand, Ingo's tests shows that balancenuma is
  incapable of converging for this workloads driven by perf which is bad
  but is potentially explained by the lack of scheduler smarts.  Thomas'
  results show balancenuma improves on mainline but falls far short of
  numacore or autonuma.  Srikar's results indicate we all suffer on a
  large machine with imbalanced node sizes.

  My own testing showed that recent numacore results have improved
  dramatically, particularly in the last week but not universally.
  We've butted heads heavily on system CPU usage and high levels of
  migration even when it shows that overall performance is better.
  There are also cases where it regresses.  Of interest is that for
  specjbb in some configurations it will regress for lower numbers of
  warehouses and show gains for higher numbers which is not reported by
  the tool by default and sometimes missed in treports.  Recently I
  reported for numacore that the JVM was crashing with
  NullPointerExceptions but currently it's unclear what the source of
  this problem is.  Initially I thought it was in how numacore batch
  handles PTEs but I'm no longer think this is the case.  It's possible
  numacore is just able to trigger it due to higher rates of migration.

  These reports were quite late in the cycle so I/we would like to start
  with this tree as it contains much of the code we can agree on and has
  not changed significantly over the last 2-3 weeks."

* tag 'balancenuma-v11' of git://git.kernel.org/pub/scm/linux/kernel/git/mel/linux-balancenuma: (50 commits)
  mm/rmap, migration: Make rmap_walk_anon() and try_to_unmap_anon() more scalable
  mm/rmap: Convert the struct anon_vma::mutex to an rwsem
  mm: migrate: Account a transhuge page properly when rate limiting
  mm: numa: Account for failed allocations and isolations as migration failures
  mm: numa: Add THP migration for the NUMA working set scanning fault case build fix
  mm: numa: Add THP migration for the NUMA working set scanning fault case.
  mm: sched: numa: Delay PTE scanning until a task is scheduled on a new node
  mm: sched: numa: Control enabling and disabling of NUMA balancing if !SCHED_DEBUG
  mm: sched: numa: Control enabling and disabling of NUMA balancing
  mm: sched: Adapt the scanning rate if a NUMA hinting fault does not migrate
  mm: numa: Use a two-stage filter to restrict pages being migrated for unlikely task&lt;-&gt;node relationships
  mm: numa: migrate: Set last_nid on newly allocated page
  mm: numa: split_huge_page: Transfer last_nid on tail page
  mm: numa: Introduce last_nid to the page frame
  sched: numa: Slowly increase the scanning period as NUMA faults are handled
  mm: numa: Rate limit setting of pte_numa if node is saturated
  mm: numa: Rate limit the amount of memory that is migrated between nodes
  mm: numa: Structures for Migrate On Fault per NUMA migration rate limiting
  mm: numa: Migrate pages handled during a pmd_numa hinting fault
  mm: numa: Migrate on reference policy
  ...
</content>
</entry>
<entry>
<title>Merge branch 'akpm' (Andrew's patch-bomb)</title>
<updated>2012-12-13T21:11:15Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2012-12-13T21:11:15Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=f6e858a00af788bab0fd4c0b7f5cd788000edc18'/>
<id>urn:sha1:f6e858a00af788bab0fd4c0b7f5cd788000edc18</id>
<content type='text'>
Merge misc VM changes from Andrew Morton:
 "The rest of most-of-MM.  The other MM bits await a slab merge.

  This patch includes the addition of a huge zero_page.  Not a
  performance boost but it an save large amounts of physical memory in
  some situations.

  Also a bunch of Fujitsu engineers are working on memory hotplug.
  Which, as it turns out, was badly broken.  About half of their patches
  are included here; the remainder are 3.8 material."

However, this merge disables CONFIG_MOVABLE_NODE, which was totally
broken.  We don't add new features with "default y", nor do we add
Kconfig questions that are incomprehensible to most people without any
help text.  Does the feature even make sense without compaction or
memory hotplug?

* akpm: (54 commits)
  mm/bootmem.c: remove unused wrapper function reserve_bootmem_generic()
  mm/memory.c: remove unused code from do_wp_page()
  asm-generic, mm: pgtable: consolidate zero page helpers
  mm/hugetlb.c: fix warning on freeing hwpoisoned hugepage
  hwpoison, hugetlbfs: fix RSS-counter warning
  hwpoison, hugetlbfs: fix "bad pmd" warning in unmapping hwpoisoned hugepage
  mm: protect against concurrent vma expansion
  memcg: do not check for mm in __mem_cgroup_count_vm_event
  tmpfs: support SEEK_DATA and SEEK_HOLE (reprise)
  mm: provide more accurate estimation of pages occupied by memmap
  fs/buffer.c: remove redundant initialization in alloc_page_buffers()
  fs/buffer.c: do not inline exported function
  writeback: fix a typo in comment
  mm: introduce new field "managed_pages" to struct zone
  mm, oom: remove statically defined arch functions of same name
  mm, oom: remove redundant sleep in pagefault oom handler
  mm, oom: cleanup pagefault oom handler
  memory_hotplug: allow online/offline memory to result movable node
  numa: add CONFIG_MOVABLE_NODE for movable-dedicated node
  mm, memcg: avoid unnecessary function call when memcg is disabled
  ...
</content>
</entry>
</feed>
