<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/mm, branch v3.5</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v3.5</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v3.5'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2012-07-17T23:24:09Z</updated>
<entry>
<title>Merge branch 'akpm' (Andrew's patch-bomb)</title>
<updated>2012-07-17T23:24:09Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2012-07-17T23:24:09Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=de74646c603fa71d1587f1ba5c761d009624abd7'/>
<id>urn:sha1:de74646c603fa71d1587f1ba5c761d009624abd7</id>
<content type='text'>
Merge Andrew's remaining patches for 3.5:
 "Nine fixes"

* Merge emailed patches from Andrew Morton &lt;akpm@linux-foundation.org&gt;: (9 commits)
  mm: fix lost kswapd wakeup in kswapd_stop()
  m32r: make memset() global for CONFIG_KERNEL_BZIP2=y
  m32r: add memcpy() for CONFIG_KERNEL_GZIP=y
  m32r: consistently use "suffix-$(...)"
  m32r: fix 'fix breakage from "m32r: use generic ptrace_resume code"' fallout
  m32r: fix pull clearing RESTORE_SIGMASK into block_sigmask() fallout
  m32r: remove duplicate definition of PTRACE_O_TRACESYSGOOD
  mn10300: fix "pull clearing RESTORE_SIGMASK into block_sigmask()" fallout
  bootmem: make ___alloc_bootmem_node_nopanic() really nopanic
</content>
</entry>
<entry>
<title>mm: fix lost kswapd wakeup in kswapd_stop()</title>
<updated>2012-07-17T23:21:30Z</updated>
<author>
<name>Aaditya Kumar</name>
<email>aaditya.kumar.30@gmail.com</email>
</author>
<published>2012-07-17T22:48:07Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=1c7e7f6c0703d03af6bcd5ccc11fc15d23e5ecbe'/>
<id>urn:sha1:1c7e7f6c0703d03af6bcd5ccc11fc15d23e5ecbe</id>
<content type='text'>
Offlining memory may block forever, waiting for kswapd() to wake up
because kswapd() does not check the event kthread-&gt;should_stop before
sleeping.

The proper pattern, from Documentation/memory-barriers.txt, is:

   ---  waker  ---
   event_indicated = 1;
   wake_up_process(event_daemon);

   ---  sleeper  ---
   for (;;) {
      set_current_state(TASK_UNINTERRUPTIBLE);
      if (event_indicated)
         break;
      schedule();
   }

   set_current_state() may be wrapped by:
      prepare_to_wait();

In the kswapd() case, event_indicated is kthread-&gt;should_stop.

  === offlining memory (waker) ===
   kswapd_stop()
      kthread_stop()
         kthread-&gt;should_stop = 1
         wake_up_process()
         wait_for_completion()

  ===  kswapd_try_to_sleep (sleeper) ===
   kswapd_try_to_sleep()
      prepare_to_wait()
           .
           .
      schedule()
           .
           .
      finish_wait()

The schedule() needs to be protected by a test of kthread-&gt;should_stop,
which is wrapped by kthread_should_stop().

Reproducer:
   Do heavy file I/O in background.
   Do a memory offline/online in a tight loop

Signed-off-by: Aaditya Kumar &lt;aaditya.kumar@ap.sony.com&gt;
Acked-by: KOSAKI Motohiro &lt;kosaki.motohiro@jp.fujitsu.com&gt;
Reviewed-by: Minchan Kim &lt;minchan@kernel.org&gt;
Acked-by: Mel Gorman &lt;mel@csn.ul.ie&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>bootmem: make ___alloc_bootmem_node_nopanic() really nopanic</title>
<updated>2012-07-17T23:21:29Z</updated>
<author>
<name>Yinghai Lu</name>
<email>yinghai@kernel.org</email>
</author>
<published>2012-07-17T22:47:51Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=c8f4a2d095bcb7ff798f984b9c7d16b4c8d194c3'/>
<id>urn:sha1:c8f4a2d095bcb7ff798f984b9c7d16b4c8d194c3</id>
<content type='text'>
In reaction to commit 99ab7b19440a ("mm: sparse: fix usemap allocation
above node descriptor section") Johannes said:
| while backporting the below patch, I realised that your fix busted
| f5bf18fa22f8 again.  The problem was not a panicking version on
| allocation failure but when the usemap size was too large such that
| goal + size &gt; limit triggers the BUG_ON in the bootmem allocator.  So
| we need a version that passes limit ONLY if the usemap is smaller than
| the section.

after checking the code, the name of ___alloc_bootmem_node_nopanic()
does not reflect the fact.

Make bootmem really not panic.

Hope will kill bootmem sooner.

Signed-off-by: Yinghai Lu &lt;yinghai@kernel.org&gt;
Cc: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: &lt;stable@vger.kernel.org&gt;    [3.3.x, 3.4.x]
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>Merge branch 'fixes-for-linus' of git://git.linaro.org/people/mszyprowski/linux-dma-mapping</title>
<updated>2012-07-17T15:43:12Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2012-07-17T15:43:12Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=5bb93f1a212e756834bc380682210647b5cdfcf1'/>
<id>urn:sha1:5bb93f1a212e756834bc380682210647b5cdfcf1</id>
<content type='text'>
Pull CMA and DMA-mapping fixes from Marek Szyprowski:
 "Another set of minor fixups for recently merged Contiguous Memory
  Allocator and ARM DMA-mapping changes.  Those patches fix mysterious
  crashes on systems with CMA and Himem enabled as well as some corner
  cases caused by typical off-by-one bug."

* 'fixes-for-linus' of git://git.linaro.org/people/mszyprowski/linux-dma-mapping:
  ARM: dma-mapping: modify condition check while freeing pages
  mm: cma: fix condition check when setting global cma area
  mm: cma: don't replace lowmem pages with highmem
</content>
</entry>
<entry>
<title>memblock: free allocated memblock_reserved_regions later</title>
<updated>2012-07-11T23:04:50Z</updated>
<author>
<name>Yinghai Lu</name>
<email>yinghai@kernel.org</email>
</author>
<published>2012-07-11T21:02:56Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=29f6738609e40227dabcc63bfb3b84b3726a75bd'/>
<id>urn:sha1:29f6738609e40227dabcc63bfb3b84b3726a75bd</id>
<content type='text'>
memblock_free_reserved_regions() calls memblock_free(), but
memblock_free() would double reserved.regions too, so we could free the
old range for reserved.regions.

Also tj said there is another bug which could be related to this.

| I don't think we're saving any noticeable
| amount by doing this "free - give it to page allocator - reserve
| again" dancing.  We should just allocate regions aligned to page
| boundaries and free them later when memblock is no longer in use.

in that case, when DEBUG_PAGEALLOC, will get panic:

     memblock_free: [0x0000102febc080-0x0000102febf080] memblock_free_reserved_regions+0x37/0x39
  BUG: unable to handle kernel paging request at ffff88102febd948
  IP: [&lt;ffffffff836a5774&gt;] __next_free_mem_range+0x9b/0x155
  PGD 4826063 PUD cf67a067 PMD cf7fa067 PTE 800000102febd160
  Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
  CPU 0
  Pid: 0, comm: swapper Not tainted 3.5.0-rc2-next-20120614-sasha #447
  RIP: 0010:[&lt;ffffffff836a5774&gt;]  [&lt;ffffffff836a5774&gt;] __next_free_mem_range+0x9b/0x155

See the discussion at https://lkml.org/lkml/2012/6/13/469

So try to allocate with PAGE_SIZE alignment and free it later.

Reported-by: Sasha Levin &lt;levinsasha928@gmail.com&gt;
Acked-by: Tejun Heo &lt;tj@kernel.org&gt;
Cc: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
Signed-off-by: Yinghai Lu &lt;yinghai@kernel.org&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: sparse: fix usemap allocation above node descriptor section</title>
<updated>2012-07-11T23:04:49Z</updated>
<author>
<name>Yinghai Lu</name>
<email>yinghai@kernel.org</email>
</author>
<published>2012-07-11T21:02:53Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=99ab7b19440a72ebdf225f99b20f8ef40decee86'/>
<id>urn:sha1:99ab7b19440a72ebdf225f99b20f8ef40decee86</id>
<content type='text'>
After commit f5bf18fa22f8 ("bootmem/sparsemem: remove limit constraint
in alloc_bootmem_section"), usemap allocations may easily be placed
outside the optimal section that holds the node descriptor, even if
there is space available in that section.  This results in unnecessary
hotplug dependencies that need to have the node unplugged before the
section holding the usemap.

The reason is that the bootmem allocator doesn't guarantee a linear
search starting from the passed allocation goal but may start out at a
much higher address absent an upper limit.

Fix this by trying the allocation with the limit at the section end,
then retry without if that fails.  This keeps the fix from f5bf18fa22f8
of not panicking if the allocation does not fit in the section, but
still makes sure to try to stay within the section at first.

Signed-off-by: Yinghai Lu &lt;yinghai@kernel.org&gt;
Signed-off-by: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: &lt;stable@vger.kernel.org&gt;	[3.3.x, 3.4.x]
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: sparse: fix section usemap placement calculation</title>
<updated>2012-07-11T23:04:49Z</updated>
<author>
<name>Yinghai Lu</name>
<email>yinghai@kernel.org</email>
</author>
<published>2012-07-11T21:02:51Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=07b4e2bc9c35ea88cbd36d806fcd5e3bcbf022be'/>
<id>urn:sha1:07b4e2bc9c35ea88cbd36d806fcd5e3bcbf022be</id>
<content type='text'>
Commit 238305bb4d41 ("mm: remove sparsemem allocation details from the
bootmem allocator") introduced a bug in the allocation goal calculation
that put section usemaps not in the same section as the node
descriptors, creating unnecessary hotplug dependencies between them:

  node 0 must be removed before remove section 16399
  node 1 must be removed before remove section 16399
  node 2 must be removed before remove section 16399
  node 3 must be removed before remove section 16399
  node 4 must be removed before remove section 16399
  node 5 must be removed before remove section 16399
  node 6 must be removed before remove section 16399

The reason is that it applies PAGE_SECTION_MASK to the physical address
of the node descriptor when finding a suitable place to put the usemap,
when this mask is actually intended to be used with PFNs.  Because the
PFN mask is wider, the target address will point beyond the wanted
section holding the node descriptor and the node must be offlined before
the section holding the usemap can go.

Fix this by extending the mask to address width before use.

Signed-off-by: Yinghai Lu &lt;yinghai@kernel.org&gt;
Signed-off-by: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>shmem: cleanup shmem_add_to_page_cache</title>
<updated>2012-07-11T23:04:48Z</updated>
<author>
<name>Hugh Dickins</name>
<email>hughd@google.com</email>
</author>
<published>2012-07-11T21:02:48Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=b065b4321fa78e83bf8f5b0d79d0b5424b57998b'/>
<id>urn:sha1:b065b4321fa78e83bf8f5b0d79d0b5424b57998b</id>
<content type='text'>
shmem_add_to_page_cache() has three callsites, but only one of them wants
the radix_tree_preload() (an exceptional entry guarantees that the radix
tree node is present in the other cases), and only that site can achieve
mem_cgroup_uncharge_cache_page() (PageSwapCache makes it a no-op in the
other cases).  We did it this way originally to reflect
add_to_page_cache_locked(); but it's confusing now, so move the radix_tree
preloading and mem_cgroup uncharging to that one caller.

Signed-off-by: Hugh Dickins &lt;hughd@google.com&gt;
Acked-by: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: KAMEZAWA Hiroyuki &lt;kamezawa.hiroyu@jp.fujitsu.com&gt;
Cc: Michal Hocko &lt;mhocko@suse.cz&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>shmem: fix negative rss in memcg memory.stat</title>
<updated>2012-07-11T23:04:48Z</updated>
<author>
<name>Hugh Dickins</name>
<email>hughd@google.com</email>
</author>
<published>2012-07-11T21:02:47Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=d189922862e03ce6c7adc1e99d3b94e632dc8e89'/>
<id>urn:sha1:d189922862e03ce6c7adc1e99d3b94e632dc8e89</id>
<content type='text'>
When adding the page_private checks before calling shmem_replace_page(), I
did realize that there is a further race, but thought it too unlikely to
need a hurried fix.

But independently I've been chasing why a mem cgroup's memory.stat
sometimes shows negative rss after all tasks have gone: I expected it to
be a stats gathering bug, but actually it's shmem swapping's fault.

It's an old surprise, that when you lock_page(lookup_swap_cache(swap)),
the page may have been removed from swapcache before getting the lock; or
it may have been freed and reused and be back in swapcache; and it can
even be using the same swap location as before (page_private same).

The swapoff case is already secure against this (swap cannot be reused
until the whole area has been swapped off, and a new swapped on); and
shmem_getpage_gfp() is protected by shmem_add_to_page_cache()'s check for
the expected radix_tree entry - but a little too late.

By that time, we might have already decided to shmem_replace_page(): I
don't know of a problem from that, but I'd feel more at ease not to do so
spuriously.  And we have already done mem_cgroup_cache_charge(), on
perhaps the wrong mem cgroup: and this charge is not then undone on the
error path, because PageSwapCache ends up preventing that.

It's this last case which causes the occasional negative rss in
memory.stat: the page is charged here as cache, but (sometimes) found to
be anon when eventually it's uncharged - and in between, it's an
undeserved charge on the wrong memcg.

Fix this by adding an earlier check on the radix_tree entry: it's
inelegant to descend the tree twice, but swapping is not the fast path,
and a better solution would need a pair (try+commit) of memcg calls, and a
rework of shmem_replace_page() to keep out of the swapcache.

We can use the added shmem_confirm_swap() function to replace the
find_get_page+page_cache_release we were already doing on the error path.
And add a comment on that -EEXIST: it seems a peculiar errno to be using,
but originates from its use in radix_tree_insert().

[It can be surprising to see positive rss left in a memcg's memory.stat
after all tasks have gone, since it is supposed to count anonymous but not
shmem.  Aside from sharing anon pages via fork with a task in some other
memcg, it often happens after swapping: because a swap page can't be freed
while under writeback, nor while locked.  So it's not an error, and these
residual pages are easily freed once pressure demands.]

Signed-off-by: Hugh Dickins &lt;hughd@google.com&gt;
Acked-by: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: KAMEZAWA Hiroyuki &lt;kamezawa.hiroyu@jp.fujitsu.com&gt;
Cc: Michal Hocko &lt;mhocko@suse.cz&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>tmpfs: revert SEEK_DATA and SEEK_HOLE</title>
<updated>2012-07-11T23:04:48Z</updated>
<author>
<name>Hugh Dickins</name>
<email>hughd@google.com</email>
</author>
<published>2012-07-11T21:02:45Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=f21f8062201fc6361f65de92e758a76375ba8c59'/>
<id>urn:sha1:f21f8062201fc6361f65de92e758a76375ba8c59</id>
<content type='text'>
Revert 4fb5ef089b28 ("tmpfs: support SEEK_DATA and SEEK_HOLE").  I believe
it's correct, and it's been nice to have from rc1 to rc6; but as the
original commit said:

I don't know who actually uses SEEK_DATA or SEEK_HOLE, and whether it
would be of any use to them on tmpfs.  This code adds 92 lines and 752
bytes on x86_64 - is that bloat or worthwhile?

Nobody asked for it, so I conclude that it's bloat: let's revert tmpfs to
the dumb generic support for v3.5.  We can always reinstate it later if
useful, and anyone needing it in a hurry can just get it out of git.

Signed-off-by: Hugh Dickins &lt;hughd@google.com&gt;
Cc: Christoph Hellwig &lt;hch@infradead.org&gt;
Cc: Josef Bacik &lt;josef@redhat.com&gt;
Cc: Andi Kleen &lt;andi@firstfloor.org&gt;
Cc: Andreas Dilger &lt;adilger@dilger.ca&gt;
Cc: Dave Chinner &lt;david@fromorbit.com&gt;
Cc: Marco Stornelli &lt;marco.stornelli@gmail.com&gt;
Cc: Jeff liu &lt;jeff.liu@oracle.com&gt;
Cc: Chris Mason &lt;chris.mason@fusionio.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
</feed>
