<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/kernel/exit.c, branch v3.5</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v3.5</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v3.5'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2012-06-20T21:39:36Z</updated>
<entry>
<title>pidns: find_new_reaper() can no longer switch to init_pid_ns.child_reaper</title>
<updated>2012-06-20T21:39:36Z</updated>
<author>
<name>Oleg Nesterov</name>
<email>oleg@redhat.com</email>
</author>
<published>2012-06-20T19:53:04Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=50d75f8daead8a1f850c40a3b6c6575ab19b48cf'/>
<id>urn:sha1:50d75f8daead8a1f850c40a3b6c6575ab19b48cf</id>
<content type='text'>
find_new_reaper() changes pid_ns-&gt;child_reaper, see add0d4df ("pid_ns:
zap_pid_ns_processes: fix the -&gt;child_reaper changing").

The original reason has gone away after the previous patch, -&gt;children
list must be empty after zap_pid_ns_processes().

However now we can not switch to init_pid_ns.child_reaper.
__unhash_process() relies on the "-&gt;child_reaper == parent" check, but
this check does not work if the last exiting task is also the child
reaper.

As Eric sugested, we can change __unhash_process() to use the parent's
pid_ns and remove this code.

Also, with this change we can move detach_pid(PIDTYPE_PID) back, where it
was before the previous fix.

Signed-off-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Acked-by: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Cc: Louis Rilling &lt;louis.rilling@kerlabs.com&gt;
Cc: Mike Galbraith &lt;efault@gmx.de&gt;
Acked-by: Pavel Emelyanov &lt;xemul@parallels.com&gt;
Tested-by: Andrew Wagin &lt;avagin@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>pidns: guarantee that the pidns init will be the last pidns process reaped</title>
<updated>2012-06-20T21:39:36Z</updated>
<author>
<name>Eric W. Biederman</name>
<email>ebiederm@xmission.com</email>
</author>
<published>2012-06-20T19:53:03Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=6347e90091041e34bea625370794c92f4ce71228'/>
<id>urn:sha1:6347e90091041e34bea625370794c92f4ce71228</id>
<content type='text'>
Today we have a twofold bug.  Sometimes release_task on pid == 1 in a pid
namespace can run before other processes in a pid namespace have had
release task called.  With the result that pid_ns_release_proc can be
called before the last proc_flus_task() is done using upid-&gt;ns-&gt;proc_mnt,
resulting in the use of a stale pointer.  This same set of circumstances
can lead to waitpid(...) returning for a processes started with
clone(CLONE_NEWPID) before the every process in the pid namespace has
actually exited.

To fix this modify zap_pid_ns_processess wait until all other processes in
the pid namespace have exited, even EXIT_DEAD zombies.

The delay_group_leader and related tests ensure that the thread gruop
leader will be the last thread of a process group to be reaped, or to
become EXIT_DEAD and self reap.  With the change to zap_pid_ns_processes
we get the guarantee that pid == 1 in a pid namespace will be the last
task that release_task is called on.

With pid == 1 being the last task to pass through release_task
pid_ns_release_proc can no longer be called too early nor can wait return
before all of the EXIT_DEAD tasks in a pid namespace have exited.

Signed-off-by: Eric W. Biederman &lt;ebiederm@xmission.com&gt;
Signed-off-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Cc: Louis Rilling &lt;louis.rilling@kerlabs.com&gt;
Cc: Mike Galbraith &lt;efault@gmx.de&gt;
Acked-by: Pavel Emelyanov &lt;xemul@parallels.com&gt;
Tested-by: Andrew Wagin &lt;avagin@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: correctly synchronize rss-counters at exit/exec</title>
<updated>2012-06-20T21:39:36Z</updated>
<author>
<name>Konstantin Khlebnikov</name>
<email>khlebnikov@openvz.org</email>
</author>
<published>2012-06-20T19:53:01Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=4fe7efdbdfb1c7e7a7f31decfd831c0f31d37091'/>
<id>urn:sha1:4fe7efdbdfb1c7e7a7f31decfd831c0f31d37091</id>
<content type='text'>
do_exit() and exec_mmap() call sync_mm_rss() before mm_release() does
put_user(clear_child_tid) which can update task-&gt;rss_stat and thus make
mm-&gt;rss_stat inconsistent.  This triggers the "BUG:" printk in check_mm().

Let's fix this bug in the safest way, and optimize/cleanup this later.

Reported-by: Markus Trippelsdorf &lt;markus@trippelsdorf.de&gt;
Signed-off-by: Konstantin Khlebnikov &lt;khlebnikov@openvz.org&gt;
Cc: Oleg Nesterov &lt;oleg@redhat.com&gt;
Cc: KAMEZAWA Hiroyuki &lt;kamezawa.hiroyu@jp.fujitsu.com&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>Revert "mm: correctly synchronize rss-counters at exit/exec"</title>
<updated>2012-06-08T00:54:07Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2012-06-08T00:54:07Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=48d212a2eecaca2e1875925837ad27b2f43f48a3'/>
<id>urn:sha1:48d212a2eecaca2e1875925837ad27b2f43f48a3</id>
<content type='text'>
This reverts commit 40af1bbdca47e5c8a2044039bb78ca8fd8b20f94.

It's horribly and utterly broken for at least the following reasons:

 - calling sync_mm_rss() from mmput() is fundamentally wrong, because
   there's absolutely no reason to believe that the task that does the
   mmput() always does it on its own VM.  Example: fork, ptrace, /proc -
   you name it.

 - calling it *after* having done mmdrop() on it is doubly insane, since
   the mm struct may well be gone now.

 - testing mm against NULL before you call it is insane too, since a
NULL mm there would have caused oopses long before.

.. and those are just the three bugs I found before I decided to give up
looking for me and revert it asap.  I should have caught it before I
even took it, but I trusted Andrew too much.

Cc: Konstantin Khlebnikov &lt;khlebnikov@openvz.org&gt;
Cc: Markus Trippelsdorf &lt;markus@trippelsdorf.de&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Cc: KAMEZAWA Hiroyuki &lt;kamezawa.hiroyu@jp.fujitsu.com&gt;
Cc: Oleg Nesterov &lt;oleg@redhat.com&gt;
Cc: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: correctly synchronize rss-counters at exit/exec</title>
<updated>2012-06-07T21:43:55Z</updated>
<author>
<name>Konstantin Khlebnikov</name>
<email>khlebnikov@openvz.org</email>
</author>
<published>2012-06-07T21:21:14Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=40af1bbdca47e5c8a2044039bb78ca8fd8b20f94'/>
<id>urn:sha1:40af1bbdca47e5c8a2044039bb78ca8fd8b20f94</id>
<content type='text'>
mm-&gt;rss_stat counters have per-task delta: task-&gt;rss_stat.  Before
changing task-&gt;mm pointer the kernel must flush this delta with
sync_mm_rss().

do_exit() already calls sync_mm_rss() to flush the rss-counters before
committing the rss statistics into task-&gt;signal-&gt;maxrss, taskstats,
audit and other stuff.  Unfortunately the kernel does this before
calling mm_release(), which can call put_user() for processing
task-&gt;clear_child_tid.  So at this point we can trigger page-faults and
task-&gt;rss_stat becomes non-zero again.  As a result mm-&gt;rss_stat becomes
inconsistent and check_mm() will print something like this:

| BUG: Bad rss-counter state mm:ffff88020813c380 idx:1 val:-1
| BUG: Bad rss-counter state mm:ffff88020813c380 idx:2 val:1

This patch moves sync_mm_rss() into mm_release(), and moves mm_release()
out of do_exit() and calls it earlier.  After mm_release() there should
be no pagefaults.

[akpm@linux-foundation.org: tweak comment]
Signed-off-by: Konstantin Khlebnikov &lt;khlebnikov@openvz.org&gt;
Reported-by: Markus Trippelsdorf &lt;markus@trippelsdorf.de&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Cc: KAMEZAWA Hiroyuki &lt;kamezawa.hiroyu@jp.fujitsu.com&gt;
Cc: Oleg Nesterov &lt;oleg@redhat.com&gt;
Cc: &lt;stable@vger.kernel.org&gt;		[3.4.x]
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal</title>
<updated>2012-06-01T01:47:30Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2012-06-01T01:47:30Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=fb21affa49204acd409328415b49bfe90136653c'/>
<id>urn:sha1:fb21affa49204acd409328415b49bfe90136653c</id>
<content type='text'>
Pull second pile of signal handling patches from Al Viro:
 "This one is just task_work_add() series + remaining prereqs for it.

  There probably will be another pull request from that tree this
  cycle - at least for helpers, to get them out of the way for per-arch
  fixes remaining in the tree."

Fix trivial conflict in kernel/irq/manage.c: the merge of Andrew's pile
had brought in commit 97fd75b7b8e0 ("kernel/irq/manage.c: use the
pr_foo() infrastructure to prefix printks") which changed one of the
pr_err() calls that this merge moves around.

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal:
  keys: kill task_struct-&gt;replacement_session_keyring
  keys: kill the dummy key_replace_session_keyring()
  keys: change keyctl_session_to_parent() to use task_work_add()
  genirq: reimplement exit_irq_thread() hook via task_work_add()
  task_work_add: generic process-context callbacks
  avr32: missed _TIF_NOTIFY_RESUME on one of do_notify_resume callers
  parisc: need to check NOTIFY_RESUME when exiting from syscall
  move key_repace_session_keyring() into tracehook_notify_resume()
  TIF_NOTIFY_RESUME is defined on all targets now
</content>
</entry>
<entry>
<title>stack usage: add pid to warning printk in check_stack_usage</title>
<updated>2012-06-01T00:49:28Z</updated>
<author>
<name>Tim Bird</name>
<email>tim.bird@am.sony.com</email>
</author>
<published>2012-05-31T23:26:16Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=168eeccbc956d2ec083c3a513f7706784ee0dc5f'/>
<id>urn:sha1:168eeccbc956d2ec083c3a513f7706784ee0dc5f</id>
<content type='text'>
In embedded systems, sometimes the same program (busybox) is the cause of
multiple warnings.  Outputting the pid with the program name in the
warning printk helps distinguish which instances of a program are using
the stack most.

This is a small patch, but useful.

Signed-off-by: Tim Bird &lt;tim.bird@am.sony.com&gt;
Cc: Oleg Nesterov &lt;oleg@redhat.com&gt;
Cc: Frederic Weisbecker &lt;fweisbec@gmail.com&gt;
Cc: Andrea Arcangeli &lt;aarcange@redhat.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>cred: remove task_is_dead() from __task_cred() validation</title>
<updated>2012-06-01T00:49:28Z</updated>
<author>
<name>Oleg Nesterov</name>
<email>oleg@redhat.com</email>
</author>
<published>2012-05-31T23:26:16Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=43e13cc107cf6cd3c15fbe1cef849435c2223d50'/>
<id>urn:sha1:43e13cc107cf6cd3c15fbe1cef849435c2223d50</id>
<content type='text'>
Commit 8f92054e7ca1 ("CRED: Fix __task_cred()'s lockdep check and banner
comment"):

    add the following validation condition:

        task-&gt;exit_state &gt;= 0

    to permit the access if the target task is dead and therefore
    unable to change its own credentials.

OK, but afaics currently this can only help wait_task_zombie() which calls
__task_cred() without rcu lock.

Remove this validation and change wait_task_zombie() to use task_uid()
instead.  This means we do rcu_read_lock() only to shut up the lockdep,
but we already do the same in, say, wait_task_stopped().

task_is_dead() should die, task-&gt;exit_state != 0 means that this task has
passed exit_notify(), only do_wait-like code paths should use this.

Unfortunately, we can't kill task_is_dead() right now, it has already
acquired buggy users in drivers/staging.  The fix already exists.

Signed-off-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Reviewed-by: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Acked-by: David Howells &lt;dhowells@redhat.com&gt;
Cc: Jiri Olsa &lt;jolsa@redhat.com&gt;
Cc: Paul E. McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
Cc: James Morris &lt;jmorris@namei.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>genirq: reimplement exit_irq_thread() hook via task_work_add()</title>
<updated>2012-05-24T02:11:12Z</updated>
<author>
<name>Oleg Nesterov</name>
<email>oleg@redhat.com</email>
</author>
<published>2012-05-11T00:59:08Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=4d1d61a6b203d957777d73fcebf19d90b038b5b2'/>
<id>urn:sha1:4d1d61a6b203d957777d73fcebf19d90b038b5b2</id>
<content type='text'>
exit_irq_thread() and task-&gt;irq_thread are needed to handle the unexpected
(and unlikely) exit of irq-thread.

We can use task_work instead and make this all private to
kernel/irq/manage.c, cleanup plus micro-optimization.

1. rename exit_irq_thread() to irq_thread_dtor(), make it
   static, and move it up before irq_thread().

2. change irq_thread() to do task_work_add(irq_thread_dtor)
   at the start and task_work_cancel() before return.

   tracehook_notify_resume() can never play with kthreads,
   only do_exit()-&gt;exit_task_work() can call the callback
   and this is what we want.

3. remove task_struct-&gt;irq_thread and the special hook
   in do_exit().

Signed-off-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Reviewed-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: David Howells &lt;dhowells@redhat.com&gt;
Cc: Richard Kuo &lt;rkuo@codeaurora.org&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Alexander Gordeev &lt;agordeev@redhat.com&gt;
Cc: Chris Zankel &lt;chris@zankel.net&gt;
Cc: David Smith &lt;dsmith@redhat.com&gt;
Cc: "Frank Ch. Eigler" &lt;fche@redhat.com&gt;
Cc: Geert Uytterhoeven &lt;geert@linux-m68k.org&gt;
Cc: Larry Woodman &lt;lwoodman@redhat.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Tejun Heo &lt;tj@kernel.org&gt;
Cc: Ingo Molnar &lt;mingo@elte.hu&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</content>
</entry>
<entry>
<title>task_work_add: generic process-context callbacks</title>
<updated>2012-05-24T02:09:21Z</updated>
<author>
<name>Oleg Nesterov</name>
<email>oleg@redhat.com</email>
</author>
<published>2012-05-11T00:59:07Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=e73f8959af0439d114847eab5a8a5ce48f1217c4'/>
<id>urn:sha1:e73f8959af0439d114847eab5a8a5ce48f1217c4</id>
<content type='text'>
Provide a simple mechanism that allows running code in the (nonatomic)
context of the arbitrary task.

The caller does task_work_add(task, task_work) and this task executes
task_work-&gt;func() either from do_notify_resume() or from do_exit().  The
callback can rely on PF_EXITING to detect the latter case.

"struct task_work" can be embedded in another struct, still it has "void
*data" to handle the most common/simple case.

This allows us to kill the -&gt;replacement_session_keyring hack, and
potentially this can have more users.

Performance-wise, this adds 2 "unlikely(!hlist_empty())" checks into
tracehook_notify_resume() and do_exit().  But at the same time we can
remove the "replacement_session_keyring != NULL" checks from
arch/*/signal.c and exit_creds().

Note: task_work_add/task_work_run abuses -&gt;pi_lock.  This is only because
this lock is already used by lookup_pi_state() to synchronize with
do_exit() setting PF_EXITING.  Fortunately the scope of this lock in
task_work.c is really tiny, and the code is unlikely anyway.

Signed-off-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Acked-by: David Howells &lt;dhowells@redhat.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Richard Kuo &lt;rkuo@codeaurora.org&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Alexander Gordeev &lt;agordeev@redhat.com&gt;
Cc: Chris Zankel &lt;chris@zankel.net&gt;
Cc: David Smith &lt;dsmith@redhat.com&gt;
Cc: "Frank Ch. Eigler" &lt;fche@redhat.com&gt;
Cc: Geert Uytterhoeven &lt;geert@linux-m68k.org&gt;
Cc: Larry Woodman &lt;lwoodman@redhat.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Tejun Heo &lt;tj@kernel.org&gt;
Cc: Ingo Molnar &lt;mingo@elte.hu&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</content>
</entry>
</feed>
