<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/kernel/exit.c, branch v5.3</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v5.3</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v5.3'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2019-07-30T17:57:14Z</updated>
<entry>
<title>exit: make setting exit_state consistent</title>
<updated>2019-07-30T17:57:14Z</updated>
<author>
<name>Christian Brauner</name>
<email>christian@brauner.io</email>
</author>
<published>2019-07-29T15:48:24Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=30b692d3b390c6fe78a5064be0c4bbd44a41be59'/>
<id>urn:sha1:30b692d3b390c6fe78a5064be0c4bbd44a41be59</id>
<content type='text'>
Since commit b191d6491be6 ("pidfd: fix a poll race when setting exit_state")
we unconditionally set exit_state to EXIT_ZOMBIE before calling into
do_notify_parent(). This was done to eliminate a race when querying
exit_state in do_notify_pidfd().
Back then we decided to do the absolute minimal thing to fix this and
not touch the rest of the exit_notify() function where exit_state is
set.
Since this fix has not caused any issues change the setting of
exit_state to EXIT_DEAD in the autoreap case to account for the fact hat
exit_state is set to EXIT_ZOMBIE unconditionally. This fix was planned
but also explicitly requested in [1] and makes the whole code more
consistent.

/* References */
[1]: https://lore.kernel.org/lkml/CAHk-=wigcxGFR2szue4wavJtH5cYTTeNES=toUBVGsmX0rzX+g@mail.gmail.com

Signed-off-by: Christian Brauner &lt;christian@brauner.io&gt;
Acked-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>pidfd: fix a poll race when setting exit_state</title>
<updated>2019-07-22T14:02:03Z</updated>
<author>
<name>Suren Baghdasaryan</name>
<email>surenb@google.com</email>
</author>
<published>2019-07-17T17:21:00Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=b191d6491be67cef2b3fa83015561caca1394ab9'/>
<id>urn:sha1:b191d6491be67cef2b3fa83015561caca1394ab9</id>
<content type='text'>
There is a race between reading task-&gt;exit_state in pidfd_poll and
writing it after do_notify_parent calls do_notify_pidfd. Expected
sequence of events is:

CPU 0                            CPU 1
------------------------------------------------
exit_notify
  do_notify_parent
    do_notify_pidfd
  tsk-&gt;exit_state = EXIT_DEAD
                                  pidfd_poll
                                     if (tsk-&gt;exit_state)

However nothing prevents the following sequence:

CPU 0                            CPU 1
------------------------------------------------
exit_notify
  do_notify_parent
    do_notify_pidfd
                                   pidfd_poll
                                      if (tsk-&gt;exit_state)
  tsk-&gt;exit_state = EXIT_DEAD

This causes a polling task to wait forever, since poll blocks because
exit_state is 0 and the waiting task is not notified again. A stress
test continuously doing pidfd poll and process exits uncovered this bug.

To fix it, we make sure that the task's exit_state is always set before
calling do_notify_pidfd.

Fixes: b53b0b9d9a6 ("pidfd: add polling support")
Cc: kernel-team@android.com
Cc: Oleg Nesterov &lt;oleg@redhat.com&gt;
Signed-off-by: Suren Baghdasaryan &lt;surenb@google.com&gt;
Signed-off-by: Joel Fernandes (Google) &lt;joel@joelfernandes.org&gt;
Link: https://lore.kernel.org/r/20190717172100.261204-1-joel@joelfernandes.org
[christian@brauner.io: adapt commit message and drop unneeded changes from wait_task_zombie]
Signed-off-by: Christian Brauner &lt;christian@brauner.io&gt;
</content>
</entry>
<entry>
<title>cgroup: Call cgroup_release() before __exit_signal()</title>
<updated>2019-05-31T17:38:57Z</updated>
<author>
<name>Tejun Heo</name>
<email>tj@kernel.org</email>
</author>
<published>2019-05-31T17:38:57Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=6b115bf58e6f013ca75e7115aabcbd56c20ff31d'/>
<id>urn:sha1:6b115bf58e6f013ca75e7115aabcbd56c20ff31d</id>
<content type='text'>
cgroup_release() calls cgroup_subsys-&gt;release() which is used by the
pids controller to uncharge its pid.  We want to use it to manage
iteration of dying tasks which requires putting it before
__unhash_process().  Move cgroup_release() above __exit_signal().
While this makes it uncharge before the pid is freed, pid is RCU freed
anyway and the window is very narrow.

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Cc: Oleg Nesterov &lt;oleg@redhat.com&gt;
</content>
</entry>
<entry>
<title>treewide: Add SPDX license identifier for missed files</title>
<updated>2019-05-21T08:50:45Z</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2019-05-19T12:08:55Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=457c89965399115e5cd8bf38f9c597293405703d'/>
<id>urn:sha1:457c89965399115e5cd8bf38f9c597293405703d</id>
<content type='text'>
Add SPDX license identifiers to all files which:

 - Have no license information of any form

 - Have EXPORT_.*_SYMBOL_GPL inside which was used in the
   initial scan/conversion to ignore the file

These files fall under the project license, GPL v2 only. The resulting SPDX
license identifier is:

  GPL-2.0-only

Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>mm: change mm_update_next_owner() to update mm-&gt;owner with WRITE_ONCE</title>
<updated>2019-05-15T02:52:47Z</updated>
<author>
<name>Andrea Arcangeli</name>
<email>aarcange@redhat.com</email>
</author>
<published>2019-05-14T22:40:50Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=987717e5e016a0dd3011d3bb16546672713f94e2'/>
<id>urn:sha1:987717e5e016a0dd3011d3bb16546672713f94e2</id>
<content type='text'>
The RCU reader uses rcu_dereference() inside rcu_read_lock critical
sections, so the writer shall use WRITE_ONCE.  Just a cleanup, we still
rely on gcc to emit atomic writes in other places.

Link: http://lkml.kernel.org/r/20190325225636.11635-3-aarcange@redhat.com
Signed-off-by: Andrea Arcangeli &lt;aarcange@redhat.com&gt;
Reviewed-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Cc: Jann Horn &lt;jannh@google.com&gt;
Cc: Jason Gunthorpe &lt;jgg@mellanox.com&gt;
Cc: "Kirill A . Shutemov" &lt;kirill.shutemov@linux.intel.com&gt;
Cc: Michal Hocko &lt;mhocko@suse.com&gt;
Cc: Mike Kravetz &lt;mike.kravetz@oracle.com&gt;
Cc: Mike Rapoport &lt;rppt@linux.vnet.ibm.com&gt;
Cc: Oleg Nesterov &lt;oleg@redhat.com&gt;
Cc: Peter Xu &lt;peterx@redhat.com&gt;
Cc: zhong jiang &lt;zhongjiang@huawei.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>Merge branch 'for-5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup</title>
<updated>2019-03-07T18:11:41Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2019-03-07T18:11:41Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=1fc1cd8399ab5541a488a7e47b2f21537dd76c2d'/>
<id>urn:sha1:1fc1cd8399ab5541a488a7e47b2f21537dd76c2d</id>
<content type='text'>
Pull cgroup updates from Tejun Heo:

 - Oleg's pids controller accounting update which gets rid of rcu delay
   in pids accounting updates

 - rstat (cgroup hierarchical stat collection mechanism) optimization

 - Doc updates

* 'for-5.1' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
  cpuset: remove unused task_has_mempolicy()
  cgroup, rstat: Don't flush subtree root unless necessary
  cgroup: add documentation for pids.events file
  Documentation: cgroup-v2: eliminate markup warnings
  MAINTAINERS: Update cgroup entry
  cgroup/pids: turn cgroup_subsys-&gt;free() into cgroup_subsys-&gt;release() to fix the accounting
</content>
</entry>
<entry>
<title>kernel/exit.c: release ptraced tasks before zap_pid_ns_processes</title>
<updated>2019-02-01T23:46:23Z</updated>
<author>
<name>Andrei Vagin</name>
<email>avagin@gmail.com</email>
</author>
<published>2019-02-01T22:20:24Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=8fb335e078378c8426fabeed1ebee1fbf915690c'/>
<id>urn:sha1:8fb335e078378c8426fabeed1ebee1fbf915690c</id>
<content type='text'>
Currently, exit_ptrace() adds all ptraced tasks in a dead list, then
zap_pid_ns_processes() waits on all tasks in a current pidns, and only
then are tasks from the dead list released.

zap_pid_ns_processes() can get stuck on waiting tasks from the dead
list.  In this case, we will have one unkillable process with one or
more dead children.

Thanks to Oleg for the advice to release tasks in find_child_reaper().

Link: http://lkml.kernel.org/r/20190110175200.12442-1-avagin@gmail.com
Fixes: 7c8bd2322c7f ("exit: ptrace: shift "reap dead" code from exit_ptrace() to forget_original_parent()")
Signed-off-by: Andrei Vagin &lt;avagin@gmail.com&gt;
Signed-off-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Cc: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>cgroup/pids: turn cgroup_subsys-&gt;free() into cgroup_subsys-&gt;release() to fix the accounting</title>
<updated>2019-01-31T14:55:57Z</updated>
<author>
<name>Oleg Nesterov</name>
<email>oleg@redhat.com</email>
</author>
<published>2019-01-28T16:00:13Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=51bee5abeab2058ea5813c5615d6197a23dbf041'/>
<id>urn:sha1:51bee5abeab2058ea5813c5615d6197a23dbf041</id>
<content type='text'>
The only user of cgroup_subsys-&gt;free() callback is pids_cgrp_subsys which
needs pids_free() to uncharge the pid.

However, -&gt;free() is called from __put_task_struct()-&gt;cgroup_free() and this
is too late. Even the trivial program which does

	for (;;) {
		int pid = fork();
		assert(pid &gt;= 0);
		if (pid)
			wait(NULL);
		else
			exit(0);
	}

can run out of limits because release_task()-&gt;call_rcu(delayed_put_task_struct)
implies an RCU gp after the task/pid goes away and before the final put().

Test-case:

	mkdir -p /tmp/CG
	mount -t cgroup2 none /tmp/CG
	echo '+pids' &gt; /tmp/CG/cgroup.subtree_control

	mkdir /tmp/CG/PID
	echo 2 &gt; /tmp/CG/PID/pids.max

	perl -e 'while ($p = fork) { wait; } $p // die "fork failed: $!\n"' &amp;
	echo $! &gt; /tmp/CG/PID/cgroup.procs

Without this patch the forking process fails soon after migration.

Rename cgroup_subsys-&gt;free() to cgroup_subsys-&gt;release() and move the callsite
into the new helper, cgroup_release(), called by release_task() which actually
frees the pid(s).

Reported-by: Herton R. Krzesinski &lt;hkrzesin@redhat.com&gt;
Reported-by: Jan Stancek &lt;jstancek@redhat.com&gt;
Signed-off-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
</content>
</entry>
<entry>
<title>sched/wait: Fix rcuwait_wake_up() ordering</title>
<updated>2019-01-21T10:15:36Z</updated>
<author>
<name>Prateek Sood</name>
<email>prsood@codeaurora.org</email>
</author>
<published>2018-11-30T15:10:56Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=6dc080eeb2ba01973bfff0d79844d7a59e12542e'/>
<id>urn:sha1:6dc080eeb2ba01973bfff0d79844d7a59e12542e</id>
<content type='text'>
For some peculiar reason rcuwait_wake_up() has the right barrier in
the comment, but not in the code.

This mistake has been observed to cause a deadlock in the following
situation:

    P1					P2

    percpu_up_read()			percpu_down_write()
      rcu_sync_is_idle() // false
					  rcu_sync_enter()
					  ...
      __percpu_up_read()

[S] ,-  __this_cpu_dec(*sem-&gt;read_count)
    |   smp_rmb();
[L] |   task = rcu_dereference(w-&gt;task) // NULL
    |
    |				    [S]	    w-&gt;task = current
    |					    smp_mb();
    |				    [L]	    readers_active_check() // fail
    `-&gt; &lt;store happens here&gt;

Where the smp_rmb() (obviously) fails to constrain the store.

[ peterz: Added changelog. ]

Signed-off-by: Prateek Sood &lt;prsood@codeaurora.org&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Reviewed-by: Andrea Parri &lt;andrea.parri@amarulasolutions.com&gt;
Acked-by: Davidlohr Bueso &lt;dbueso@suse.de&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Fixes: 8f95c90ceb54 ("sched/wait, RCU: Introduce rcuwait machinery")
Link: https://lkml.kernel.org/r/1543590656-7157-1-git-send-email-prsood@codeaurora.org
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net</title>
<updated>2019-01-15T17:13:36Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2019-01-15T17:13:36Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=e8746440bf68212f19688f1454dad593c74abee1'/>
<id>urn:sha1:e8746440bf68212f19688f1454dad593c74abee1</id>
<content type='text'>
Pull networking fixes from David Miller:

 1) Fix regression in multi-SKB responses to RTM_GETADDR, from Arthur
    Gautier.

 2) Fix ipv6 frag parsing in openvswitch, from Yi-Hung Wei.

 3) Unbounded recursion in ipv4 and ipv6 GUE tunnels, from Stefano
    Brivio.

 4) Use after free in hns driver, from Yonglong Liu.

 5) icmp6_send() needs to handle the case of NULL skb, from Eric
    Dumazet.

 6) Missing rcu read lock in __inet6_bind() when operating on mapped
    addresses, from David Ahern.

 7) Memory leak in tipc-nl_compat_publ_dump(), from Gustavo A. R. Silva.

 8) Fix PHY vs r8169 module loading ordering issues, from Heiner
    Kallweit.

 9) Fix bridge vlan memory leak, from Ido Schimmel.

10) Dev refcount leak in AF_PACKET, from Jason Gunthorpe.

11) Infoleak in ipv6_local_error(), flow label isn't completely
    initialized. From Eric Dumazet.

12) Handle mv88e6390 errata, from Andrew Lunn.

13) Making vhost/vsock CID hashing consistent, from Zha Bin.

14) Fix lack of UMH cleanup when it unexpectedly exits, from Taehee Yoo.

15) Bridge forwarding must clear skb-&gt;tstamp, from Paolo Abeni.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (87 commits)
  bnxt_en: Fix context memory allocation.
  bnxt_en: Fix ring checking logic on 57500 chips.
  mISDN: hfcsusb: Use struct_size() in kzalloc()
  net: clear skb-&gt;tstamp in bridge forwarding path
  net: bpfilter: disallow to remove bpfilter module while being used
  net: bpfilter: restart bpfilter_umh when error occurred
  net: bpfilter: use cleanup callback to release umh_info
  umh: add exit routine for UMH process
  isdn: i4l: isdn_tty: Fix some concurrency double-free bugs
  vhost/vsock: fix vhost vsock cid hashing inconsistent
  net: stmmac: Prevent RX starvation in stmmac_napi_poll()
  net: stmmac: Fix the logic of checking if RX Watchdog must be enabled
  net: stmmac: Check if CBS is supported before configuring
  net: stmmac: dwxgmac2: Only clear interrupts that are active
  net: stmmac: Fix PCI module removal leak
  tools/bpf: fix bpftool map dump with bitfields
  tools/bpf: test btf bitfield with &gt;=256 struct member offset
  bpf: fix bpffs bitfield pretty print
  net: ethernet: mediatek: fix warning in phy_start_aneg
  tcp: change txhash on SYN-data timeout
  ...
</content>
</entry>
</feed>
