<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/kernel/cgroup.c, branch v3.13</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v3.13</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v3.13'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2013-12-17T13:11:52Z</updated>
<entry>
<title>cgroup: don't recycle cgroup id until all csses' have been destroyed</title>
<updated>2013-12-17T13:11:52Z</updated>
<author>
<name>Li Zefan</name>
<email>lizefan@huawei.com</email>
</author>
<published>2013-12-17T03:13:39Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=c1a71504e9715812a2d15e7c03b5aa147ae70ded'/>
<id>urn:sha1:c1a71504e9715812a2d15e7c03b5aa147ae70ded</id>
<content type='text'>
Hugh reported this bug:

&gt; CONFIG_MEMCG_SWAP is broken in 3.13-rc.  Try something like this:
&gt;
&gt; mkdir -p /tmp/tmpfs /tmp/memcg
&gt; mount -t tmpfs -o size=1G tmpfs /tmp/tmpfs
&gt; mount -t cgroup -o memory memcg /tmp/memcg
&gt; mkdir /tmp/memcg/old
&gt; echo 512M &gt;/tmp/memcg/old/memory.limit_in_bytes
&gt; echo $$ &gt;/tmp/memcg/old/tasks
&gt; cp /dev/zero /tmp/tmpfs/zero 2&gt;/dev/null
&gt; echo $$ &gt;/tmp/memcg/tasks
&gt; rmdir /tmp/memcg/old
&gt; sleep 1	# let rmdir work complete
&gt; mkdir /tmp/memcg/new
&gt; umount /tmp/tmpfs
&gt; dmesg | grep WARNING
&gt; rmdir /tmp/memcg/new
&gt; umount /tmp/memcg
&gt;
&gt; Shows lots of WARNING: CPU: 1 PID: 1006 at kernel/res_counter.c:91
&gt;                            res_counter_uncharge_locked+0x1f/0x2f()
&gt;
&gt; Breakage comes from 34c00c319ce7 ("memcg: convert to use cgroup id").
&gt;
&gt; The lifetime of a cgroup id is different from the lifetime of the
&gt; css id it replaced: memsw's css_get()s do nothing to hold on to the
&gt; old cgroup id, it soon gets recycled to a new cgroup, which then
&gt; mysteriously inherits the old's swap, without any charge for it.

Instead of removing cgroup id right after all the csses have been
offlined, we should do that after csses have been destroyed.

To make sure an invalid css pointer won't be returned after the css
is destroyed, make sure css_from_id() returns NULL in this case.

tj: Updated comment to note planned changes for cgrp-&gt;id.

Reported-by: Hugh Dickins &lt;hughd@google.com&gt;
Signed-off-by: Li Zefan &lt;lizefan@huawei.com&gt;
Reviewed-by: Michal Hocko &lt;mhocko@suse.cz&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
</content>
</entry>
<entry>
<title>cgroup: fix cgroup_create() error handling path</title>
<updated>2013-12-06T20:08:50Z</updated>
<author>
<name>Tejun Heo</name>
<email>tj@kernel.org</email>
</author>
<published>2013-12-06T20:07:32Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=266ccd505e8acb98717819cef9d91d66c7b237cc'/>
<id>urn:sha1:266ccd505e8acb98717819cef9d91d66c7b237cc</id>
<content type='text'>
ae7f164a09 ("cgroup: move cgroup-&gt;subsys[] assignment to
online_css()") moved cgroup-&gt;subsys[] assignements later in
cgroup_create() but didn't update error handling path accordingly
leading to the following oops and leaking later css's after an
online_css() failure.  The oops is from cgroup destruction path being
invoked on the partially constructed cgroup which is not ready to
handle empty slots in cgrp-&gt;subsys[] array.

  BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
  IP: [&lt;ffffffff810eeaa8&gt;] cgroup_destroy_locked+0x118/0x2f0
  PGD a780a067 PUD aadbe067 PMD 0
  Oops: 0000 [#1] SMP
  Modules linked in:
  CPU: 6 PID: 7360 Comm: mkdir Not tainted 3.13.0-rc2+ #69
  Hardware name:
  task: ffff8800b9dbec00 ti: ffff8800a781a000 task.ti: ffff8800a781a000
  RIP: 0010:[&lt;ffffffff810eeaa8&gt;]  [&lt;ffffffff810eeaa8&gt;] cgroup_destroy_locked+0x118/0x2f0
  RSP: 0018:ffff8800a781bd98  EFLAGS: 00010282
  RAX: ffff880586903878 RBX: ffff880586903800 RCX: ffff880586903820
  RDX: ffff880586903860 RSI: ffff8800a781bdb0 RDI: ffff880586903820
  RBP: ffff8800a781bde8 R08: ffff88060e0b8048 R09: ffffffff811d7bc1
  R10: 000000000000008c R11: 0000000000000001 R12: ffff8800a72286c0
  R13: 0000000000000000 R14: ffffffff81cf7a40 R15: 0000000000000001
  FS:  00007f60ecda57a0(0000) GS:ffff8806272c0000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 0000000000000008 CR3: 00000000a7a03000 CR4: 00000000000007e0
  Stack:
   ffff880586903860 ffff880586903910 ffff8800a72286c0 ffff880586903820
   ffffffff81cf7a40 ffff880586903800 ffff88060e0b8018 ffffffff81cf7a40
   ffff8800b9dbec00 ffff8800b9dbf098 ffff8800a781bec8 ffffffff810ef5bf
  Call Trace:
   [&lt;ffffffff810ef5bf&gt;] cgroup_mkdir+0x55f/0x5f0
   [&lt;ffffffff811c90ae&gt;] vfs_mkdir+0xee/0x140
   [&lt;ffffffff811cb07e&gt;] SyS_mkdirat+0x6e/0xf0
   [&lt;ffffffff811c6a19&gt;] SyS_mkdir+0x19/0x20
   [&lt;ffffffff8169e569&gt;] system_call_fastpath+0x16/0x1b

This patch moves reference bumping inside online_css() loop, clears
css_ar[] as css's are brought online successfully, and updates
err_destroy path so that either a css is fully online and destroyed by
cgroup_destroy_locked() or the error path frees it.  This creates a
duplicate css free logic in the error path but it will be cleaned up
soon.

v2: Li pointed out that cgroup_destroy_locked() would do NULL-deref if
    invoked with a cgroup which doesn't have all css's populated.
    Update cgroup_destroy_locked() so that it skips NULL css's.

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Acked-by: Li Zefan &lt;lizefan@huawei.com&gt;
Reported-by: Vladimir Davydov &lt;vdavydov@parallels.com&gt;
Cc: stable@vger.kernel.org # v3.12+
</content>
</entry>
<entry>
<title>cgroup: fix cgroup_subsys_state leak for seq_files</title>
<updated>2013-11-27T23:16:21Z</updated>
<author>
<name>Tejun Heo</name>
<email>tj@kernel.org</email>
</author>
<published>2013-11-27T23:16:21Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=e605b36575e896edd8161534550c9ea021b03bc0'/>
<id>urn:sha1:e605b36575e896edd8161534550c9ea021b03bc0</id>
<content type='text'>
If a cgroup file implements either read_map() or read_seq_string(),
such file is served using seq_file by overriding file-&gt;f_op to
cgroup_seqfile_operations, which also overrides the release method to
single_release() from cgroup_file_release().

Because cgroup_file_open() didn't use to acquire any resources, this
used to be fine, but since f7d58818ba42 ("cgroup: pin
cgroup_subsys_state when opening a cgroupfs file"), cgroup_file_open()
pins the css (cgroup_subsys_state) which is put by
cgroup_file_release().  The patch forgot to update the release path
for seq_files and each open/release cycle leaks a css reference.

Fix it by updating cgroup_file_release() to also handle seq_files and
using it for seq_file release path too.

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Cc: stable@vger.kernel.org # v3.12
</content>
</entry>
<entry>
<title>cgroup: use a dedicated workqueue for cgroup destruction</title>
<updated>2013-11-22T22:14:39Z</updated>
<author>
<name>Tejun Heo</name>
<email>tj@kernel.org</email>
</author>
<published>2013-11-22T22:14:39Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=e5fca243abae1445afbfceebda5f08462ef869d3'/>
<id>urn:sha1:e5fca243abae1445afbfceebda5f08462ef869d3</id>
<content type='text'>
Since be44562613851 ("cgroup: remove synchronize_rcu() from
cgroup_diput()"), cgroup destruction path makes use of workqueue.  css
freeing is performed from a work item from that point on and a later
commit, ea15f8ccdb430 ("cgroup: split cgroup destruction into two
steps"), moves css offlining to workqueue too.

As cgroup destruction isn't depended upon for memory reclaim, the
destruction work items were put on the system_wq; unfortunately, some
controller may block in the destruction path for considerable duration
while holding cgroup_mutex.  As large part of destruction path is
synchronized through cgroup_mutex, when combined with high rate of
cgroup removals, this has potential to fill up system_wq's max_active
of 256.

Also, it turns out that memcg's css destruction path ends up queueing
and waiting for work items on system_wq through work_on_cpu().  If
such operation happens while system_wq is fully occupied by cgroup
destruction work items, work_on_cpu() can't make forward progress
because system_wq is full and other destruction work items on
system_wq can't make forward progress because the work item waiting
for work_on_cpu() is holding cgroup_mutex, leading to deadlock.

This can be fixed by queueing destruction work items on a separate
workqueue.  This patch creates a dedicated workqueue -
cgroup_destroy_wq - for this purpose.  As these work items shouldn't
have inter-dependencies and mostly serialized by cgroup_mutex anyway,
giving high concurrency level doesn't buy anything and the workqueue's
@max_active is set to 1 so that destruction work items are executed
one by one on each CPU.

Hugh Dickins: Because cgroup_init() is run before init_workqueues(),
cgroup_destroy_wq can't be allocated from cgroup_init().  Do it from a
separate core_initcall().  In the future, we probably want to reorder
so that workqueue init happens before cgroup_init().

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Reported-by: Hugh Dickins &lt;hughd@google.com&gt;
Reported-by: Shawn Bohrer &lt;shawn.bohrer@gmail.com&gt;
Link: http://lkml.kernel.org/r/20131111220626.GA7509@sbohrermbp13-local.rgmadvisors.com
Link: http://lkml.kernel.org/g/alpine.LNX.2.00.1310301606080.2333@eggly.anvils
Cc: stable@vger.kernel.org # v3.9+
</content>
</entry>
<entry>
<title>consolidate simple -&gt;d_delete() instances</title>
<updated>2013-11-16T03:04:17Z</updated>
<author>
<name>Al Viro</name>
<email>viro@zeniv.linux.org.uk</email>
</author>
<published>2013-10-25T22:47:37Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=b26d4cd385fc51e8844e2cdf9ba2051f5bba11a5'/>
<id>urn:sha1:b26d4cd385fc51e8844e2cdf9ba2051f5bba11a5</id>
<content type='text'>
Rename simple_delete_dentry() to always_delete_dentry() and export it.
Export simple_dentry_operations, while we are at it, and get rid of
their duplicates

Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</content>
</entry>
<entry>
<title>Merge branch 'for-3.13' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup</title>
<updated>2013-11-13T06:21:53Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2013-11-13T06:21:53Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=a9986464564609dd0962e6023513f7d3d313dc80'/>
<id>urn:sha1:a9986464564609dd0962e6023513f7d3d313dc80</id>
<content type='text'>
Pull cgroup changes from Tejun Heo:
 "Not too much activity this time around.  css_id is finally killed and
  a minor update to device_cgroup"

* 'for-3.13' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
  device_cgroup: remove can_attach
  cgroup: kill css_id
  memcg: stop using css id
  memcg: fail to create cgroup if the cgroup id is too big
  memcg: convert to use cgroup id
  memcg: convert to use cgroup_is_descendant()
</content>
</entry>
<entry>
<title>Merge branch 'for-3.12-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup</title>
<updated>2013-10-22T07:20:34Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2013-10-22T07:20:34Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=ee7eafc907db64ef4cbe8a17da3a1089cbe50617'/>
<id>urn:sha1:ee7eafc907db64ef4cbe8a17da3a1089cbe50617</id>
<content type='text'>
Pull cgroup fixes from Tejun Heo:
 "Two late fixes for cgroup.

  One fixes descendant walk introduced during this rc1 cycle.  The other
  fixes a post 3.9 bug during task attach which can lead to hang.  Both
  fixes are critical and the fixes are relatively straight-forward"

* 'for-3.12-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
  cgroup: fix to break the while loop in cgroup_attach_task() correctly
  cgroup: fix cgroup post-order descendant walk of empty subtree
</content>
</entry>
<entry>
<title>cgroup: fix to break the while loop in cgroup_attach_task() correctly</title>
<updated>2013-10-13T20:07:10Z</updated>
<author>
<name>Anjana V Kumar</name>
<email>anjanavk12@gmail.com</email>
</author>
<published>2013-10-12T02:59:17Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=ea84753c98a7ac6b74e530b64c444a912b3835ca'/>
<id>urn:sha1:ea84753c98a7ac6b74e530b64c444a912b3835ca</id>
<content type='text'>
Both Anjana and Eunki reported a stall in the while_each_thread loop
in cgroup_attach_task().

It's because, when we attach a single thread to a cgroup, if the cgroup
is exiting or is already in that cgroup, we won't break the loop.

If the task is already in the cgroup, the bug can lead to another thread
being attached to the cgroup unexpectedly:

  # echo 5207 &gt; tasks
  # cat tasks
  5207
  # echo 5207 &gt; tasks
  # cat tasks
  5207
  5215

What's worse, if the task to be attached isn't the leader of the thread
group, we might never exit the loop, hence cpu stall. Thanks for Oleg's
analysis.

This bug was introduced by commit 081aa458c38ba576bdd4265fc807fa95b48b9e79
("cgroup: consolidate cgroup_attach_task() and cgroup_attach_proc()")

[ lizf: - fixed the first continue, pointed out by Oleg,
        - rewrote changelog. ]

Cc: &lt;stable@vger.kernel.org&gt; # 3.9+
Reported-by: Eunki Kim &lt;eunki_kim@samsung.com&gt;
Reported-by: Anjana V Kumar &lt;anjanavk12@gmail.com&gt;
Signed-off-by: Anjana V Kumar &lt;anjanavk12@gmail.com&gt;
Signed-off-by: Li Zefan &lt;lizefan@huawei.com&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
</content>
</entry>
<entry>
<title>cgroup: kill css_id</title>
<updated>2013-09-24T01:44:16Z</updated>
<author>
<name>Li Zefan</name>
<email>lizefan@huawei.com</email>
</author>
<published>2013-09-23T08:57:03Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=2ff2a7d03bbe472ed44a8380dbdbea490d81c59d'/>
<id>urn:sha1:2ff2a7d03bbe472ed44a8380dbdbea490d81c59d</id>
<content type='text'>
The only user of css_id was memcg, and it has been convered to use
cgroup-&gt;id, so kill css_id.

Signed-off-by: Li Zefan &lt;lizefan@huwei.com&gt;
Reviewed-by: Michal Hocko &lt;mhocko@suse.cz&gt;
Acked-by: Tejun Heo &lt;tj@kernel.org&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
</content>
</entry>
<entry>
<title>cgroup: fix cgroup post-order descendant walk of empty subtree</title>
<updated>2013-09-10T13:41:00Z</updated>
<author>
<name>Tejun Heo</name>
<email>tj@kernel.org</email>
</author>
<published>2013-09-06T19:31:08Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=58b79a91f57efec9457de8ff93a4cc4fb8daf753'/>
<id>urn:sha1:58b79a91f57efec9457de8ff93a4cc4fb8daf753</id>
<content type='text'>
bd8815a6d8 ("cgroup: make css_for_each_descendant() and friends
include the origin css in the iteration") updated cgroup descendant
iterators to include the origin css; unfortuantely, it forgot to drop
special case handling in css_next_descendant_post() for empty subtree
leading to failure to visit the origin css without any child.

Fix it by dropping the special case handling and always returning the
leftmost descendant on the first iteration.

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Acked-by: Li Zefan &lt;lizefan@huawei.com&gt;
</content>
</entry>
</feed>
