<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/kernel/cgroup.c, branch v4.7</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v4.7</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v4.7'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2016-06-23T21:23:12Z</updated>
<entry>
<title>cgroup: Disable IRQs while holding css_set_lock</title>
<updated>2016-06-23T21:23:12Z</updated>
<author>
<name>Daniel Bristot de Oliveira</name>
<email>bristot@redhat.com</email>
</author>
<published>2016-06-22T20:28:41Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=82d6489d0fed2ec8a8c48c19e8d8a04ac8e5bb26'/>
<id>urn:sha1:82d6489d0fed2ec8a8c48c19e8d8a04ac8e5bb26</id>
<content type='text'>
While testing the deadline scheduler + cgroup setup I hit this
warning.

[  132.612935] ------------[ cut here ]------------
[  132.612951] WARNING: CPU: 5 PID: 0 at kernel/softirq.c:150 __local_bh_enable_ip+0x6b/0x80
[  132.612952] Modules linked in: (a ton of modules...)
[  132.612981] CPU: 5 PID: 0 Comm: swapper/5 Not tainted 4.7.0-rc2 #2
[  132.612981] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.8.2-20150714_191134- 04/01/2014
[  132.612982]  0000000000000086 45c8bb5effdd088b ffff88013fd43da0 ffffffff813d229e
[  132.612984]  0000000000000000 0000000000000000 ffff88013fd43de0 ffffffff810a652b
[  132.612985]  00000096811387b5 0000000000000200 ffff8800bab29d80 ffff880034c54c00
[  132.612986] Call Trace:
[  132.612987]  &lt;IRQ&gt;  [&lt;ffffffff813d229e&gt;] dump_stack+0x63/0x85
[  132.612994]  [&lt;ffffffff810a652b&gt;] __warn+0xcb/0xf0
[  132.612997]  [&lt;ffffffff810e76a0&gt;] ? push_dl_task.part.32+0x170/0x170
[  132.612999]  [&lt;ffffffff810a665d&gt;] warn_slowpath_null+0x1d/0x20
[  132.613000]  [&lt;ffffffff810aba5b&gt;] __local_bh_enable_ip+0x6b/0x80
[  132.613008]  [&lt;ffffffff817d6c8a&gt;] _raw_write_unlock_bh+0x1a/0x20
[  132.613010]  [&lt;ffffffff817d6c9e&gt;] _raw_spin_unlock_bh+0xe/0x10
[  132.613015]  [&lt;ffffffff811388ac&gt;] put_css_set+0x5c/0x60
[  132.613016]  [&lt;ffffffff8113dc7f&gt;] cgroup_free+0x7f/0xa0
[  132.613017]  [&lt;ffffffff810a3912&gt;] __put_task_struct+0x42/0x140
[  132.613018]  [&lt;ffffffff810e776a&gt;] dl_task_timer+0xca/0x250
[  132.613027]  [&lt;ffffffff810e76a0&gt;] ? push_dl_task.part.32+0x170/0x170
[  132.613030]  [&lt;ffffffff8111371e&gt;] __hrtimer_run_queues+0xee/0x270
[  132.613031]  [&lt;ffffffff81113ec8&gt;] hrtimer_interrupt+0xa8/0x190
[  132.613034]  [&lt;ffffffff81051a58&gt;] local_apic_timer_interrupt+0x38/0x60
[  132.613035]  [&lt;ffffffff817d9b0d&gt;] smp_apic_timer_interrupt+0x3d/0x50
[  132.613037]  [&lt;ffffffff817d7c5c&gt;] apic_timer_interrupt+0x8c/0xa0
[  132.613038]  &lt;EOI&gt;  [&lt;ffffffff81063466&gt;] ? native_safe_halt+0x6/0x10
[  132.613043]  [&lt;ffffffff81037a4e&gt;] default_idle+0x1e/0xd0
[  132.613044]  [&lt;ffffffff810381cf&gt;] arch_cpu_idle+0xf/0x20
[  132.613046]  [&lt;ffffffff810e8fda&gt;] default_idle_call+0x2a/0x40
[  132.613047]  [&lt;ffffffff810e92d7&gt;] cpu_startup_entry+0x2e7/0x340
[  132.613048]  [&lt;ffffffff81050235&gt;] start_secondary+0x155/0x190
[  132.613049] ---[ end trace f91934d162ce9977 ]---

The warn is the spin_(lock|unlock)_bh(&amp;css_set_lock) in the interrupt
context. Converting the spin_lock_bh to spin_lock_irq(save) to avoid
this problem - and other problems of sharing a spinlock with an
interrupt.

Cc: Tejun Heo &lt;tj@kernel.org&gt;
Cc: Li Zefan &lt;lizefan@huawei.com&gt;
Cc: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: Juri Lelli &lt;juri.lelli@arm.com&gt;
Cc: Steven Rostedt &lt;rostedt@goodmis.org&gt;
Cc: cgroups@vger.kernel.org
Cc: stable@vger.kernel.org # 4.5+
Cc: linux-kernel@vger.kernel.org
Reviewed-by: Rik van Riel &lt;riel@redhat.com&gt;
Reviewed-by: "Luis Claudio R. Goncalves" &lt;lgoncalv@redhat.com&gt;
Signed-off-by: Daniel Bristot de Oliveira &lt;bristot@redhat.com&gt;
Acked-by: Zefan Li &lt;lizefan@huawei.com&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
</content>
</entry>
<entry>
<title>cgroup: set css-&gt;id to -1 during init</title>
<updated>2016-06-16T21:59:35Z</updated>
<author>
<name>Tejun Heo</name>
<email>tj@kernel.org</email>
</author>
<published>2016-05-26T19:42:13Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=8fa3b8d689a54d6d04ff7803c724fb7aca6ce98e'/>
<id>urn:sha1:8fa3b8d689a54d6d04ff7803c724fb7aca6ce98e</id>
<content type='text'>
If percpu_ref initialization fails during css_create(), the free path
can end up trying to free css-&gt;id of zero.  As ID 0 is unused, it
doesn't cause a critical breakage but it does trigger a warning
message.  Fix it by setting css-&gt;id to -1 from init_and_link_css().

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Cc: Wenwei Tao &lt;ww.tao0320@gmail.com&gt;
Fixes: 01e586598b22 ("cgroup: release css-&gt;id after css_free")
Cc: stable@vger.kernel.org # v4.0+
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
</content>
</entry>
<entry>
<title>cgroup: remove redundant cleanup in css_create</title>
<updated>2016-05-26T19:09:23Z</updated>
<author>
<name>Wenwei Tao</name>
<email>ww.tao0320@gmail.com</email>
</author>
<published>2016-05-13T14:59:20Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=b00c52dae6d9ee8d0f2407118ef6544ae5524781'/>
<id>urn:sha1:b00c52dae6d9ee8d0f2407118ef6544ae5524781</id>
<content type='text'>
When create css failed, before call css_free_rcu_fn, we remove the css
id and exit the percpu_ref, but we will do these again in
css_free_work_fn, so they are redundant.  Especially the css id, that
would cause problem if we remove it twice, since it may be assigned to
another css after the first remove.

tj: This was broken by two commits updating the free path without
    synchronizing the creation failure path.  This can be easily
    triggered by trying to create more than 64k memory cgroups.

Signed-off-by: Wenwei Tao &lt;ww.tao0320@gmail.com&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Cc: Vladimir Davydov &lt;vdavydov@parallels.com&gt;
Fixes: 9a1049da9bd2 ("percpu-refcount: require percpu_ref to be exited explicitly")
Fixes: 01e586598b22 ("cgroup: release css-&gt;id after css_free")
Cc: stable@vger.kernel.org # v3.17+
</content>
</entry>
<entry>
<title>cgroup: fix compile warning</title>
<updated>2016-05-12T15:05:27Z</updated>
<author>
<name>Felipe Balbi</name>
<email>felipe.balbi@linux.intel.com</email>
</author>
<published>2016-05-12T09:34:38Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=09be4c824ebdbf3c043a07d2d9537a0164a1ecfe'/>
<id>urn:sha1:09be4c824ebdbf3c043a07d2d9537a0164a1ecfe</id>
<content type='text'>
commit 4f41fc59620f ("cgroup, kernfs: make mountinfo
 show properly scoped path for cgroup namespaces")
 added the following compile warning:

kernel/cgroup.c: In function ‘cgroup_show_path’:
kernel/cgroup.c:1634:15: warning: unused variable ‘ret’ [-Wunused-variable]
  int len = 0, ret = 0;
               ^
fix it.

Fixes: 4f41fc59620f ("cgroup, kernfs: make mountinfo show properly scoped path for cgroup namespaces")
Signed-off-by: Felipe Balbi &lt;felipe.balbi@linux.intel.com&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
</content>
</entry>
<entry>
<title>cgroup, kernfs: make mountinfo show properly scoped path for cgroup namespaces</title>
<updated>2016-05-09T16:15:03Z</updated>
<author>
<name>Serge E. Hallyn</name>
<email>serge.hallyn@ubuntu.com</email>
</author>
<published>2016-05-09T14:59:55Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=4f41fc59620fcedaa97cbdf3d7d2956d80fcd922'/>
<id>urn:sha1:4f41fc59620fcedaa97cbdf3d7d2956d80fcd922</id>
<content type='text'>
Patch summary:

When showing a cgroupfs entry in mountinfo, show the path of the mount
root dentry relative to the reader's cgroup namespace root.

Short explanation (courtesy of mkerrisk):

If we create a new cgroup namespace, then we want both /proc/self/cgroup
and /proc/self/mountinfo to show cgroup paths that are correctly
virtualized with respect to the cgroup mount point.  Previous to this
patch, /proc/self/cgroup shows the right info, but /proc/self/mountinfo
does not.

Long version:

When a uid 0 task which is in freezer cgroup /a/b, unshares a new cgroup
namespace, and then mounts a new instance of the freezer cgroup, the new
mount will be rooted at /a/b.  The root dentry field of the mountinfo
entry will show '/a/b'.

 cat &gt; /tmp/do1 &lt;&lt; EOF
 mount -t cgroup -o freezer freezer /mnt
 grep freezer /proc/self/mountinfo
 EOF

 unshare -Gm  bash /tmp/do1
 &gt; 330 160 0:34 / /sys/fs/cgroup/freezer rw,nosuid,nodev,noexec,relatime - cgroup cgroup rw,freezer
 &gt; 355 133 0:34 /a/b /mnt rw,relatime - cgroup freezer rw,freezer

The task's freezer cgroup entry in /proc/self/cgroup will simply show
'/':

 grep freezer /proc/self/cgroup
 9:freezer:/

If instead the same task simply bind mounts the /a/b cgroup directory,
the resulting mountinfo entry will again show /a/b for the dentry root.
However in this case the task will find its own cgroup at /mnt/a/b,
not at /mnt:

 mount --bind /sys/fs/cgroup/freezer/a/b /mnt
 130 25 0:34 /a/b /mnt rw,nosuid,nodev,noexec,relatime shared:21 - cgroup cgroup rw,freezer

In other words, there is no way for the task to know, based on what is
in mountinfo, which cgroup directory is its own.

Example (by mkerrisk):

First, a little script to save some typing and verbiage:

echo -e "\t/proc/self/cgroup:\t$(cat /proc/self/cgroup | grep freezer)"
cat /proc/self/mountinfo | grep freezer |
        awk '{print "\tmountinfo:\t\t" $4 "\t" $5}'

Create cgroup, place this shell into the cgroup, and look at the state
of the /proc files:

2653
2653                         # Our shell
14254                        # cat(1)
        /proc/self/cgroup:      10:freezer:/a/b
        mountinfo:              /       /sys/fs/cgroup/freezer

Create a shell in new cgroup and mount namespaces. The act of creating
a new cgroup namespace causes the process's current cgroups directories
to become its cgroup root directories. (Here, I'm using my own version
of the "unshare" utility, which takes the same options as the util-linux
version):

Look at the state of the /proc files:

        /proc/self/cgroup:      10:freezer:/
        mountinfo:              /       /sys/fs/cgroup/freezer

The third entry in /proc/self/cgroup (the pathname of the cgroup inside
the hierarchy) is correctly virtualized w.r.t. the cgroup namespace, which
is rooted at /a/b in the outer namespace.

However, the info in /proc/self/mountinfo is not for this cgroup
namespace, since we are seeing a duplicate of the mount from the
old mount namespace, and the info there does not correspond to the
new cgroup namespace. However, trying to create a new mount still
doesn't show us the right information in mountinfo:

                                      # propagating to other mountns
        /proc/self/cgroup:      7:freezer:/
        mountinfo:              /a/b    /mnt/freezer

The act of creating a new cgroup namespace caused the process's
current freezer directory, "/a/b", to become its cgroup freezer root
directory. In other words, the pathname directory of the directory
within the newly mounted cgroup filesystem should be "/",
but mountinfo wrongly shows us "/a/b". The consequence of this is
that the process in the cgroup namespace cannot correctly construct
the pathname of its cgroup root directory from the information in
/proc/PID/mountinfo.

With this patch, the dentry root field in mountinfo is shown relative
to the reader's cgroup namespace.  So the same steps as above:

        /proc/self/cgroup:      10:freezer:/a/b
        mountinfo:              /       /sys/fs/cgroup/freezer
        /proc/self/cgroup:      10:freezer:/
        mountinfo:              /../..  /sys/fs/cgroup/freezer
        /proc/self/cgroup:      10:freezer:/
        mountinfo:              /       /mnt/freezer

cgroup.clone_children  freezer.parent_freezing  freezer.state      tasks
cgroup.procs           freezer.self_freezing    notify_on_release
3164
2653                   # First shell that placed in this cgroup
3164                   # Shell started by 'unshare'
14197                  # cat(1)

Signed-off-by: Serge Hallyn &lt;serge.hallyn@ubuntu.com&gt;
Tested-by: Michael Kerrisk &lt;mtk.manpages@gmail.com&gt;
Acked-by: Michael Kerrisk &lt;mtk.manpages@gmail.com&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
</content>
</entry>
<entry>
<title>cgroup, cpuset: replace cpuset_post_attach_flush() with cgroup_subsys-&gt;post_attach callback</title>
<updated>2016-04-25T19:45:14Z</updated>
<author>
<name>Tejun Heo</name>
<email>tj@kernel.org</email>
</author>
<published>2016-04-21T23:06:48Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=5cf1cacb49aee39c3e02ae87068fc3c6430659b0'/>
<id>urn:sha1:5cf1cacb49aee39c3e02ae87068fc3c6430659b0</id>
<content type='text'>
Since e93ad19d0564 ("cpuset: make mm migration asynchronous"), cpuset
kicks off asynchronous NUMA node migration if necessary during task
migration and flushes it from cpuset_post_attach_flush() which is
called at the end of __cgroup_procs_write().  This is to avoid
performing migration with cgroup_threadgroup_rwsem write-locked which
can lead to deadlock through dependency on kworker creation.

memcg has a similar issue with charge moving, so let's convert it to
an official callback rather than the current one-off cpuset specific
function.  This patch adds cgroup_subsys-&gt;post_attach callback and
makes cpuset register cpuset_post_attach_flush() as its -&gt;post_attach.

The conversion is mostly one-to-one except that the new callback is
called under cgroup_mutex.  This is to guarantee that no other
migration operations are started before -&gt;post_attach callbacks are
finished.  cgroup_mutex is one of the outermost mutex in the system
and has never been and shouldn't be a problem.  We can add specialized
synchronization around __cgroup_procs_write() but I don't think
there's any noticeable benefit.

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Cc: Li Zefan &lt;lizefan@huawei.com&gt;
Cc: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: Michal Hocko &lt;mhocko@kernel.org&gt;
Cc: &lt;stable@vger.kernel.org&gt; # 4.4+ prerequisite for the next patch
</content>
</entry>
<entry>
<title>Merge branch 'for-4.6-ns' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup</title>
<updated>2016-03-21T17:05:13Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2016-03-21T17:05:13Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=5518f66b5a64b76fd602a7baf60590cd838a2ca0'/>
<id>urn:sha1:5518f66b5a64b76fd602a7baf60590cd838a2ca0</id>
<content type='text'>
Pull cgroup namespace support from Tejun Heo:
 "These are changes to implement namespace support for cgroup which has
  been pending for quite some time now.  It is very straight-forward and
  only affects what part of cgroup hierarchies are visible.

  After unsharing, mounting a cgroup fs will be scoped to the cgroups
  the task belonged to at the time of unsharing and the cgroup paths
  exposed to userland would be adjusted accordingly"

* 'for-4.6-ns' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
  cgroup: fix and restructure error handling in copy_cgroup_ns()
  cgroup: fix alloc_cgroup_ns() error handling in copy_cgroup_ns()
  Add FS_USERNS_FLAG to cgroup fs
  cgroup: Add documentation for cgroup namespaces
  cgroup: mount cgroupns-root when inside non-init cgroupns
  kernfs: define kernfs_node_dentry
  cgroup: cgroup namespace setns support
  cgroup: introduce cgroup namespaces
  sched: new clone flag CLONE_NEWCGROUP for cgroup namespace
  kernfs: Add API to generate relative kernfs path
</content>
</entry>
<entry>
<title>cgroup: avoid false positive gcc-6 warning</title>
<updated>2016-03-16T20:32:23Z</updated>
<author>
<name>Arnd Bergmann</name>
<email>arnd@arndb.de</email>
</author>
<published>2016-03-14T23:21:06Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=cfe02a8a973e7e5f66926b8ae38dfce404b19e29'/>
<id>urn:sha1:cfe02a8a973e7e5f66926b8ae38dfce404b19e29</id>
<content type='text'>
When all subsystems are disabled, gcc notices that cgroup_subsys_enabled_key
is a zero-length array and that any access to it must be out of bounds:

In file included from ../include/linux/cgroup.h:19:0,
                 from ../kernel/cgroup.c:31:
../kernel/cgroup.c: In function 'cgroup_add_cftypes':
../kernel/cgroup.c:261:53: error: array subscript is above array bounds [-Werror=array-bounds]
  return static_key_enabled(cgroup_subsys_enabled_key[ssid]);
                            ~~~~~~~~~~~~~~~~~~~~~~~~~^~~~~~
../include/linux/jump_label.h:271:40: note: in definition of macro 'static_key_enabled'
  static_key_count((struct static_key *)x) &gt; 0;    \
                                        ^

We should never call the function in this particular case, so this is
not a bug. In order to silence the warning, this adds an explicit check
for the CGROUP_SUBSYS_COUNT==0 case.

Signed-off-by: Arnd Bergmann &lt;arnd@arndb.de&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
</content>
</entry>
<entry>
<title>cgroup: ignore css_sets associated with dead cgroups during migration</title>
<updated>2016-03-16T20:31:46Z</updated>
<author>
<name>Tejun Heo</name>
<email>tj@kernel.org</email>
</author>
<published>2016-03-16T00:43:04Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=2b021cbf3cb6208f0d40fd2f1869f237934340ed'/>
<id>urn:sha1:2b021cbf3cb6208f0d40fd2f1869f237934340ed</id>
<content type='text'>
Before 2e91fa7f6d45 ("cgroup: keep zombies associated with their
original cgroups"), all dead tasks were associated with init_css_set.
If a zombie task is requested for migration, while migration prep
operations would still be performed on init_css_set, the actual
migration would ignore zombie tasks.  As init_css_set is always valid,
this worked fine.

However, after 2e91fa7f6d45, zombie tasks stay with the css_set it was
associated with at the time of death.  Let's say a task T associated
with cgroup A on hierarchy H-1 and cgroup B on hiearchy H-2.  After T
becomes a zombie, it would still remain associated with A and B.  If A
only contains zombie tasks, it can be removed.  On removal, A gets
marked offline but stays pinned until all zombies are drained.  At
this point, if migration is initiated on T to a cgroup C on hierarchy
H-2, migration path would try to prepare T's css_set for migration and
trigger the following.

 WARNING: CPU: 0 PID: 1576 at kernel/cgroup.c:474 cgroup_get+0x121/0x160()
 CPU: 0 PID: 1576 Comm: bash Not tainted 4.4.0-work+ #289
 ...
 Call Trace:
  [&lt;ffffffff8127e63c&gt;] dump_stack+0x4e/0x82
  [&lt;ffffffff810445e8&gt;] warn_slowpath_common+0x78/0xb0
  [&lt;ffffffff810446d5&gt;] warn_slowpath_null+0x15/0x20
  [&lt;ffffffff810c33e1&gt;] cgroup_get+0x121/0x160
  [&lt;ffffffff810c349b&gt;] link_css_set+0x7b/0x90
  [&lt;ffffffff810c4fbc&gt;] find_css_set+0x3bc/0x5e0
  [&lt;ffffffff810c5269&gt;] cgroup_migrate_prepare_dst+0x89/0x1f0
  [&lt;ffffffff810c7547&gt;] cgroup_attach_task+0x157/0x230
  [&lt;ffffffff810c7a17&gt;] __cgroup_procs_write+0x2b7/0x470
  [&lt;ffffffff810c7bdc&gt;] cgroup_tasks_write+0xc/0x10
  [&lt;ffffffff810c4790&gt;] cgroup_file_write+0x30/0x1b0
  [&lt;ffffffff811c68fc&gt;] kernfs_fop_write+0x13c/0x180
  [&lt;ffffffff81151673&gt;] __vfs_write+0x23/0xe0
  [&lt;ffffffff81152494&gt;] vfs_write+0xa4/0x1a0
  [&lt;ffffffff811532d4&gt;] SyS_write+0x44/0xa0
  [&lt;ffffffff814af2d7&gt;] entry_SYSCALL_64_fastpath+0x12/0x6f

It doesn't make sense to prepare migration for css_sets pointing to
dead cgroups as they are guaranteed to contain only zombies which are
ignored later during migration.  This patch makes cgroup destruction
path mark all affected css_sets as dead and updates the migration path
to ignore them during preparation.

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Fixes: 2e91fa7f6d45 ("cgroup: keep zombies associated with their original cgroups")
Cc: stable@vger.kernel.org # v4.4+
</content>
</entry>
<entry>
<title>cgroup: implement cgroup_subsys-&gt;implicit_on_dfl</title>
<updated>2016-03-08T16:51:26Z</updated>
<author>
<name>Tejun Heo</name>
<email>tj@kernel.org</email>
</author>
<published>2016-03-08T16:51:26Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=f6d635ad341d5cc0b9c7ab46adfbf3bf5886cee4'/>
<id>urn:sha1:f6d635ad341d5cc0b9c7ab46adfbf3bf5886cee4</id>
<content type='text'>
Some controllers, perf_event for now and possibly freezer in the
future, don't really make sense to control explicitly through
"cgroup.subtree_control".  For example, the primary role of perf_event
is identifying the cgroups of tasks; however, because the controller
also keeps a small amount of state per cgroup, it can't be replaced
with simple cgroup membership tests.

This patch implements cgroup_subsys-&gt;implicit_on_dfl flag.  When set,
the controller is implicitly enabled on all cgroups on the v2
hierarchy so that utility type controllers such as perf_event can be
enabled and function transparently.

An implicit controller doesn't show up in "cgroup.controllers" or
"cgroup.subtree_control", is exempt from no internal process rule and
can be stolen from the default hierarchy even if there are non-root
csses.

v2: Reimplemented on top of the recent updates to css handling and
    subsystem rebinding.  Rebinding implicit subsystems is now a
    simple matter of exempting it from the busy subsystem check.

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
</content>
</entry>
</feed>
