<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/kernel/cgroup/cgroup.c, branch v5.0</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v5.0</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v5.0'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2018-12-29T18:57:20Z</updated>
<entry>
<title>Merge branch 'for-4.21' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup</title>
<updated>2018-12-29T18:57:20Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2018-12-29T18:57:20Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=6f9d71c9c759b1e7d31189a4de228983192c7dc7'/>
<id>urn:sha1:6f9d71c9c759b1e7d31189a4de228983192c7dc7</id>
<content type='text'>
Pull cgroup updates from Tejun Heo:

 - Waiman's cgroup2 cpuset support has been finally merged closing one
   of the last remaining feature gaps.

 - cgroup.procs could show non-leader threads when cgroup2 threaded mode
   was used in certain ways. I forgot to push the fix during the last
   cycle.

 - A patch to fix mount option parsing when all mount options have been
   consumed by someone else (LSM).

 - cgroup_no_v1 boot param can now block named cgroup1 hierarchies too.

* 'for-4.21' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
  cgroup: Add named hierarchy disabling to cgroup_no_v1 boot param
  cgroup: fix parsing empty mount option string
  cpuset: Remove set but not used variable 'cs'
  cgroup: fix CSS_TASK_ITER_PROCS
  cgroup: Add .__DEBUG__. prefix to debug file names
  cpuset: Minor cgroup2 interface updates
  cpuset: Expose cpuset.cpus.subpartitions with cgroup_debug
  cpuset: Add documentation about the new "cpuset.sched.partition" flag
  cpuset: Use descriptive text when reading/writing cpuset.sched.partition
  cpuset: Expose cpus.effective and mems.effective on cgroup v2 root
  cpuset: Make generate_sched_domains() work with partition
  cpuset: Make CPU hotplug work with partition
  cpuset: Track cpusets that use parent's effective_cpus
  cpuset: Add an error state to cpuset.sched.partition
  cpuset: Add new v2 cpuset.sched.partition flag
  cpuset: Simply allocation and freeing of cpumasks
  cpuset: Define data structures to support scheduling partition
  cpuset: Enable cpuset controller in default hierarchy
  cgroup: remove unnecessary unlikely()
</content>
</entry>
<entry>
<title>Merge tag 'for-4.21/block-20181221' of git://git.kernel.dk/linux-block</title>
<updated>2018-12-28T21:19:59Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2018-12-28T21:19:59Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=0e9da3fbf7d81f0f913b491c8de1ba7883d4f217'/>
<id>urn:sha1:0e9da3fbf7d81f0f913b491c8de1ba7883d4f217</id>
<content type='text'>
Pull block updates from Jens Axboe:
 "This is the main pull request for block/storage for 4.21.

  Larger than usual, it was a busy round with lots of goodies queued up.
  Most notable is the removal of the old IO stack, which has been a long
  time coming. No new features for a while, everything coming in this
  week has all been fixes for things that were previously merged.

  This contains:

   - Use atomic counters instead of semaphores for mtip32xx (Arnd)

   - Cleanup of the mtip32xx request setup (Christoph)

   - Fix for circular locking dependency in loop (Jan, Tetsuo)

   - bcache (Coly, Guoju, Shenghui)
      * Optimizations for writeback caching
      * Various fixes and improvements

   - nvme (Chaitanya, Christoph, Sagi, Jay, me, Keith)
      * host and target support for NVMe over TCP
      * Error log page support
      * Support for separate read/write/poll queues
      * Much improved polling
      * discard OOM fallback
      * Tracepoint improvements

   - lightnvm (Hans, Hua, Igor, Matias, Javier)
      * Igor added packed metadata to pblk. Now drives without metadata
        per LBA can be used as well.
      * Fix from Geert on uninitialized value on chunk metadata reads.
      * Fixes from Hans and Javier to pblk recovery and write path.
      * Fix from Hua Su to fix a race condition in the pblk recovery
        code.
      * Scan optimization added to pblk recovery from Zhoujie.
      * Small geometry cleanup from me.

   - Conversion of the last few drivers that used the legacy path to
     blk-mq (me)

   - Removal of legacy IO path in SCSI (me, Christoph)

   - Removal of legacy IO stack and schedulers (me)

   - Support for much better polling, now without interrupts at all.
     blk-mq adds support for multiple queue maps, which enables us to
     have a map per type. This in turn enables nvme to have separate
     completion queues for polling, which can then be interrupt-less.
     Also means we're ready for async polled IO, which is hopefully
     coming in the next release.

   - Killing of (now) unused block exports (Christoph)

   - Unification of the blk-rq-qos and blk-wbt wait handling (Josef)

   - Support for zoned testing with null_blk (Masato)

   - sx8 conversion to per-host tag sets (Christoph)

   - IO priority improvements (Damien)

   - mq-deadline zoned fix (Damien)

   - Ref count blkcg series (Dennis)

   - Lots of blk-mq improvements and speedups (me)

   - sbitmap scalability improvements (me)

   - Make core inflight IO accounting per-cpu (Mikulas)

   - Export timeout setting in sysfs (Weiping)

   - Cleanup the direct issue path (Jianchao)

   - Export blk-wbt internals in block debugfs for easier debugging
     (Ming)

   - Lots of other fixes and improvements"

* tag 'for-4.21/block-20181221' of git://git.kernel.dk/linux-block: (364 commits)
  kyber: use sbitmap add_wait_queue/list_del wait helpers
  sbitmap: add helpers for add/del wait queue handling
  block: save irq state in blkg_lookup_create()
  dm: don't reuse bio for flushes
  nvme-pci: trace SQ status on completions
  nvme-rdma: implement polling queue map
  nvme-fabrics: allow user to pass in nr_poll_queues
  nvme-fabrics: allow nvmf_connect_io_queue to poll
  nvme-core: optionally poll sync commands
  block: make request_to_qc_t public
  nvme-tcp: fix spelling mistake "attepmpt" -&gt; "attempt"
  nvme-tcp: fix endianess annotations
  nvmet-tcp: fix endianess annotations
  nvme-pci: refactor nvme_poll_irqdisable to make sparse happy
  nvme-pci: only set nr_maps to 2 if poll queues are supported
  nvmet: use a macro for default error location
  nvmet: fix comparison of a u16 with -1
  blk-mq: enable IO poll if .nr_queues of type poll &gt; 0
  blk-mq: change blk_mq_queue_busy() to blk_mq_queue_inflight()
  blk-mq: skip zero-queue maps in blk_mq_map_swqueue
  ...
</content>
</entry>
<entry>
<title>cgroup: fix parsing empty mount option string</title>
<updated>2018-12-28T18:32:57Z</updated>
<author>
<name>Ondrej Mosnacek</name>
<email>omosnace@redhat.com</email>
</author>
<published>2018-12-13T14:17:37Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=e250d91d65750a0c0c62483ac4f9f357e7317617'/>
<id>urn:sha1:e250d91d65750a0c0c62483ac4f9f357e7317617</id>
<content type='text'>
This fixes the case where all mount options specified are consumed by an
LSM and all that's left is an empty string. In this case cgroupfs should
accept the string and not fail.

How to reproduce (with SELinux enabled):

    # umount /sys/fs/cgroup/unified
    # mount -o context=system_u:object_r:cgroup_t:s0 -t cgroup2 cgroup2 /sys/fs/cgroup/unified
    mount: /sys/fs/cgroup/unified: wrong fs type, bad option, bad superblock on cgroup2, missing codepage or helper program, or other error.
    # dmesg | tail -n 1
    [   31.575952] cgroup: cgroup2: unknown option ""

Fixes: 67e9c74b8a87 ("cgroup: replace __DEVEL__sane_behavior with cgroup2 fs type")
[NOTE: should apply on top of commit 5136f6365ce3 ("cgroup: implement "nsdelegate" mount option"), older versions need manual rebase]
Suggested-by: Stephen Smalley &lt;sds@tycho.nsa.gov&gt;
Signed-off-by: Ondrej Mosnacek &lt;omosnace@redhat.com&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
</content>
</entry>
<entry>
<title>Merge branch 'for-4.20-fixes' into for-4.21</title>
<updated>2018-12-28T02:05:30Z</updated>
<author>
<name>Tejun Heo</name>
<email>tj@kernel.org</email>
</author>
<published>2018-12-28T02:05:30Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=4d71c6f8771a6bccb844244f09831fa4624b22c1'/>
<id>urn:sha1:4d71c6f8771a6bccb844244f09831fa4624b22c1</id>
<content type='text'>
</content>
</entry>
<entry>
<title>blkcg: remove additional reference to the css</title>
<updated>2018-12-08T05:26:37Z</updated>
<author>
<name>Dennis Zhou</name>
<email>dennis@kernel.org</email>
</author>
<published>2018-12-05T17:10:36Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=fc5a828bfad628c1092194f2814604943561c52d'/>
<id>urn:sha1:fc5a828bfad628c1092194f2814604943561c52d</id>
<content type='text'>
The previous patch in this series removed carrying around a pointer to
the css in blkg. However, the blkg association logic still relied on
taking a reference on the css to ensure we wouldn't fail in getting a
reference for the blkg.

Here the implicit dependency on the css is removed. The association
continues to rely on the tryget logic walking up the blkg tree. This
streamlines the three ways that association can happen: normal, swap,
and writeback.

Signed-off-by: Dennis Zhou &lt;dennis@kernel.org&gt;
Acked-by: Tejun Heo &lt;tj@kernel.org&gt;
Reviewed-by: Josef Bacik &lt;josef@toxicpanda.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>cgroups: Replace synchronize_sched() with synchronize_rcu()</title>
<updated>2018-12-01T20:38:49Z</updated>
<author>
<name>Paul E. McKenney</name>
<email>paulmck@linux.ibm.com</email>
</author>
<published>2018-11-07T22:11:40Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=2af3024cd78f120d027cb44b454186ba9d7dab24'/>
<id>urn:sha1:2af3024cd78f120d027cb44b454186ba9d7dab24</id>
<content type='text'>
Now that synchronize_rcu() waits for preempt-disable regions of code
as well as RCU read-side critical sections, synchronize_sched() can be
replaced by synchronize_rcu().  This commit therefore makes this change,
even though it is but a comment.

Signed-off-by: Paul E. McKenney &lt;paulmck@linux.ibm.com&gt;
Cc: Jens Axboe &lt;axboe@kernel.dk&gt;
Cc: Dennis Zhou &lt;dennis@kernel.org&gt;
Cc: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: "Dennis Zhou (Facebook)" &lt;dennisszhou@gmail.com&gt;
Acked-by: Tejun Heo &lt;tj@kernel.org&gt;
</content>
</entry>
<entry>
<title>cgroup: fix CSS_TASK_ITER_PROCS</title>
<updated>2018-11-20T16:12:20Z</updated>
<author>
<name>Tejun Heo</name>
<email>tj@kernel.org</email>
</author>
<published>2018-11-08T20:15:15Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=e9d81a1bc2c48ea9782e3e8b53875f419766ef47'/>
<id>urn:sha1:e9d81a1bc2c48ea9782e3e8b53875f419766ef47</id>
<content type='text'>
CSS_TASK_ITER_PROCS implements process-only iteration by making
css_task_iter_advance() skip tasks which aren't threadgroup leaders;
however, when an iteration is started css_task_iter_start() calls the
inner helper function css_task_iter_advance_css_set() instead of
css_task_iter_advance().  As the helper doesn't have the skip logic,
when the first task to visit is a non-leader thread, it doesn't get
skipped correctly as shown in the following example.

  # ps -L 2030
    PID   LWP TTY      STAT   TIME COMMAND
   2030  2030 pts/0    Sl+    0:00 ./test-thread
   2030  2031 pts/0    Sl+    0:00 ./test-thread
  # mkdir -p /sys/fs/cgroup/x/a/b
  # echo threaded &gt; /sys/fs/cgroup/x/a/cgroup.type
  # echo threaded &gt; /sys/fs/cgroup/x/a/b/cgroup.type
  # echo 2030 &gt; /sys/fs/cgroup/x/a/cgroup.procs
  # cat /sys/fs/cgroup/x/a/cgroup.threads
  2030
  2031
  # cat /sys/fs/cgroup/x/cgroup.procs
  2030
  # echo 2030 &gt; /sys/fs/cgroup/x/a/b/cgroup.threads
  # cat /sys/fs/cgroup/x/cgroup.procs
  2031
  2030

The last read of cgroup.procs is incorrectly showing non-leader 2031
in cgroup.procs output.

This can be fixed by updating css_task_iter_advance() to handle the
first advance and css_task_iters_tart() to call
css_task_iter_advance() instead of the inner helper.  After the fix,
the same commands result in the following (correct) result:

  # ps -L 2062
    PID   LWP TTY      STAT   TIME COMMAND
   2062  2062 pts/0    Sl+    0:00 ./test-thread
   2062  2063 pts/0    Sl+    0:00 ./test-thread
  # mkdir -p /sys/fs/cgroup/x/a/b
  # echo threaded &gt; /sys/fs/cgroup/x/a/cgroup.type
  # echo threaded &gt; /sys/fs/cgroup/x/a/b/cgroup.type
  # echo 2062 &gt; /sys/fs/cgroup/x/a/cgroup.procs
  # cat /sys/fs/cgroup/x/a/cgroup.threads
  2062
  2063
  # cat /sys/fs/cgroup/x/cgroup.procs
  2062
  # echo 2062 &gt; /sys/fs/cgroup/x/a/b/cgroup.threads
  # cat /sys/fs/cgroup/x/cgroup.procs
  2062

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Reported-by: "Michael Kerrisk (man-pages)" &lt;mtk.manpages@gmail.com&gt;
Fixes: 8cfd8147df67 ("cgroup: implement cgroup v2 thread support")
Cc: stable@vger.kernel.org # v4.14+
</content>
</entry>
<entry>
<title>cgroup: Add .__DEBUG__. prefix to debug file names</title>
<updated>2018-11-13T20:16:01Z</updated>
<author>
<name>Tejun Heo</name>
<email>tj@kernel.org</email>
</author>
<published>2018-11-13T20:06:41Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=c1bbd933e5fae83f96acd3f748bb01a0300b315d'/>
<id>urn:sha1:c1bbd933e5fae83f96acd3f748bb01a0300b315d</id>
<content type='text'>
Clearly mark the debug files and hide them by default by prefixing
".__DEBUG__.".

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Cc: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: Waiman Long &lt;longman@redhat.com&gt;
</content>
</entry>
<entry>
<title>cpuset: Expose cpuset.cpus.subpartitions with cgroup_debug</title>
<updated>2018-11-08T20:27:32Z</updated>
<author>
<name>Waiman Long</name>
<email>longman@redhat.com</email>
</author>
<published>2018-11-08T15:08:46Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=5cf8114d6e90b3822be5eb6a2faedf99d1c08f77'/>
<id>urn:sha1:5cf8114d6e90b3822be5eb6a2faedf99d1c08f77</id>
<content type='text'>
For debugging purpose, it will be useful to expose the content of the
subparts_cpus as a read-only file to see if the code work correctly.
However, subparts_cpus will not be used at all in most use cases. So
adding a new cpuset file that clutters the cgroup directory may not be
desirable.  This is now being done by using the hidden "cgroup_debug"
kernel command line option to expose a new "cpuset.cpus.subpartitions"
file.

That option was originally used by the debug controller to expose
itself when configured into the kernel. This is now extended to set an
internal flag used by cgroup_addrm_files(). A new CFTYPE_DEBUG flag
can now be used to specify that a cgroup file should only be created
when the "cgroup_debug" option is specified.

Signed-off-by: Waiman Long &lt;longman@redhat.com&gt;
Acked-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
</content>
</entry>
<entry>
<title>cgroup: remove unnecessary unlikely()</title>
<updated>2018-11-05T16:28:11Z</updated>
<author>
<name>Yangtao Li</name>
<email>tiny.windzz@gmail.com</email>
</author>
<published>2018-11-04T02:27:41Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=4d9ebbe2b061a9c25e12ba8539ba172533132eb6'/>
<id>urn:sha1:4d9ebbe2b061a9c25e12ba8539ba172533132eb6</id>
<content type='text'>
WARN_ON() already contains an unlikely(), so it's not necessary to use
unlikely.

Signed-off-by: Yangtao Li &lt;tiny.windzz@gmail.com&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
</content>
</entry>
</feed>
