<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/kernel/cgroup, branch v6.8</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v6.8</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v6.8'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2024-02-29T20:30:35Z</updated>
<entry>
<title>cgroup/cpuset: Fix retval in update_cpumask()</title>
<updated>2024-02-29T20:30:35Z</updated>
<author>
<name>Kamalesh Babulal</name>
<email>kamalesh.babulal@oracle.com</email>
</author>
<published>2024-02-29T10:11:14Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=25125a4762835d62ba1e540c1351d447fc1f6c7c'/>
<id>urn:sha1:25125a4762835d62ba1e540c1351d447fc1f6c7c</id>
<content type='text'>
The update_cpumask(), checks for newly requested cpumask by calling
validate_change(), which returns an error on passing an invalid set
of cpu(s). Independent of the error returned, update_cpumask() always
returns zero, suppressing the error and returning success to the user
on writing an invalid cpu range for a cpuset. Fix it by returning
retval instead, which is returned by validate_change().

Fixes: 99fe36ba6fc1 ("cgroup/cpuset: Improve temporary cpumasks handling")
Signed-off-by: Kamalesh Babulal &lt;kamalesh.babulal@oracle.com&gt;
Reviewed-by: Waiman Long &lt;longman@redhat.com&gt;
Cc: stable@vger.kernel.org # v6.6+
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
</content>
</entry>
<entry>
<title>cgroup/cpuset: Fix a memory leak in update_exclusive_cpumask()</title>
<updated>2024-02-28T18:02:55Z</updated>
<author>
<name>Waiman Long</name>
<email>longman@redhat.com</email>
</author>
<published>2024-02-28T00:58:01Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=66f40b926dd249f74334a22162c09e7ec1ec5b07'/>
<id>urn:sha1:66f40b926dd249f74334a22162c09e7ec1ec5b07</id>
<content type='text'>
Fix a possible memory leak in update_exclusive_cpumask() by moving the
alloc_cpumasks() down after the validate_change() check which can fail
and still before the temporary cpumasks are needed.

Fixes: e2ffe502ba45 ("cgroup/cpuset: Add cpuset.cpus.exclusive for v2")
Reported-and-tested-by: Mirsad Todorovac &lt;mirsad.todorovac@alu.hr&gt;
Closes: https://lore.kernel.org/lkml/14915689-27a3-4cd8-80d2-9c30d0c768b6@alu.unizg.hr
Signed-off-by: Waiman Long &lt;longman@redhat.com&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Cc: stable@vger.kernel.org # v6.7+
</content>
</entry>
<entry>
<title>Merge tag 'driver-core-6.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core</title>
<updated>2024-01-18T17:48:40Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2024-01-18T17:48:40Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=80955ae955d15ea5c11d55cd50032a5243a6dfd6'/>
<id>urn:sha1:80955ae955d15ea5c11d55cd50032a5243a6dfd6</id>
<content type='text'>
Pull driver core updates from Greg KH:
 "Here are the set of driver core and kernfs changes for 6.8-rc1.
  Nothing major in here this release cycle, just lots of small cleanups
  and some tweaks on kernfs that in the very end, got reverted and will
  come back in a safer way next release cycle.

  Included in here are:

   - more driver core 'const' cleanups and fixes

   - fw_devlink=rpm is now the default behavior

   - kernfs tiny changes to remove some string functions

   - cpu handling in the driver core is updated to work better on many
     systems that add topologies and cpus after booting

   - other minor changes and cleanups

  All of the cpu handling patches have been acked by the respective
  maintainers and are coming in here in one series. Everything has been
  in linux-next for a while with no reported issues"

* tag 'driver-core-6.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (51 commits)
  Revert "kernfs: convert kernfs_idr_lock to an irq safe raw spinlock"
  kernfs: convert kernfs_idr_lock to an irq safe raw spinlock
  class: fix use-after-free in class_register()
  PM: clk: make pm_clk_add_notifier() take a const pointer
  EDAC: constantify the struct bus_type usage
  kernfs: fix reference to renamed function
  driver core: device.h: fix Excess kernel-doc description warning
  driver core: class: fix Excess kernel-doc description warning
  driver core: mark remaining local bus_type variables as const
  driver core: container: make container_subsys const
  driver core: bus: constantify subsys_register() calls
  driver core: bus: make bus_sort_breadthfirst() take a const pointer
  kernfs: d_obtain_alias(NULL) will do the right thing...
  driver core: Better advertise dev_err_probe()
  kernfs: Convert kernfs_path_from_node_locked() from strlcpy() to strscpy()
  kernfs: Convert kernfs_name_locked() from strlcpy() to strscpy()
  kernfs: Convert kernfs_walk_ns() from strlcpy() to strscpy()
  initramfs: Expose retained initrd as sysfs file
  fs/kernfs/dir: obey S_ISGID
  kernel/cgroup: use kernfs_create_dir_ns()
  ...
</content>
</entry>
<entry>
<title>Merge tag 'cgroup-for-6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup</title>
<updated>2024-01-09T04:04:02Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2024-01-09T04:04:02Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=9f8413c4a66f2fb776d3dc3c9ed20bf435eb305e'/>
<id>urn:sha1:9f8413c4a66f2fb776d3dc3c9ed20bf435eb305e</id>
<content type='text'>
Pull cgroup updates from Tejun Heo:

 - Yafang Shao added task_get_cgroup1() helper to enable a similar BPF
   helper so that BPF progs can be more useful on cgroup1 hierarchies.
   While cgroup1 is mostly in maintenance mode, this addition is very
   small while having an outsized usefulness for users who are still on
   cgroup1. Yafang also optimized root cgroup list access by making it
   RCU protected in the process.

 - Waiman Long optimized rstat operation leading to substantially lower
   and more consistent lock hold time while flushing the hierarchical
   statistics. As the lock can be acquired briefly in various hot paths,
   this reduction has cascading benefits.

 - Waiman also improved the quality of isolation for cpuset's isolated
   partitions. CPUs which are allocated to isolated partitions are now
   excluded from running unbound work items and cpu_is_isolated() test
   which is used by vmstat and memcg to reduce interference now includes
   cpuset isolated CPUs. While it isn't there yet, the hope is
   eventually reaching parity with the isolation level provided by the
   `isolcpus` boot param but in a dynamic manner.

* tag 'cgroup-for-6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
  cgroup: Move rcu_head up near the top of cgroup_root
  cgroup/cpuset: Include isolated cpuset CPUs in cpu_is_isolated() check
  cgroup: Avoid false cacheline sharing of read mostly rstat_cpu
  cgroup/rstat: Optimize cgroup_rstat_updated_list()
  cgroup: Fix documentation for cpu.idle
  cgroup/cpuset: Expose cpuset.cpus.isolated
  workqueue: Move workqueue_set_unbound_cpumask() and its helpers inside CONFIG_SYSFS
  cgroup/rstat: Reduce cpu_lock hold time in cgroup_rstat_flush_locked()
  cgroup/cpuset: Take isolated CPUs out of workqueue unbound cpumask
  cgroup/cpuset: Keep track of CPUs in isolated partitions
  selftests/cgroup: Minor code cleanup and reorganization of test_cpuset_prs.sh
  workqueue: Add workqueue_unbound_exclude_cpumask() to exclude CPUs from wq_unbound_cpumask
  selftests: cgroup: Fixes a typo in a comment
  cgroup: Add a new helper for cgroup1 hierarchy
  cgroup: Add annotation for holding namespace_sem in current_cgns_cgroup_from_root()
  cgroup: Eliminate the need for cgroup_mutex in proc_cgroup_show()
  cgroup: Make operations on the cgroup root_list RCU safe
  cgroup: Remove unnecessary list_empty()
</content>
</entry>
<entry>
<title>kernfs: Convert kernfs_path_from_node_locked() from strlcpy() to strscpy()</title>
<updated>2023-12-15T16:25:10Z</updated>
<author>
<name>Kees Cook</name>
<email>keescook@chromium.org</email>
</author>
<published>2023-12-12T21:17:40Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=ff6d413b0b59466e5acf2e42f294b1842ae130a1'/>
<id>urn:sha1:ff6d413b0b59466e5acf2e42f294b1842ae130a1</id>
<content type='text'>
One of the last remaining users of strlcpy() in the kernel is
kernfs_path_from_node_locked(), which passes back the problematic "length
we _would_ have copied" return value to indicate truncation.  Convert the
chain of all callers to use the negative return value (some of which
already doing this explicitly). All callers were already also checking
for negative return values, so the risk to missed checks looks very low.

In this analysis, it was found that cgroup1_release_agent() actually
didn't handle the "too large" condition, so this is technically also a
bug fix. :)

Here's the chain of callers, and resolution identifying each one as now
handling the correct return value:

kernfs_path_from_node_locked()
        kernfs_path_from_node()
                pr_cont_kernfs_path()
                        returns void
                kernfs_path()
                        sysfs_warn_dup()
                                return value ignored
                        cgroup_path()
                                blkg_path()
                                        bfq_bic_update_cgroup()
                                                return value ignored
                                TRACE_IOCG_PATH()
                                        return value ignored
                                TRACE_CGROUP_PATH()
                                        return value ignored
                                perf_event_cgroup()
                                        return value ignored
                                task_group_path()
                                        return value ignored
                                damon_sysfs_memcg_path_eq()
                                        return value ignored
                                get_mm_memcg_path()
                                        return value ignored
                                lru_gen_seq_show()
                                        return value ignored
                        cgroup_path_from_kernfs_id()
                                return value ignored
                cgroup_show_path()
                        already converted "too large" error to negative value
                cgroup_path_ns_locked()
                        cgroup_path_ns()
                                bpf_iter_cgroup_show_fdinfo()
                                        return value ignored
                                cgroup1_release_agent()
                                        wasn't checking "too large" error
                        proc_cgroup_show()
                                already converted "too large" to negative value

Cc: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Cc: Tejun Heo &lt;tj@kernel.org&gt;
Cc: Zefan Li &lt;lizefan.x@bytedance.com&gt;
Cc: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: Waiman Long &lt;longman@redhat.com&gt;
Cc:  &lt;cgroups@vger.kernel.org&gt;
Co-developed-by: Azeem Shaikh &lt;azeemshaikh38@gmail.com&gt;
Signed-off-by: Azeem Shaikh &lt;azeemshaikh38@gmail.com&gt;
Link: https://lore.kernel.org/r/20231116192127.1558276-3-keescook@chromium.org
Signed-off-by: Kees Cook &lt;keescook@chromium.org&gt;
Link: https://lore.kernel.org/r/20231212211741.164376-3-keescook@chromium.org
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>kernel/cgroup: use kernfs_create_dir_ns()</title>
<updated>2023-12-15T16:22:40Z</updated>
<author>
<name>Max Kellermann</name>
<email>max.kellermann@ionos.com</email>
</author>
<published>2023-12-08T09:33:09Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=fe3de0102bc830e78d80dbf6e2dd29c520d68575'/>
<id>urn:sha1:fe3de0102bc830e78d80dbf6e2dd29c520d68575</id>
<content type='text'>
By passing the fsugid to kernfs_create_dir_ns(), we don't need
cgroup_kn_set_ugid() any longer.  That function was added for exactly
this purpose by commit 49957f8e2a43 ("cgroup: newly created dirs and
files should be owned by the creator").

Eliminating this piece of duplicate code means we benefit from future
improvements to kernfs_create_dir_ns(); for example, both are lacking
S_ISGID support currently, which my next patch will add to
kernfs_create_dir_ns().  It cannot (easily) be added to
cgroup_kn_set_ugid() because we can't dereference struct kernfs_iattrs
from there.

--
v1 -&gt; v2: 12-digit commit id

Signed-off-by: Max Kellermann &lt;max.kellermann@ionos.com&gt;
Acked-by: Tejun Heo &lt;tj@kernel.org&gt;
Link: https://lore.kernel.org/r/20231208093310.297233-1-max.kellermann@ionos.com
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>Merge tag 'cgroup-for-6.7-rc4-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup</title>
<updated>2023-12-07T20:42:40Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2023-12-07T20:42:40Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=9ace34a8e446c1a566f3b0a3e0c4c483987e39a6'/>
<id>urn:sha1:9ace34a8e446c1a566f3b0a3e0c4c483987e39a6</id>
<content type='text'>
Pull cgroup fix from Tejun Heo:
 "Just one fix.

  Commit f5d39b020809 ("freezer,sched: Rewrite core freezer logic")
  changed how freezing state is recorded which made cgroup_freezing()
  disagree with the actual state of the task while thawing triggering a
  warning. Fix it by updating cgroup_freezing()"

* tag 'cgroup-for-6.7-rc4-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
  cgroup_freezer: cgroup_freezing: Check if not frozen
</content>
</entry>
<entry>
<title>cgroup/cpuset: Include isolated cpuset CPUs in cpu_is_isolated() check</title>
<updated>2023-12-06T19:37:28Z</updated>
<author>
<name>Waiman Long</name>
<email>longman@redhat.com</email>
</author>
<published>2023-12-05T22:21:14Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=3232e7aad11e541da86bbb1fa5ea5737b30bd006'/>
<id>urn:sha1:3232e7aad11e541da86bbb1fa5ea5737b30bd006</id>
<content type='text'>
Currently, the cpu_is_isolated() function checks only the statically
isolated CPUs specified via the "isolcpus" and "nohz_full" kernel
command line options. This function is used by vmstat and memcg to
reduce interference with isolated CPUs by not doing stat flushing
or scheduling works on those CPUs.

Workloads running on isolated CPUs within isolated cpuset
partitions should receive the same treatment to reduce unnecessary
interference. This patch introduces a new cpuset_cpu_is_isolated()
function to be called by cpu_is_isolated() so that the set of dynamically
created cpuset isolated CPUs will be included in the check.

Assuming that testing a bit in a cpumask is atomic, no synchronization
primitive is currently used to synchronize access to the cpuset's
isolated_cpus mask.

Signed-off-by: Waiman Long &lt;longman@redhat.com&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
</content>
</entry>
<entry>
<title>cgroup/rstat: Optimize cgroup_rstat_updated_list()</title>
<updated>2023-12-01T17:40:04Z</updated>
<author>
<name>Waiman Long</name>
<email>longman@redhat.com</email>
</author>
<published>2023-11-30T20:43:26Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=d499fd418fa15949d86d28bb5442ab88203fc513'/>
<id>urn:sha1:d499fd418fa15949d86d28bb5442ab88203fc513</id>
<content type='text'>
The current design of cgroup_rstat_cpu_pop_updated() is to traverse
the updated tree in a way to pop out the leaf nodes first before
their parents. This can cause traversal of multiple nodes before a
leaf node can be found and popped out. IOW, a given node in the tree
can be visited multiple times before the whole operation is done. So
it is not very efficient and the code can be hard to read.

With the introduction of cgroup_rstat_updated_list() to build a list
of cgroups to be flushed first before any flushing operation is being
done, we can optimize the way the updated tree nodes are being popped
by pushing the parents first to the tail end of the list before their
children. In this way, most updated tree nodes will be visited only
once with the exception of the subtree root as we still need to go
back to its parent and popped it out of its updated_children list.
This also makes the code easier to read.

Signed-off-by: Waiman Long &lt;longman@redhat.com&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
</content>
</entry>
<entry>
<title>cgroup_freezer: cgroup_freezing: Check if not frozen</title>
<updated>2023-11-28T18:04:03Z</updated>
<author>
<name>Tim Van Patten</name>
<email>timvp@google.com</email>
</author>
<published>2023-11-15T16:20:43Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=cff5f49d433fcd0063c8be7dd08fa5bf190c6c37'/>
<id>urn:sha1:cff5f49d433fcd0063c8be7dd08fa5bf190c6c37</id>
<content type='text'>
__thaw_task() was recently updated to warn if the task being thawed was
part of a freezer cgroup that is still currently freezing:

	void __thaw_task(struct task_struct *p)
	{
	...
		if (WARN_ON_ONCE(freezing(p)))
			goto unlock;

This has exposed a bug in cgroup1 freezing where when CGROUP_FROZEN is
asserted, the CGROUP_FREEZING bits are not also cleared at the same
time. Meaning, when a cgroup is marked FROZEN it continues to be marked
FREEZING as well. This causes the WARNING to trigger, because
cgroup_freezing() thinks the cgroup is still freezing.

There are two ways to fix this:

1. Whenever FROZEN is set, clear FREEZING for the cgroup and all
children cgroups.
2. Update cgroup_freezing() to also verify that FROZEN is not set.

This patch implements option (2), since it's smaller and more
straightforward.

Signed-off-by: Tim Van Patten &lt;timvp@google.com&gt;
Tested-by: Mark Hasemeyer &lt;markhas@chromium.org&gt;
Fixes: f5d39b020809 ("freezer,sched: Rewrite core freezer logic")
Cc: stable@vger.kernel.org # v6.1+
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
</content>
</entry>
</feed>
