<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/kernel/sched/core.c, branch v4.4</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v4.4</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v4.4'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2015-12-07T15:09:03Z</updated>
<entry>
<title>Merge branch 'master' into for-4.4-fixes</title>
<updated>2015-12-07T15:09:03Z</updated>
<author>
<name>Tejun Heo</name>
<email>tj@kernel.org</email>
</author>
<published>2015-12-07T15:09:03Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=0b98f0c04245877ae0b625a7f0aa55b8ff98e0c4'/>
<id>urn:sha1:0b98f0c04245877ae0b625a7f0aa55b8ff98e0c4</id>
<content type='text'>
The following commit which went into mainline through networking tree

  3b13758f51de ("cgroups: Allow dynamically changing net_classid")

conflicts in net/core/netclassid_cgroup.c with the following pending
fix in cgroup/for-4.4-fixes.

  1f7dd3e5a6e4 ("cgroup: fix handling of multi-destination migration from subtree_control enabling")

The former separates out update_classid() from cgrp_attach() and
updates it to walk all fds of all tasks in the target css so that it
can be used from both migration and config change paths.  The latter
drops @css from cgrp_attach().

Resolve the conflict by making cgrp_attach() call update_classid()
with the css from the first task.  We can revive @tset walking in
cgrp_attach() but given that net_cls is v1 only where there always is
only one target css during migration, this is fine.

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Reported-by: Stephen Rothwell &lt;sfr@canb.auug.org.au&gt;
Cc: Nina Schiff &lt;ninasc@fb.com&gt;
</content>
</entry>
<entry>
<title>sched/core: Fix an SMP ordering race in try_to_wake_up() vs. schedule()</title>
<updated>2015-12-04T09:26:43Z</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2015-10-07T12:14:13Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=ecf7d01c229d11a44609c0067889372c91fb4f36'/>
<id>urn:sha1:ecf7d01c229d11a44609c0067889372c91fb4f36</id>
<content type='text'>
Oleg noticed that its possible to falsely observe p-&gt;on_cpu == 0 such
that we'll prematurely continue with the wakeup and effectively run p on
two CPUs at the same time.

Even though the overlap is very limited; the task is in the middle of
being scheduled out; it could still result in corruption of the
scheduler data structures.

        CPU0                            CPU1

        set_current_state(...)

        &lt;preempt_schedule&gt;
          context_switch(X, Y)
            prepare_lock_switch(Y)
              Y-&gt;on_cpu = 1;
            finish_lock_switch(X)
              store_release(X-&gt;on_cpu, 0);

                                        try_to_wake_up(X)
                                          LOCK(p-&gt;pi_lock);

                                          t = X-&gt;on_cpu; // 0

          context_switch(Y, X)
            prepare_lock_switch(X)
              X-&gt;on_cpu = 1;
            finish_lock_switch(Y)
              store_release(Y-&gt;on_cpu, 0);
        &lt;/preempt_schedule&gt;

        schedule();
          deactivate_task(X);
          X-&gt;on_rq = 0;

                                          if (X-&gt;on_rq) // false

                                          if (t) while (X-&gt;on_cpu)
                                            cpu_relax();

          context_switch(X, ..)
            finish_lock_switch(X)
              store_release(X-&gt;on_cpu, 0);

Avoid the load of X-&gt;on_cpu being hoisted over the X-&gt;on_rq load.

Reported-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Mike Galbraith &lt;efault@gmx.de&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>sched/core: Better document the try_to_wake_up() barriers</title>
<updated>2015-12-04T09:26:42Z</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2015-10-06T12:36:17Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=b75a22531588e77aa8c2daf228c9723916ae2cd0'/>
<id>urn:sha1:b75a22531588e77aa8c2daf228c9723916ae2cd0</id>
<content type='text'>
Explain how the control dependency and smp_rmb() end up providing
ACQUIRE semantics and pair with smp_store_release() in
finish_lock_switch().

Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Mike Galbraith &lt;efault@gmx.de&gt;
Cc: Oleg Nesterov &lt;oleg@redhat.com&gt;
Cc: Paul E. McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>sched/core: Clear the root_domain cpumasks in init_rootdomain()</title>
<updated>2015-12-04T09:16:21Z</updated>
<author>
<name>Xunlei Pang</name>
<email>xlpang@redhat.com</email>
</author>
<published>2015-12-02T11:52:59Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=8295c69925ad53ec32ca54ac9fc194ff21bc40e2'/>
<id>urn:sha1:8295c69925ad53ec32ca54ac9fc194ff21bc40e2</id>
<content type='text'>
root_domain::rto_mask allocated through alloc_cpumask_var()
contains garbage data, this may cause problems. For instance,
When doing pull_rt_task(), it may do useless iterations if
rto_mask retains some extra garbage bits. Worse still, this
violates the isolated domain rule for clustered scheduling
using cpuset, because the tasks(with all the cpus allowed)
belongs to one root domain can be pulled away into another
root domain.

The patch cleans the garbage by using zalloc_cpumask_var()
instead of alloc_cpumask_var() for root_domain::rto_mask
allocation, thereby addressing the issues.

Do the same thing for root_domain's other cpumask memembers:
dlo_mask, span, and online.

Signed-off-by: Xunlei Pang &lt;xlpang@redhat.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Mike Galbraith &lt;efault@gmx.de&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Steven Rostedt &lt;rostedt@goodmis.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Link: http://lkml.kernel.org/r/1449057179-29321-1-git-send-email-xlpang@redhat.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>sched/core: Remove false-positive warning from wake_up_process()</title>
<updated>2015-12-04T09:10:16Z</updated>
<author>
<name>Sasha Levin</name>
<email>sasha.levin@oracle.com</email>
</author>
<published>2015-12-01T01:34:20Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=119d6f6a3be8b424b200dcee56e74484d5445f7e'/>
<id>urn:sha1:119d6f6a3be8b424b200dcee56e74484d5445f7e</id>
<content type='text'>
Because wakeups can (fundamentally) be late, a task might not be in
the expected state. Therefore testing against a task's state is racy,
and can yield false positives.

Signed-off-by: Sasha Levin &lt;sasha.levin@oracle.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Acked-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Mike Galbraith &lt;efault@gmx.de&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: oleg@redhat.com
Fixes: 9067ac85d533 ("wake_up_process() should be never used to wakeup a TASK_STOPPED/TRACED task")
Link: http://lkml.kernel.org/r/1448933660-23082-1-git-send-email-sasha.levin@oracle.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>cgroup: fix handling of multi-destination migration from subtree_control enabling</title>
<updated>2015-12-03T15:18:21Z</updated>
<author>
<name>Tejun Heo</name>
<email>tj@kernel.org</email>
</author>
<published>2015-12-03T15:18:21Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=1f7dd3e5a6e4f093017fff12232572ee1aa4639b'/>
<id>urn:sha1:1f7dd3e5a6e4f093017fff12232572ee1aa4639b</id>
<content type='text'>
Consider the following v2 hierarchy.

  P0 (+memory) --- P1 (-memory) --- A
                                 \- B
       
P0 has memory enabled in its subtree_control while P1 doesn't.  If
both A and B contain processes, they would belong to the memory css of
P1.  Now if memory is enabled on P1's subtree_control, memory csses
should be created on both A and B and A's processes should be moved to
the former and B's processes the latter.  IOW, enabling controllers
can cause atomic migrations into different csses.

The core cgroup migration logic has been updated accordingly but the
controller migration methods haven't and still assume that all tasks
migrate to a single target css; furthermore, the methods were fed the
css in which subtree_control was updated which is the parent of the
target csses.  pids controller depends on the migration methods to
move charges and this made the controller attribute charges to the
wrong csses often triggering the following warning by driving a
counter negative.

 WARNING: CPU: 1 PID: 1 at kernel/cgroup_pids.c:97 pids_cancel.constprop.6+0x31/0x40()
 Modules linked in:
 CPU: 1 PID: 1 Comm: systemd Not tainted 4.4.0-rc1+ #29
 ...
  ffffffff81f65382 ffff88007c043b90 ffffffff81551ffc 0000000000000000
  ffff88007c043bc8 ffffffff810de202 ffff88007a752000 ffff88007a29ab00
  ffff88007c043c80 ffff88007a1d8400 0000000000000001 ffff88007c043bd8
 Call Trace:
  [&lt;ffffffff81551ffc&gt;] dump_stack+0x4e/0x82
  [&lt;ffffffff810de202&gt;] warn_slowpath_common+0x82/0xc0
  [&lt;ffffffff810de2fa&gt;] warn_slowpath_null+0x1a/0x20
  [&lt;ffffffff8118e031&gt;] pids_cancel.constprop.6+0x31/0x40
  [&lt;ffffffff8118e0fd&gt;] pids_can_attach+0x6d/0xf0
  [&lt;ffffffff81188a4c&gt;] cgroup_taskset_migrate+0x6c/0x330
  [&lt;ffffffff81188e05&gt;] cgroup_migrate+0xf5/0x190
  [&lt;ffffffff81189016&gt;] cgroup_attach_task+0x176/0x200
  [&lt;ffffffff8118949d&gt;] __cgroup_procs_write+0x2ad/0x460
  [&lt;ffffffff81189684&gt;] cgroup_procs_write+0x14/0x20
  [&lt;ffffffff811854e5&gt;] cgroup_file_write+0x35/0x1c0
  [&lt;ffffffff812e26f1&gt;] kernfs_fop_write+0x141/0x190
  [&lt;ffffffff81265f88&gt;] __vfs_write+0x28/0xe0
  [&lt;ffffffff812666fc&gt;] vfs_write+0xac/0x1a0
  [&lt;ffffffff81267019&gt;] SyS_write+0x49/0xb0
  [&lt;ffffffff81bcef32&gt;] entry_SYSCALL_64_fastpath+0x12/0x76

This patch fixes the bug by removing @css parameter from the three
migration methods, -&gt;can_attach, -&gt;cancel_attach() and -&gt;attach() and
updating cgroup_taskset iteration helpers also return the destination
css in addition to the task being migrated.  All controllers are
updated accordingly.

* Controllers which don't care whether there are one or multiple
  target csses can be converted trivially.  cpu, io, freezer, perf,
  netclassid and netprio fall in this category.

* cpuset's current implementation assumes that there's single source
  and destination and thus doesn't support v2 hierarchy already.  The
  only change made by this patchset is how that single destination css
  is obtained.

* memory migration path already doesn't do anything on v2.  How the
  single destination css is obtained is updated and the prep stage of
  mem_cgroup_can_attach() is reordered to accomodate the change.

* pids is the only controller which was affected by this bug.  It now
  correctly handles multi-destination migrations and no longer causes
  counter underflow from incorrect accounting.

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Reported-and-tested-by: Daniel Wagner &lt;daniel.wagner@bmw-carit.de&gt;
Cc: Aleksa Sarai &lt;cyphar@cyphar.com&gt;
</content>
</entry>
<entry>
<title>Merge branch 'for-4.4' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup</title>
<updated>2015-11-05T22:51:32Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2015-11-05T22:51:32Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=69234acee54407962a20bedf90ef9c96326994b5'/>
<id>urn:sha1:69234acee54407962a20bedf90ef9c96326994b5</id>
<content type='text'>
Pull cgroup updates from Tejun Heo:
 "The cgroup core saw several significant updates this cycle:

   - percpu_rwsem for threadgroup locking is reinstated.  This was
     temporarily dropped due to down_write latency issues.  Oleg's
     rework of percpu_rwsem which is scheduled to be merged in this
     merge window resolves the issue.

   - On the v2 hierarchy, when controllers are enabled and disabled, all
     operations are atomic and can fail and revert cleanly.  This allows
     -&gt;can_attach() failure which is necessary for cpu RT slices.

   - Tasks now stay associated with the original cgroups after exit
     until released.  This allows tracking resources held by zombies
     (e.g.  pids) and makes it easy to find out where zombies came from
     on the v2 hierarchy.  The pids controller was broken before these
     changes as zombies escaped the limits; unfortunately, updating this
     behavior required too many invasive changes and I don't think it's
     a good idea to backport them, so the pids controller on 4.3, the
     first version which included the pids controller, will stay broken
     at least until I'm sure about the cgroup core changes.

   - Optimization of a couple common tests using static_key"

* 'for-4.4' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: (38 commits)
  cgroup: fix race condition around termination check in css_task_iter_next()
  blkcg: don't create "io.stat" on the root cgroup
  cgroup: drop cgroup__DEVEL__legacy_files_on_dfl
  cgroup: replace error handling in cgroup_init() with WARN_ON()s
  cgroup: add cgroup_subsys-&gt;free() method and use it to fix pids controller
  cgroup: keep zombies associated with their original cgroups
  cgroup: make css_set_rwsem a spinlock and rename it to css_set_lock
  cgroup: don't hold css_set_rwsem across css task iteration
  cgroup: reorganize css_task_iter functions
  cgroup: factor out css_set_move_task()
  cgroup: keep css_set and task lists in chronological order
  cgroup: make cgroup_destroy_locked() test cgroup_is_populated()
  cgroup: make css_sets pin the associated cgroups
  cgroup: relocate cgroup_[try]get/put()
  cgroup: move check_for_release() invocation
  cgroup: replace cgroup_has_tasks() with cgroup_is_populated()
  cgroup: make cgroup-&gt;nr_populated count the number of populated css_sets
  cgroup: remove an unused parameter from cgroup_task_migrate()
  cgroup: fix too early usage of static_branch_disable()
  cgroup: make cgroup_update_dfl_csses() migrate all target processes atomically
  ...
</content>
</entry>
<entry>
<title>Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip</title>
<updated>2015-11-04T02:03:50Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2015-11-04T02:03:50Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=53528695ff6d8b77011bc818407c13e30914a946'/>
<id>urn:sha1:53528695ff6d8b77011bc818407c13e30914a946</id>
<content type='text'>
Pull scheduler changes from Ingo Molnar:
 "The main changes in this cycle were:

   - sched/fair load tracking fixes and cleanups (Byungchul Park)

   - Make load tracking frequency scale invariant (Dietmar Eggemann)

   - sched/deadline updates (Juri Lelli)

   - stop machine fixes, cleanups and enhancements for bugs triggered by
     CPU hotplug stress testing (Oleg Nesterov)

   - scheduler preemption code rework: remove PREEMPT_ACTIVE and related
     cleanups (Peter Zijlstra)

   - Rework the sched_info::run_delay code to fix races (Peter Zijlstra)

   - Optimize per entity utilization tracking (Peter Zijlstra)

   - ... misc other fixes, cleanups and smaller updates"

* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (57 commits)
  sched: Don't scan all-offline -&gt;cpus_allowed twice if !CONFIG_CPUSETS
  sched: Move cpu_active() tests from stop_two_cpus() into migrate_swap_stop()
  sched: Start stopper early
  stop_machine: Kill cpu_stop_threads-&gt;setup() and cpu_stop_unpark()
  stop_machine: Kill smp_hotplug_thread-&gt;pre_unpark, introduce stop_machine_unpark()
  stop_machine: Change cpu_stop_queue_two_works() to rely on stopper-&gt;enabled
  stop_machine: Introduce __cpu_stop_queue_work() and cpu_stop_queue_two_works()
  stop_machine: Ensure that a queued callback will be called before cpu_stop_park()
  sched/x86: Fix typo in __switch_to() comments
  sched/core: Remove a parameter in the migrate_task_rq() function
  sched/core: Drop unlikely behind BUG_ON()
  sched/core: Fix task and run queue sched_info::run_delay inconsistencies
  sched/numa: Fix task_tick_fair() from disabling numa_balancing
  sched/core: Add preempt_count invariant check
  sched/core: More notrace annotations
  sched/core: Kill PREEMPT_ACTIVE
  sched/core, sched/x86: Kill thread_info::saved_preempt_count
  sched/core: Simplify preempt_count tests
  sched/core: Robustify preemption leak checks
  sched/core: Stop setting PREEMPT_ACTIVE
  ...
</content>
</entry>
<entry>
<title>Merge branch 'linus' into core/rcu, to fix up a semantic conflict</title>
<updated>2015-10-28T12:17:20Z</updated>
<author>
<name>Ingo Molnar</name>
<email>mingo@kernel.org</email>
</author>
<published>2015-10-28T12:17:01Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=e4340bbb07dd38339c0773543dd928886e512a57'/>
<id>urn:sha1:e4340bbb07dd38339c0773543dd928886e512a57</id>
<content type='text'>
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>sched/core: Add missing lockdep_unpin() annotations</title>
<updated>2015-10-23T10:02:10Z</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2015-10-23T09:50:08Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=0aaafaabfcba8aa991913cd3280a5dbf7f111a2a'/>
<id>urn:sha1:0aaafaabfcba8aa991913cd3280a5dbf7f111a2a</id>
<content type='text'>
Luca and Wanpeng reported two missing annotations that led to
false lockdep complaints. Add the missing annotations.

Reported-by: Luca Abeni &lt;luca.abeni@unitn.it&gt;
Reported-by: Wanpeng Li &lt;wanpeng.li@hotmail.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: Juri Lelli &lt;juri.lelli@arm.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Mike Galbraith &lt;efault@gmx.de&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Fixes: cbce1a686700 ("sched,lockdep: Employ lock pinning")
Link: http://lkml.kernel.org/r/20151023095008.GY17308@twins.programming.kicks-ass.net
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
</feed>
