<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/kernel/sched, branch v4.7</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v4.7</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v4.7'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2016-07-13T12:58:20Z</updated>
<entry>
<title>sched/core: Correct off by one bug in load migration calculation</title>
<updated>2016-07-13T12:58:20Z</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2016-07-12T16:33:56Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=d60585c5766e9620d5d83e2b25dc042c7bdada2c'/>
<id>urn:sha1:d60585c5766e9620d5d83e2b25dc042c7bdada2c</id>
<content type='text'>
The move of calc_load_migrate() from CPU_DEAD to CPU_DYING did not take into
account that the function is now called from a thread running on the outgoing
CPU. As a result a cpu unplug leakes a load of 1 into the global load
accounting mechanism.

Fix it by adjusting for the currently running thread which calls
calc_load_migrate().

Reported-by: Anton Blanchard &lt;anton@samba.org&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Acked-by: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Cc: Vaidyanathan Srinivasan &lt;svaidy@linux.vnet.ibm.com&gt;
Cc: rt@linutronix.de
Cc: shreyas@linux.vnet.ibm.com
Fixes: e9cd8fa4fcfd: ("sched/migration: Move calc_load_migrate() into CPU_DYING")
Link: http://lkml.kernel.org/r/alpine.DEB.2.11.1607121744350.4083@nanos
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>sched/fair: Fix calc_cfs_shares() fixed point arithmetics width confusion</title>
<updated>2016-06-27T09:18:37Z</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2016-06-24T14:11:02Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=ea1dc6fc6242f991656e35e2ed3d90ec1cd13418'/>
<id>urn:sha1:ea1dc6fc6242f991656e35e2ed3d90ec1cd13418</id>
<content type='text'>
Commit:

  fde7d22e01aa ("sched/fair: Fix overly small weight for interactive group entities")

did something non-obvious but also did it buggy yet latent.

The problem was exposed for real by a later commit in the v4.7 merge window:

  2159197d6677 ("sched/core: Enable increased load resolution on 64-bit kernels")

... after which tg-&gt;load_avg and cfs_rq-&gt;load.weight had different
units (10 bit fixed point and 20 bit fixed point resp.).

Add a comment to explain the use of cfs_rq-&gt;load.weight over the
'natural' cfs_rq-&gt;avg.load_avg and add scale_load_down() to correct
for the difference in unit.

Since this is (now, as per a previous commit) the only user of
calc_tg_weight(), collapse it.

The effects of this bug should be randomly inconsistent SMP-balancing
of cgroups workloads.

Reported-by: Jirka Hladky &lt;jhladky@redhat.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Fixes: 2159197d6677 ("sched/core: Enable increased load resolution on 64-bit kernels")
Fixes: fde7d22e01aa ("sched/fair: Fix overly small weight for interactive group entities")
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>sched/fair: Fix effective_load() to consistently use smoothed load</title>
<updated>2016-06-27T09:18:36Z</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2016-06-24T13:53:54Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=7dd4912594daf769a46744848b05bd5bc6d62469'/>
<id>urn:sha1:7dd4912594daf769a46744848b05bd5bc6d62469</id>
<content type='text'>
Starting with the following commit:

  fde7d22e01aa ("sched/fair: Fix overly small weight for interactive group entities")

calc_tg_weight() doesn't compute the right value as expected by effective_load().

The difference is in the 'correction' term. In order to ensure \Sum
rw_j &gt;= rw_i we cannot use tg-&gt;load_avg directly, since that might be
lagging a correction on the current cfs_rq-&gt;avg.load_avg value.
Therefore we use tg-&gt;load_avg - cfs_rq-&gt;tg_load_avg_contrib +
cfs_rq-&gt;avg.load_avg.

Now, per the referenced commit, calc_tg_weight() doesn't use
cfs_rq-&gt;avg.load_avg, as is later used in @w, but uses
cfs_rq-&gt;load.weight instead.

So stop using calc_tg_weight() and do it explicitly.

The effects of this bug are wake_affine() making randomly
poor choices in cgroup-intense workloads.

Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: &lt;stable@vger.kernel.org&gt; # v4.3+
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Fixes: fde7d22e01aa ("sched/fair: Fix overly small weight for interactive group entities")
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>sched/core: Allow kthreads to fall back to online &amp;&amp; !active cpus</title>
<updated>2016-06-24T06:26:53Z</updated>
<author>
<name>Tejun Heo</name>
<email>htejun@gmail.com</email>
</author>
<published>2016-06-16T19:35:04Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=feb245e304f343cf5e4f9123db36354144dce8a4'/>
<id>urn:sha1:feb245e304f343cf5e4f9123db36354144dce8a4</id>
<content type='text'>
During CPU hotplug, CPU_ONLINE callbacks are run while the CPU is
online but not active.  A CPU_ONLINE callback may create or bind a
kthread so that its cpus_allowed mask only allows the CPU which is
being brought online.  The kthread may start executing before the CPU
is made active and can end up in select_fallback_rq().

In such cases, the expected behavior is selecting the CPU which is
coming online; however, because select_fallback_rq() only chooses from
active CPUs, it determines that the task doesn't have any viable CPU
in its allowed mask and ends up overriding it to cpu_possible_mask.

CPU_ONLINE callbacks should be able to put kthreads on the CPU which
is coming online.  Update select_fallback_rq() so that it follows
cpu_online() rather than cpu_active() for kthreads.

Reported-by: Gautham R Shenoy &lt;ego@linux.vnet.ibm.com&gt;
Tested-by: Gautham R. Shenoy &lt;ego@linux.vnet.ibm.com&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: Abdul Haleem &lt;abdhalee@linux.vnet.ibm.com&gt;
Cc: Aneesh Kumar &lt;aneesh.kumar@linux.vnet.ibm.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: kernel-team@fb.com
Cc: linuxppc-dev@lists.ozlabs.org
Link: http://lkml.kernel.org/r/20160616193504.GB3262@mtj.duckdns.org
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>sched/fair: Do not announce throttled next buddy in dequeue_task_fair()</title>
<updated>2016-06-24T06:26:45Z</updated>
<author>
<name>Konstantin Khlebnikov</name>
<email>khlebnikov@yandex-team.ru</email>
</author>
<published>2016-06-16T12:57:15Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=754bd598be9bbc953bc709a9e8ed7f3188bfb9d7'/>
<id>urn:sha1:754bd598be9bbc953bc709a9e8ed7f3188bfb9d7</id>
<content type='text'>
Hierarchy could be already throttled at this point. Throttled next
buddy could trigger a NULL pointer dereference in pick_next_task_fair().

Signed-off-by: Konstantin Khlebnikov &lt;khlebnikov@yandex-team.ru&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Reviewed-by: Ben Segall &lt;bsegall@google.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Link: http://lkml.kernel.org/r/146608183552.21905.15924473394414832071.stgit@buzz
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>sched/fair: Initialize throttle_count for new task-groups lazily</title>
<updated>2016-06-24T06:26:44Z</updated>
<author>
<name>Konstantin Khlebnikov</name>
<email>khlebnikov@yandex-team.ru</email>
</author>
<published>2016-06-16T12:57:01Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=094f469172e00d6ab0a3130b0e01c83b3cf3a98d'/>
<id>urn:sha1:094f469172e00d6ab0a3130b0e01c83b3cf3a98d</id>
<content type='text'>
Cgroup created inside throttled group must inherit current throttle_count.
Broken throttle_count allows to nominate throttled entries as a next buddy,
later this leads to null pointer dereference in pick_next_task_fair().

This patch initialize cfs_rq-&gt;throttle_count at first enqueue: laziness
allows to skip locking all rq at group creation. Lazy approach also allows
to skip full sub-tree scan at throttling hierarchy (not in this patch).

Signed-off-by: Konstantin Khlebnikov &lt;khlebnikov@yandex-team.ru&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: bsegall@google.com
Link: http://lkml.kernel.org/r/146608182119.21870.8439834428248129633.stgit@buzz
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>sched/fair: Fix cfs_rq avg tracking underflow</title>
<updated>2016-06-20T09:29:09Z</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2016-06-16T08:50:40Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=8974189222159154c55f24ddad33e3613960521a'/>
<id>urn:sha1:8974189222159154c55f24ddad33e3613960521a</id>
<content type='text'>
As per commit:

  b7fa30c9cc48 ("sched/fair: Fix post_init_entity_util_avg() serialization")

&gt; the code generated from update_cfs_rq_load_avg():
&gt;
&gt; 	if (atomic_long_read(&amp;cfs_rq-&gt;removed_load_avg)) {
&gt; 		s64 r = atomic_long_xchg(&amp;cfs_rq-&gt;removed_load_avg, 0);
&gt; 		sa-&gt;load_avg = max_t(long, sa-&gt;load_avg - r, 0);
&gt; 		sa-&gt;load_sum = max_t(s64, sa-&gt;load_sum - r * LOAD_AVG_MAX, 0);
&gt; 		removed_load = 1;
&gt; 	}
&gt;
&gt; turns into:
&gt;
&gt; ffffffff81087064:       49 8b 85 98 00 00 00    mov    0x98(%r13),%rax
&gt; ffffffff8108706b:       48 85 c0                test   %rax,%rax
&gt; ffffffff8108706e:       74 40                   je     ffffffff810870b0 &lt;update_blocked_averages+0xc0&gt;
&gt; ffffffff81087070:       4c 89 f8                mov    %r15,%rax
&gt; ffffffff81087073:       49 87 85 98 00 00 00    xchg   %rax,0x98(%r13)
&gt; ffffffff8108707a:       49 29 45 70             sub    %rax,0x70(%r13)
&gt; ffffffff8108707e:       4c 89 f9                mov    %r15,%rcx
&gt; ffffffff81087081:       bb 01 00 00 00          mov    $0x1,%ebx
&gt; ffffffff81087086:       49 83 7d 70 00          cmpq   $0x0,0x70(%r13)
&gt; ffffffff8108708b:       49 0f 49 4d 70          cmovns 0x70(%r13),%rcx
&gt;
&gt; Which you'll note ends up with sa-&gt;load_avg -= r in memory at
&gt; ffffffff8108707a.

So I _should_ have looked at other unserialized users of -&gt;load_avg,
but alas. Luckily nikbor reported a similar /0 from task_h_load() which
instantly triggered recollection of this here problem.

Aside from the intermediate value hitting memory and causing problems,
there's another problem: the underflow detection relies on the signed
bit. This reduces the effective width of the variables, IOW its
effectively the same as having these variables be of signed type.

This patch changes to a different means of unsigned underflow
detection to not rely on the signed bit. This allows the variables to
use the 'full' unsigned range. And it does so with explicit LOAD -
STORE to ensure any intermediate value will never be visible in
memory, allowing these unserialized loads.

Note: GCC generates crap code for this, might warrant a look later.

Note2: I say 'full' above, if we end up at U*_MAX we'll still explode;
       maybe we should do clamping on add too.

Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: Andrey Ryabinin &lt;aryabinin@virtuozzo.com&gt;
Cc: Chris Wilson &lt;chris@chris-wilson.co.uk&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Mike Galbraith &lt;efault@gmx.de&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Yuyang Du &lt;yuyang.du@intel.com&gt;
Cc: bsegall@google.com
Cc: kernel@kyup.com
Cc: morten.rasmussen@arm.com
Cc: pjt@google.com
Cc: steve.muckle@linaro.org
Fixes: 9d89c257dfb9 ("sched/fair: Rewrite runnable load and utilization average tracking")
Link: http://lkml.kernel.org/r/20160617091948.GJ30927@twins.programming.kicks-ass.net
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>kernel/sysrq, watchdog, sched/core: Reset watchdog on all CPUs while processing sysrq-w</title>
<updated>2016-06-14T10:48:38Z</updated>
<author>
<name>Andrey Ryabinin</name>
<email>aryabinin@virtuozzo.com</email>
</author>
<published>2016-06-09T12:20:05Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=57675cb976eff977aefb428e68e4e0236d48a9ff'/>
<id>urn:sha1:57675cb976eff977aefb428e68e4e0236d48a9ff</id>
<content type='text'>
Lengthy output of sysrq-w may take a lot of time on slow serial console.

Currently we reset NMI-watchdog on the current CPU to avoid spurious
lockup messages. Sometimes this doesn't work since softlockup watchdog
might trigger on another CPU which is waiting for an IPI to proceed.
We reset softlockup watchdogs on all CPUs, but we do this only after
listing all tasks, and this may be too late on a busy system.

So, reset watchdogs CPUs earlier, in for_each_process_thread() loop.

Signed-off-by: Andrey Ryabinin &lt;aryabinin@virtuozzo.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Link: http://lkml.kernel.org/r/1465474805-14641-1-git-send-email-aryabinin@virtuozzo.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>sched/debug: Fix deadlock when enabling sched events</title>
<updated>2016-06-14T10:47:21Z</updated>
<author>
<name>Josh Poimboeuf</name>
<email>jpoimboe@redhat.com</email>
</author>
<published>2016-06-13T07:32:09Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=eda8dca519269c92a0771668b3d5678792de7b78'/>
<id>urn:sha1:eda8dca519269c92a0771668b3d5678792de7b78</id>
<content type='text'>
I see a hang when enabling sched events:

  echo 1 &gt; /sys/kernel/debug/tracing/events/sched/enable

The printk buffer shows:

  BUG: spinlock recursion on CPU#1, swapper/1/0
   lock: 0xffff88007d5d8c00, .magic: dead4ead, .owner: swapper/1/0, .owner_cpu: 1
  CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.7.0-rc2+ #1
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.8.1-20150318_183358- 04/01/2014
  ...
  Call Trace:
   &lt;IRQ&gt;  [&lt;ffffffff8143d663&gt;] dump_stack+0x85/0xc2
   [&lt;ffffffff81115948&gt;] spin_dump+0x78/0xc0
   [&lt;ffffffff81115aea&gt;] do_raw_spin_lock+0x11a/0x150
   [&lt;ffffffff81891471&gt;] _raw_spin_lock+0x61/0x80
   [&lt;ffffffff810e5466&gt;] ? try_to_wake_up+0x256/0x4e0
   [&lt;ffffffff810e5466&gt;] try_to_wake_up+0x256/0x4e0
   [&lt;ffffffff81891a0a&gt;] ? _raw_spin_unlock_irqrestore+0x4a/0x80
   [&lt;ffffffff810e5705&gt;] wake_up_process+0x15/0x20
   [&lt;ffffffff810cebb4&gt;] insert_work+0x84/0xc0
   [&lt;ffffffff810ced7f&gt;] __queue_work+0x18f/0x660
   [&lt;ffffffff810cf9a6&gt;] queue_work_on+0x46/0x90
   [&lt;ffffffffa00cd95b&gt;] drm_fb_helper_dirty.isra.11+0xcb/0xe0 [drm_kms_helper]
   [&lt;ffffffffa00cdac0&gt;] drm_fb_helper_sys_imageblit+0x30/0x40 [drm_kms_helper]
   [&lt;ffffffff814babcd&gt;] soft_cursor+0x1ad/0x230
   [&lt;ffffffff814ba379&gt;] bit_cursor+0x649/0x680
   [&lt;ffffffff814b9d30&gt;] ? update_attr.isra.2+0x90/0x90
   [&lt;ffffffff814b5e6a&gt;] fbcon_cursor+0x14a/0x1c0
   [&lt;ffffffff81555ef8&gt;] hide_cursor+0x28/0x90
   [&lt;ffffffff81558b6f&gt;] vt_console_print+0x3bf/0x3f0
   [&lt;ffffffff81122c63&gt;] call_console_drivers.constprop.24+0x183/0x200
   [&lt;ffffffff811241f4&gt;] console_unlock+0x3d4/0x610
   [&lt;ffffffff811247f5&gt;] vprintk_emit+0x3c5/0x610
   [&lt;ffffffff81124bc9&gt;] vprintk_default+0x29/0x40
   [&lt;ffffffff811e965b&gt;] printk+0x57/0x73
   [&lt;ffffffff810f7a9e&gt;] enqueue_entity+0xc2e/0xc70
   [&lt;ffffffff810f7b39&gt;] enqueue_task_fair+0x59/0xab0
   [&lt;ffffffff8106dcd9&gt;] ? kvm_sched_clock_read+0x9/0x20
   [&lt;ffffffff8103fb39&gt;] ? sched_clock+0x9/0x10
   [&lt;ffffffff810e3fcc&gt;] activate_task+0x5c/0xa0
   [&lt;ffffffff810e4514&gt;] ttwu_do_activate+0x54/0xb0
   [&lt;ffffffff810e5cea&gt;] sched_ttwu_pending+0x7a/0xb0
   [&lt;ffffffff810e5e51&gt;] scheduler_ipi+0x61/0x170
   [&lt;ffffffff81059e7f&gt;] smp_trace_reschedule_interrupt+0x4f/0x2a0
   [&lt;ffffffff81893ba6&gt;] trace_reschedule_interrupt+0x96/0xa0
   &lt;EOI&gt;  [&lt;ffffffff8106e0d6&gt;] ? native_safe_halt+0x6/0x10
   [&lt;ffffffff8110fb1d&gt;] ? trace_hardirqs_on+0xd/0x10
   [&lt;ffffffff81040ac0&gt;] default_idle+0x20/0x1a0
   [&lt;ffffffff8104147f&gt;] arch_cpu_idle+0xf/0x20
   [&lt;ffffffff81102f8f&gt;] default_idle_call+0x2f/0x50
   [&lt;ffffffff8110332e&gt;] cpu_startup_entry+0x37e/0x450
   [&lt;ffffffff8105af70&gt;] start_secondary+0x160/0x1a0

Note the hang only occurs when echoing the above from a physical serial
console, not from an ssh session.

The bug is caused by a deadlock where the task is trying to grab the rq
lock twice because printk()'s aren't safe in sched code.

Signed-off-by: Josh Poimboeuf &lt;jpoimboe@redhat.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Matt Fleming &lt;matt@codeblueprint.co.uk&gt;
Cc: Mel Gorman &lt;mgorman@techsingularity.net&gt;
Cc: Mike Galbraith &lt;efault@gmx.de&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Srikar Dronamraju &lt;srikar@linux.vnet.ibm.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: stable@vger.kernel.org
Fixes: cb2517653fcc ("sched/debug: Make schedstats a runtime tunable that is disabled by default")
Link: http://lkml.kernel.org/r/20160613073209.gdvdybiruljbkn3p@treble
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>sched/fair: Fix post_init_entity_util_avg() serialization</title>
<updated>2016-06-14T08:58:34Z</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2016-06-09T13:07:50Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=b7fa30c9cc48c4f55663420472505d3b4f6e1705'/>
<id>urn:sha1:b7fa30c9cc48c4f55663420472505d3b4f6e1705</id>
<content type='text'>
Chris Wilson reported a divide by 0 at:

 post_init_entity_util_avg():

 &gt;    725	if (cfs_rq-&gt;avg.util_avg != 0) {
 &gt;    726		sa-&gt;util_avg  = cfs_rq-&gt;avg.util_avg * se-&gt;load.weight;
 &gt; -&gt; 727		sa-&gt;util_avg /= (cfs_rq-&gt;avg.load_avg + 1);
 &gt;    728
 &gt;    729		if (sa-&gt;util_avg &gt; cap)
 &gt;    730			sa-&gt;util_avg = cap;
 &gt;    731	} else {

Which given the lack of serialization, and the code generated from
update_cfs_rq_load_avg() is entirely possible:

	if (atomic_long_read(&amp;cfs_rq-&gt;removed_load_avg)) {
		s64 r = atomic_long_xchg(&amp;cfs_rq-&gt;removed_load_avg, 0);
		sa-&gt;load_avg = max_t(long, sa-&gt;load_avg - r, 0);
		sa-&gt;load_sum = max_t(s64, sa-&gt;load_sum - r * LOAD_AVG_MAX, 0);
		removed_load = 1;
	}

turns into:

  ffffffff81087064:       49 8b 85 98 00 00 00    mov    0x98(%r13),%rax
  ffffffff8108706b:       48 85 c0                test   %rax,%rax
  ffffffff8108706e:       74 40                   je     ffffffff810870b0
  ffffffff81087070:       4c 89 f8                mov    %r15,%rax
  ffffffff81087073:       49 87 85 98 00 00 00    xchg   %rax,0x98(%r13)
  ffffffff8108707a:       49 29 45 70             sub    %rax,0x70(%r13)
  ffffffff8108707e:       4c 89 f9                mov    %r15,%rcx
  ffffffff81087081:       bb 01 00 00 00          mov    $0x1,%ebx
  ffffffff81087086:       49 83 7d 70 00          cmpq   $0x0,0x70(%r13)
  ffffffff8108708b:       49 0f 49 4d 70          cmovns 0x70(%r13),%rcx

Which you'll note ends up with 'sa-&gt;load_avg - r' in memory at
ffffffff8108707a.

By calling post_init_entity_util_avg() under rq-&gt;lock we're sure to be
fully serialized against PELT updates and cannot observe intermediate
state like this.

Reported-by: Chris Wilson &lt;chris@chris-wilson.co.uk&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: Andrey Ryabinin &lt;aryabinin@virtuozzo.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Mike Galbraith &lt;efault@gmx.de&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Yuyang Du &lt;yuyang.du@intel.com&gt;
Cc: bsegall@google.com
Cc: morten.rasmussen@arm.com
Cc: pjt@google.com
Cc: steve.muckle@linaro.org
Fixes: 2b8c41daba32 ("sched/fair: Initiate a new task's util avg to a bounded value")
Link: http://lkml.kernel.org/r/20160609130750.GQ30909@twins.programming.kicks-ass.net
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
</feed>
