<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/kernel/sched, branch v4.11</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v4.11</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v4.11'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2017-04-27T07:08:26Z</updated>
<entry>
<title>sched/cputime: Fix ksoftirqd cputime accounting regression</title>
<updated>2017-04-27T07:08:26Z</updated>
<author>
<name>Frederic Weisbecker</name>
<email>fweisbec@gmail.com</email>
</author>
<published>2017-04-25T14:10:48Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=25e2d8c1b9e327ed260edd13169cc22bc7a78bc6'/>
<id>urn:sha1:25e2d8c1b9e327ed260edd13169cc22bc7a78bc6</id>
<content type='text'>
irq_time_read() returns the irqtime minus the ksoftirqd time. This
is necessary because irq_time_read() is used to substract the IRQ time
from the sum_exec_runtime of a task. If we were to include the softirq
time of ksoftirqd, this task would substract its own CPU time everytime
it updates ksoftirqd-&gt;sum_exec_runtime which would therefore never
progress.

But this behaviour got broken by:

  a499a5a14db ("sched/cputime: Increment kcpustat directly on irqtime account")

... which now includes ksoftirqd softirq time in the time returned by
irq_time_read().

This has resulted in wrong ksoftirqd cputime reported to userspace
through /proc/stat and thus "top" not showing ksoftirqd when it should
after intense networking load.

ksoftirqd-&gt;stime happens to be correct but it gets scaled down by
sum_exec_runtime through task_cputime_adjusted().

To fix this, just account the strict IRQ time in a separate counter and
use it to report the IRQ time.

Reported-and-tested-by: Jesper Dangaard Brouer &lt;brouer@redhat.com&gt;
Signed-off-by: Frederic Weisbecker &lt;fweisbec@gmail.com&gt;
Reviewed-by: Rik van Riel &lt;riel@redhat.com&gt;
Acked-by: Jesper Dangaard Brouer &lt;brouer@redhat.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Stanislaw Gruszka &lt;sgruszka@redhat.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Wanpeng Li &lt;wanpeng.li@hotmail.com&gt;
Link: http://lkml.kernel.org/r/1493129448-5356-1-git-send-email-fweisbec@gmail.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip</title>
<updated>2017-04-02T16:25:10Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2017-04-02T16:25:10Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=128c434a70996a3738c7ca4aa2ee91942e4c48f0'/>
<id>urn:sha1:128c434a70996a3738c7ca4aa2ee91942e4c48f0</id>
<content type='text'>
Pull scheduler fixes from Thomas Gleixner:
 "This update provides:

   - make the scheduler clock switch to unstable mode smooth so the
     timestamps stay at microseconds granularity instead of switching to
     tick granularity.

   - unbreak perf test tsc by taking the new offset into account which
     was added in order to proveide better sched clock continuity

   - switching sched clock to unstable mode runs all clock related
     computations which affect the sched clock output itself from a work
     queue. In case of preemption sched clock uses half updated data and
     provides wrong timestamps. Keep the math in the protected context
     and delegate only the static key switch to workqueue context.

   - remove a duplicate header include"

* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched/headers: Remove duplicate #include &lt;linux/sched/debug.h&gt; line
  sched/clock: Fix broken stable to unstable transfer
  sched/clock, x86/perf: Fix "perf test tsc"
  sched/clock: Fix clear_sched_clock_stable() preempt wobbly
</content>
</entry>
<entry>
<title>sched/clock: Fix broken stable to unstable transfer</title>
<updated>2017-03-27T08:23:48Z</updated>
<author>
<name>Pavel Tatashin</name>
<email>pasha.tatashin@oracle.com</email>
</author>
<published>2017-03-22T20:24:17Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=7b09cc5a9debc86c903c2eff8f8a1fdef773c649'/>
<id>urn:sha1:7b09cc5a9debc86c903c2eff8f8a1fdef773c649</id>
<content type='text'>
When it is determined that the clock is actually unstable, and
we switch from stable to unstable, the __clear_sched_clock_stable()
function is eventually called.

In this function we set gtod_offset so the following holds true:

  sched_clock() + raw_offset == ktime_get_ns() + gtod_offset

But instead of getting the latest timestamps, we use the last values
from scd, so instead of sched_clock() we use scd-&gt;tick_raw, and
instead of ktime_get_ns() we use scd-&gt;tick_gtod.

However, later, when we use gtod_offset sched_clock_local() we do not
add it to scd-&gt;tick_gtod to calculate the correct clock value when we
determine the boundaries for min/max clocks.

This can result in tick granularity sched_clock() values, so fix it.

Signed-off-by: Pavel Tatashin &lt;pasha.tatashin@oracle.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: hpa@zytor.com
Fixes: 5680d8094ffa ("sched/clock: Provide better clock continuity")
Link: http://lkml.kernel.org/r/1490214265-899964-2-git-send-email-pasha.tatashin@oracle.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>sched/clock, x86/perf: Fix "perf test tsc"</title>
<updated>2017-03-23T06:31:49Z</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2017-03-17T11:48:18Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=698eff6355f735d46d1b7113df8b422874cd7988'/>
<id>urn:sha1:698eff6355f735d46d1b7113df8b422874cd7988</id>
<content type='text'>
People reported that commit:

  5680d8094ffa ("sched/clock: Provide better clock continuity")

broke "perf test tsc".

That commit added another offset to the reported clock value; so
take that into account when computing the provided offset values.

Reported-by: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Reported-by: Arnaldo Carvalho de Melo &lt;acme@kernel.org&gt;
Tested-by: Alexander Shishkin &lt;alexander.shishkin@linux.intel.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Mike Galbraith &lt;efault@gmx.de&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Fixes: 5680d8094ffa ("sched/clock: Provide better clock continuity")
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>sched/clock: Fix clear_sched_clock_stable() preempt wobbly</title>
<updated>2017-03-23T06:31:48Z</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2017-03-13T12:46:21Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=71fdb70eb48784c1f28cdf2e67c4c587dd7f2594'/>
<id>urn:sha1:71fdb70eb48784c1f28cdf2e67c4c587dd7f2594</id>
<content type='text'>
Paul reported a problems with clear_sched_clock_stable(). Since we run
all of __clear_sched_clock_stable() from workqueue context, there's a
preempt problem.

Solve it by only running the static_key_disable() from workqueue.

Reported-by: Paul E. McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Mike Galbraith &lt;efault@gmx.de&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: fweisbec@gmail.com
Link: http://lkml.kernel.org/r/20170313124621.GA3328@twins.programming.kicks-ass.net
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>cpufreq: schedutil: Fix per-CPU structure initialization in sugov_start()</title>
<updated>2017-03-21T00:03:08Z</updated>
<author>
<name>Rafael J. Wysocki</name>
<email>rafael.j.wysocki@intel.com</email>
</author>
<published>2017-03-19T13:30:02Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=4296f23ed49a15d36949458adcc66ff993dee2a8'/>
<id>urn:sha1:4296f23ed49a15d36949458adcc66ff993dee2a8</id>
<content type='text'>
sugov_start() only initializes struct sugov_cpu per-CPU structures
for shared policies, but it should do that for single-CPU policies too.

That in particular makes the IO-wait boost mechanism work in the
cases when cpufreq policies correspond to individual CPUs.

Fixes: 21ca6d2c52f8 (cpufreq: schedutil: Add iowait boosting)
Signed-off-by: Rafael J. Wysocki &lt;rafael.j.wysocki@intel.com&gt;
Acked-by: Viresh Kumar &lt;viresh.kumar@linaro.org&gt;
Cc: 4.9+ &lt;stable@vger.kernel.org&gt; # 4.9+
</content>
</entry>
<entry>
<title>sched/deadline: Use deadline instead of period when calculating overflow</title>
<updated>2017-03-16T08:37:38Z</updated>
<author>
<name>Steven Rostedt (VMware)</name>
<email>rostedt@goodmis.org</email>
</author>
<published>2017-03-02T14:10:59Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=2317d5f1c34913bac5971d93d69fb6c31bb74670'/>
<id>urn:sha1:2317d5f1c34913bac5971d93d69fb6c31bb74670</id>
<content type='text'>
I was testing Daniel's changes with his test case, and tweaked it a
little. Instead of having the runtime equal to the deadline, I
increased the deadline ten fold.

Daniel's test case had:

	attr.sched_runtime  = 2 * 1000 * 1000;		/* 2 ms */
	attr.sched_deadline = 2 * 1000 * 1000;		/* 2 ms */
	attr.sched_period   = 2 * 1000 * 1000 * 1000;	/* 2 s */

To make it more interesting, I changed it to:

	attr.sched_runtime  =  2 * 1000 * 1000;		/* 2 ms */
	attr.sched_deadline = 20 * 1000 * 1000;		/* 20 ms */
	attr.sched_period   =  2 * 1000 * 1000 * 1000;	/* 2 s */

The results were rather surprising. The behavior that Daniel's patch
was fixing came back. The task started using much more than .1% of the
CPU. More like 20%.

Looking into this I found that it was due to the dl_entity_overflow()
constantly returning true. That's because it uses the relative period
against relative runtime vs the absolute deadline against absolute
runtime.

  runtime / (deadline - t) &gt; dl_runtime / dl_period

There's even a comment mentioning this, and saying that when relative
deadline equals relative period, that the equation is the same as using
deadline instead of period. That comment is backwards! What we really
want is:

  runtime / (deadline - t) &gt; dl_runtime / dl_deadline

We care about if the runtime can make its deadline, not its period. And
then we can say "when the deadline equals the period, the equation is
the same as using dl_period instead of dl_deadline".

After correcting this, now when the task gets enqueued, it can throttle
correctly, and Daniel's fix to the throttling of sleeping deadline
tasks works even when the runtime and deadline are not the same.

Signed-off-by: Steven Rostedt (VMware) &lt;rostedt@goodmis.org&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Reviewed-by: Daniel Bristot de Oliveira &lt;bristot@redhat.com&gt;
Cc: Juri Lelli &lt;juri.lelli@arm.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Luca Abeni &lt;luca.abeni@santannapisa.it&gt;
Cc: Mike Galbraith &lt;efault@gmx.de&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Romulo Silva de Oliveira &lt;romulo.deoliveira@ufsc.br&gt;
Cc: Steven Rostedt &lt;rostedt@goodmis.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Tommaso Cucinotta &lt;tommaso.cucinotta@sssup.it&gt;
Link: http://lkml.kernel.org/r/02135a27f1ae3fe5fd032568a5a2f370e190e8d7.1488392936.git.bristot@redhat.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>sched/deadline: Throttle a constrained deadline task activated after the deadline</title>
<updated>2017-03-16T08:37:38Z</updated>
<author>
<name>Daniel Bristot de Oliveira</name>
<email>bristot@redhat.com</email>
</author>
<published>2017-03-02T14:10:58Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=df8eac8cafce7d086be3bd5cf5a838fa37594dfb'/>
<id>urn:sha1:df8eac8cafce7d086be3bd5cf5a838fa37594dfb</id>
<content type='text'>
During the activation, CBS checks if it can reuse the current task's
runtime and period. If the deadline of the task is in the past, CBS
cannot use the runtime, and so it replenishes the task. This rule
works fine for implicit deadline tasks (deadline == period), and the
CBS was designed for implicit deadline tasks. However, a task with
constrained deadline (deadine &lt; period) might be awakened after the
deadline, but before the next period. In this case, replenishing the
task would allow it to run for runtime / deadline. As in this case
deadline &lt; period, CBS enables a task to run for more than the
runtime / period. In a very loaded system, this can cause a domino
effect, making other tasks miss their deadlines.

To avoid this problem, in the activation of a constrained deadline
task after the deadline but before the next period, throttle the
task and set the replenishing timer to the begin of the next period,
unless it is boosted.

Reproducer:

 --------------- %&lt; ---------------
  int main (int argc, char **argv)
  {
	int ret;
	int flags = 0;
	unsigned long l = 0;
	struct timespec ts;
	struct sched_attr attr;

	memset(&amp;attr, 0, sizeof(attr));
	attr.size = sizeof(attr);

	attr.sched_policy   = SCHED_DEADLINE;
	attr.sched_runtime  = 2 * 1000 * 1000;		/* 2 ms */
	attr.sched_deadline = 2 * 1000 * 1000;		/* 2 ms */
	attr.sched_period   = 2 * 1000 * 1000 * 1000;	/* 2 s */

	ts.tv_sec = 0;
	ts.tv_nsec = 2000 * 1000;			/* 2 ms */

	ret = sched_setattr(0, &amp;attr, flags);

	if (ret &lt; 0) {
		perror("sched_setattr");
		exit(-1);
	}

	for(;;) {
		/* XXX: you may need to adjust the loop */
		for (l = 0; l &lt; 150000; l++);
		/*
		 * The ideia is to go to sleep right before the deadline
		 * and then wake up before the next period to receive
		 * a new replenishment.
		 */
		nanosleep(&amp;ts, NULL);
	}

	exit(0);
  }
  --------------- &gt;% ---------------

On my box, this reproducer uses almost 50% of the CPU time, which is
obviously wrong for a task with 2/2000 reservation.

Signed-off-by: Daniel Bristot de Oliveira &lt;bristot@redhat.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: Juri Lelli &lt;juri.lelli@arm.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Luca Abeni &lt;luca.abeni@santannapisa.it&gt;
Cc: Mike Galbraith &lt;efault@gmx.de&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Romulo Silva de Oliveira &lt;romulo.deoliveira@ufsc.br&gt;
Cc: Steven Rostedt &lt;rostedt@goodmis.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Tommaso Cucinotta &lt;tommaso.cucinotta@sssup.it&gt;
Link: http://lkml.kernel.org/r/edf58354e01db46bf42df8d2dd32418833f68c89.1488392936.git.bristot@redhat.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>sched/deadline: Make sure the replenishment timer fires in the next period</title>
<updated>2017-03-16T08:37:37Z</updated>
<author>
<name>Daniel Bristot de Oliveira</name>
<email>bristot@redhat.com</email>
</author>
<published>2017-03-02T14:10:57Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=5ac69d37784b237707a7b15d199cdb6c6fdb6780'/>
<id>urn:sha1:5ac69d37784b237707a7b15d199cdb6c6fdb6780</id>
<content type='text'>
Currently, the replenishment timer is set to fire at the deadline
of a task. Although that works for implicit deadline tasks because the
deadline is equals to the begin of the next period, that is not correct
for constrained deadline tasks (deadline &lt; period).

For instance:

f.c:
 --------------- %&lt; ---------------
int main (void)
{
	for(;;);
}
 --------------- &gt;% ---------------

  # gcc -o f f.c

  # trace-cmd record -e sched:sched_switch                              \
				   -e syscalls:sys_exit_sched_setattr   \
   chrt -d --sched-runtime  490000000					\
           --sched-deadline 500000000					\
	   --sched-period  1000000000 0 ./f

  # trace-cmd report | grep "{pid of ./f}"

After setting parameters, the task is replenished and continue running
until being throttled:

         f-11295 [003] 13322.113776: sys_exit_sched_setattr: 0x0

The task is throttled after running 492318 ms, as expected:

         f-11295 [003] 13322.606094: sched_switch:   f:11295 [-1] R ==&gt; watchdog/3:32 [0]

But then, the task is replenished 500719 ms after the first
replenishment:

    &lt;idle&gt;-0     [003] 13322.614495: sched_switch:   swapper/3:0 [120] R ==&gt; f:11295 [-1]

Running for 490277 ms:

         f-11295 [003] 13323.104772: sched_switch:   f:11295 [-1] R ==&gt;  swapper/3:0 [120]

Hence, in the first period, the task runs 2 * runtime, and that is a bug.

During the first replenishment, the next deadline is set one period away.
So the runtime / period starts to be respected. However, as the second
replenishment took place in the wrong instant, the next replenishment
will also be held in a wrong instant of time. Rather than occurring in
the nth period away from the first activation, it is taking place
in the (nth period - relative deadline).

Signed-off-by: Daniel Bristot de Oliveira &lt;bristot@redhat.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Reviewed-by: Luca Abeni &lt;luca.abeni@santannapisa.it&gt;
Reviewed-by: Steven Rostedt (VMware) &lt;rostedt@goodmis.org&gt;
Reviewed-by: Juri Lelli &lt;juri.lelli@arm.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Mike Galbraith &lt;efault@gmx.de&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Romulo Silva de Oliveira &lt;romulo.deoliveira@ufsc.br&gt;
Cc: Steven Rostedt &lt;rostedt@goodmis.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Tommaso Cucinotta &lt;tommaso.cucinotta@sssup.it&gt;
Link: http://lkml.kernel.org/r/ac50d89887c25285b47465638354b63362f8adff.1488392936.git.bristot@redhat.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>sched/loadavg: Use {READ,WRITE}_ONCE() for sample window</title>
<updated>2017-03-16T08:21:01Z</updated>
<author>
<name>Matt Fleming</name>
<email>matt@codeblueprint.co.uk</email>
</author>
<published>2017-02-17T12:07:31Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=caeb5882979bc6f3c8766fcf59c6269b38f521bc'/>
<id>urn:sha1:caeb5882979bc6f3c8766fcf59c6269b38f521bc</id>
<content type='text'>
'calc_load_update' is accessed without any kind of locking and there's
a clear assumption in the code that only a single value is read or
written.

Make this explicit by using READ_ONCE() and WRITE_ONCE(), and avoid
unintentionally seeing multiple values, or having the load/stores
split.

Technically the loads in calc_global_*() don't require this since
those are the only functions that update 'calc_load_update', but I've
added the READ_ONCE() for consistency.

Suggested-by: Peter Zijlstra &lt;peterz@infradead.org&gt;
Signed-off-by: Matt Fleming &lt;matt@codeblueprint.co.uk&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: Frederic Weisbecker &lt;fweisbec@gmail.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Mike Galbraith &lt;efault@gmx.de&gt;
Cc: Mike Galbraith &lt;umgwanakikbuti@gmail.com&gt;
Cc: Morten Rasmussen &lt;morten.rasmussen@arm.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Vincent Guittot &lt;vincent.guittot@linaro.org&gt;
Link: http://lkml.kernel.org/r/20170217120731.11868-3-matt@codeblueprint.co.uk
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
</feed>
