<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/kernel/sched/rt.c, branch v3.14</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v3.14</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v3.14'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2014-02-27T11:29:41Z</updated>
<entry>
<title>sched/deadline: Prevent rt_time growth to infinity</title>
<updated>2014-02-27T11:29:41Z</updated>
<author>
<name>Juri Lelli</name>
<email>juri.lelli@gmail.com</email>
</author>
<published>2014-02-21T10:37:15Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=faa5993736d9b44b508cab4f1f3a77d66641c6f4'/>
<id>urn:sha1:faa5993736d9b44b508cab4f1f3a77d66641c6f4</id>
<content type='text'>
Kirill Tkhai noted:

  Since deadline tasks share rt bandwidth, we must care about
  bandwidth timer set. Otherwise rt_time may grow up to infinity
  in update_curr_dl(), if there are no other available RT tasks
  on top level bandwidth.

RT task were in fact throttled right after they got enqueued,
and never executed again (rt_time never again went below rt_runtime).

Peter then proposed to accrue DL execution on rt_time only when
rt timer is active, and proposed a patch (this patch is a slight
modification of that) to implement that behavior. While this
solves Kirill problem, it has a drawback.

Indeed, Kirill noted again:

  It looks we may get into a situation, when all CPU time is shared
  between RT and DL tasks:

  rt_runtime = n
  rt_period  = 2n

  | RT working, DL sleeping  | DL working, RT sleeping      |
  -----------------------------------------------------------
  | (1)     duration = n     | (2)     duration = n         | (repeat)
  |--------------------------|------------------------------|
  | (rt_bw timer is running) | (rt_bw timer is not running) |

  No time for fair tasks at all.

While this can happen during the first period, if rq is always backlogged,
RT tasks won't have the opportunity to execute anymore: rt_time reached
rt_runtime during (1), suppose after (2) RT is enqueued back, it gets
throttled since rt timer didn't fire, replenishment is from now on eaten up
by DL tasks that accrue their execution on rt_time (while rt timer is
active - we have an RT task waiting for replenishment). FAIR tasks are
not touched after this first period. Ok, this is not ideal, and the situation
is even worse!

What above (the nice case), practically never happens in reality, where
your rt timer is not aligned to tasks periods, tasks are in general not
periodic, etc.. Long story short, you always risk to overload your system.

This patch is based on Peter's idea, but exploits an additional fact:
if you don't have RT tasks enqueued, it makes little sense to continue
incrementing rt_time once you reached the upper limit (DL tasks have their
own mechanism for throttling).

This cures both problems:

 - no matter how many DL instances in the past, you'll have an rt_time
   slightly above rt_runtime when an RT task is enqueued, and from that
   point on (after the first replenishment), the task will normally execute;

 - you can still eat up all bandwidth during the first period, but not
   anymore after that, remember that DL execution will increment rt_time
   till the upper limit is reached.

The situation is still not perfect! But, we have a simple solution for now,
that limits how much you can jeopardize your system, as we keep working
towards the right answer: RT groups scheduled using deadline servers.

Reported-by: Kirill Tkhai &lt;tkhai@yandex.ru&gt;
Signed-off-by: Juri Lelli &lt;juri.lelli@gmail.com&gt;
Signed-off-by: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Steven Rostedt &lt;rostedt@goodmis.org&gt;
Link: http://lkml.kernel.org/r/20140225151515.617714e2f2cd6c558531ba61@gmail.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>sched/deadline: Add SCHED_DEADLINE SMP-related data structures &amp; logic</title>
<updated>2014-01-13T12:41:07Z</updated>
<author>
<name>Juri Lelli</name>
<email>juri.lelli@gmail.com</email>
</author>
<published>2013-11-07T13:43:38Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=1baca4ce16b8cc7d4f50be1f7914799af30a2861'/>
<id>urn:sha1:1baca4ce16b8cc7d4f50be1f7914799af30a2861</id>
<content type='text'>
Introduces data structures relevant for implementing dynamic
migration of -deadline tasks and the logic for checking if
runqueues are overloaded with -deadline tasks and for choosing
where a task should migrate, when it is the case.

Adds also dynamic migrations to SCHED_DEADLINE, so that tasks can
be moved among CPUs when necessary. It is also possible to bind a
task to a (set of) CPU(s), thus restricting its capability of
migrating, or forbidding migrations at all.

The very same approach used in sched_rt is utilised:
 - -deadline tasks are kept into CPU-specific runqueues,
 - -deadline tasks are migrated among runqueues to achieve the
   following:
    * on an M-CPU system the M earliest deadline ready tasks
      are always running;
    * affinity/cpusets settings of all the -deadline tasks is
      always respected.

Therefore, this very special form of "load balancing" is done with
an active method, i.e., the scheduler pushes or pulls tasks between
runqueues when they are woken up and/or (de)scheduled.
IOW, every time a preemption occurs, the descheduled task might be sent
to some other CPU (depending on its deadline) to continue executing
(push). On the other hand, every time a CPU becomes idle, it might pull
the second earliest deadline ready task from some other CPU.

To enforce this, a pull operation is always attempted before taking any
scheduling decision (pre_schedule()), as well as a push one after each
scheduling decision (post_schedule()). In addition, when a task arrives
or wakes up, the best CPU where to resume it is selected taking into
account its affinity mask, the system topology, but also its deadline.
E.g., from the scheduling point of view, the best CPU where to wake
up (and also where to push) a task is the one which is running the task
with the latest deadline among the M executing ones.

In order to facilitate these decisions, per-runqueue "caching" of the
deadlines of the currently running and of the first ready task is used.
Queued but not running tasks are also parked in another rb-tree to
speed-up pushes.

Signed-off-by: Juri Lelli &lt;juri.lelli@gmail.com&gt;
Signed-off-by: Dario Faggioli &lt;raistlin@linux.it&gt;
Signed-off-by: Peter Zijlstra &lt;peterz@infradead.org&gt;
Link: http://lkml.kernel.org/r/1383831828-15501-5-git-send-email-juri.lelli@gmail.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>sched/rt: Fix rq's cpupri leak while enqueue/dequeue child RT entities</title>
<updated>2013-12-17T14:08:44Z</updated>
<author>
<name>Kirill Tkhai</name>
<email>tkhai@yandex.ru</email>
</author>
<published>2013-11-27T15:59:13Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=757dfcaa41844595964f1220f1d33182dae49976'/>
<id>urn:sha1:757dfcaa41844595964f1220f1d33182dae49976</id>
<content type='text'>
This patch touches the RT group scheduling case.

Functions inc_rt_prio_smp() and dec_rt_prio_smp() change (global) rq's
priority, while rt_rq passed to them may be not the top-level rt_rq.
This is wrong, because changing of priority on a child level does not
guarantee that the priority is the highest all over the rq. So, this
leak makes RT balancing unusable.

The short example: the task having the highest priority among all rq's
RT tasks (no one other task has the same priority) are waking on a
throttle rt_rq.  The rq's cpupri is set to the task's priority
equivalent, but real rq-&gt;rt.highest_prio.curr is less.

The patch below fixes the problem.

Signed-off-by: Kirill Tkhai &lt;tkhai@yandex.ru&gt;
Signed-off-by: Peter Zijlstra &lt;peterz@infradead.org&gt;
CC: Steven Rostedt &lt;rostedt@goodmis.org&gt;
CC: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/49231385567953@web4m.yandex.ru
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>sched/rt: Fix task_tick_rt() comment</title>
<updated>2013-10-26T10:25:21Z</updated>
<author>
<name>Li Bin</name>
<email>huawei.libin@huawei.com</email>
</author>
<published>2013-10-21T12:15:43Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=e9aa39bb7c4415ca26484239cc3a6686d549bf4f'/>
<id>urn:sha1:e9aa39bb7c4415ca26484239cc3a6686d549bf4f</id>
<content type='text'>
This issue was introduced by 454c79999f7e ("sched/rt: Fix SCHED_RR
across cgroups") that missed the word 'not'. Fix it.

Signed-off-by: Li Bin &lt;huawei.libin@huawei.com&gt;
Cc: &lt;guohanjun@huawei.com&gt;
Cc: &lt;xiexiuqi@huawei.com&gt;
Cc: &lt;peterz@infradead.org&gt;
Link: http://lkml.kernel.org/r/1382357743-54136-1-git-send-email-huawei.libin@huawei.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>sched/rt: Add missing rmb()</title>
<updated>2013-10-16T12:22:13Z</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2013-10-15T10:35:07Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=7c3f2ab7b844f1a859afbc3d41925e8a0faba5fa'/>
<id>urn:sha1:7c3f2ab7b844f1a859afbc3d41925e8a0faba5fa</id>
<content type='text'>
While discussing the proposed SCHED_DEADLINE patches which in parts
mimic the existing FIFO code it was noticed that the wmb in
rt_set_overloaded() didn't have a matching barrier.

The only site using rt_overloaded() to test the rto_count is
pull_rt_task() and we should issue a matching rmb before then assuming
there's an rto_mask bit set.

Without that smp_rmb() in there we could actually miss seeing the
rto_mask bit.

Also, change to using smp_[wr]mb(), even though this is SMP only code;
memory barriers without smp_ always make me think they're against
hardware of some sort.

Signed-off-by: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: vincent.guittot@linaro.org
Cc: luca.abeni@unitn.it
Cc: bruce.ashfield@windriver.com
Cc: dhaval.giani@gmail.com
Cc: rostedt@goodmis.org
Cc: hgu1972@gmail.com
Cc: oleg@redhat.com
Cc: fweisbec@gmail.com
Cc: darren@dvhart.com
Cc: johan.eker@ericsson.com
Cc: p.faure@akatech.ch
Cc: paulmck@linux.vnet.ibm.com
Cc: raistlin@linux.it
Cc: claudio@evidence.eu.com
Cc: insop.song@gmail.com
Cc: michael@amarulasolutions.com
Cc: liming.wang@windriver.com
Cc: fchecconi@gmail.com
Cc: jkacur@redhat.com
Cc: tommaso.cucinotta@sssup.it
Cc: Juri Lelli &lt;juri.lelli@gmail.com&gt;
Cc: harald.gustafsson@ericsson.com
Cc: nicola.manica@disi.unitn.it
Cc: tglx@linutronix.de
Link: http://lkml.kernel.org/r/20131015103507.GF10651@twins.programming.kicks-ass.net
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>sched/numa: Introduce migrate_swap()</title>
<updated>2013-10-09T10:40:46Z</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2013-10-07T10:29:16Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=ac66f5477239ebd3c4e2cbf2f591ef387aa09884'/>
<id>urn:sha1:ac66f5477239ebd3c4e2cbf2f591ef387aa09884</id>
<content type='text'>
Use the new stop_two_cpus() to implement migrate_swap(), a function that
flips two tasks between their respective cpus.

I'm fairly sure there's a less crude way than employing the stop_two_cpus()
method, but everything I tried either got horribly fragile and/or complex. So
keep it simple for now.

The notable detail is how we 'migrate' tasks that aren't runnable
anymore. We'll make it appear like we migrated them before they went to
sleep. The sole difference is the previous cpu in the wakeup path, so we
override this.

Signed-off-by: Peter Zijlstra &lt;peterz@infradead.org&gt;
Reviewed-by: Rik van Riel &lt;riel@redhat.com&gt;
Cc: Andrea Arcangeli &lt;aarcange@redhat.com&gt;
Cc: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: Srikar Dronamraju &lt;srikar@linux.vnet.ibm.com&gt;
Signed-off-by: Mel Gorman &lt;mgorman@suse.de&gt;
Link: http://lkml.kernel.org/r/1381141781-10992-39-git-send-email-mgorman@suse.de
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>sched/rt: Remove redundant nr_cpus_allowed test</title>
<updated>2013-10-06T09:28:40Z</updated>
<author>
<name>Shawn Bohrer</name>
<email>sbohrer@rgmadvisors.com</email>
</author>
<published>2013-10-04T19:24:53Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=6bfa687c19b7ab8adee03f0d43c197c2945dd869'/>
<id>urn:sha1:6bfa687c19b7ab8adee03f0d43c197c2945dd869</id>
<content type='text'>
In 76854c7e8f3f4172fef091e78d88b3b751463ac6 ("sched: Use
rt.nr_cpus_allowed to recover select_task_rq() cycles") an
optimization was added to select_task_rq_rt() that immediately
returns when p-&gt;nr_cpus_allowed == 1 at the beginning of the
function.

This makes the latter p-&gt;nr_cpus_allowed &gt; 1 check redundant,
which can now be removed.

Signed-off-by: Shawn Bohrer &lt;sbohrer@rgmadvisors.com&gt;
Reviewed-by: Steven Rostedt &lt;rostedt@goodmis.org&gt;
Cc: Mike Galbraith &lt;mgalbraith@suse.de&gt;
Cc: tomk@rgmadvisors.com
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Link: http://lkml.kernel.org/r/1380914693-24634-1-git-send-email-shawn.bohrer@gmail.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>sched/rt: Simplify pull_rt_task() logic and remove .leaf_rt_rq_list</title>
<updated>2013-06-19T10:58:40Z</updated>
<author>
<name>Kirill Tkhai</name>
<email>tkhai@yandex.ru</email>
</author>
<published>2013-06-07T19:37:43Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=e23ee74777f389369431d77390c4b09332ce026a'/>
<id>urn:sha1:e23ee74777f389369431d77390c4b09332ce026a</id>
<content type='text'>
[ Peter, this is based off of some of my work, I ran it though a few
  tests and it passed. I also reviewed it, and added my SOB as I am
  somewhat a co-author to it. ]

Based on the patch by Steven Rostedt from previous year:

https://lkml.org/lkml/2012/4/18/517

1)Simplify pull_rt_task() logic: search in pushable tasks of dest runqueue.
The only pullable tasks are the tasks which are pushable in their local rq,
and no others.

2)Remove .leaf_rt_rq_list member of struct rt_rq and functions connected
with it: nobody uses it since now.

Signed-off-by: Kirill Tkhai &lt;tkhai@yandex.ru&gt;
Signed-off-by: Steven Rostedt &lt;rostedt@goodmis.org&gt;
Signed-off-by: Peter Zijlstra &lt;peterz@infradead.org&gt;
Link: http://lkml.kernel.org/r/287571370557898@web7d.yandex.ru
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>sched: Use an accessor to read the rq clock</title>
<updated>2013-05-28T07:40:27Z</updated>
<author>
<name>Frederic Weisbecker</name>
<email>fweisbec@gmail.com</email>
</author>
<published>2013-04-11T23:51:02Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=78becc27097585c6aec7043834cadde950ae79f2'/>
<id>urn:sha1:78becc27097585c6aec7043834cadde950ae79f2</id>
<content type='text'>
Read the runqueue clock through an accessor. This
prepares for adding a debugging infrastructure to
detect missing or redundant calls to update_rq_clock()
between a scheduler's entry and exit point.

Signed-off-by: Frederic Weisbecker &lt;fweisbec@gmail.com&gt;
Cc: Li Zhong &lt;zhong@linux.vnet.ibm.com&gt;
Cc: Steven Rostedt &lt;rostedt@goodmis.org&gt;
Cc: Paul Turner &lt;pjt@google.com&gt;
Cc: Mike Galbraith &lt;efault@gmx.de&gt;
Signed-off-by: Peter Zijlstra &lt;peterz@infradead.org&gt;
Link: http://lkml.kernel.org/r/1365724262-20142-6-git-send-email-fweisbec@gmail.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>sched: Remove redundant update_runtime notifier</title>
<updated>2013-05-28T07:40:22Z</updated>
<author>
<name>Neil Zhang</name>
<email>zhangwm@marvell.com</email>
</author>
<published>2013-04-11T13:04:59Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=c5405a495e88d93cf9b4f4cc91507c7f4afcb901'/>
<id>urn:sha1:c5405a495e88d93cf9b4f4cc91507c7f4afcb901</id>
<content type='text'>
migration_call() will do all the things that update_runtime() does.
So let's remove it.

Furthermore, there is potential risk that the current code will catch
BUG_ON at line 689 of rt.c when do cpu hotplug while there are realtime
threads running because of enabling runtime twice while the rt_runtime
may already changed.

Signed-off-by: Neil Zhang &lt;zhangwm@marvell.com&gt;
Signed-off-by: Peter Zijlstra &lt;peterz@infradead.org&gt;
Link: http://lkml.kernel.org/r/1365685499-26515-1-git-send-email-zhangwm@marvell.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
</feed>
