<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/kernel/sched, branch v5.1</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v5.1</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v5.1'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2019-04-30T05:57:23Z</updated>
<entry>
<title>sched/cpufreq: Fix kobject memleak</title>
<updated>2019-04-30T05:57:23Z</updated>
<author>
<name>Tobin C. Harding</name>
<email>tobin@kernel.org</email>
</author>
<published>2019-04-30T00:11:44Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=9a4f26cc98d81b67ecc23b890c28e2df324e29f3'/>
<id>urn:sha1:9a4f26cc98d81b67ecc23b890c28e2df324e29f3</id>
<content type='text'>
Currently the error return path from kobject_init_and_add() is not
followed by a call to kobject_put() - which means we are leaking
the kobject.

Fix it by adding a call to kobject_put() in the error path of
kobject_init_and_add().

Signed-off-by: Tobin C. Harding &lt;tobin@kernel.org&gt;
Cc: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Rafael J. Wysocki &lt;rafael.j.wysocki@intel.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Tobin C. Harding &lt;tobin@kernel.org&gt;
Cc: Vincent Guittot &lt;vincent.guittot@linaro.org&gt;
Cc: Viresh Kumar &lt;viresh.kumar@linaro.org&gt;
Link: http://lkml.kernel.org/r/20190430001144.24890-1-tobin@kernel.org
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>sched/numa: Fix a possible divide-by-zero</title>
<updated>2019-04-25T17:58:54Z</updated>
<author>
<name>Xie XiuQi</name>
<email>xiexiuqi@huawei.com</email>
</author>
<published>2019-04-20T08:34:16Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=a860fa7b96e1a1c974556327aa1aee852d434c21'/>
<id>urn:sha1:a860fa7b96e1a1c974556327aa1aee852d434c21</id>
<content type='text'>
sched_clock_cpu() may not be consistent between CPUs. If a task
migrates to another CPU, then se.exec_start is set to that CPU's
rq_clock_task() by update_stats_curr_start(). Specifically, the new
value might be before the old value due to clock skew.

So then if in numa_get_avg_runtime() the expression:

  'now - p-&gt;last_task_numa_placement'

ends up as -1, then the divider '*period + 1' in task_numa_placement()
is 0 and things go bang. Similar to update_curr(), check if time goes
backwards to avoid this.

[ peterz: Wrote new changelog. ]
[ mingo: Tweaked the code comment. ]

Signed-off-by: Xie XiuQi &lt;xiexiuqi@huawei.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: cj.chengjian@huawei.com
Cc: &lt;stable@vger.kernel.org&gt;
Link: http://lkml.kernel.org/r/20190425080016.GX11158@hirez.programming.kicks-ass.net
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>sched/deadline: Correctly handle active 0-lag timers</title>
<updated>2019-04-16T14:54:58Z</updated>
<author>
<name>luca abeni</name>
<email>luca.abeni@santannapisa.it</email>
</author>
<published>2019-03-25T13:15:30Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=1b02cd6a2d7f3e2a6a5262887d2cb2912083e42f'/>
<id>urn:sha1:1b02cd6a2d7f3e2a6a5262887d2cb2912083e42f</id>
<content type='text'>
syzbot reported the following warning:

   [ ] WARNING: CPU: 4 PID: 17089 at kernel/sched/deadline.c:255 task_non_contending+0xae0/0x1950

line 255 of deadline.c is:

	WARN_ON(hrtimer_active(&amp;dl_se-&gt;inactive_timer));

in task_non_contending().

Unfortunately, in some cases (for example, a deadline task
continuosly blocking and waking immediately) it can happen that
a task blocks (and task_non_contending() is called) while the
0-lag timer is still active.

In this case, the safest thing to do is to immediately decrease
the running bandwidth of the task, without trying to re-arm the 0-lag timer.

Signed-off-by: luca abeni &lt;luca.abeni@santannapisa.it&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Acked-by: Juri Lelli &lt;juri.lelli@redhat.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: chengjian (D) &lt;cj.chengjian@huawei.com&gt;
Link: https://lkml.kernel.org/r/20190325131530.34706-1-luca.abeni@santannapisa.it
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>sched/fair: Limit sched_cfs_period_timer() loop to avoid hard lockup</title>
<updated>2019-04-16T14:50:05Z</updated>
<author>
<name>Phil Auld</name>
<email>pauld@redhat.com</email>
</author>
<published>2019-03-19T13:00:05Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=2e8e19226398db8265a8e675fcc0118b9e80c9e8'/>
<id>urn:sha1:2e8e19226398db8265a8e675fcc0118b9e80c9e8</id>
<content type='text'>
With extremely short cfs_period_us setting on a parent task group with a large
number of children the for loop in sched_cfs_period_timer() can run until the
watchdog fires. There is no guarantee that the call to hrtimer_forward_now()
will ever return 0.  The large number of children can make
do_sched_cfs_period_timer() take longer than the period.

 NMI watchdog: Watchdog detected hard LOCKUP on cpu 24
 RIP: 0010:tg_nop+0x0/0x10
  &lt;IRQ&gt;
  walk_tg_tree_from+0x29/0xb0
  unthrottle_cfs_rq+0xe0/0x1a0
  distribute_cfs_runtime+0xd3/0xf0
  sched_cfs_period_timer+0xcb/0x160
  ? sched_cfs_slack_timer+0xd0/0xd0
  __hrtimer_run_queues+0xfb/0x270
  hrtimer_interrupt+0x122/0x270
  smp_apic_timer_interrupt+0x6a/0x140
  apic_timer_interrupt+0xf/0x20
  &lt;/IRQ&gt;

To prevent this we add protection to the loop that detects when the loop has run
too many times and scales the period and quota up, proportionally, so that the timer
can complete before then next period expires.  This preserves the relative runtime
quota while preventing the hard lockup.

A warning is issued reporting this state and the new values.

Signed-off-by: Phil Auld &lt;pauld@redhat.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Cc: Anton Blanchard &lt;anton@ozlabs.org&gt;
Cc: Ben Segall &lt;bsegall@google.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Link: https://lkml.kernel.org/r/20190319130005.25492-1-pauld@redhat.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>sched/fair: Do not re-read -&gt;h_load_next during hierarchical load calculation</title>
<updated>2019-04-03T07:50:22Z</updated>
<author>
<name>Mel Gorman</name>
<email>mgorman@techsingularity.net</email>
</author>
<published>2019-03-19T12:36:10Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=0e9f02450da07fc7b1346c8c32c771555173e397'/>
<id>urn:sha1:0e9f02450da07fc7b1346c8c32c771555173e397</id>
<content type='text'>
A NULL pointer dereference bug was reported on a distribution kernel but
the same issue should be present on mainline kernel. It occured on s390
but should not be arch-specific.  A partial oops looks like:

  Unable to handle kernel pointer dereference in virtual kernel address space
  ...
  Call Trace:
    ...
    try_to_wake_up+0xfc/0x450
    vhost_poll_wakeup+0x3a/0x50 [vhost]
    __wake_up_common+0xbc/0x178
    __wake_up_common_lock+0x9e/0x160
    __wake_up_sync_key+0x4e/0x60
    sock_def_readable+0x5e/0x98

The bug hits any time between 1 hour to 3 days. The dereference occurs
in update_cfs_rq_h_load when accumulating h_load. The problem is that
cfq_rq-&gt;h_load_next is not protected by any locking and can be updated
by parallel calls to task_h_load. Depending on the compiler, code may be
generated that re-reads cfq_rq-&gt;h_load_next after the check for NULL and
then oops when reading se-&gt;avg.load_avg. The dissassembly showed that it
was possible to reread h_load_next after the check for NULL.

While this does not appear to be an issue for later compilers, it's still
an accident if the correct code is generated. Full locking in this path
would have high overhead so this patch uses READ_ONCE to read h_load_next
only once and check for NULL before dereferencing. It was confirmed that
there were no further oops after 10 days of testing.

As Peter pointed out, it is also necessary to use WRITE_ONCE() to avoid any
potential problems with store tearing.

Signed-off-by: Mel Gorman &lt;mgorman@techsingularity.net&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Reviewed-by: Valentin Schneider &lt;valentin.schneider@arm.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Mike Galbraith &lt;efault@gmx.de&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Fixes: 685207963be9 ("sched: Move h_load calculation to task_h_load()")
Link: https://lkml.kernel.org/r/20190319123610.nsivgf3mjbjjesxb@techsingularity.net
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip</title>
<updated>2019-03-24T18:42:10Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2019-03-24T18:42:10Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=231c807a60715312e2a93a001cc9be9b888bc350'/>
<id>urn:sha1:231c807a60715312e2a93a001cc9be9b888bc350</id>
<content type='text'>
Pull scheduler updates from Thomas Gleixner:
 "Third more careful attempt for this set of fixes:

   - Prevent a 32bit math overflow in the cpufreq code

   - Fix a buffer overflow when scanning the cgroup2 cpu.max property

   - A set of fixes for the NOHZ scheduler logic to prevent waking up
     CPUs even if the capacity of the busy CPUs is sufficient along with
     other tweaks optimizing the behaviour for asymmetric systems
     (big/little)"

* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched/fair: Skip LLC NOHZ logic for asymmetric systems
  sched/fair: Tune down misfit NOHZ kicks
  sched/fair: Comment some nohz_balancer_kick() kick conditions
  sched/core: Fix buffer overflow in cgroup2 property cpu.max
  sched/cpufreq: Fix 32-bit math overflow
</content>
</entry>
<entry>
<title>sched/fair: Skip LLC NOHZ logic for asymmetric systems</title>
<updated>2019-03-19T11:06:15Z</updated>
<author>
<name>Valentin Schneider</name>
<email>valentin.schneider@arm.com</email>
</author>
<published>2019-02-11T17:59:46Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=b9a7b8831600afc51c9ba52c05f12db2266f01c7'/>
<id>urn:sha1:b9a7b8831600afc51c9ba52c05f12db2266f01c7</id>
<content type='text'>
The LLC NOHZ condition will become true as soon as &gt;=2 CPUs in a
single LLC domain are busy. On big.LITTLE systems, this translates to
two or more CPUs of a "cluster" (big or LITTLE) being busy.

Issuing a NOHZ kick in these conditions isn't desired for asymmetric
systems, as if the busy CPUs can provide enough compute capacity to
the running tasks, then we can leave the NOHZ CPUs in peace.

Skip the LLC NOHZ condition for asymmetric systems, and rely on
nr_running &amp; capacity checks to trigger NOHZ kicks when the system
actually needs them.

Suggested-by: Morten Rasmussen &lt;morten.rasmussen@arm.com&gt;
Signed-off-by: Valentin Schneider &lt;valentin.schneider@arm.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: Andy Lutomirski &lt;luto@kernel.org&gt;
Cc: Borislav Petkov &lt;bp@alien8.de&gt;
Cc: Dave Hansen &lt;dave.hansen@linux.intel.com&gt;
Cc: Dietmar.Eggemann@arm.com
Cc: H. Peter Anvin &lt;hpa@zytor.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Rik van Riel &lt;riel@surriel.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: vincent.guittot@linaro.org
Link: https://lkml.kernel.org/r/20190211175946.4961-4-valentin.schneider@arm.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>sched/fair: Tune down misfit NOHZ kicks</title>
<updated>2019-03-19T11:06:15Z</updated>
<author>
<name>Valentin Schneider</name>
<email>valentin.schneider@arm.com</email>
</author>
<published>2019-02-11T17:59:45Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=a0fe2cf086aef213d1b4bca1b1291a3dee8357c9'/>
<id>urn:sha1:a0fe2cf086aef213d1b4bca1b1291a3dee8357c9</id>
<content type='text'>
In this commit:

  3b1baa6496e6 ("sched/fair: Add 'group_misfit_task' load-balance type")

we set rq-&gt;misfit_task_load whenever the current running task has a
utilization greater than 80% of rq-&gt;cpu_capacity. A non-zero value in
this field enables misfit load balancing.

However, if the task being looked at is already running on a CPU of
highest capacity, there's nothing more we can do for it. We can
currently spot this in update_sd_pick_busiest(), which prevents us
from selecting a sched_group of group_type == group_misfit_task as the
busiest group, but we don't do any of that in nohz_balancer_kick().

This means that we could repeatedly kick NOHZ CPUs when there's no
improvements in terms of load balance to be done.

Introduce a check_misfit_status() helper that returns true iff there
is a CPU in the system that could give more CPU capacity to a rq's
misfit task - IOW, there exists a CPU of higher capacity_orig or the
rq's CPU is severely pressured by rt/IRQ.

Signed-off-by: Valentin Schneider &lt;valentin.schneider@arm.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: Andy Lutomirski &lt;luto@kernel.org&gt;
Cc: Borislav Petkov &lt;bp@alien8.de&gt;
Cc: Dave Hansen &lt;dave.hansen@linux.intel.com&gt;
Cc: Dietmar.Eggemann@arm.com
Cc: H. Peter Anvin &lt;hpa@zytor.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Rik van Riel &lt;riel@surriel.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: morten.rasmussen@arm.com
Cc: vincent.guittot@linaro.org
Link: https://lkml.kernel.org/r/20190211175946.4961-3-valentin.schneider@arm.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>sched/fair: Comment some nohz_balancer_kick() kick conditions</title>
<updated>2019-03-19T11:06:15Z</updated>
<author>
<name>Valentin Schneider</name>
<email>valentin.schneider@arm.com</email>
</author>
<published>2019-02-11T17:59:44Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=e25a7a944f1936b5134b7ee06bc432fc701e4aa3'/>
<id>urn:sha1:e25a7a944f1936b5134b7ee06bc432fc701e4aa3</id>
<content type='text'>
We now have a comment explaining the first sched_domain based NOHZ kick,
so might as well comment them all.

While at it, unwrap a line that fits under 80 characters.

Co-authored-by: Peter Zijlstra &lt;peterz@infradead.org&gt;
Signed-off-by: Valentin Schneider &lt;valentin.schneider@arm.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: Andy Lutomirski &lt;luto@kernel.org&gt;
Cc: Borislav Petkov &lt;bp@alien8.de&gt;
Cc: Dave Hansen &lt;dave.hansen@linux.intel.com&gt;
Cc: Dietmar.Eggemann@arm.com
Cc: H. Peter Anvin &lt;hpa@zytor.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Rik van Riel &lt;riel@surriel.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: morten.rasmussen@arm.com
Cc: vincent.guittot@linaro.org
Link: https://lkml.kernel.org/r/20190211175946.4961-2-valentin.schneider@arm.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>sched/core: Fix buffer overflow in cgroup2 property cpu.max</title>
<updated>2019-03-19T11:06:15Z</updated>
<author>
<name>Konstantin Khlebnikov</name>
<email>khlebnikov@yandex-team.ru</email>
</author>
<published>2019-03-06T17:11:42Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=4c47acd824aaaa8fc6dc519fb4e08d1522105b7a'/>
<id>urn:sha1:4c47acd824aaaa8fc6dc519fb4e08d1522105b7a</id>
<content type='text'>
Add limit into sscanf format string for on-stack buffer.

Signed-off-by: Konstantin Khlebnikov &lt;khlebnikov@yandex-team.ru&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Acked-by: Tejun Heo &lt;tj@kernel.org&gt;
Cc: Andy Lutomirski &lt;luto@kernel.org&gt;
Cc: Borislav Petkov &lt;bp@alien8.de&gt;
Cc: Dave Hansen &lt;dave.hansen@linux.intel.com&gt;
Cc: H. Peter Anvin &lt;hpa@zytor.com&gt;
Cc: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: Li Zefan &lt;lizefan@huawei.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Rik van Riel &lt;riel@surriel.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Fixes: 0d5936344f30 ("sched: Implement interface for cgroup unified hierarchy")
Link: https://lkml.kernel.org/r/155189230232.2620.13120481613524200065.stgit@buzz
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
</feed>
