<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/kernel/sched/core.c, branch v5.7</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v5.7</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v5.7'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2020-04-22T21:10:13Z</updated>
<entry>
<title>sched/core: Fix reset-on-fork from RT with uclamp</title>
<updated>2020-04-22T21:10:13Z</updated>
<author>
<name>Quentin Perret</name>
<email>qperret@google.com</email>
</author>
<published>2020-04-16T08:59:56Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=eaf5a92ebde5bca3bb2565616115bd6d579486cd'/>
<id>urn:sha1:eaf5a92ebde5bca3bb2565616115bd6d579486cd</id>
<content type='text'>
uclamp_fork() resets the uclamp values to their default when the
reset-on-fork flag is set. It also checks whether the task has a RT
policy, and sets its uclamp.min to 1024 accordingly. However, during
reset-on-fork, the task's policy is lowered to SCHED_NORMAL right after,
hence leading to an erroneous uclamp.min setting for the new task if it
was forked from RT.

Fix this by removing the unnecessary check on rt_task() in
uclamp_fork() as this doesn't make sense if the reset-on-fork flag is
set.

Fixes: 1a00d999971c ("sched/uclamp: Set default clamps for RT tasks")
Reported-by: Chitti Babu Theegala &lt;ctheegal@codeaurora.org&gt;
Signed-off-by: Quentin Perret &lt;qperret@google.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Reviewed-by: Patrick Bellasi &lt;patrick.bellasi@matbug.net&gt;
Reviewed-by: Dietmar Eggemann &lt;dietmar.eggemann@arm.com&gt;
Link: https://lkml.kernel.org/r/20200416085956.217587-1-qperret@google.com
</content>
</entry>
<entry>
<title>sched/core: Remove unused rq::last_load_update_tick</title>
<updated>2020-04-08T09:35:23Z</updated>
<author>
<name>Vincent Donnefort</name>
<email>vincent.donnefort@arm.com</email>
</author>
<published>2020-03-20T13:21:35Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=275b2f6723ab9173484e1055ae138d4c2dd9d7c5'/>
<id>urn:sha1:275b2f6723ab9173484e1055ae138d4c2dd9d7c5</id>
<content type='text'>
The following commit:

  5e83eafbfd3b ("sched/fair: Remove the rq-&gt;cpu_load[] update code")

eliminated the last use case for rq-&gt;last_load_update_tick, so remove
the field as well.

Reviewed-by: Dietmar Eggemann &lt;dietmar.eggemann@arm.com&gt;
Reviewed-by: Vincent Guittot &lt;vincent.guittot@linaro.org&gt;
Signed-off-by: Vincent Donnefort &lt;vincent.donnefort@arm.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Link: https://lkml.kernel.org/r/1584710495-308969-1-git-send-email-vincent.donnefort@arm.com
</content>
</entry>
<entry>
<title>workqueue: Remove the warning in wq_worker_sleeping()</title>
<updated>2020-04-08T09:35:20Z</updated>
<author>
<name>Sebastian Andrzej Siewior</name>
<email>bigeasy@linutronix.de</email>
</author>
<published>2020-03-27T23:29:59Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=62849a9612924a655c67cf6962920544aa5c20db'/>
<id>urn:sha1:62849a9612924a655c67cf6962920544aa5c20db</id>
<content type='text'>
The kernel test robot triggered a warning with the following race:
   task-ctx A                            interrupt-ctx B
 worker
  -&gt; process_one_work()
    -&gt; work_item()
      -&gt; schedule();
         -&gt; sched_submit_work()
           -&gt; wq_worker_sleeping()
             -&gt; -&gt;sleeping = 1
               atomic_dec_and_test(nr_running)
         __schedule();                *interrupt*
                                       async_page_fault()
                                       -&gt; local_irq_enable();
                                       -&gt; schedule();
                                          -&gt; sched_submit_work()
                                            -&gt; wq_worker_sleeping()
                                               -&gt; if (WARN_ON(-&gt;sleeping)) return
                                          -&gt; __schedule()
                                            -&gt;  sched_update_worker()
                                              -&gt; wq_worker_running()
                                                 -&gt; atomic_inc(nr_running);
                                                 -&gt; -&gt;sleeping = 0;

      -&gt;  sched_update_worker()
        -&gt; wq_worker_running()
          if (!-&gt;sleeping) return

In this context the warning is pointless everything is fine.
An interrupt before wq_worker_sleeping() will perform the -&gt;sleeping
assignment (0 -&gt; 1 &gt; 0) twice.
An interrupt after wq_worker_sleeping() will trigger the warning and
nr_running will be decremented (by A) and incremented once (only by B, A
will skip it). This is the case until the -&gt;sleeping is zeroed again in
wq_worker_running().

Remove the WARN statement because this condition may happen. Document
that preemption around wq_worker_sleeping() needs to be disabled to
protect -&gt;sleeping and not just as an optimisation.

Fixes: 6d25be5782e48 ("sched/core, workqueues: Distangle worker accounting from rq lock")
Reported-by: kernel test robot &lt;lkp@intel.com&gt;
Signed-off-by: Sebastian Andrzej Siewior &lt;bigeasy@linutronix.de&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Tejun Heo &lt;tj@kernel.org&gt;
Link: https://lkml.kernel.org/r/20200327074308.GY11705@shao2-debian
</content>
</entry>
<entry>
<title>sched/fair: Align rq-&gt;avg_idle and rq-&gt;avg_scan_cost</title>
<updated>2020-04-08T09:35:18Z</updated>
<author>
<name>Valentin Schneider</name>
<email>valentin.schneider@arm.com</email>
</author>
<published>2020-03-30T09:01:27Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=d76343c6b2b79f5e89c392bc9ce9dabc4c9e90cb'/>
<id>urn:sha1:d76343c6b2b79f5e89c392bc9ce9dabc4c9e90cb</id>
<content type='text'>
sched/core.c uses update_avg() for rq-&gt;avg_idle and sched/fair.c uses an
open-coded version (with the exact same decay factor) for
rq-&gt;avg_scan_cost. On top of that, select_idle_cpu() expects to be able to
compare these two fields.

The only difference between the two is that rq-&gt;avg_scan_cost is computed
using a pure division rather than a shift. Turns out it actually matters,
first of all because the shifted value can be negative, and the standard
has this to say about it:

  """
  The result of E1 &gt;&gt; E2 is E1 right-shifted E2 bit positions. [...] If E1
  has a signed type and a negative value, the resulting value is
  implementation-defined.
  """

Not only this, but (arithmetic) right shifting a negative value (using 2's
complement) is *not* equivalent to dividing it by the corresponding power
of 2. Let's look at a few examples:

  -4      -&gt; 0xF..FC
  -4 &gt;&gt; 3 -&gt; 0xF..FF == -1 != -4 / 8

  -8      -&gt; 0xF..F8
  -8 &gt;&gt; 3 -&gt; 0xF..FF == -1 == -8 / 8

  -9      -&gt; 0xF..F7
  -9 &gt;&gt; 3 -&gt; 0xF..FE == -2 != -9 / 8

Make update_avg() use a division, and export it to the private scheduler
header to reuse it where relevant. Note that this still lets compilers use
a shift here, but should prevent any unwanted surprise. The disassembly of
select_idle_cpu() remains unchanged on arm64, and ttwu_do_wakeup() gains 2
instructions; the diff sort of looks like this:

  - sub x1, x1, x0
  + subs x1, x1, x0 // set condition codes
  + add x0, x1, #0x7
  + csel x0, x0, x1, mi // x0 = x1 &lt; 0 ? x0 : x1
    add x0, x3, x0, asr #3

which does the right thing (i.e. gives us the expected result while still
using an arithmetic shift)

Signed-off-by: Valentin Schneider &lt;valentin.schneider@arm.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Link: https://lkml.kernel.org/r/20200330090127.16294-1-valentin.schneider@arm.com
</content>
</entry>
<entry>
<title>Merge tag 'smp-core-2020-03-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip</title>
<updated>2020-03-31T01:06:39Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2020-03-31T01:06:39Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=992a1a3b45b5c0b6e69ecc2a3f32b0d02da28d58'/>
<id>urn:sha1:992a1a3b45b5c0b6e69ecc2a3f32b0d02da28d58</id>
<content type='text'>
Pull core SMP updates from Thomas Gleixner:
 "CPU (hotplug) updates:

   - Support for locked CSD objects in smp_call_function_single_async()
     which allows to simplify callsites in the scheduler core and MIPS

   - Treewide consolidation of CPU hotplug functions which ensures the
     consistency between the sysfs interface and kernel state. The low
     level functions cpu_up/down() are now confined to the core code and
     not longer accessible from random code"

* tag 'smp-core-2020-03-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (22 commits)
  cpu/hotplug: Ignore pm_wakeup_pending() for disable_nonboot_cpus()
  cpu/hotplug: Hide cpu_up/down()
  cpu/hotplug: Move bringup of secondary CPUs out of smp_init()
  torture: Replace cpu_up/down() with add/remove_cpu()
  firmware: psci: Replace cpu_up/down() with add/remove_cpu()
  xen/cpuhotplug: Replace cpu_up/down() with device_online/offline()
  parisc: Replace cpu_up/down() with add/remove_cpu()
  sparc: Replace cpu_up/down() with add/remove_cpu()
  powerpc: Replace cpu_up/down() with add/remove_cpu()
  x86/smp: Replace cpu_up/down() with add/remove_cpu()
  arm64: hibernate: Use bringup_hibernate_cpu()
  cpu/hotplug: Provide bringup_hibernate_cpu()
  arm64: Use reboot_cpu instead of hardconding it to 0
  arm64: Don't use disable_nonboot_cpus()
  ARM: Use reboot_cpu instead of hardcoding it to 0
  ARM: Don't use disable_nonboot_cpus()
  ia64: Replace cpu_down() with smp_shutdown_nonboot_cpus()
  cpu/hotplug: Create a new function to shutdown nonboot cpus
  cpu/hotplug: Add new {add,remove}_cpu() functions
  sched/core: Remove rq.hrtick_csd_pending
  ...
</content>
</entry>
<entry>
<title>psi: Fix cpu.pressure for cpu.max and competing cgroups</title>
<updated>2020-03-20T12:06:18Z</updated>
<author>
<name>Johannes Weiner</name>
<email>hannes@cmpxchg.org</email>
</author>
<published>2020-03-16T19:13:31Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=b05e75d611380881e73edc58a20fd8c6bb71720b'/>
<id>urn:sha1:b05e75d611380881e73edc58a20fd8c6bb71720b</id>
<content type='text'>
For simplicity, cpu pressure is defined as having more than one
runnable task on a given CPU. This works on the system-level, but it
has limitations in a cgrouped reality: When cpu.max is in use, it
doesn't capture the time in which a task is not executing on the CPU
due to throttling. Likewise, it doesn't capture the time in which a
competing cgroup is occupying the CPU - meaning it only reflects
cgroup-internal competitive pressure, not outside pressure.

Enable tracking of currently executing tasks, and then change the
definition of cpu pressure in a cgroup from

	NR_RUNNING &gt; 1

to

	NR_RUNNING &gt; ON_CPU

which will capture the effects of cpu.max as well as competition from
outside the cgroup.

After this patch, a cgroup running `stress -c 1` with a cpu.max
setting of 5000 10000 shows ~50% continuous CPU pressure.

Signed-off-by: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Link: https://lkml.kernel.org/r/20200316191333.115523-2-hannes@cmpxchg.org
</content>
</entry>
<entry>
<title>sched/core: Distribute tasks within affinity masks</title>
<updated>2020-03-20T12:06:18Z</updated>
<author>
<name>Paul Turner</name>
<email>pjt@google.com</email>
</author>
<published>2020-03-11T01:01:13Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=46a87b3851f0d6eb05e6d83d5c5a30df0eca8f76'/>
<id>urn:sha1:46a87b3851f0d6eb05e6d83d5c5a30df0eca8f76</id>
<content type='text'>
Currently, when updating the affinity of tasks via either cpusets.cpus,
or, sched_setaffinity(); tasks not currently running within the newly
specified mask will be arbitrarily assigned to the first CPU within the
mask.

This (particularly in the case that we are restricting masks) can
result in many tasks being assigned to the first CPUs of their new
masks.

This:
 1) Can induce scheduling delays while the load-balancer has a chance to
    spread them between their new CPUs.
 2) Can antogonize a poor load-balancer behavior where it has a
    difficult time recognizing that a cross-socket imbalance has been
    forced by an affinity mask.

This change adds a new cpumask interface to allow iterated calls to
distribute within the intersection of the provided masks.

The cases that this mainly affects are:
 - modifying cpuset.cpus
 - when tasks join a cpuset
 - when modifying a task's affinity via sched_setaffinity(2)

Signed-off-by: Paul Turner &lt;pjt@google.com&gt;
Signed-off-by: Josh Don &lt;joshdon@google.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Reviewed-by: Qais Yousef &lt;qais.yousef@arm.com&gt;
Tested-by: Qais Yousef &lt;qais.yousef@arm.com&gt;
Link: https://lkml.kernel.org/r/20200311010113.136465-1-joshdon@google.com
</content>
</entry>
<entry>
<title>thermal/cpu-cooling, sched/core: Move the arch_set_thermal_pressure() API to generic scheduler code</title>
<updated>2020-03-06T13:26:31Z</updated>
<author>
<name>Ingo Molnar</name>
<email>mingo@kernel.org</email>
</author>
<published>2020-03-06T13:26:31Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=14533a16c46db70b8a75eda8fa633c25ac446d81'/>
<id>urn:sha1:14533a16c46db70b8a75eda8fa633c25ac446d81</id>
<content type='text'>
drivers/base/arch_topology.c is only built if CONFIG_GENERIC_ARCH_TOPOLOGY=y,
resulting in such build failures:

  cpufreq_cooling.c:(.text+0x1e7): undefined reference to `arch_set_thermal_pressure'

Move it to sched/core.c instead, and keep it enabled on x86 despite
us not having a arch_scale_thermal_pressure() facility there, to
build-test this thing.

Cc: Thara Gopinath &lt;thara.gopinath@linaro.org&gt;
Cc: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>sched/core: Remove rq.hrtick_csd_pending</title>
<updated>2020-03-06T12:42:28Z</updated>
<author>
<name>Peter Xu</name>
<email>peterx@redhat.com</email>
</author>
<published>2019-12-16T21:31:25Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=fd3eafda8f146d4ad8f95f91a8c2b9a5319ff6b2'/>
<id>urn:sha1:fd3eafda8f146d4ad8f95f91a8c2b9a5319ff6b2</id>
<content type='text'>
Now smp_call_function_single_async() provides the protection that
we'll return with -EBUSY if the csd object is still pending, then we
don't need the rq.hrtick_csd_pending any more.

Signed-off-by: Peter Xu &lt;peterx@redhat.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Link: https://lkml.kernel.org/r/20191216213125.9536-4-peterx@redhat.com
</content>
</entry>
<entry>
<title>sched/fair: Enable tuning of decay period</title>
<updated>2020-03-06T11:57:21Z</updated>
<author>
<name>Thara Gopinath</name>
<email>thara.gopinath@linaro.org</email>
</author>
<published>2020-02-22T00:52:13Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=05289b90c2e40ae80f5c70431cd0be4cc8a6038d'/>
<id>urn:sha1:05289b90c2e40ae80f5c70431cd0be4cc8a6038d</id>
<content type='text'>
Thermal pressure follows pelt signals which means the decay period for
thermal pressure is the default pelt decay period. Depending on SoC
characteristics and thermal activity, it might be beneficial to decay
thermal pressure slower, but still in-tune with the pelt signals.  One way
to achieve this is to provide a command line parameter to set a decay
shift parameter to an integer between 0 and 10.

Signed-off-by: Thara Gopinath &lt;thara.gopinath@linaro.org&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Link: https://lkml.kernel.org/r/20200222005213.3873-10-thara.gopinath@linaro.org
</content>
</entry>
</feed>
