<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/drivers/cpuidle, branch v4.7</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v4.7</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v4.7'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2016-07-04T12:17:34Z</updated>
<entry>
<title>cpuidle: Fix last_residency division</title>
<updated>2016-07-04T12:17:34Z</updated>
<author>
<name>Shreyas B. Prabhu</name>
<email>shreyas@linux.vnet.ibm.com</email>
</author>
<published>2016-07-01T14:24:14Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=dbd1b8ea43b17e2ed4acda72f83ea17f69408682'/>
<id>urn:sha1:dbd1b8ea43b17e2ed4acda72f83ea17f69408682</id>
<content type='text'>
Snooze is a poll idle state in powernv and pseries platforms. Snooze
has a timeout so that if a CPU stays in snooze for more than target
residency of the next available idle state, then it would exit
thereby giving chance to the cpuidle governor to re-evaluate and
promote the CPU to a deeper idle state. Therefore whenever snooze
exits due to this timeout, its last_residency will be target_residency
of the next deeper state.

Commit e93e59ce5b85 "cpuidle: Replace ktime_get() with local_clock()"
changed the math around last_residency calculation. Specifically,
while converting last_residency value from nano- to microseconds, it
carries out right shift by 10. Because of that, in snooze timeout
exit scenarios last_residency calculated is roughly 2.3% less than
target_residency of the next available state. This pattern is picked
up by get_typical_interval() in the menu governor and therefore
expected_interval in menu_select() is frequently less than the
target_residency of any state other than snooze.

Due to this we are entering snooze at a higher rate, thereby
affecting the single thread performance.

Fix this by using more precise division via ktime_us_delta().

Fixes: e93e59ce5b85 "cpuidle: Replace ktime_get() with local_clock()"
Reported-by: Anton Blanchard &lt;anton@samba.org&gt;
Bisected-by: Shilpasri G Bhat &lt;shilpa.bhat@linux.vnet.ibm.com&gt;
Signed-off-by: Shreyas B. Prabhu &lt;shreyas@linux.vnet.ibm.com&gt;
Acked-by: Daniel Lezcano &lt;daniel.lezcano@linaro.org&gt;
Acked-by: Balbir Singh &lt;bsingharora@gmail.com&gt;
Signed-off-by: Rafael J. Wysocki &lt;rafael.j.wysocki@intel.com&gt;
</content>
</entry>
<entry>
<title>cpuidle: Fix cpuidle_state_is_coupled() argument in cpuidle_enter()</title>
<updated>2016-05-18T00:48:37Z</updated>
<author>
<name>Daniel Lezcano</name>
<email>daniel.lezcano@linaro.org</email>
</author>
<published>2016-05-17T14:54:00Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=e7387da52028b072489c45efeb7a916c0205ebd2'/>
<id>urn:sha1:e7387da52028b072489c45efeb7a916c0205ebd2</id>
<content type='text'>
Commit 0b89e9aa2856 (cpuidle: delay enabling interrupts until all
coupled CPUs leave idle) rightfully fixed a regression by letting
the coupled idle state framework to handle local interrupt enabling
when the CPU is exiting an idle state.

The current code checks if the idle state is coupled and, if so, it
will let the coupled code to enable interrupts. This way, it can
decrement the ready-count before handling the interrupt. This
mechanism prevents the other CPUs from waiting for a CPU which is
handling interrupts.

But the check is done against the state index returned by the back
end driver's -&gt;enter functions which could be different from the
initial index passed as parameter to the cpuidle_enter_state()
function.

 entered_state = target_state-&gt;enter(dev, drv, index);

 [ ... ]

 if (!cpuidle_state_is_coupled(drv, entered_state))
	local_irq_enable();

 [ ... ]

If the 'index' is referring to a coupled idle state but the
'entered_state' is *not* coupled, then the interrupts are enabled
again. All CPUs blocked on the sync barrier may busy loop longer
if the CPU has interrupts to handle before decrementing the
ready-count. That's consuming more energy than saving.

Fixes: 0b89e9aa2856 (cpuidle: delay enabling interrupts until all coupled CPUs leave idle)
Signed-off-by: Daniel Lezcano &lt;daniel.lezcano@linaro.org&gt;
Cc: 3.15+ &lt;stable@vger.kernel.org&gt; # 3.15+
[ rjw: Subject &amp; changelog ]
Signed-off-by: Rafael J. Wysocki &lt;rafael.j.wysocki@intel.com&gt;
</content>
</entry>
<entry>
<title>Merge back new cpuidle material for v4.7.</title>
<updated>2016-05-06T20:08:46Z</updated>
<author>
<name>Rafael J. Wysocki</name>
<email>rafael.j.wysocki@intel.com</email>
</author>
<published>2016-05-06T20:08:46Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=fd7c3c29f9abab50a84fd4fd6811129641c53b2f'/>
<id>urn:sha1:fd7c3c29f9abab50a84fd4fd6811129641c53b2f</id>
<content type='text'>
</content>
</entry>
<entry>
<title>ARM: cpuidle: Pass on arm_cpuidle_suspend()'s return value</title>
<updated>2016-04-28T13:15:14Z</updated>
<author>
<name>James Morse</name>
<email>james.morse@arm.com</email>
</author>
<published>2016-04-26T11:15:01Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=625fe4f8ffc1b915248558481bb94249f6bd411c'/>
<id>urn:sha1:625fe4f8ffc1b915248558481bb94249f6bd411c</id>
<content type='text'>
arm_cpuidle_suspend() may return -EOPNOTSUPP, or any value returned
by the cpu_ops/cpuidle_ops suspend call. arm_enter_idle_state() doesn't
update 'ret' with this value, meaning we always signal success to
cpuidle_enter_state(), causing it to update the usage counters as if we
succeeded.

Fixes: 191de17aa3c1 ("ARM64: cpuidle: Replace cpu_suspend by the common ARM/ARM64 function")
Signed-off-by: James Morse &lt;james.morse@arm.com&gt;
Acked-by: Lorenzo Pieralisi &lt;lorenzo.pieralisi@arm.com&gt;
Acked-by: Daniel Lezcano &lt;daniel.lezcano@linaro.org&gt;
Cc: 4.1+ &lt;stable@vger.kernel.org&gt; # 4.1+
Signed-off-by: Rafael J. Wysocki &lt;rafael.j.wysocki@intel.com&gt;
</content>
</entry>
<entry>
<title>cpuidle: Replace ktime_get() with local_clock()</title>
<updated>2016-04-26T00:38:49Z</updated>
<author>
<name>Daniel Lezcano</name>
<email>daniel.lezcano@linaro.org</email>
</author>
<published>2016-04-21T08:56:30Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=e93e59ce5b85e6c2b444f09fd1f707274ec066dc'/>
<id>urn:sha1:e93e59ce5b85e6c2b444f09fd1f707274ec066dc</id>
<content type='text'>
The ktime_get() can have a non negligeable overhead, use local_clock()
instead.

In order to test the difference between ktime_get() and local_clock(),
a quick hack has been added to trigger, via debugfs, 10000 times a
call to ktime_get() and local_clock() and measure the elapsed time.

Then the average value, the min and max is computed for each call.

From userspace, the test above was called 100 times every 2 seconds.

So, ktime_get() and local_clock() have been called 1000000 times in
total.

The results are:

ktime_get():
============
 * average: 101 ns (stddev: 27.4)
 * maximum: 38313 ns
 * minimum: 65 ns

local_clock():
==============
 * average: 60 ns (stddev: 9.8)
 * maximum: 13487 ns
 * minimum: 46 ns

The local_clock() is faster and more stable.

Even if it is a drop in the ocean, changing the ktime_get() by the
local_clock() allows to save 80ns at idle time (entry + exit). And
in some circumstances, especially when there are several CPUs racing
for the clock access, we save tens of microseconds.

The idle duration resulting from a diff is converted from nanosec to
microsec. This could be done with integer division (div 1000) - which is
an expensive operation or by 10 bits shifting (div 1024) - which is fast
but unprecise.

The following table gives some results at the limits.

 ------------------------------------------
|   nsec   |   div(1000)   |   div(1024)   |
 ------------------------------------------
|   1e3    |        1 usec |      976 nsec |
 ------------------------------------------
|   1e6    |     1000 usec |      976 usec |
 ------------------------------------------
|   1e9    |  1000000 usec |   976562 usec |
 ------------------------------------------

There is a linear deviation of 2.34%. This loss of precision is acceptable
in the context of the resulting diff which is used for statistics. These
ones are processed to guess estimate an approximation of the duration of the
next idle period which ends up into an idle state selection. The selection
criteria takes into account the next duration based on large intervals,
represented by the idle state's target residency.

The 2^10 division is enough because the approximation regarding the 1e3
division is lost in all the approximations done for the next idle duration
computation.

Signed-off-by: Daniel Lezcano &lt;daniel.lezcano@linaro.org&gt;
Acked-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
[ rjw: Subject ]
Signed-off-by: Rafael J. Wysocki &lt;rafael.j.wysocki@intel.com&gt;
</content>
</entry>
<entry>
<title>cpuidle: Indicate when a device has been unregistered</title>
<updated>2016-04-09T00:14:39Z</updated>
<author>
<name>Dave Gerlach</name>
<email>d-gerlach@ti.com</email>
</author>
<published>2016-04-05T19:05:38Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=c998c07836f985b24361629dc98506ec7893e7a0'/>
<id>urn:sha1:c998c07836f985b24361629dc98506ec7893e7a0</id>
<content type='text'>
Currently the 'registered' member of the cpuidle_device struct is set
to 1 during cpuidle_register_device. In this same function there are
checks to see if the device is already registered to prevent duplicate
calls to register the device, but this value is never set to 0 even on
unregister of the device. Because of this, any attempt to call
cpuidle_register_device after a call to cpuidle_unregister_device will
fail which shouldn't be the case.

To prevent this, set registered to 0 when the device is unregistered.

Fixes: c878a52d3c7c (cpuidle: Check if device is already registered)
Signed-off-by: Dave Gerlach &lt;d-gerlach@ti.com&gt;
Acked-by: Daniel Lezcano &lt;daniel.lezcano@linaro.org&gt;
Cc: All applicable &lt;stable@vger.kernel.org&gt;
Signed-off-by: Rafael J. Wysocki &lt;rafael.j.wysocki@intel.com&gt;
</content>
</entry>
<entry>
<title>cpuidle: menu: Fall back to polling if next timer event is near</title>
<updated>2016-03-21T14:50:28Z</updated>
<author>
<name>Rafael J. Wysocki</name>
<email>rafael.j.wysocki@intel.com</email>
</author>
<published>2016-03-20T00:33:35Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=0c313cb207326f759a58f486214288411b25d4cf'/>
<id>urn:sha1:0c313cb207326f759a58f486214288411b25d4cf</id>
<content type='text'>
Commit a9ceb78bc75c (cpuidle,menu: use interactivity_req to disable
polling) changed the behavior of the fallback state selection part
of menu_select() so it looks at interactivity_req instead of
data-&gt;next_timer_us when it makes its decision.  That effectively
caused polling to be used more often as fallback idle which led to
significant increases of energy consumption in some cases.

Commit e132b9b3bc7f (cpuidle: menu: use high confidence factors
only when considering polling) changed that logic again to be more
predictable, but that didn't help with the increased energy
consumption problem.

For this reason, go back to making decisions on which state to fall
back to based on data-&gt;next_timer_us which is the time we know for
sure something will happen rather than a prediction (which may be
inaccurate and turns out to be so often enough to be problematic).
However, take the target residency of the first proper idle state
(C1) into account, so that state is not used as the fallback one
if its target residency is greater than data-&gt;next_timer_us.

Fixes: a9ceb78bc75c (cpuidle,menu: use interactivity_req to disable polling)
Signed-off-by: Rafael J. Wysocki &lt;rafael.j.wysocki@intel.com&gt;
Reported-and-tested-by: Doug Smythies &lt;dsmythies@telus.net&gt;
</content>
</entry>
<entry>
<title>cpuidle: menu: use high confidence factors only when considering polling</title>
<updated>2016-03-17T01:40:32Z</updated>
<author>
<name>Rik van Riel</name>
<email>riel@redhat.com</email>
</author>
<published>2016-03-16T16:14:00Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=e132b9b3bc7f19e9b158e42b323881d5dee5ecf3'/>
<id>urn:sha1:e132b9b3bc7f19e9b158e42b323881d5dee5ecf3</id>
<content type='text'>
The menu governor uses five different factors to pick the
idle state:
 - the user configured latency_req
 - the time until the next timer (next_timer_us)
 - the typical sleep interval, as measured recently
 - an estimate of sleep time by dividing next_timer_us by an observed factor
 - a load corrected version of the above, divided again by load

Only the first three items are known with enough confidence that
we can use them to consider polling, instead of an actual CPU
idle state, because the cost of being wrong about polling can be
excessive power use.

The latter two are used in the menu governor's main selection
loop, and can result in choosing a shallower idle state when
the system is expected to be busy again soon.

This pushes a busy system in the "performance" direction of
the performance&lt;&gt;power tradeoff, when choosing between idle
states, but stays more strictly on the "power" state when
deciding between polling and C1.

Signed-off-by: Rik van Riel &lt;riel@redhat.com&gt;
Signed-off-by: Rafael J. Wysocki &lt;rafael.j.wysocki@intel.com&gt;
</content>
</entry>
<entry>
<title>cpuidle: menu: help gcc generate slightly better code</title>
<updated>2016-02-16T23:28:15Z</updated>
<author>
<name>Rasmus Villemoes</name>
<email>linux@rasmusvillemoes.dk</email>
</author>
<published>2016-02-16T19:19:19Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=3b99669b75db04e411bb298591224a9e8e4f57fb'/>
<id>urn:sha1:3b99669b75db04e411bb298591224a9e8e4f57fb</id>
<content type='text'>
We know that the avg variable actually ends up holding a 32 bit
quantity, since it's an average of such numbers. It is only a u64
because it is temporarily used to hold the sum. Making it an actual
u32 allows gcc to generate slightly better code, e.g. when computing
the square, it can do a 32x32-&gt;64 multiply.

Signed-off-by: Rasmus Villemoes &lt;linux@rasmusvillemoes.dk&gt;
Signed-off-by: Rafael J. Wysocki &lt;rafael.j.wysocki@intel.com&gt;
</content>
</entry>
<entry>
<title>cpuidle: menu: avoid expensive square root computation</title>
<updated>2016-02-16T23:27:16Z</updated>
<author>
<name>Rasmus Villemoes</name>
<email>linux@rasmusvillemoes.dk</email>
</author>
<published>2016-02-16T19:19:18Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=7024b18ca461bab45e5fb329f6e3d904d5109401'/>
<id>urn:sha1:7024b18ca461bab45e5fb329f6e3d904d5109401</id>
<content type='text'>
Computing the integer square root is a rather expensive operation, at
least compared to doing a 64x64 -&gt; 64 multiply (avg*avg) and, on 64
bit platforms, doing an extra comparison to a constant (variance &lt;=
U64_MAX/36).

On 64 bit platforms, this does mean that we add a restriction on the
range of the variance where we end up using the estimate (since
previously the stddev &lt;= ULONG_MAX was a tautology), but on the other
hand, we extend the range quite substantially on 32 bit platforms - in
both cases, we now allow standard deviations up to 715 seconds, which
is for example guaranteed if all observations are less than 1430
seconds.

Signed-off-by: Rasmus Villemoes &lt;linux@rasmusvillemoes.dk&gt;
Signed-off-by: Rafael J. Wysocki &lt;rafael.j.wysocki@intel.com&gt;
</content>
</entry>
</feed>
