<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/kernel/sched/rt.c, branch v4.0</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v4.0</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v4.0'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2015-01-30T18:38:49Z</updated>
<entry>
<title>sched/rt: Reduce rq lock contention by eliminating locking of non-feasible target</title>
<updated>2015-01-30T18:38:49Z</updated>
<author>
<name>Tim Chen</name>
<email>tim.c.chen@linux.intel.com</email>
</author>
<published>2014-12-12T23:38:12Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=80e3d87b2c5582db0ab5e39610ce3707d97ba409'/>
<id>urn:sha1:80e3d87b2c5582db0ab5e39610ce3707d97ba409</id>
<content type='text'>
This patch adds checks that prevens futile attempts to move rt tasks
to a CPU with active tasks of equal or higher priority.

This reduces run queue lock contention and improves the performance of
a well known OLTP benchmark by 0.7%.

Signed-off-by: Tim Chen &lt;tim.c.chen@linux.intel.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: Shawn Bohrer &lt;sbohrer@rgmadvisors.com&gt;
Cc: Suruchi Kadu &lt;suruchi.a.kadu@intel.com&gt;
Cc: Doug Nelson&lt;doug.nelson@intel.com&gt;
Cc: Steven Rostedt &lt;rostedt@goodmis.org&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Link: http://lkml.kernel.org/r/1421430374.2399.27.camel@schen9-desk2.jf.intel.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>sched/core: Rework rq-&gt;clock update skips</title>
<updated>2015-01-14T12:34:20Z</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2015-01-05T10:18:11Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=9edfbfed3f544a7830d99b341f0c175995a02950'/>
<id>urn:sha1:9edfbfed3f544a7830d99b341f0c175995a02950</id>
<content type='text'>
The original purpose of rq::skip_clock_update was to avoid 'costly' clock
updates for back to back wakeup-preempt pairs. The big problem with it
has always been that the rq variable is unaware of the context and
causes indiscrimiate clock skips.

Rework the entire thing and create a sense of context by only allowing
schedule() to skip clock updates. (XXX can we measure the cost of the
added store?)

By ensuring only schedule can ever skip an update, we guarantee we're
never more than 1 tick behind on the update.

Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: umgwanakikbuti@gmail.com
Link: http://lkml.kernel.org/r/20150105103554.432381549@infradead.org
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>sched: Move p-&gt;nr_cpus_allowed check to select_task_rq()</title>
<updated>2014-11-16T09:58:55Z</updated>
<author>
<name>Wanpeng Li</name>
<email>wanpeng.li@linux.intel.com</email>
</author>
<published>2014-11-05T01:14:37Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=6c1d9410f007a26d13173cf17204cfd965f49b83'/>
<id>urn:sha1:6c1d9410f007a26d13173cf17204cfd965f49b83</id>
<content type='text'>
Move the p-&gt;nr_cpus_allowed check into kernel/sched/core.c: select_task_rq().
This change will make fair.c, rt.c, and deadline.c all start with the
same logic.

Suggested-and-Acked-by: Steven Rostedt &lt;rostedt@goodmis.org&gt;
Signed-off-by: Wanpeng Li &lt;wanpeng.li@linux.intel.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: "pang.xunlei" &lt;pang.xunlei@linaro.org&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Link: http://lkml.kernel.org/r/1415150077-59053-1-git-send-email-wanpeng.li@linux.intel.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>Merge branch 'sched/urgent' into sched/core, to pick up fixes before applying more changes</title>
<updated>2014-11-16T09:50:25Z</updated>
<author>
<name>Ingo Molnar</name>
<email>mingo@kernel.org</email>
</author>
<published>2014-11-16T09:50:25Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=e9ac5f0fa8549dffe2a15870217a9c2e7cd557ec'/>
<id>urn:sha1:e9ac5f0fa8549dffe2a15870217a9c2e7cd557ec</id>
<content type='text'>
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>sched/cputime: Fix clock_nanosleep()/clock_gettime() inconsistency</title>
<updated>2014-11-16T09:04:20Z</updated>
<author>
<name>Stanislaw Gruszka</name>
<email>sgruszka@redhat.com</email>
</author>
<published>2014-11-12T15:58:44Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=6e998916dfe327e785e7c2447959b2c1a3ea4930'/>
<id>urn:sha1:6e998916dfe327e785e7c2447959b2c1a3ea4930</id>
<content type='text'>
Commit d670ec13178d0 "posix-cpu-timers: Cure SMP wobbles" fixes one glibc
test case in cost of breaking another one. After that commit, calling
clock_nanosleep(TIMER_ABSTIME, X) and then clock_gettime(&amp;Y) can result
of Y time being smaller than X time.

Reproducer/tester can be found further below, it can be compiled and ran by:

	gcc -o tst-cpuclock2 tst-cpuclock2.c -pthread
	while ./tst-cpuclock2 ; do : ; done

This reproducer, when running on a buggy kernel, will complain
about "clock_gettime difference too small".

Issue happens because on start in thread_group_cputimer() we initialize
sum_exec_runtime of cputimer with threads runtime not yet accounted and
then add the threads runtime to running cputimer again on scheduler
tick, making it's sum_exec_runtime bigger than actual threads runtime.

KOSAKI Motohiro posted a fix for this problem, but that patch was never
applied: https://lkml.org/lkml/2013/5/26/191 .

This patch takes different approach to cure the problem. It calls
update_curr() when cputimer starts, that assure we will have updated
stats of running threads and on the next schedule tick we will account
only the runtime that elapsed from cputimer start. That also assure we
have consistent state between cpu times of individual threads and cpu
time of the process consisted by those threads.

Full reproducer (tst-cpuclock2.c):

	#define _GNU_SOURCE
	#include &lt;unistd.h&gt;
	#include &lt;sys/syscall.h&gt;
	#include &lt;stdio.h&gt;
	#include &lt;time.h&gt;
	#include &lt;pthread.h&gt;
	#include &lt;stdint.h&gt;
	#include &lt;inttypes.h&gt;

	/* Parameters for the Linux kernel ABI for CPU clocks.  */
	#define CPUCLOCK_SCHED          2
	#define MAKE_PROCESS_CPUCLOCK(pid, clock) \
		((~(clockid_t) (pid) &lt;&lt; 3) | (clockid_t) (clock))

	static pthread_barrier_t barrier;

	/* Help advance the clock.  */
	static void *chew_cpu(void *arg)
	{
		pthread_barrier_wait(&amp;barrier);
		while (1) ;

		return NULL;
	}

	/* Don't use the glibc wrapper.  */
	static int do_nanosleep(int flags, const struct timespec *req)
	{
		clockid_t clock_id = MAKE_PROCESS_CPUCLOCK(0, CPUCLOCK_SCHED);

		return syscall(SYS_clock_nanosleep, clock_id, flags, req, NULL);
	}

	static int64_t tsdiff(const struct timespec *before, const struct timespec *after)
	{
		int64_t before_i = before-&gt;tv_sec * 1000000000ULL + before-&gt;tv_nsec;
		int64_t after_i = after-&gt;tv_sec * 1000000000ULL + after-&gt;tv_nsec;

		return after_i - before_i;
	}

	int main(void)
	{
		int result = 0;
		pthread_t th;

		pthread_barrier_init(&amp;barrier, NULL, 2);

		if (pthread_create(&amp;th, NULL, chew_cpu, NULL) != 0) {
			perror("pthread_create");
			return 1;
		}

		pthread_barrier_wait(&amp;barrier);

		/* The test.  */
		struct timespec before, after, sleeptimeabs;
		int64_t sleepdiff, diffabs;
		const struct timespec sleeptime = {.tv_sec = 0,.tv_nsec = 100000000 };

		/* The relative nanosleep.  Not sure why this is needed, but its presence
		   seems to make it easier to reproduce the problem.  */
		if (do_nanosleep(0, &amp;sleeptime) != 0) {
			perror("clock_nanosleep");
			return 1;
		}

		/* Get the current time.  */
		if (clock_gettime(CLOCK_PROCESS_CPUTIME_ID, &amp;before) &lt; 0) {
			perror("clock_gettime[2]");
			return 1;
		}

		/* Compute the absolute sleep time based on the current time.  */
		uint64_t nsec = before.tv_nsec + sleeptime.tv_nsec;
		sleeptimeabs.tv_sec = before.tv_sec + nsec / 1000000000;
		sleeptimeabs.tv_nsec = nsec % 1000000000;

		/* Sleep for the computed time.  */
		if (do_nanosleep(TIMER_ABSTIME, &amp;sleeptimeabs) != 0) {
			perror("absolute clock_nanosleep");
			return 1;
		}

		/* Get the time after the sleep.  */
		if (clock_gettime(CLOCK_PROCESS_CPUTIME_ID, &amp;after) &lt; 0) {
			perror("clock_gettime[3]");
			return 1;
		}

		/* The time after sleep should always be equal to or after the absolute sleep
		   time passed to clock_nanosleep.  */
		sleepdiff = tsdiff(&amp;sleeptimeabs, &amp;after);
		if (sleepdiff &lt; 0) {
			printf("absolute clock_nanosleep woke too early: %" PRId64 "\n", sleepdiff);
			result = 1;

			printf("Before %llu.%09llu\n", before.tv_sec, before.tv_nsec);
			printf("After  %llu.%09llu\n", after.tv_sec, after.tv_nsec);
			printf("Sleep  %llu.%09llu\n", sleeptimeabs.tv_sec, sleeptimeabs.tv_nsec);
		}

		/* The difference between the timestamps taken before and after the
		   clock_nanosleep call should be equal to or more than the duration of the
		   sleep.  */
		diffabs = tsdiff(&amp;before, &amp;after);
		if (diffabs &lt; sleeptime.tv_nsec) {
			printf("clock_gettime difference too small: %" PRId64 "\n", diffabs);
			result = 1;
		}

		pthread_cancel(th);

		return result;
	}

Signed-off-by: Stanislaw Gruszka &lt;sgruszka@redhat.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: Rik van Riel &lt;riel@redhat.com&gt;
Cc: Frederic Weisbecker &lt;fweisbec@gmail.com&gt;
Cc: KOSAKI Motohiro &lt;kosaki.motohiro@jp.fujitsu.com&gt;
Cc: Oleg Nesterov &lt;oleg@redhat.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Link: http://lkml.kernel.org/r/20141112155843.GA24803@redhat.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>sched/rt: Clean up check_preempt_equal_prio()</title>
<updated>2014-11-04T06:17:52Z</updated>
<author>
<name>Wanpeng Li</name>
<email>wanpeng.li@linux.intel.com</email>
</author>
<published>2014-10-30T22:39:31Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=308a623a40ce168eb234ea82c2bd13ff85a098d9'/>
<id>urn:sha1:308a623a40ce168eb234ea82c2bd13ff85a098d9</id>
<content type='text'>
This patch checks if current can be pushed/pulled somewhere else
in advance to make logic clear, the same behavior as dl class.

- If current can't be migrated, useless to reschedule, let's hope
  task can move out.
- If task is migratable, so let's not schedule it and see if it
  can be pushed or pulled somewhere else.

Signed-off-by: Wanpeng Li &lt;wanpeng.li@linux.intel.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: Juri Lelli &lt;juri.lelli@arm.com&gt;
Cc: Kirill Tkhai &lt;ktkhai@parallels.com&gt;
Cc: Steven Rostedt &lt;rostedt@goodmis.org&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Link: http://lkml.kernel.org/r/1414708776-124078-1-git-send-email-wanpeng.li@linux.intel.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>Merge branch 'for-3.18-consistent-ops' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu</title>
<updated>2014-10-15T05:48:18Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2014-10-15T05:48:18Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=0429fbc0bdc297d64188483ba029a23773ae07b0'/>
<id>urn:sha1:0429fbc0bdc297d64188483ba029a23773ae07b0</id>
<content type='text'>
Pull percpu consistent-ops changes from Tejun Heo:
 "Way back, before the current percpu allocator was implemented, static
  and dynamic percpu memory areas were allocated and handled separately
  and had their own accessors.  The distinction has been gone for many
  years now; however, the now duplicate two sets of accessors remained
  with the pointer based ones - this_cpu_*() - evolving various other
  operations over time.  During the process, we also accumulated other
  inconsistent operations.

  This pull request contains Christoph's patches to clean up the
  duplicate accessor situation.  __get_cpu_var() uses are replaced with
  with this_cpu_ptr() and __this_cpu_ptr() with raw_cpu_ptr().

  Unfortunately, the former sometimes is tricky thanks to C being a bit
  messy with the distinction between lvalues and pointers, which led to
  a rather ugly solution for cpumask_var_t involving the introduction of
  this_cpu_cpumask_var_ptr().

  This converts most of the uses but not all.  Christoph will follow up
  with the remaining conversions in this merge window and hopefully
  remove the obsolete accessors"

* 'for-3.18-consistent-ops' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu: (38 commits)
  irqchip: Properly fetch the per cpu offset
  percpu: Resolve ambiguities in __get_cpu_var/cpumask_var_t -fix
  ia64: sn_nodepda cannot be assigned to after this_cpu conversion. Use __this_cpu_write.
  percpu: Resolve ambiguities in __get_cpu_var/cpumask_var_t
  Revert "powerpc: Replace __get_cpu_var uses"
  percpu: Remove __this_cpu_ptr
  clocksource: Replace __this_cpu_ptr with raw_cpu_ptr
  sparc: Replace __get_cpu_var uses
  avr32: Replace __get_cpu_var with __this_cpu_write
  blackfin: Replace __get_cpu_var uses
  tile: Use this_cpu_ptr() for hardware counters
  tile: Replace __get_cpu_var uses
  powerpc: Replace __get_cpu_var uses
  alpha: Replace __get_cpu_var
  ia64: Replace __get_cpu_var uses
  s390: cio driver &amp;__get_cpu_var replacements
  s390: Replace __get_cpu_var uses
  mips: Replace __get_cpu_var uses
  MIPS: Replace __get_cpu_var uses in FPU emulator.
  arm: Replace __this_cpu_ptr with raw_cpu_ptr
  ...
</content>
</entry>
<entry>
<title>sched/rt: Use resched_curr() in task_tick_rt()</title>
<updated>2014-09-24T12:47:12Z</updated>
<author>
<name>Kirill Tkhai</name>
<email>ktkhai@parallels.com</email>
</author>
<published>2014-09-22T18:36:43Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=8aa6f0ebf41b5fdd186276394bf07e7bd6884d94'/>
<id>urn:sha1:8aa6f0ebf41b5fdd186276394bf07e7bd6884d94</id>
<content type='text'>
Some time ago PREEMPT_NEED_RESCHED was implemented,
so reschedule technics is a little more difficult now.

Signed-off-by: Kirill Tkhai &lt;ktkhai@parallels.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Link: http://lkml.kernel.org/r/20140922183642.11015.66039.stgit@localhost
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>sched/rt: Remove useless if from cleanup pick_next_task_rt()</title>
<updated>2014-09-19T10:35:20Z</updated>
<author>
<name>Kirill Tkhai</name>
<email>ktkhai@parallels.com</email>
</author>
<published>2014-09-12T13:42:01Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=f3f1768f89d601ad29f4701deef91caaa82b9f57'/>
<id>urn:sha1:f3f1768f89d601ad29f4701deef91caaa82b9f57</id>
<content type='text'>
_pick_next_task_rt() never returns NULL.

Signed-off-by: Kirill Tkhai &lt;ktkhai@parallels.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Link: http://lkml.kernel.org/r/1410529321.3569.26.camel@tkhai
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>percpu: Resolve ambiguities in __get_cpu_var/cpumask_var_t</title>
<updated>2014-08-28T12:58:57Z</updated>
<author>
<name>Christoph Lameter</name>
<email>cl@linux.com</email>
</author>
<published>2014-08-27T00:12:21Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=4ba2968420fa9d0604b6a6a5c61bfa8d0fa84ae0'/>
<id>urn:sha1:4ba2968420fa9d0604b6a6a5c61bfa8d0fa84ae0</id>
<content type='text'>
__get_cpu_var can paper over differences in the definitions of
cpumask_var_t and either use the address of the cpumask variable
directly or perform a fetch of the address of the struct cpumask
allocated elsewhere. This is important particularly when using per cpu
cpumask_var_t declarations because in one case we have an offset into
a per cpu area to handle and in the other case we need to fetch a
pointer from the offset.

This patch introduces a new macro

this_cpu_cpumask_var_ptr()

that is defined where cpumask_var_t is defined and performs the proper
actions. All use cases where __get_cpu_var is used with cpumask_var_t
are converted to the use of this_cpu_cpumask_var_ptr().

Signed-off-by: Christoph Lameter &lt;cl@linux.com&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
</content>
</entry>
</feed>
