<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/kernel/sched, branch v4.0</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v4.0</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v4.0'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2015-04-07T23:45:33Z</updated>
<entry>
<title>mm: numa: disable change protection for vma(VM_HUGETLB)</title>
<updated>2015-04-07T23:45:33Z</updated>
<author>
<name>Naoya Horiguchi</name>
<email>n-horiguchi@ah.jp.nec.com</email>
</author>
<published>2015-04-07T21:26:47Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=6b79c57b92cdd90853002980609af516d14c4f9c'/>
<id>urn:sha1:6b79c57b92cdd90853002980609af516d14c4f9c</id>
<content type='text'>
Currently when a process accesses a hugetlb range protected with
PROTNONE, unexpected COWs are triggered, which finally puts the hugetlb
subsystem into a broken/uncontrollable state, where for example
h-&gt;resv_huge_pages is subtracted too much and wraps around to a very
large number, and the free hugepage pool is no longer maintainable.

This patch simply stops changing protection for vma(VM_HUGETLB) to fix
the problem.  And this also allows us to avoid useless overhead of minor
faults.

Signed-off-by: Naoya Horiguchi &lt;n-horiguchi@ah.jp.nec.com&gt;
Suggested-by: Mel Gorman &lt;mgorman@suse.de&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Cc: "Kirill A. Shutemov" &lt;kirill@shutemov.name&gt;
Cc: David Rientjes &lt;rientjes@google.com&gt;
Cc: Rik van Riel &lt;riel@redhat.com&gt;
Cc: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Cc: Ingo Molnar &lt;mingo@elte.hu&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip</title>
<updated>2015-03-28T18:17:32Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2015-03-28T18:17:32Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=19dba4f3e9f7fef811cf446e15f578887762528a'/>
<id>urn:sha1:19dba4f3e9f7fef811cf446e15f578887762528a</id>
<content type='text'>
Pull scheduler fix from Ingo Molnar:
 "A single sched/rt corner case fix for RLIMIT_RTIME correctness"

* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched: Fix RLIMIT_RTTIME when PI-boosting to RT
</content>
</entry>
<entry>
<title>mm: numa: slow PTE scan rate if migration failures occur</title>
<updated>2015-03-25T23:20:31Z</updated>
<author>
<name>Mel Gorman</name>
<email>mgorman@suse.de</email>
</author>
<published>2015-03-25T22:55:42Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=074c238177a75f5e79af3b2cb6a84e54823ef950'/>
<id>urn:sha1:074c238177a75f5e79af3b2cb6a84e54823ef950</id>
<content type='text'>
Dave Chinner reported the following on https://lkml.org/lkml/2015/3/1/226

  Across the board the 4.0-rc1 numbers are much slower, and the degradation
  is far worse when using the large memory footprint configs. Perf points
  straight at the cause - this is from 4.0-rc1 on the "-o bhash=101073" config:

   -   56.07%    56.07%  [kernel]            [k] default_send_IPI_mask_sequence_phys
      - default_send_IPI_mask_sequence_phys
         - 99.99% physflat_send_IPI_mask
            - 99.37% native_send_call_func_ipi
                 smp_call_function_many
               - native_flush_tlb_others
                  - 99.85% flush_tlb_page
                       ptep_clear_flush
                       try_to_unmap_one
                       rmap_walk
                       try_to_unmap
                       migrate_pages
                       migrate_misplaced_page
                     - handle_mm_fault
                        - 99.73% __do_page_fault
                             trace_do_page_fault
                             do_async_page_fault
                           + async_page_fault
              0.63% native_send_call_func_single_ipi
                 generic_exec_single
                 smp_call_function_single

This is showing excessive migration activity even though excessive
migrations are meant to get throttled.  Normally, the scan rate is tuned
on a per-task basis depending on the locality of faults.  However, if
migrations fail for any reason then the PTE scanner may scan faster if
the faults continue to be remote.  This means there is higher system CPU
overhead and fault trapping at exactly the time we know that migrations
cannot happen.  This patch tracks when migration failures occur and
slows the PTE scanner.

Signed-off-by: Mel Gorman &lt;mgorman@suse.de&gt;
Reported-by: Dave Chinner &lt;david@fromorbit.com&gt;
Tested-by: Dave Chinner &lt;david@fromorbit.com&gt;
Cc: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Aneesh Kumar &lt;aneesh.kumar@linux.vnet.ibm.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>sched: Fix RLIMIT_RTTIME when PI-boosting to RT</title>
<updated>2015-03-23T09:47:55Z</updated>
<author>
<name>Brian Silverman</name>
<email>brian@peloton-tech.com</email>
</author>
<published>2015-02-19T00:23:56Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=746db9443ea57fd9c059f62c4bfbf41cf224fe13'/>
<id>urn:sha1:746db9443ea57fd9c059f62c4bfbf41cf224fe13</id>
<content type='text'>
When non-realtime tasks get priority-inheritance boosted to a realtime
scheduling class, RLIMIT_RTTIME starts to apply to them. However, the
counter used for checking this (the same one used for SCHED_RR
timeslices) was not getting reset. This meant that tasks running with a
non-realtime scheduling class which are repeatedly boosted to a realtime
one, but never block while they are running realtime, eventually hit the
timeout without ever running for a time over the limit. This patch
resets the realtime timeslice counter when un-PI-boosting from an RT to
a non-RT scheduling class.

I have some test code with two threads and a shared PTHREAD_PRIO_INHERIT
mutex which induces priority boosting and spins while boosted that gets
killed by a SIGXCPU on non-fixed kernels but doesn't with this patch
applied. It happens much faster with a CONFIG_PREEMPT_RT kernel, and
does happen eventually with PREEMPT_VOLUNTARY kernels.

Signed-off-by: Brian Silverman &lt;brian@peloton-tech.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: austin@peloton-tech.com
Cc: &lt;stable@vger.kernel.org&gt;
Link: http://lkml.kernel.org/r/1424305436-6716-1-git-send-email-brian@peloton-tech.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>cpuidle / sleep: Use broadcast timer for states that stop local timer</title>
<updated>2015-03-05T22:13:19Z</updated>
<author>
<name>Rafael J. Wysocki</name>
<email>rafael.j.wysocki@intel.com</email>
</author>
<published>2015-03-02T21:26:55Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=ef2b22ac540c018bd574d1846ab95b9bfcf38702'/>
<id>urn:sha1:ef2b22ac540c018bd574d1846ab95b9bfcf38702</id>
<content type='text'>
Commit 381063133246 (PM / sleep: Re-implement suspend-to-idle handling)
overlooked the fact that entering some sufficiently deep idle states
by CPUs may cause their local timers to stop and in those cases it
is necessary to switch over to a broadcast timer prior to entering
the idle state.  If the cpuidle driver in use does not provide
the new -&gt;enter_freeze callback for any of the idle states, that
problem affects suspend-to-idle too, but it is not taken into account
after the changes made by commit 381063133246.

Fix that by changing the definition of cpuidle_enter_freeze() and
re-arranging of the code in cpuidle_idle_call(), so the former does
not call cpuidle_enter() any more and the fallback case is handled
by cpuidle_idle_call() directly.

Fixes: 381063133246 (PM / sleep: Re-implement suspend-to-idle handling)
Reported-and-tested-by: Lorenzo Pieralisi &lt;lorenzo.pieralisi@arm.com&gt;
Signed-off-by: Rafael J. Wysocki &lt;rafael.j.wysocki@intel.com&gt;
Acked-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
</content>
</entry>
<entry>
<title>cpuidle: Clean up fallback handling in cpuidle_idle_call()</title>
<updated>2015-03-02T21:25:37Z</updated>
<author>
<name>Rafael J. Wysocki</name>
<email>rafael.j.wysocki@intel.com</email>
</author>
<published>2015-03-02T21:25:37Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=dfcacc154fb38fdb2c243c3dbbdc1f26a64cedc8'/>
<id>urn:sha1:dfcacc154fb38fdb2c243c3dbbdc1f26a64cedc8</id>
<content type='text'>
Move the fallback code path in cpuidle_idle_call() to the end of the
function to avoid jumping to a label in an if () branch.

Signed-off-by: Rafael J. Wysocki &lt;rafael.j.wysocki@intel.com&gt;
</content>
</entry>
<entry>
<title>idle / sleep: Avoid excessive disabling and enabling interrupts</title>
<updated>2015-02-28T22:46:24Z</updated>
<author>
<name>Rafael J. Wysocki</name>
<email>rafael.j.wysocki@intel.com</email>
</author>
<published>2015-02-26T23:39:21Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=01e04f466e12e883907937eb04a9010533363f55'/>
<id>urn:sha1:01e04f466e12e883907937eb04a9010533363f55</id>
<content type='text'>
Disabling interrupts at the end of cpuidle_enter_freeze() is not
useful, because its caller, cpuidle_idle_call(), re-enables them
right away after invoking it.

To avoid that unnecessary back and forth dance with interrupts,
make cpuidle_enter_freeze() enable interrupts after calling
enter_freeze_proper() and drop the local_irq_disable() at its
end, so that all of the code paths in it end up with interrupts
enabled.  Then, cpuidle_idle_call() will not need to re-enable
interrupts after calling cpuidle_enter_freeze() any more, because
the latter will return with interrupts enabled, in analogy with
cpuidle_enter().

Reported-by: Lorenzo Pieralisi &lt;lorenzo.pieralisi@arm.com&gt;
Signed-off-by: Rafael J. Wysocki &lt;rafael.j.wysocki@intel.com&gt;
Acked-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
</content>
</entry>
<entry>
<title>Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip</title>
<updated>2015-02-21T18:40:02Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2015-02-21T18:40:02Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=e2defd02717ebc54ae2f4862271a3093665b426a'/>
<id>urn:sha1:e2defd02717ebc54ae2f4862271a3093665b426a</id>
<content type='text'>
Pull scheduler fixes from Ingo Molnar:
 "Thiscontains misc fixes: preempt_schedule_common() and io_schedule()
  recursion fixes, sched/dl fixes, a completion_done() revert, two
  sched/rt fixes and a comment update patch"

* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched/rt: Avoid obvious configuration fail
  sched/autogroup: Fix failure to set cpu.rt_runtime_us
  sched/dl: Do update_rq_clock() in yield_task_dl()
  sched: Prevent recursion in io_schedule()
  sched/completion: Serialize completion_done() with complete()
  sched: Fix preempt_schedule_common() triggering tracing recursion
  sched/dl: Prevent enqueue of a sleeping task in dl_task_timer()
  sched: Make dl_task_time() use task_rq_lock()
  sched: Clarify ordering between task_rq_lock() and move_queued_task()
</content>
</entry>
<entry>
<title>sched/rt: Avoid obvious configuration fail</title>
<updated>2015-02-18T15:17:23Z</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2015-02-09T11:23:20Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=2636ed5f8d15ff9395731593537b4b3fdf2af24d'/>
<id>urn:sha1:2636ed5f8d15ff9395731593537b4b3fdf2af24d</id>
<content type='text'>
Setting the root group's cpu.rt_runtime_us to 0 is a bad thing; it
would disallow the kernel creating RT tasks.

One can of course still set it to 1, which will (likely) still wreck
your kernel, but at least make it clear that setting it to 0 is not
good.

Collect both sanity checks into the one place while we're there.

Suggested-by: Zefan Li &lt;lizefan@huawei.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Link: http://lkml.kernel.org/r/20150209112715.GO24151@twins.programming.kicks-ass.net
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>sched/autogroup: Fix failure to set cpu.rt_runtime_us</title>
<updated>2015-02-18T15:17:20Z</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2015-02-09T10:53:18Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=1fe89e1b6d270aa0d3452c60d38461ea589594e3'/>
<id>urn:sha1:1fe89e1b6d270aa0d3452c60d38461ea589594e3</id>
<content type='text'>
Because task_group() uses a cache of autogroup_task_group(), whose
output depends on sched_class, switching classes can generate
problems.

In particular, when started as fair, the cache points to the
autogroup, so when switching to RT the tg_rt_schedulable() test fails
for every cpu.rt_{runtime,period}_us change because now the autogroup
has tasks and no runtime.

Furthermore, going back to the previous semantics of varying
task_group() with sched_class has the down-side that the sched_debug
output varies as well, even though the task really is in the
autogroup.

Therefore add an autogroup exception to tg_has_rt_tasks() -- such that
both (all) task_group() usages in sched/core now have one. And remove
all the remnants of the variable task_group() output.

Reported-by: Zefan Li &lt;lizefan@huawei.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Mike Galbraith &lt;umgwanakikbuti@gmail.com&gt;
Cc: Stefan Bader &lt;stefan.bader@canonical.com&gt;
Fixes: 8323f26ce342 ("sched: Fix race in task_group()")
Link: http://lkml.kernel.org/r/20150209112237.GR5029@twins.programming.kicks-ass.net
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
</feed>
