<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/kernel/softirq.c, branch v3.6</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v3.6</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v3.6'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2012-08-01T01:42:45Z</updated>
<entry>
<title>mm: allow PF_MEMALLOC from softirq context</title>
<updated>2012-08-01T01:42:45Z</updated>
<author>
<name>Mel Gorman</name>
<email>mgorman@suse.de</email>
</author>
<published>2012-07-31T23:44:07Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=907aed48f65efeecf91575397e3d79335d93a466'/>
<id>urn:sha1:907aed48f65efeecf91575397e3d79335d93a466</id>
<content type='text'>
This is needed to allow network softirq packet processing to make use of
PF_MEMALLOC.

Currently softirq context cannot use PF_MEMALLOC due to it not being
associated with a task, and therefore not having task flags to fiddle with
- thus the gfp to alloc flag mapping ignores the task flags when in
interrupts (hard or soft) context.

Allowing softirqs to make use of PF_MEMALLOC therefore requires some
trickery.  This patch borrows the task flags from whatever process happens
to be preempted by the softirq.  It then modifies the gfp to alloc flags
mapping to not exclude task flags in softirq context, and modify the
softirq code to save, clear and restore the PF_MEMALLOC flag.

The save and clear, ensures the preempted task's PF_MEMALLOC flag doesn't
leak into the softirq.  The restore ensures a softirq's PF_MEMALLOC flag
cannot leak back into the preempted process.  This should be safe due to
the following reasons

Softirqs can run on multiple CPUs sure but the same task should not be
	executing the same softirq code. Neither should the softirq
	handler be preempted by any other softirq handler so the flags
	should not leak to an unrelated softirq.

Softirqs re-enable hardware interrupts in __do_softirq() so can be
	preempted by hardware interrupts so PF_MEMALLOC is inherited
	by the hard IRQ. However, this is similar to a process in
	reclaim being preempted by a hardirq. While PF_MEMALLOC is
	set, gfp_to_alloc_flags() distinguishes between hard and
	soft irqs and avoids giving a hardirq the ALLOC_NO_WATERMARKS
	flag.

If the softirq is deferred to ksoftirq then its flags may be used
        instead of a normal tasks but as the softirq cannot be preempted,
        the PF_MEMALLOC flag does not leak to other code by accident.

[davem@davemloft.net: Document why PF_MEMALLOC is safe]
Signed-off-by: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Signed-off-by: Mel Gorman &lt;mgorman@suse.de&gt;
Cc: David Miller &lt;davem@davemloft.net&gt;
Cc: Neil Brown &lt;neilb@suse.de&gt;
Cc: Mike Christie &lt;michaelc@cs.wisc.edu&gt;
Cc: Eric B Munson &lt;emunson@mgebm.net&gt;
Cc: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
Cc: Sebastian Andrzej Siewior &lt;sebastian@breakpoint.cc&gt;
Cc: Mel Gorman &lt;mgorman@suse.de&gt;
Cc: Christoph Lameter &lt;cl@linux.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip</title>
<updated>2012-03-20T17:32:09Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2012-03-20T17:32:09Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=161f7a7161191ab9c2e97f787829ef8dd2b95771'/>
<id>urn:sha1:161f7a7161191ab9c2e97f787829ef8dd2b95771</id>
<content type='text'>
Pull timer changes for v3.4 from Ingo Molnar

* 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (32 commits)
  ntp: Fix integer overflow when setting time
  math: Introduce div64_long
  cs5535-clockevt: Allow the MFGPT IRQ to be shared
  cs5535-clockevt: Don't ignore MFGPT on SMP-capable kernels
  x86/time: Eliminate unused irq0_irqs counter
  clocksource: scx200_hrt: Fix the build
  x86/tsc: Reduce the TSC sync check time for core-siblings
  timer: Fix bad idle check on irq entry
  nohz: Remove ts-&gt;Einidle checks before restarting the tick
  nohz: Remove update_ts_time_stat from tick_nohz_start_idle
  clockevents: Leave the broadcast device in shutdown mode when not needed
  clocksource: Load the ACPI PM clocksource asynchronously
  clocksource: scx200_hrt: Convert scx200 to use clocksource_register_hz
  clocksource: Get rid of clocksource_calc_mult_shift()
  clocksource: dbx500: convert to clocksource_register_hz()
  clocksource: scx200_hrt:  use pr_&lt;level&gt; instead of printk
  time: Move common updates to a function
  time: Reorder so the hot data is together
  time: Remove most of xtime_lock usage in timekeeping.c
  ntp: Add ntp_lock to replace xtime_locking
  ...
</content>
</entry>
<entry>
<title>Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip</title>
<updated>2012-03-20T17:31:44Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2012-03-20T17:31:44Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=2ba68940c893c8f0bfc8573c041254251bb6aeab'/>
<id>urn:sha1:2ba68940c893c8f0bfc8573c041254251bb6aeab</id>
<content type='text'>
Pull scheduler changes for v3.4 from Ingo Molnar

* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (27 commits)
  printk: Make it compile with !CONFIG_PRINTK
  sched/x86: Fix overflow in cyc2ns_offset
  sched: Fix nohz load accounting -- again!
  sched: Update yield() docs
  printk/sched: Introduce special printk_sched() for those awkward moments
  sched/nohz: Correctly initialize 'next_balance' in 'nohz' idle balancer
  sched: Cleanup cpu_active madness
  sched: Fix load-balance wreckage
  sched: Clean up parameter passing of proc_sched_autogroup_set_nice()
  sched: Ditch per cgroup task lists for load-balancing
  sched: Rename load-balancing fields
  sched: Move load-balancing arguments into helper struct
  sched/rt: Do not submit new work when PI-blocked
  sched/rt: Prevent idle task boosting
  sched/wait: Add __wake_up_all_locked() API
  sched/rt: Document scheduler related skip-resched-check sites
  sched/rt: Use schedule_preempt_disabled()
  sched/rt: Add schedule_preempt_disabled()
  sched/rt: Do not throttle when PI boosting
  sched/rt: Keep period timer ticking when rt throttling is active
  ...
</content>
</entry>
<entry>
<title>Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip</title>
<updated>2012-03-20T17:29:15Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2012-03-20T17:29:15Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=9c2b957db1772ebf942ae7a9346b14eba6c8ca66'/>
<id>urn:sha1:9c2b957db1772ebf942ae7a9346b14eba6c8ca66</id>
<content type='text'>
Pull perf events changes for v3.4 from Ingo Molnar:

 - New "hardware based branch profiling" feature both on the kernel and
   the tooling side, on CPUs that support it.  (modern x86 Intel CPUs
   with the 'LBR' hardware feature currently.)

   This new feature is basically a sophisticated 'magnifying glass' for
   branch execution - something that is pretty difficult to extract from
   regular, function histogram centric profiles.

   The simplest mode is activated via 'perf record -b', and the result
   looks like this in perf report:

	$ perf record -b any_call,u -e cycles:u branchy

	$ perf report -b --sort=symbol
	    52.34%  [.] main                   [.] f1
	    24.04%  [.] f1                     [.] f3
	    23.60%  [.] f1                     [.] f2
	     0.01%  [k] _IO_new_file_xsputn    [k] _IO_file_overflow
	     0.01%  [k] _IO_vfprintf_internal  [k] _IO_new_file_xsputn
	     0.01%  [k] _IO_vfprintf_internal  [k] strchrnul
	     0.01%  [k] __printf               [k] _IO_vfprintf_internal
	     0.01%  [k] main                   [k] __printf

   This output shows from/to branch columns and shows the highest
   percentage (from,to) jump combinations - i.e.  the most likely taken
   branches in the system.  "branches" can also include function calls
   and any other synchronous and asynchronous transitions of the
   instruction pointer that are not 'next instruction' - such as system
   calls, traps, interrupts, etc.

   This feature comes with (hopefully intuitive) flat ascii and TUI
   support in perf report.

 - Various 'perf annotate' visual improvements for us assembly junkies.
   It will now recognize function calls in the TUI and by hitting enter
   you can follow the call (recursively) and back, amongst other
   improvements.

 - Multiple threads/processes recording support in perf record, perf
   stat, perf top - which is activated via a comma-list of PIDs:

	perf top -p 21483,21485
	perf stat -p 21483,21485 -ddd
	perf record -p 21483,21485

 - Support for per UID views, via the --uid paramter to perf top, perf
   report, etc.  For example 'perf top --uid mingo' will only show the
   tasks that I am running, excluding other users, root, etc.

 - Jump label restructurings and improvements - this includes the
   factoring out of the (hopefully much clearer) include/linux/static_key.h
   generic facility:

	struct static_key key = STATIC_KEY_INIT_FALSE;

	...

	if (static_key_false(&amp;key))
	        do unlikely code
	else
	        do likely code

	...
	static_key_slow_inc();
	...
	static_key_slow_inc();
	...

   The static_key_false() branch will be generated into the code with as
   little impact to the likely code path as possible.  the
   static_key_slow_*() APIs flip the branch via live kernel code patching.

   This facility can now be used more widely within the kernel to
   micro-optimize hot branches whose likelihood matches the static-key
   usage and fast/slow cost patterns.

 - SW function tracer improvements: perf support and filtering support.

 - Various hardenings of the perf.data ABI, to make older perf.data's
   smoother on newer tool versions, to make new features integrate more
   smoothly, to support cross-endian recording/analyzing workflows
   better, etc.

 - Restructuring of the kprobes code, the splitting out of 'optprobes',
   and a corner case bugfix.

 - Allow the tracing of kernel console output (printk).

 - Improvements/fixes to user-space RDPMC support, allowing user-space
   self-profiling code to extract PMU counts without performing any
   system calls, while playing nice with the kernel side.

 - 'perf bench' improvements

 - ... and lots of internal restructurings, cleanups and fixes that made
   these features possible.  And, as usual this list is incomplete as
   there were also lots of other improvements

* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (120 commits)
  perf report: Fix annotate double quit issue in branch view mode
  perf report: Remove duplicate annotate choice in branch view mode
  perf/x86: Prettify pmu config literals
  perf report: Enable TUI in branch view mode
  perf report: Auto-detect branch stack sampling mode
  perf record: Add HEADER_BRANCH_STACK tag
  perf record: Provide default branch stack sampling mode option
  perf tools: Make perf able to read files from older ABIs
  perf tools: Fix ABI compatibility bug in print_event_desc()
  perf tools: Enable reading of perf.data files from different ABI rev
  perf: Add ABI reference sizes
  perf report: Add support for taken branch sampling
  perf record: Add support for sampling taken branch
  perf tools: Add code to support PERF_SAMPLE_BRANCH_STACK
  x86/kprobes: Split out optprobe related code to kprobes-opt.c
  x86/kprobes: Fix a bug which can modify kernel code permanently
  x86/kprobes: Fix instruction recovery on optimized path
  perf: Add callback to flush branch_stack on context switch
  perf: Disable PERF_SAMPLE_BRANCH_* when not supported
  perf/x86: Add LBR software filter support for Intel CPUs
  ...
</content>
</entry>
<entry>
<title>softirq: Reduce invoke_softirq() code duplication</title>
<updated>2012-03-06T12:33:27Z</updated>
<author>
<name>Heiko Carstens</name>
<email>heiko.carstens@de.ibm.com</email>
</author>
<published>2012-03-05T23:07:25Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=b2a00178614e2cdd981a708d22a05c1ce4eadfd7'/>
<id>urn:sha1:b2a00178614e2cdd981a708d22a05c1ce4eadfd7</id>
<content type='text'>
The two invoke_softirq() variants are identical except for a single
line. So move the #ifdef __ARCH_IRQ_EXIT_IRQS_DISABLED inside one of
the functions and get rid of the other one.

Signed-off-by: Heiko Carstens &lt;heiko.carstens@de.ibm.com&gt;
Cc: Ingo Molnar &lt;mingo@elte.hu&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
</content>
</entry>
<entry>
<title>sched/rt: Document scheduler related skip-resched-check sites</title>
<updated>2012-03-01T09:28:04Z</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2011-03-21T12:32:17Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=ba74c1448f127649046615ec017bded7b2a76f29'/>
<id>urn:sha1:ba74c1448f127649046615ec017bded7b2a76f29</id>
<content type='text'>
Create a distinction between scheduler related preempt_enable_no_resched()
calls and the nearly one hundred other places in the kernel that do not
want to reschedule, for one reason or another.

This distinction matters for -rt, where the scheduler and the non-scheduler
preempt models (and checks) are different. For upstream it's purely
documentational.

Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Link: http://lkml.kernel.org/n/tip-gs88fvx2mdv5psnzxnv575ke@git.kernel.org
Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
</content>
</entry>
<entry>
<title>sched/rt: Use schedule_preempt_disabled()</title>
<updated>2012-03-01T09:28:03Z</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2011-03-21T11:33:18Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=bd2f55361f18347e890d52ff9cfd8895455ec11b'/>
<id>urn:sha1:bd2f55361f18347e890d52ff9cfd8895455ec11b</id>
<content type='text'>
Coccinelle based conversion.

Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Acked-by: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Link: http://lkml.kernel.org/n/tip-24swm5zut3h9c4a6s46x8rws@git.kernel.org
Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
</content>
</entry>
<entry>
<title>timer: Fix bad idle check on irq entry</title>
<updated>2012-02-15T14:23:09Z</updated>
<author>
<name>Frederic Weisbecker</name>
<email>fweisbec@gmail.com</email>
</author>
<published>2012-01-24T17:59:44Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=0a8a2e78b7eece7c65884fcff9f98dc0fce89ee4'/>
<id>urn:sha1:0a8a2e78b7eece7c65884fcff9f98dc0fce89ee4</id>
<content type='text'>
idle_cpu() is called on irq entry to guess if we need to call
tick_check_idle(). This way we can catch up with jiffies if the tick
was stopped, stop accounting idle time during the interrupt and
maintain the sched clock if it is unstable.

But if we are going to exit the idle loop to schedule a new task (ie:
if we have a task in the runqueue or a remotely enqueued ttwu to
perform), the idle_cpu() check will return 0 such that we miss the
call to tick_check_idle() for all interrupts happening before we
schedule the new task.

As a result these interrupts and the softirqs coming along may deal
with stale jiffies values, bad sched clock values, and won't substract
their time from the idle time accounting.

Fix this with using is_idle_task() instead that strictly checks that
we are running the idle task, without caring about the fact we are
going to schedule a task soon.

Signed-off-by: Frederic Weisbecker &lt;fweisbec@gmail.com&gt;
Cc: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Cc: John Stultz &lt;john.stultz@linaro.org&gt;
Cc: Ingo Molnar &lt;mingo@elte.hu&gt;
Link: http://lkml.kernel.org/r/1327427984-23282-3-git-send-email-fweisbec@gmail.com
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
</content>
</entry>
<entry>
<title>tracing/softirq: Move __raise_softirq_irqoff() out of header</title>
<updated>2012-02-03T14:48:19Z</updated>
<author>
<name>Steven Rostedt</name>
<email>srostedt@redhat.com</email>
</author>
<published>2012-01-26T01:18:55Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=f069686e4bdc60a637d210ea3eea25fcdb82df88'/>
<id>urn:sha1:f069686e4bdc60a637d210ea3eea25fcdb82df88</id>
<content type='text'>
The __raise_softirq_irqoff() contains a tracepoint. As tracepoints in headers
can cause issues, and not to mention, bloats the kernel when they are
in a static inline, it is best to move the function that contains the
tracepoint out of the header and into softirq.c.

Link: http://lkml.kernel.org/r/20120118120711.GB14863@elte.hu

Suggested-by: Ingo Molnar &lt;mingo@elte.hu&gt;
Signed-off-by: Steven Rostedt &lt;rostedt@goodmis.org&gt;
</content>
</entry>
<entry>
<title>rcu: Fix early call to rcu_idle_enter()</title>
<updated>2011-12-11T18:31:38Z</updated>
<author>
<name>Frederic Weisbecker</name>
<email>fweisbec@gmail.com</email>
</author>
<published>2011-10-07T23:31:02Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=416eb33cd60ef405e2860a186364e57bcb2d89f6'/>
<id>urn:sha1:416eb33cd60ef405e2860a186364e57bcb2d89f6</id>
<content type='text'>
On the irq exit path, tick_nohz_irq_exit()
may raise a softirq, which action leads to the wake up
path and select_task_rq_fair() that makes use of rcu
to iterate the domains.

This is an illegal use of RCU because we may be in RCU
extended quiescent state if we interrupted an RCU-idle
window in the idle loop:

[  132.978883] ===============================
[  132.978883] [ INFO: suspicious RCU usage. ]
[  132.978883] -------------------------------
[  132.978883] kernel/sched_fair.c:1707 suspicious rcu_dereference_check() usage!
[  132.978883]
[  132.978883] other info that might help us debug this:
[  132.978883]
[  132.978883]
[  132.978883] rcu_scheduler_active = 1, debug_locks = 0
[  132.978883] RCU used illegally from extended quiescent state!
[  132.978883] 2 locks held by swapper/0:
[  132.978883]  #0:  (&amp;p-&gt;pi_lock){-.-.-.}, at: [&lt;ffffffff8105a729&gt;] try_to_wake_up+0x39/0x2f0
[  132.978883]  #1:  (rcu_read_lock){.+.+..}, at: [&lt;ffffffff8105556a&gt;] select_task_rq_fair+0x6a/0xec0
[  132.978883]
[  132.978883] stack backtrace:
[  132.978883] Pid: 0, comm: swapper Tainted: G        W   3.0.0+ #178
[  132.978883] Call Trace:
[  132.978883]  &lt;IRQ&gt;  [&lt;ffffffff810a01f6&gt;] lockdep_rcu_suspicious+0xe6/0x100
[  132.978883]  [&lt;ffffffff81055c49&gt;] select_task_rq_fair+0x749/0xec0
[  132.978883]  [&lt;ffffffff8105556a&gt;] ? select_task_rq_fair+0x6a/0xec0
[  132.978883]  [&lt;ffffffff812fe494&gt;] ? do_raw_spin_lock+0x54/0x150
[  132.978883]  [&lt;ffffffff810a1f2d&gt;] ? trace_hardirqs_on+0xd/0x10
[  132.978883]  [&lt;ffffffff8105a7c3&gt;] try_to_wake_up+0xd3/0x2f0
[  132.978883]  [&lt;ffffffff81094f98&gt;] ? ktime_get+0x68/0xf0
[  132.978883]  [&lt;ffffffff8105aa35&gt;] wake_up_process+0x15/0x20
[  132.978883]  [&lt;ffffffff81069dd5&gt;] raise_softirq_irqoff+0x65/0x110
[  132.978883]  [&lt;ffffffff8108eb65&gt;] __hrtimer_start_range_ns+0x415/0x5a0
[  132.978883]  [&lt;ffffffff812fe3ee&gt;] ? do_raw_spin_unlock+0x5e/0xb0
[  132.978883]  [&lt;ffffffff8108ed08&gt;] hrtimer_start+0x18/0x20
[  132.978883]  [&lt;ffffffff8109c9c3&gt;] tick_nohz_stop_sched_tick+0x393/0x450
[  132.978883]  [&lt;ffffffff810694f2&gt;] irq_exit+0xd2/0x100
[  132.978883]  [&lt;ffffffff81829e96&gt;] do_IRQ+0x66/0xe0
[  132.978883]  [&lt;ffffffff81820d53&gt;] common_interrupt+0x13/0x13
[  132.978883]  &lt;EOI&gt;  [&lt;ffffffff8103434b&gt;] ? native_safe_halt+0xb/0x10
[  132.978883]  [&lt;ffffffff810a1f2d&gt;] ? trace_hardirqs_on+0xd/0x10
[  132.978883]  [&lt;ffffffff810144ea&gt;] default_idle+0xba/0x370
[  132.978883]  [&lt;ffffffff810147fe&gt;] amd_e400_idle+0x5e/0x130
[  132.978883]  [&lt;ffffffff8100a9f6&gt;] cpu_idle+0xb6/0x120
[  132.978883]  [&lt;ffffffff817f217f&gt;] rest_init+0xef/0x150
[  132.978883]  [&lt;ffffffff817f20e2&gt;] ? rest_init+0x52/0x150
[  132.978883]  [&lt;ffffffff81ed9cf3&gt;] start_kernel+0x3da/0x3e5
[  132.978883]  [&lt;ffffffff81ed9346&gt;] x86_64_start_reservations+0x131/0x135
[  132.978883]  [&lt;ffffffff81ed944d&gt;] x86_64_start_kernel+0x103/0x112

Fix this by calling rcu_idle_enter() after tick_nohz_irq_exit().

Signed-off-by: Frederic Weisbecker &lt;fweisbec@gmail.com&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Signed-off-by: Paul E. McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
Reviewed-by: Josh Triplett &lt;josh@joshtriplett.org&gt;
</content>
</entry>
</feed>
