<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/kernel/watchdog.c, branch v4.11</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v4.11</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v4.11'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2017-03-02T07:42:34Z</updated>
<entry>
<title>sched/headers: Prepare for new header dependencies before moving code to &lt;linux/sched/debug.h&gt;</title>
<updated>2017-03-02T07:42:34Z</updated>
<author>
<name>Ingo Molnar</name>
<email>mingo@kernel.org</email>
</author>
<published>2017-02-08T17:51:35Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=b17b01533b719e9949e437abf66436a875739b40'/>
<id>urn:sha1:b17b01533b719e9949e437abf66436a875739b40</id>
<content type='text'>
We are going to split &lt;linux/sched/debug.h&gt; out of &lt;linux/sched.h&gt;, which
will have to be picked up from other headers and a couple of .c files.

Create a trivial placeholder &lt;linux/sched/debug.h&gt; file that just
maps to &lt;linux/sched.h&gt; to make this patch obviously correct and
bisectable.

Include the new header in the files that are going to need it.

Acked-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Mike Galbraith &lt;efault@gmx.de&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>sched/headers: Prepare for new header dependencies before moving code to &lt;uapi/linux/sched/types.h&gt;</title>
<updated>2017-03-02T07:42:27Z</updated>
<author>
<name>Ingo Molnar</name>
<email>mingo@kernel.org</email>
</author>
<published>2017-02-01T17:07:51Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=ae7e81c077d60507dcec139e40a6d10cf932cf4b'/>
<id>urn:sha1:ae7e81c077d60507dcec139e40a6d10cf932cf4b</id>
<content type='text'>
We are going to move scheduler ABI details to &lt;uapi/linux/sched/types.h&gt;,
which will be used from a number of .c files.

Create empty placeholder header that maps to &lt;linux/types.h&gt;.

Include the new header in the files that are going to need it.

Acked-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Mike Galbraith &lt;efault@gmx.de&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>sched/headers: Prepare for new header dependencies before moving code to &lt;linux/sched/clock.h&gt;</title>
<updated>2017-03-02T07:42:27Z</updated>
<author>
<name>Ingo Molnar</name>
<email>mingo@kernel.org</email>
</author>
<published>2017-02-01T15:36:40Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=e601757102cfd3eeae068f53b3bc1234f3a2b2e9'/>
<id>urn:sha1:e601757102cfd3eeae068f53b3bc1234f3a2b2e9</id>
<content type='text'>
We are going to split &lt;linux/sched/clock.h&gt; out of &lt;linux/sched.h&gt;, which
will have to be picked up from other headers and .c files.

Create a trivial placeholder &lt;linux/sched/clock.h&gt; file that just
maps to &lt;linux/sched.h&gt; to make this patch obviously correct and
bisectable.

Include the new header in the files that are going to need it.

Acked-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Mike Galbraith &lt;efault@gmx.de&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>kernel/watchdog: prevent false hardlockup on overloaded system</title>
<updated>2017-01-25T00:26:14Z</updated>
<author>
<name>Don Zickus</name>
<email>dzickus@redhat.com</email>
</author>
<published>2017-01-24T23:17:53Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=b94f51183b0617e7b9b4fb4137d4cf1cab7547c2'/>
<id>urn:sha1:b94f51183b0617e7b9b4fb4137d4cf1cab7547c2</id>
<content type='text'>
On an overloaded system, it is possible that a change in the watchdog
threshold can be delayed long enough to trigger a false positive.

This can easily be achieved by having a cpu spinning indefinitely on a
task, while another cpu updates watchdog threshold.

What happens is while trying to park the watchdog threads, the hrtimers
on the other cpus trigger and reprogram themselves with the new slower
watchdog threshold.  Meanwhile, the nmi watchdog is still programmed
with the old faster threshold.

Because the one cpu is blocked, it prevents the thread parking on the
other cpus from completing, which is needed to shutdown the nmi watchdog
and reprogram it correctly.  As a result, a false positive from the nmi
watchdog is reported.

Fix this by setting a park_in_progress flag to block all lockups until
the parking is complete.

Fix provided by Ulrich Obergfell.

[akpm@linux-foundation.org: s/park_in_progress/watchdog_park_in_progress/]
Link: http://lkml.kernel.org/r/1481041033-192236-1-git-send-email-dzickus@redhat.com
Signed-off-by: Don Zickus &lt;dzickus@redhat.com&gt;
Reviewed-by: Aaron Tomlin &lt;atomlin@redhat.com&gt;
Cc: Ulrich Obergfell &lt;uobergfe@redhat.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>kernel/watchdog.c: move hardlockup detector to separate file</title>
<updated>2016-12-15T00:04:08Z</updated>
<author>
<name>Babu Moger</name>
<email>babu.moger@oracle.com</email>
</author>
<published>2016-12-14T23:06:24Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=73ce0511c43686095efd2f65ef564aab952e07bc'/>
<id>urn:sha1:73ce0511c43686095efd2f65ef564aab952e07bc</id>
<content type='text'>
Separate hardlockup code from watchdog.c and move it to watchdog_hld.c.
It is mostly straight forward.  Remove everything inside
CONFIG_HARDLOCKUP_DETECTORS.  This code will go to file watchdog_hld.c.
Also update the makefile accordigly.

Link: http://lkml.kernel.org/r/1478034826-43888-3-git-send-email-babu.moger@oracle.com
Signed-off-by: Babu Moger &lt;babu.moger@oracle.com&gt;
Acked-by: Don Zickus &lt;dzickus@redhat.com&gt;
Cc: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Jiri Kosina &lt;jkosina@suse.cz&gt;
Cc: Andi Kleen &lt;andi@firstfloor.org&gt;
Cc: Yaowei Bai &lt;baiyaowei@cmss.chinamobile.com&gt;
Cc: Aaron Tomlin &lt;atomlin@redhat.com&gt;
Cc: Ulrich Obergfell &lt;uobergfe@redhat.com&gt;
Cc: Tejun Heo &lt;tj@kernel.org&gt;
Cc: Hidehiro Kawai &lt;hidehiro.kawai.ez@hitachi.com&gt;
Cc: Josh Hunt &lt;johunt@akamai.com&gt;
Cc: "David S. Miller" &lt;davem@davemloft.net&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>kernel/watchdog.c: move shared definitions to nmi.h</title>
<updated>2016-12-15T00:04:08Z</updated>
<author>
<name>Babu Moger</name>
<email>babu.moger@oracle.com</email>
</author>
<published>2016-12-14T23:06:21Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=249e52e35580fcfe5dad53a7dcd7c1252788749c'/>
<id>urn:sha1:249e52e35580fcfe5dad53a7dcd7c1252788749c</id>
<content type='text'>
Patch series "Clean up watchdog handlers", v2.

This is an attempt to cleanup watchdog handlers.  Right now,
kernel/watchdog.c implements both softlockup and hardlockup detectors.
Softlockup code is generic.  Hardlockup code is arch specific.  Some
architectures don't use hardlockup detectors.  They use their own
watchdog detectors.  To make both these combination work, we have
numerous #ifdefs in kernel/watchdog.c.

We are trying here to make these handlers independent of each other.
Also provide an interface for architectures to implement their own
handlers.  watchdog_nmi_enable and watchdog_nmi_disable will be defined
as weak such that architectures can override its definitions.

Thanks to Don Zickus for his suggestions.
Here are our previous discussions
http://www.spinics.net/lists/sparclinux/msg16543.html
http://www.spinics.net/lists/sparclinux/msg16441.html

This patch (of 3):

Move shared macros and definitions to nmi.h so that watchdog.c, new file
watchdog_hld.c or any other architecture specific handler can use those
definitions.

Link: http://lkml.kernel.org/r/1478034826-43888-2-git-send-email-babu.moger@oracle.com
Signed-off-by: Babu Moger &lt;babu.moger@oracle.com&gt;
Acked-by: Don Zickus &lt;dzickus@redhat.com&gt;
Cc: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Jiri Kosina &lt;jkosina@suse.cz&gt;
Cc: Andi Kleen &lt;andi@firstfloor.org&gt;
Cc: Yaowei Bai &lt;baiyaowei@cmss.chinamobile.com&gt;
Cc: Aaron Tomlin &lt;atomlin@redhat.com&gt;
Cc: Ulrich Obergfell &lt;uobergfe@redhat.com&gt;
Cc: Tejun Heo &lt;tj@kernel.org&gt;
Cc: Hidehiro Kawai &lt;hidehiro.kawai.ez@hitachi.com&gt;
Cc: Josh Hunt &lt;johunt@akamai.com&gt;
Cc: "David S. Miller" &lt;davem@davemloft.net&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>kernel/watchdog: use nmi registers snapshot in hardlockup handler</title>
<updated>2016-12-15T00:04:07Z</updated>
<author>
<name>Konstantin Khlebnikov</name>
<email>khlebnikov@yandex-team.ru</email>
</author>
<published>2016-12-14T23:04:04Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=4d1f0fb096aedea7bb5489af93498a82e467c480'/>
<id>urn:sha1:4d1f0fb096aedea7bb5489af93498a82e467c480</id>
<content type='text'>
NMI handler doesn't call set_irq_regs(), it's set only by normal IRQ.
Thus get_irq_regs() returns NULL or stale registers snapshot with IP/SP
pointing to the code interrupted by IRQ which was interrupted by NMI.
NULL isn't a problem: in this case watchdog calls dump_stack() and
prints full stack trace including NMI.  But if we're stuck in IRQ
handler then NMI watchlog will print stack trace without IRQ part at
all.

This patch uses registers snapshot passed into NMI handler as arguments:
these registers point exactly to the instruction interrupted by NMI.

Fixes: 55537871ef66 ("kernel/watchdog.c: perform all-CPU backtrace in case of hard lockup")
Link: http://lkml.kernel.org/r/146771764784.86724.6006627197118544150.stgit@buzz
Signed-off-by: Konstantin Khlebnikov &lt;khlebnikov@yandex-team.ru&gt;
Cc: Jiri Kosina &lt;jkosina@suse.cz&gt;
Cc: Ulrich Obergfell &lt;uobergfe@redhat.com&gt;
Cc: Aaron Tomlin &lt;atomlin@redhat.com&gt;
Cc: &lt;stable@vger.kernel.org&gt;	[4.4+]
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>watchdog: don't run proc_watchdog_update if new value is same as old</title>
<updated>2016-03-17T22:09:34Z</updated>
<author>
<name>Joshua Hunt</name>
<email>johunt@akamai.com</email>
</author>
<published>2016-03-17T21:17:23Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=a1ee1932aa6bea0bb074f5e3ced112664e4637ed'/>
<id>urn:sha1:a1ee1932aa6bea0bb074f5e3ced112664e4637ed</id>
<content type='text'>
While working on a script to restore all sysctl params before a series of
tests I found that writing any value into the
/proc/sys/kernel/{nmi_watchdog,soft_watchdog,watchdog,watchdog_thresh}
causes them to call proc_watchdog_update().

  NMI watchdog: enabled on all CPUs, permanently consumes one hw-PMU counter.
  NMI watchdog: enabled on all CPUs, permanently consumes one hw-PMU counter.
  NMI watchdog: enabled on all CPUs, permanently consumes one hw-PMU counter.
  NMI watchdog: enabled on all CPUs, permanently consumes one hw-PMU counter.

There doesn't appear to be a reason for doing this work every time a write
occurs, so only do it when the values change.

Signed-off-by: Josh Hunt &lt;johunt@akamai.com&gt;
Acked-by: Don Zickus &lt;dzickus@redhat.com&gt;
Reviewed-by: Aaron Tomlin &lt;atomlin@redhat.com&gt;
Cc: Ulrich Obergfell &lt;uobergfe@redhat.com&gt;
Cc: &lt;stable@vger.kernel.org&gt;	[4.1.x+]
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>Merge branch 'for-4.5' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq</title>
<updated>2016-01-12T02:53:13Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2016-01-12T02:53:13Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=0f8c7901039f8b1366ae364462743c8f4125822e'/>
<id>urn:sha1:0f8c7901039f8b1366ae364462743c8f4125822e</id>
<content type='text'>
Pull workqueue update from Tejun Heo:
 "Workqueue changes for v4.5.  One cleanup patch and three to improve
  the debuggability.

  Workqueue now has a stall detector which dumps workqueue state if any
  worker pool hasn't made forward progress over a certain amount of time
  (30s by default) and also triggers a warning if a workqueue which can
  be used in memory reclaim path tries to wait on something which can't
  be.

  These should make workqueue hangs a lot easier to debug."

* 'for-4.5' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
  workqueue: simplify the apply_workqueue_attrs_locked()
  workqueue: implement lockup detector
  watchdog: introduce touch_softlockup_watchdog_sched()
  workqueue: warn if memory reclaim tries to flush !WQ_MEM_RECLAIM workqueue
</content>
</entry>
<entry>
<title>panic, x86: Allow CPUs to save registers even if looping in NMI context</title>
<updated>2015-12-19T10:07:01Z</updated>
<author>
<name>Hidehiro Kawai</name>
<email>hidehiro.kawai.ez@hitachi.com</email>
</author>
<published>2015-12-14T10:19:10Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=58c5661f2144c089bbc2e5d87c9ec1dc1d2964fe'/>
<id>urn:sha1:58c5661f2144c089bbc2e5d87c9ec1dc1d2964fe</id>
<content type='text'>
Currently, kdump_nmi_shootdown_cpus(), a subroutine of crash_kexec(),
sends an NMI IPI to CPUs which haven't called panic() to stop them,
save their register information and do some cleanups for crash dumping.
However, if such a CPU is infinitely looping in NMI context, we fail to
save its register information into the crash dump.

For example, this can happen when unknown NMIs are broadcast to all
CPUs as follows:

  CPU 0                             CPU 1
  ===========================       ==========================
  receive an unknown NMI
  unknown_nmi_error()
    panic()                         receive an unknown NMI
      spin_trylock(&amp;panic_lock)     unknown_nmi_error()
      crash_kexec()                   panic()
                                        spin_trylock(&amp;panic_lock)
                                        panic_smp_self_stop()
                                          infinite loop
        kdump_nmi_shootdown_cpus()
          issue NMI IPI -----------&gt; blocked until IRET
                                          infinite loop...

Here, since CPU 1 is in NMI context, the second NMI from CPU 0 is
blocked until CPU 1 executes IRET. However, CPU 1 never executes IRET,
so the NMI is not handled and the callback function to save registers is
never called.

In practice, this can happen on some servers which broadcast NMIs to all
CPUs when the NMI button is pushed.

To save registers in this case, we need to:

  a) Return from NMI handler instead of looping infinitely
  or
  b) Call the callback function directly from the infinite loop

Inherently, a) is risky because NMI is also used to prevent corrupted
data from being propagated to devices.  So, we chose b).

This patch does the following:

1. Move the infinite looping of CPUs which haven't called panic() in NMI
   context (actually done by panic_smp_self_stop()) outside of panic() to
   enable us to refer pt_regs. Please note that panic_smp_self_stop() is
   still used for normal context.

2. Call a callback of kdump_nmi_shootdown_cpus() directly to save
   registers and do some cleanups after setting waiting_for_crash_ipi which
   is used for counting down the number of CPUs which handled the callback

Signed-off-by: Hidehiro Kawai &lt;hidehiro.kawai.ez@hitachi.com&gt;
Acked-by: Michal Hocko &lt;mhocko@suse.com&gt;
Cc: Aaron Tomlin &lt;atomlin@redhat.com&gt;
Cc: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: Andy Lutomirski &lt;luto@kernel.org&gt;
Cc: Baoquan He &lt;bhe@redhat.com&gt;
Cc: Chris Metcalf &lt;cmetcalf@ezchip.com&gt;
Cc: Dave Young &lt;dyoung@redhat.com&gt;
Cc: David Hildenbrand &lt;dahi@linux.vnet.ibm.com&gt;
Cc: Don Zickus &lt;dzickus@redhat.com&gt;
Cc: Eric Biederman &lt;ebiederm@xmission.com&gt;
Cc: Frederic Weisbecker &lt;fweisbec@gmail.com&gt;
Cc: Gobinda Charan Maji &lt;gobinda.cemk07@gmail.com&gt;
Cc: HATAYAMA Daisuke &lt;d.hatayama@jp.fujitsu.com&gt;
Cc: Hidehiro Kawai &lt;hidehiro.kawai.ez@hitachi.com&gt;
Cc: "H. Peter Anvin" &lt;hpa@zytor.com&gt;
Cc: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Javi Merino &lt;javi.merino@arm.com&gt;
Cc: Jiang Liu &lt;jiang.liu@linux.intel.com&gt;
Cc: Jonathan Corbet &lt;corbet@lwn.net&gt;
Cc: kexec@lists.infradead.org
Cc: linux-doc@vger.kernel.org
Cc: lkml &lt;linux-kernel@vger.kernel.org&gt;
Cc: Masami Hiramatsu &lt;masami.hiramatsu.pt@hitachi.com&gt;
Cc: Michal Nazarewicz &lt;mina86@mina86.com&gt;
Cc: Nicolas Iooss &lt;nicolas.iooss_linux@m4x.org&gt;
Cc: Oleg Nesterov &lt;oleg@redhat.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Prarit Bhargava &lt;prarit@redhat.com&gt;
Cc: Rasmus Villemoes &lt;linux@rasmusvillemoes.dk&gt;
Cc: Seth Jennings &lt;sjenning@redhat.com&gt;
Cc: Stefan Lippers-Hollmann &lt;s.l-h@gmx.de&gt;
Cc: Steven Rostedt &lt;rostedt@goodmis.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Ulrich Obergfell &lt;uobergfe@redhat.com&gt;
Cc: Vitaly Kuznetsov &lt;vkuznets@redhat.com&gt;
Cc: Vivek Goyal &lt;vgoyal@redhat.com&gt;
Cc: Yasuaki Ishimatsu &lt;isimatu.yasuaki@jp.fujitsu.com&gt;
Link: http://lkml.kernel.org/r/20151210014628.25437.75256.stgit@softrs
[ Cleanup comments, fixup formatting. ]
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
</content>
</entry>
</feed>
