<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/kernel/watchdog.c, branch v3.8</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v3.8</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v3.8'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2012-12-19T20:10:33Z</updated>
<entry>
<title>watchdog: Fix disable/enable regression</title>
<updated>2012-12-19T20:10:33Z</updated>
<author>
<name>Bjørn Mork</name>
<email>bjorn@mork.no</email>
</author>
<published>2012-12-19T19:51:31Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=3935e89505a1c3ab3f3b0c7ef0eae54124f48905'/>
<id>urn:sha1:3935e89505a1c3ab3f3b0c7ef0eae54124f48905</id>
<content type='text'>
Commit 8d4516904b39 ("watchdog: Fix CPU hotplug regression") causes an
oops or hard lockup when doing

 echo 0 &gt; /proc/sys/kernel/nmi_watchdog
 echo 1 &gt; /proc/sys/kernel/nmi_watchdog

and the kernel is booted with nmi_watchdog=1 (default)

Running laptop-mode-tools and disconnecting/connecting AC power will
cause this to trigger, making it a common failure scenario on laptops.

Instead of bailing out of watchdog_disable() when !watchdog_enabled we
can initialize the hrtimer regardless of watchdog_enabled status.  This
makes it safe to call watchdog_disable() in the nmi_watchdog=0 case,
without the negative effect on the enabled =&gt; disabled =&gt; enabled case.

All these tests pass with this patch:
- nmi_watchdog=1
  echo 0 &gt; /proc/sys/kernel/nmi_watchdog
  echo 1 &gt; /proc/sys/kernel/nmi_watchdog

- nmi_watchdog=0
  echo 0 &gt; /sys/devices/system/cpu/cpu1/online

- nmi_watchdog=0
  echo mem &gt; /sys/power/state

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=51661

Cc: &lt;stable@vger.kernel.org&gt; # v3.7
Cc: Norbert Warmuth &lt;nwarmuth@t-online.de&gt;
Cc: Joseph Salisbury &lt;joseph.salisbury@canonical.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: Bjørn Mork &lt;bjorn@mork.no&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>watchdog: store the watchdog sample period as a variable</title>
<updated>2012-12-18T01:15:13Z</updated>
<author>
<name>Chuansheng Liu</name>
<email>chuansheng.liu@intel.com</email>
</author>
<published>2012-12-17T23:59:50Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=0f34c400914f165b7b3812459be2d77b8aa1f1e4'/>
<id>urn:sha1:0f34c400914f165b7b3812459be2d77b8aa1f1e4</id>
<content type='text'>
Currently getting the sample period is always thru a complex
calculation: get_softlockup_thresh() * ((u64)NSEC_PER_SEC / 5).

We can store the sample period as a variable, and set it as __read_mostly
type.

Signed-off-by: liu chuansheng &lt;chuansheng.liu@intel.com&gt;
Cc: Don Zickus &lt;dzickus@redhat.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>watchdog: Fix CPU hotplug regression</title>
<updated>2012-12-04T18:56:59Z</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2012-12-04T17:59:34Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=8d4516904b39507458bee8115793528e12b1d8dd'/>
<id>urn:sha1:8d4516904b39507458bee8115793528e12b1d8dd</id>
<content type='text'>
Norbert reported:
"3.7-rc6 booted with nmi_watchdog=0 fails to suspend to RAM or
 offline CPUs. It's reproducable with a KVM guest and physical
 system."

The reason is that commit bcd951cf(watchdog: Use hotplug thread
infrastructure) missed to take this into account. So the cpu offline
code gets stuck in the teardown function because it accesses non
initialized data structures.

Add a check for watchdog_enabled into that path to cure the issue.

Reported-and-tested-by: Norbert Warmuth &lt;nwarmuth@t-online.de&gt;
Tested-by: Joseph Salisbury &lt;joseph.salisbury@canonical.com&gt;
Link: http://lkml.kernel.org/r/alpine.LFD.2.02.1211231033230.2701@ionos
Link: http://bugs.launchpad.net/bugs/1079534
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
</content>
</entry>
<entry>
<title>watchdog: using u64 in get_sample_period()</title>
<updated>2012-11-27T01:41:24Z</updated>
<author>
<name>Chuansheng Liu</name>
<email>chuansheng.liu@intel.com</email>
</author>
<published>2012-11-27T00:29:54Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=8ffeb9b0e6369135bf03a073514f571ef10606b9'/>
<id>urn:sha1:8ffeb9b0e6369135bf03a073514f571ef10606b9</id>
<content type='text'>
In get_sample_period(), unsigned long is not enough:

  watchdog_thresh * 2 * (NSEC_PER_SEC / 5)

case1:
  watchdog_thresh is 10 by default, the sample value will be: 0xEE6B2800

case2:
 set watchdog_thresh is 20, the sample value will be: 0x1 DCD6 5000

In case2, we need use u64 to express the sample period.  Otherwise,
changing the threshold thru proc often can not be successful.

Signed-off-by: liu chuansheng &lt;chuansheng.liu@intel.com&gt;
Acked-by: Don Zickus &lt;dzickus@redhat.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>watchdog: Use hotplug thread infrastructure</title>
<updated>2012-08-13T15:01:07Z</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2012-07-16T10:42:38Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=bcd951cf10f24e341defcd002c15a1f4eea13ddb'/>
<id>urn:sha1:bcd951cf10f24e341defcd002c15a1f4eea13ddb</id>
<content type='text'>
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Srivatsa S. Bhat &lt;srivatsa.bhat@linux.vnet.ibm.com&gt;
Cc: Rusty Russell &lt;rusty@rustcorp.com.au&gt;
Reviewed-by: Paul E. McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Link: http://lkml.kernel.org/r/20120716103948.563736676@linutronix.de
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
</content>
</entry>
<entry>
<title>Revert "NMI watchdog: fix for lockup detector breakage on resume"</title>
<updated>2012-08-08T18:49:45Z</updated>
<author>
<name>Rafael J. Wysocki</name>
<email>rjw@sisk.pl</email>
</author>
<published>2012-08-07T11:50:22Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=300d3739e873d50d4c6e3656f89007a217fb1d29'/>
<id>urn:sha1:300d3739e873d50d4c6e3656f89007a217fb1d29</id>
<content type='text'>
Revert commit 45226e9 (NMI watchdog: fix for lockup detector breakage
on resume) which breaks resume from system suspend on my SH7372
Mackerel board (by causing a NULL pointer dereference to happen) and
is generally wrong, because it abuses the CPU hotplug functionality
in a shamelessly blatant way.

The original issue should be addressed through appropriate syscore
resume callback instead.

Signed-off-by: Rafael J. Wysocki &lt;rjw@sisk.pl&gt;
</content>
</entry>
<entry>
<title>NMI watchdog: fix for lockup detector breakage on resume</title>
<updated>2012-07-31T00:25:13Z</updated>
<author>
<name>Sameer Nanda</name>
<email>snanda@chromium.org</email>
</author>
<published>2012-07-30T21:40:00Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=45226e944ce071d0231949f2fea90969437cd2dc'/>
<id>urn:sha1:45226e944ce071d0231949f2fea90969437cd2dc</id>
<content type='text'>
On the suspend/resume path the boot CPU does not go though an
offline-&gt;online transition.  This breaks the NMI detector post-resume
since it depends on PMU state that is lost when the system gets
suspended.

Fix this by forcing a CPU offline-&gt;online transition for the lockup
detector on the boot CPU during resume.

To provide more context, we enable NMI watchdog on Chrome OS.  We have
seen several reports of systems freezing up completely which indicated
that the NMI watchdog was not firing for some reason.

Debugging further, we found a simple way of repro'ing system freezes --
issuing the command 'tasket 1 sh -c "echo nmilockup &gt; /proc/breakme"'
after the system has been suspended/resumed one or more times.

With this patch in place, the system freeze result in panics, as
expected.

These panics provide a nice stack trace for us to debug the actual issue
causing the freeze.

[akpm@linux-foundation.org: fiddle with code comment]
[akpm@linux-foundation.org: make lockup_detector_bootcpu_resume() conditional on CONFIG_SUSPEND]
[akpm@linux-foundation.org: fix section errors]
Signed-off-by: Sameer Nanda &lt;snanda@chromium.org&gt;
Cc: Ingo Molnar &lt;mingo@elte.hu&gt;
Cc: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Cc: "Rafael J. Wysocki" &lt;rjw@sisk.pl&gt;
Cc: Don Zickus &lt;dzickus@redhat.com&gt;
Cc: Mandeep Singh Baines &lt;msb@chromium.org&gt;
Cc: Srivatsa S. Bhat &lt;srivatsa.bhat@linux.vnet.ibm.com&gt;
Cc: Anshuman Khandual &lt;khandual@linux.vnet.ibm.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>watchdog: Quiet down the boot messages</title>
<updated>2012-06-14T10:20:50Z</updated>
<author>
<name>Don Zickus</name>
<email>dzickus@redhat.com</email>
</author>
<published>2012-06-13T13:35:48Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=a70270468234749741c5893ae78e5bb524771402'/>
<id>urn:sha1:a70270468234749741c5893ae78e5bb524771402</id>
<content type='text'>
A bunch of bugzillas have complained how noisy the nmi_watchdog
is during boot-up especially with its expected failure cases
(like virt and bios resource contention).

This is my attempt to quiet them down and keep it less confusing
for the end user.  What I did is print the message for cpu0 and
save it for future comparisons.  If future cpus have an
identical message as cpu0, then don't print the redundant info.
However, if a future cpu has a different message, happily print
that loudly.

Before the change, you would see something like:

    ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1
    CPU0: Intel(R) Core(TM)2 Quad CPU    Q9550  @ 2.83GHz stepping 0a
    Performance Events: PEBS fmt0+, Core2 events, Intel PMU driver.
    ... version:                2
    ... bit width:              40
    ... generic registers:      2
    ... value mask:             000000ffffffffff
    ... max period:             000000007fffffff
    ... fixed-purpose events:   3
    ... event mask:             0000000700000003
    NMI watchdog enabled, takes one hw-pmu counter.
    Booting Node   0, Processors  #1
    NMI watchdog enabled, takes one hw-pmu counter.
     #2
    NMI watchdog enabled, takes one hw-pmu counter.
     #3 Ok.
    NMI watchdog enabled, takes one hw-pmu counter.
    Brought up 4 CPUs
    Total of 4 processors activated (22607.24 BogoMIPS).

After the change, it is simplified to:

    ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1
    CPU0: Intel(R) Core(TM)2 Quad CPU    Q9550  @ 2.83GHz stepping 0a
    Performance Events: PEBS fmt0+, Core2 events, Intel PMU driver.
    ... version:                2
    ... bit width:              40
    ... generic registers:      2
    ... value mask:             000000ffffffffff
    ... max period:             000000007fffffff
    ... fixed-purpose events:   3
    ... event mask:             0000000700000003
    NMI watchdog: enabled on all CPUs, permanently consumes one hw-PMU counter.
    Booting Node   0, Processors  #1 #2 #3 Ok.
    Brought up 4 CPUs

V2: little changes based on Joe Perches' feedback
V3: printk cleanup based on Ingo's feedback; checkpatch fix
V4: keep printk as one long line
V5: Ingo fix ups

Reported-and-tested-by: Nathan Zimmer &lt;nzimmer@sgi.com&gt;
Signed-off-by: Don Zickus &lt;dzickus@redhat.com&gt;
Cc: nzimmer@sgi.com
Cc: joe@perches.com
Link: http://lkml.kernel.org/r/1339594548-17227-1-git-send-email-dzickus@redhat.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>watchdog: add check for suspended vm in softlockup detector</title>
<updated>2012-04-08T09:49:03Z</updated>
<author>
<name>Eric B Munson</name>
<email>emunson@mgebm.net</email>
</author>
<published>2012-03-10T19:37:28Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=5d1c0f4a80a6df73395fb3fc2c302510f8f09d36'/>
<id>urn:sha1:5d1c0f4a80a6df73395fb3fc2c302510f8f09d36</id>
<content type='text'>
A suspended VM can cause spurious soft lockup warnings.  To avoid these, the
watchdog now checks if the kernel knows it was stopped by the host and skips
the warning if so.  When the watchdog is reset successfully, clear the guest
paused flag.

Signed-off-by: Eric B Munson &lt;emunson@mgebm.net&gt;
Signed-off-by: Marcelo Tosatti &lt;mtosatti@redhat.com&gt;
Signed-off-by: Avi Kivity &lt;avi@redhat.com&gt;

</content>
</entry>
<entry>
<title>kernel/watchdog.c: add comment to watchdog() exit path</title>
<updated>2012-03-23T23:58:32Z</updated>
<author>
<name>Andrew Morton</name>
<email>akpm@linux-foundation.org</email>
</author>
<published>2012-03-23T22:01:56Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=b60f796c4ca72545327a069f12938360d833cce7'/>
<id>urn:sha1:b60f796c4ca72545327a069f12938360d833cce7</id>
<content type='text'>
Revelation from Peter.

Cc: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Cc: Don Zickus &lt;dzickus@redhat.com&gt;
Cc: Ingo Molnar &lt;mingo@elte.hu&gt;
Cc: Thomas Gleixner &lt;tglx@tglx.de&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
</feed>
