linux/kernel/softlockup.c, branch v2.6.28

check_hung_task(): unsigned sysctl_hung_task_warnings cannot be less than 0

2008-12-03T09:11:51Z

Impact: fix warnings-limit cutoff check for debug feature unsigned sysctl_hung_task_warnings cannot be less than 0 Signed-off-by: Roel Kluin Signed-off-by: Ingo Molnar

Make the taint flags reliable

2008-10-16T18:21:31Z

It's somewhat unlikely that it happens, but right now a race window between interrupts or machine checks or oopses could corrupt the tainted bitmap because it is modified in a non atomic fashion. Convert the taint variable to an unsigned long and use only atomic bit operations on it. Unfortunately this means the intvec sysctl functions cannot be used on it anymore. It turned out the taint sysctl handler could actually be simplified a bit (since it only increases capabilities) so this patch actually removes code. [akpm@linux-foundation.org: remove unneeded include] Signed-off-by: Andi Kleen Cc: Ingo Molnar Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds

softlockup: minor cleanup, don't check task->state twice

2008-09-02T17:49:51Z

The recent commit 16d9679f33caf7e683471647d1472bfe133d858 changed check_hung_task() to filter out the TASK_KILLABLE tasks. We can move this check to the caller which has to test t->state anyway. Signed-off-by: Oleg Nesterov Acked-by: Andi Kleen Signed-off-by: Linus Torvalds

Don't trigger softlockup detector on network fs blocked tasks

2008-08-29T21:46:29Z

Pulling the ethernet cable on a 2.6.27-rc system with NFS mounts currently leads to an ongoing flood of soft lockup detector backtraces for all tasks blocked on the NFS mounts when the hickup takes longer than 120s. I don't think NFS problems should be all that noisy. Luckily there's a reasonably easy way to distingush this case. Don't report task softlockup warnings for tasks in TASK_KILLABLE state, which is used by the network file systems. I believe this patch is a 2.6.27 candidate. Signed-off-by: Andi Kleen Signed-off-by: Linus Torvalds

Full conversion to early_initcall() interface, remove old interface

2008-07-26T19:00:04Z

A previous patch added the early_initcall(), to allow a cleaner hooking of pre-SMP initcalls. Now we remove the older interface, converting all existing users to the new one. [akpm@linux-foundation.org: cleanups] [akpm@linux-foundation.org: build fix] [kosaki.motohiro@jp.fujitsu.com: warning fix] [kosaki.motohiro@jp.fujitsu.com: warning fix] Signed-off-by: Eduard - Gabriel Munteanu Cc: Tom Zanussi Signed-off-by: KOSAKI Motohiro Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds

softlockup: fix watchdog task wakeup frequency

2008-07-01T07:22:49Z

The print_timestamp can never be bigger than the touch_timestamp, at maximum it can be equal. And if it is, the second check for touch_timestamp + 1 bigger print_timestamp is always true, too. The check for equality is sufficient as we proceed in one-second-steps and are at least one second away from the last print-out if we have another timestamp. Signed-off-by: Johannes Weiner Cc: Peter Zijlstra Signed-off-by: Ingo Molnar

softlockup: fix watchdog task wakeup frequency

2008-06-30T13:03:08Z

Updating the timestamp more often is pointless as we print the warnings only if we exceed the threshold. And the check for hung tasks relies on the last timestamp, so it will keep working correctly, too. Signed-off-by: Johannes Weiner Cc: Peter Zijlstra Signed-off-by: Ingo Molnar

softlockup: show irqtrace

2008-06-25T15:49:48Z

This patch adds some information about when interrupts were last enabled and disabled to the output of the softlockup detector. Signed-off-by: Vegard Nossum Cc: Peter Zijlstra Cc: Johannes Weiner Cc: Arjan van de Ven Signed-off-by: Ingo Molnar

softlockup: print a module list on being stuck

2008-06-18T13:26:54Z

Most places in the kernel that go BUG: print a module list (which is very useful for doing statistics and finding patterns), however the softlockup detector does not do this yet. This patch adds the one line change to fix this gap. Signed-off-by: Arjan van de Ven Signed-off-by: Ingo Molnar

softlockup: fix NMI hangs due to lock race - 2.6.26-rc regression

2008-06-02T10:38:27Z

The touch_nmi_watchdog() routine on x86 ultimately calls touch_softlockup_watchdog(). The problem is that to touch the softlockup watchdog, the cpu_clock code has to be called which could involve multiple cpu locks and can lead to a hard hang if one of the locks is held by a processor that is not going to return anytime soon (such as could be the case with kgdb or perhaps even with some other kind of exception). This patch causes the public version of the touch_softlockup_watchdog() to defer the cpu clock access to a later point. The test case for this problem is to use the following kernel config options: CONFIG_KGDB_TESTS=y CONFIG_KGDB_TESTS_ON_BOOT=y CONFIG_KGDB_TESTS_BOOT_STRING="V1F100I100000" It should be noted that kgdb test suite and these options were not available until 2.6.26-rc2, so it was necessary to patch the kgdb test suite during the bisection. I would consider this patch a regression fix because the problem first appeared in commit 27ec4407790d075c325e1f4da0a19c56953cce23 when some logic was added to try to periodically sync the clocks. It was possible to work around this particular problem by simply not performing the sync anytime the system was in a critical context. This was ok until commit 3e51f33fcc7f55e6df25d15b55ed10c8b4da84cd, which added config option CONFIG_HAVE_UNSTABLE_SCHED_CLOCK and some multi-cpu locks to sync the clocks. It became clear that accessing this code from an nmi was the source of the lockups. Avoiding the access to the low level clock code from an code inside the NMI processing also fixed the problem with the 27ec44... commit. Signed-off-by: Jason Wessel Signed-off-by: Ingo Molnar