<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/kernel/hung_task.c, branch v4.20</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v4.20</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v4.20'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2018-10-25T16:45:08Z</updated>
<entry>
<title>kernel: hung_task.c: disable on suspend</title>
<updated>2018-10-25T16:45:08Z</updated>
<author>
<name>Vitaly Kuznetsov</name>
<email>vkuznets@redhat.com</email>
</author>
<published>2018-10-17T11:23:55Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=a1c6ca3c6de763459a6e93b644ec6518c890ba1c'/>
<id>urn:sha1:a1c6ca3c6de763459a6e93b644ec6518c890ba1c</id>
<content type='text'>
It is possible to observe hung_task complaints when system goes to
suspend-to-idle state:

 # echo freeze &gt; /sys/power/state

 PM: Syncing filesystems ... done.
 Freezing user space processes ... (elapsed 0.001 seconds) done.
 OOM killer disabled.
 Freezing remaining freezable tasks ... (elapsed 0.002 seconds) done.
 sd 0:0:0:0: [sda] Synchronizing SCSI cache
 INFO: task bash:1569 blocked for more than 120 seconds.
       Not tainted 4.19.0-rc3_+ #687
 "echo 0 &gt; /proc/sys/kernel/hung_task_timeout_secs" disables this message.
 bash            D    0  1569    604 0x00000000
 Call Trace:
  ? __schedule+0x1fe/0x7e0
  schedule+0x28/0x80
  suspend_devices_and_enter+0x4ac/0x750
  pm_suspend+0x2c0/0x310

Register a PM notifier to disable the detector on suspend and re-enable
back on wakeup.

Signed-off-by: Vitaly Kuznetsov &lt;vkuznets@redhat.com&gt;
Signed-off-by: Rafael J. Wysocki &lt;rafael.j.wysocki@intel.com&gt;
</content>
</entry>
<entry>
<title>kernel/hung_task.c: allow to set checking interval separately from timeout</title>
<updated>2018-08-22T17:52:47Z</updated>
<author>
<name>Dmitry Vyukov</name>
<email>dvyukov@google.com</email>
</author>
<published>2018-08-22T04:55:52Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=a2e514453861dd39b53b7a50b6771bd3f9852078'/>
<id>urn:sha1:a2e514453861dd39b53b7a50b6771bd3f9852078</id>
<content type='text'>
Currently task hung checking interval is equal to timeout, as the result
hung is detected anywhere between timeout and 2*timeout.  This is fine for
most interactive environments, but this hurts automated testing setups
(syzbot).  In an automated setup we need to strictly order CPU lockup &lt;
RCU stall &lt; workqueue lockup &lt; task hung &lt; silent loss, so that RCU stall
is not detected as task hung and task hung is not detected as silent
machine loss.  The large variance in task hung detection timeout requires
setting silent machine loss timeout to a very large value (e.g.  if task
hung is 3 mins, then silent loss need to be set to ~7 mins).  The
additional 3 minutes significantly reduce testing efficiency because
usually we crash kernel within a minute, and this can add hours to bug
localization process as it needs to do dozens of tests.

Allow setting checking interval separately from timeout.  This allows to
set timeout to, say, 3 minutes, but checking interval to 10 secs.

The interval is controlled via a new hung_task_check_interval_secs sysctl,
similar to the existing hung_task_timeout_secs sysctl.  The default value
of 0 results in the current behavior: checking interval is equal to
timeout.

[akpm@linux-foundation.org: update hung_task_timeout_max's comment]
Link: http://lkml.kernel.org/r/20180611111004.203513-1-dvyukov@google.com
Signed-off-by: Dmitry Vyukov &lt;dvyukov@google.com&gt;
Cc: Paul E. McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
Cc: Tetsuo Handa &lt;penguin-kernel@I-love.SAKURA.ne.jp&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Cc: Ingo Molnar &lt;mingo@elte.hu&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>kernel/hung_task.c: show all hung tasks before panic</title>
<updated>2018-06-08T00:34:39Z</updated>
<author>
<name>Tetsuo Handa</name>
<email>penguin-kernel@I-love.SAKURA.ne.jp</email>
</author>
<published>2018-06-08T00:10:34Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=401c636a0eeb0d51862fce222da1bf08e3a0ffd0'/>
<id>urn:sha1:401c636a0eeb0d51862fce222da1bf08e3a0ffd0</id>
<content type='text'>
When we get a hung task it can often be valuable to see _all_ the hung
tasks on the system before calling panic().

Quoting from https://syzkaller.appspot.com/text?tag=CrashReport&amp;id=5316056503549952
----------------------------------------
INFO: task syz-executor0:6540 blocked for more than 120 seconds.
      Not tainted 4.16.0+ #13
"echo 0 &gt; /proc/sys/kernel/hung_task_timeout_secs" disables this message.
syz-executor0   D23560  6540   4521 0x80000004
Call Trace:
 context_switch kernel/sched/core.c:2848 [inline]
 __schedule+0x8fb/0x1ef0 kernel/sched/core.c:3490
 schedule+0xf5/0x430 kernel/sched/core.c:3549
 schedule_preempt_disabled+0x10/0x20 kernel/sched/core.c:3607
 __mutex_lock_common kernel/locking/mutex.c:833 [inline]
 __mutex_lock+0xb7f/0x1810 kernel/locking/mutex.c:893
 mutex_lock_nested+0x16/0x20 kernel/locking/mutex.c:908
 lo_ioctl+0x8b/0x1b70 drivers/block/loop.c:1355
 __blkdev_driver_ioctl block/ioctl.c:303 [inline]
 blkdev_ioctl+0x1759/0x1e00 block/ioctl.c:601
 ioctl_by_bdev+0xa5/0x110 fs/block_dev.c:2060
 isofs_get_last_session fs/isofs/inode.c:567 [inline]
 isofs_fill_super+0x2ba9/0x3bc0 fs/isofs/inode.c:660
 mount_bdev+0x2b7/0x370 fs/super.c:1119
 isofs_mount+0x34/0x40 fs/isofs/inode.c:1560
 mount_fs+0x66/0x2d0 fs/super.c:1222
 vfs_kern_mount.part.26+0xc6/0x4a0 fs/namespace.c:1037
 vfs_kern_mount fs/namespace.c:2514 [inline]
 do_new_mount fs/namespace.c:2517 [inline]
 do_mount+0xea4/0x2b90 fs/namespace.c:2847
 ksys_mount+0xab/0x120 fs/namespace.c:3063
 SYSC_mount fs/namespace.c:3077 [inline]
 SyS_mount+0x39/0x50 fs/namespace.c:3074
 do_syscall_64+0x281/0x940 arch/x86/entry/common.c:287
 entry_SYSCALL_64_after_hwframe+0x42/0xb7
(...snipped...)
Showing all locks held in the system:
(...snipped...)
2 locks held by syz-executor0/6540:
 #0: 00000000566d4c39 (&amp;type-&gt;s_umount_key#49/1){+.+.}, at: alloc_super fs/super.c:211 [inline]
 #0: 00000000566d4c39 (&amp;type-&gt;s_umount_key#49/1){+.+.}, at: sget_userns+0x3b2/0xe60 fs/super.c:502 /* down_write_nested(&amp;s-&gt;s_umount, SINGLE_DEPTH_NESTING); */
 #1: 0000000043ca8836 (&amp;lo-&gt;lo_ctl_mutex/1){+.+.}, at: lo_ioctl+0x8b/0x1b70 drivers/block/loop.c:1355 /* mutex_lock_nested(&amp;lo-&gt;lo_ctl_mutex, 1); */
(...snipped...)
3 locks held by syz-executor7/6541:
 #0: 0000000043ca8836 (&amp;lo-&gt;lo_ctl_mutex/1){+.+.}, at: lo_ioctl+0x8b/0x1b70 drivers/block/loop.c:1355 /* mutex_lock_nested(&amp;lo-&gt;lo_ctl_mutex, 1); */
 #1: 000000007bf3d3f9 (&amp;bdev-&gt;bd_mutex){+.+.}, at: blkdev_reread_part+0x1e/0x40 block/ioctl.c:192
 #2: 00000000566d4c39 (&amp;type-&gt;s_umount_key#50){.+.+}, at: __get_super.part.10+0x1d3/0x280 fs/super.c:663 /* down_read(&amp;sb-&gt;s_umount); */
----------------------------------------

When reporting an AB-BA deadlock like shown above, it would be nice if
trace of PID=6541 is printed as well as trace of PID=6540 before calling
panic().

Showing hung tasks up to /proc/sys/kernel/hung_task_warnings could delay
calling panic() but normally there should not be so many hung tasks.

Link: http://lkml.kernel.org/r/201804050705.BHE57833.HVFOFtSOMQJFOL@I-love.SAKURA.ne.jp
Signed-off-by: Tetsuo Handa &lt;penguin-kernel@I-love.SAKURA.ne.jp&gt;
Acked-by: Paul E. McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
Acked-by: Dmitry Vyukov &lt;dvyukov@google.com&gt;
Cc: Vegard Nossum &lt;vegard.nossum@oracle.com&gt;
Cc: Mandeep Singh Baines &lt;msb@chromium.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Ingo Molnar &lt;mingo@kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;

Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>kernel/hung_task.c: defer showing held locks</title>
<updated>2017-05-09T00:15:10Z</updated>
<author>
<name>Tetsuo Handa</name>
<email>penguin-kernel@I-love.SAKURA.ne.jp</email>
</author>
<published>2017-05-08T22:55:11Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=780cbcf28781511d2cb235c375127265209796a8'/>
<id>urn:sha1:780cbcf28781511d2cb235c375127265209796a8</id>
<content type='text'>
When I was running my testcase which may block hundreds of threads on fs
locks, I got lockup due to output from debug_show_all_locks() added by
commit b2d4c2edb2e4 ("locking/hung_task: Show all locks").

For example, if 1000 threads were blocked in TASK_UNINTERRUPTIBLE state
and 500 out of 1000 threads hold some lock, debug_show_all_locks() from
for_each_process_thread() loop will report locks held by 500 threads for
1000 times.  This is a too much noise.

In order to make sure rcu_lock_break() is called frequently, we should
avoid calling debug_show_all_locks() from for_each_process_thread() loop
because debug_show_all_locks() effectively calls for_each_process_thread()
loop.  Let's defer calling debug_show_all_locks() till before panic() or
leaving for_each_process_thread() loop.

Link: http://lkml.kernel.org/r/1489296834-60436-1-git-send-email-penguin-kernel@I-love.SAKURA.ne.jp
Signed-off-by: Tetsuo Handa &lt;penguin-kernel@I-love.SAKURA.ne.jp&gt;
Reviewed-by: Vegard Nossum &lt;vegard.nossum@oracle.com&gt;
Cc: Ingo Molnar &lt;mingo@kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>sched/headers: Prepare for new header dependencies before moving code to &lt;linux/sched/debug.h&gt;</title>
<updated>2017-03-02T07:42:34Z</updated>
<author>
<name>Ingo Molnar</name>
<email>mingo@kernel.org</email>
</author>
<published>2017-02-08T17:51:35Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=b17b01533b719e9949e437abf66436a875739b40'/>
<id>urn:sha1:b17b01533b719e9949e437abf66436a875739b40</id>
<content type='text'>
We are going to split &lt;linux/sched/debug.h&gt; out of &lt;linux/sched.h&gt;, which
will have to be picked up from other headers and a couple of .c files.

Create a trivial placeholder &lt;linux/sched/debug.h&gt; file that just
maps to &lt;linux/sched.h&gt; to make this patch obviously correct and
bisectable.

Include the new header in the files that are going to need it.

Acked-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Mike Galbraith &lt;efault@gmx.de&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>sched/headers: Prepare for new header dependencies before moving code to &lt;linux/sched/signal.h&gt;</title>
<updated>2017-03-02T07:42:29Z</updated>
<author>
<name>Ingo Molnar</name>
<email>mingo@kernel.org</email>
</author>
<published>2017-02-08T17:51:30Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=3f07c0144132e4f59d88055ac8ff3e691a5fa2b8'/>
<id>urn:sha1:3f07c0144132e4f59d88055ac8ff3e691a5fa2b8</id>
<content type='text'>
We are going to split &lt;linux/sched/signal.h&gt; out of &lt;linux/sched.h&gt;, which
will have to be picked up from other headers and a couple of .c files.

Create a trivial placeholder &lt;linux/sched/signal.h&gt; file that just
maps to &lt;linux/sched.h&gt; to make this patch obviously correct and
bisectable.

Include the new header in the files that are going to need it.

Acked-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Mike Galbraith &lt;efault@gmx.de&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>hung_task: decrement sysctl_hung_task_warnings only if it is positive</title>
<updated>2016-12-13T02:55:09Z</updated>
<author>
<name>Tetsuo Handa</name>
<email>penguin-kernel@I-love.SAKURA.ne.jp</email>
</author>
<published>2016-12-13T00:45:35Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=4ca5ede07c9871c13ae422c96d6d08dbd0df5eda'/>
<id>urn:sha1:4ca5ede07c9871c13ae422c96d6d08dbd0df5eda</id>
<content type='text'>
Since sysctl_hung_task_warnings == -1 is allowed (infinite warnings),
commit 48a6d64edadb ("hung_task: allow hung_task_panic when
hung_task_warnings is 0") should decrement it only when it is not -1.

This prevents the kernel from ceasing warnings after the first
4294967295 ;)

Signed-off-by: Tetsuo Handa &lt;penguin-kernel@I-love.SAKURA.ne.jp&gt;
Cc: John Siddle &lt;jsiddle@redhat.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>hung_task: allow hung_task_panic when hung_task_warnings is 0</title>
<updated>2016-10-11T22:06:33Z</updated>
<author>
<name>John Siddle</name>
<email>jsiddle@redhat.com</email>
</author>
<published>2016-10-11T20:55:56Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=48a6d64edadbd40fa5185a890023e9b331d64a48'/>
<id>urn:sha1:48a6d64edadbd40fa5185a890023e9b331d64a48</id>
<content type='text'>
Previously hung_task_panic would not be respected if enabled after
hung_task_warnings had already been decremented to 0.

Permit the kernel to panic if hung_task_panic is enabled after
hung_task_warnings has already been decremented to 0 and another task
hangs for hung_task_timeout_secs seconds.

Check if hung_task_panic is enabled so we don't return prematurely, and
check if hung_task_warnings is non-zero so we don't print the warning
unnecessarily.

[akpm@linux-foundation.org: fix off-by-one]
Link: http://lkml.kernel.org/r/1473450214-4049-1-git-send-email-jsiddle@redhat.com
Signed-off-by: John Siddle &lt;jsiddle@redhat.com&gt;
Cc: Tetsuo Handa &lt;penguin-kernel@i-love.sakura.ne.jp&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>locking/hung_task: Show all locks</title>
<updated>2016-08-24T10:16:13Z</updated>
<author>
<name>Vegard Nossum</name>
<email>vegard.nossum@oracle.com</email>
</author>
<published>2016-08-18T16:41:00Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=b2d4c2edb2e4f89aaf85449dee3b87fbf0f8a4d4'/>
<id>urn:sha1:b2d4c2edb2e4f89aaf85449dee3b87fbf0f8a4d4</id>
<content type='text'>
When we get a hung task it can often be valuable to see _all_ the held
locks on the system (in case we are being blocked on trying to acquire
one), e.g. with this patch we can immediately see where the problem is
below:

    INFO: task trinity-c3:14933 blocked for more than 120 seconds.
	  Not tainted 4.8.0-rc1+ #135
    "echo 0 &gt; /proc/sys/kernel/hung_task_timeout_secs" disables this message.
    trinity-c3      D ffff88010c16fc88     0 14933      1 0x00080004
     ffff88010c16fc88 000000003b9aca00 0000000000000000 0000000000000296
     00000000776cdf88 ffff88011a520ae0 ffff88011a520b08 ffff88011a520198
     ffffffff867d7f00 ffff88011942c080 ffff880116841580 ffff88010c168000
    Call Trace:
     [&lt;ffffffff845e9d37&gt;] schedule+0x77/0x230
     [&lt;ffffffff833cb8b9&gt;] __lock_sock+0x129/0x250
     [&lt;ffffffff833cb790&gt;] ? __sk_destruct+0x450/0x450
     [&lt;ffffffff81408ac0&gt;] ? wake_bit_function+0x2e0/0x2e0
     [&lt;ffffffff833d832b&gt;] lock_sock_nested+0xeb/0x120
     [&lt;ffffffff83bad815&gt;] irda_setsockopt+0x65/0xb40
     [&lt;ffffffff833c6c09&gt;] SyS_setsockopt+0x139/0x230
     [&lt;ffffffff833c6ad0&gt;] ? SyS_recv+0x20/0x20
     [&lt;ffffffff81004660&gt;] ? trace_event_raw_event_sys_enter+0xb90/0xb90
     [&lt;ffffffff823c7023&gt;] ? __this_cpu_preempt_check+0x13/0x20
     [&lt;ffffffff8162ee60&gt;] ? __context_tracking_exit.part.3+0x30/0x1b0
     [&lt;ffffffff833c6ad0&gt;] ? SyS_recv+0x20/0x20
     [&lt;ffffffff81007bd3&gt;] do_syscall_64+0x1b3/0x4b0
     [&lt;ffffffff845f84aa&gt;] entry_SYSCALL64_slow_path+0x25/0x25

    Showing all locks held in the system:
    2 locks held by khungtaskd/563:
     #0:  (rcu_read_lock){......}, at: [&lt;ffffffff81534ce6&gt;] watchdog+0x106/0x910
     #1:  (tasklist_lock){......}, at: [&lt;ffffffff8141b3c4&gt;] debug_show_all_locks+0x74/0x360
    1 lock held by trinity-c0/19280:
     #0:  (sk_lock-AF_IRDA){......}, at: [&lt;ffffffff83bab7c6&gt;] irda_accept+0x176/0x10f0
    1 lock held by trinity-c0/12865:
     #0:  (sk_lock-AF_IRDA){......}, at: [&lt;ffffffff83bab7c6&gt;] irda_accept+0x176/0x10f0

Signed-off-by: Vegard Nossum &lt;vegard.nossum@oracle.com&gt;
Cc: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Mandeep Singh Baines &lt;msb@chromium.org&gt;
Cc: Paul E. McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Link: http://lkml.kernel.org/r/1471538460-7505-1-git-send-email-vegard.nossum@oracle.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>kernel/hung_task.c: use timeout diff when timeout is updated</title>
<updated>2016-03-22T22:36:02Z</updated>
<author>
<name>Tetsuo Handa</name>
<email>penguin-kernel@I-love.SAKURA.ne.jp</email>
</author>
<published>2016-03-22T21:24:39Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=b4aa14a63cb3194d8eab355fcee194838ab09121'/>
<id>urn:sha1:b4aa14a63cb3194d8eab355fcee194838ab09121</id>
<content type='text'>
When new timeout is written to /proc/sys/kernel/hung_task_timeout_secs,
khungtaskd is interrupted and again sleeps for full timeout duration.

This means that hang task will not be checked if new timeout is written
periodically within old timeout duration and/or checking of hang task
will be delayed for up to previous timeout duration.  Fix this by
remembering last time khungtaskd checked hang task.

This change will allow other watchdog tasks (if any) to share khungtaskd
by sleeping for minimal timeout diff of all watchdog tasks.  Doing more
watchdog tasks from khungtaskd will reduce the possibility of printk()
collisions by multiple watchdog threads.

Signed-off-by: Tetsuo Handa &lt;penguin-kernel@I-love.SAKURA.ne.jp&gt;
Cc: Oleg Nesterov &lt;oleg@redhat.com&gt;
Cc: Aaron Tomlin &lt;atomlin@redhat.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
</feed>
