<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/kernel/smp.c, branch v6.12</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v6.12</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v6.12'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2024-08-14T18:36:48Z</updated>
<entry>
<title>smp: print only local CPU info when sched_clock goes backward</title>
<updated>2024-08-14T18:36:48Z</updated>
<author>
<name>Rik van Riel</name>
<email>riel@surriel.com</email>
</author>
<published>2024-07-15T17:49:41Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=9fbaa44114ca6b69074a488295e86a9fa7e685f9'/>
<id>urn:sha1:9fbaa44114ca6b69074a488295e86a9fa7e685f9</id>
<content type='text'>
About 40% of all csd_lock warnings observed in our fleet appear to
be due to sched_clock() going backward in time (usually only a little
bit), resulting in ts0 being larger than ts2.

When the local CPU is at fault, we should print out a message reflecting
that, rather than trying to get the remote CPU's stack trace.

Signed-off-by: Rik van Riel &lt;riel@surriel.com&gt;
Tested-by: "Paul E. McKenney" &lt;paulmck@kernel.org&gt;
Signed-off-by: Neeraj Upadhyay &lt;neeraj.upadhyay@kernel.org&gt;
</content>
</entry>
<entry>
<title>locking/csd-lock: Use backoff for repeated reports of same incident</title>
<updated>2024-08-14T18:36:48Z</updated>
<author>
<name>Paul E. McKenney</name>
<email>paulmck@kernel.org</email>
</author>
<published>2024-07-02T21:32:20Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=d40760d6811d55172ba6b90ebd2a60a75f88bffe'/>
<id>urn:sha1:d40760d6811d55172ba6b90ebd2a60a75f88bffe</id>
<content type='text'>
Currently, the CSD-lock diagnostics in CONFIG_CSD_LOCK_WAIT_DEBUG=y
kernels are emitted at five-second intervals.  Although this has proven
to be a good time interval for the first diagnostic, if the target CPU
keeps interrupts disabled for way longer than five seconds, the ratio
of useful new information to pointless repetition increases considerably.

Therefore, back off the time period for repeated reports of the same
incident, increasing linearly with the number of reports and logarithmicly
with the number of online CPUs.

[ paulmck: Apply Dan Carpenter feedback. ]

Signed-off-by: Paul E. McKenney &lt;paulmck@kernel.org&gt;
Cc: Imran Khan &lt;imran.f.khan@oracle.com&gt;
Cc: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Leonardo Bras &lt;leobras@redhat.com&gt;
Cc: "Peter Zijlstra (Intel)" &lt;peterz@infradead.org&gt;
Cc: Rik van Riel &lt;riel@surriel.com&gt;
Reviewed-by: Rik van Riel &lt;riel@surriel.com&gt;
Signed-off-by: Neeraj Upadhyay &lt;neeraj.upadhyay@kernel.org&gt;
</content>
</entry>
<entry>
<title>locking/csd_lock: Provide an indication of ongoing CSD-lock stall</title>
<updated>2024-08-14T18:35:39Z</updated>
<author>
<name>Paul E. McKenney</name>
<email>paulmck@kernel.org</email>
</author>
<published>2024-07-01T20:33:58Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=ac9d45544cd571decca395715d0b0a3b617d02f4'/>
<id>urn:sha1:ac9d45544cd571decca395715d0b0a3b617d02f4</id>
<content type='text'>
If a CSD-lock stall goes on long enough, it will cause an RCU CPU
stall warning.  This additional warning provides much additional
console-log traffic and little additional information.  Therefore,
provide a new csd_lock_is_stuck() function that returns true if there
is an ongoing CSD-lock stall.  This function will be used by the RCU
CPU stall warnings to provide a one-line indication of the stall when
this function returns true.

[ neeraj.upadhyay: Apply Rik van Riel feedback. ]
[ neeraj.upadhyay: Apply kernel test robot feedback. ]

Signed-off-by: Paul E. McKenney &lt;paulmck@kernel.org&gt;
Cc: Imran Khan &lt;imran.f.khan@oracle.com&gt;
Cc: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Leonardo Bras &lt;leobras@redhat.com&gt;
Cc: "Peter Zijlstra (Intel)" &lt;peterz@infradead.org&gt;
Cc: Rik van Riel &lt;riel@surriel.com&gt;
Signed-off-by: Neeraj Upadhyay &lt;neeraj.upadhyay@kernel.org&gt;
</content>
</entry>
<entry>
<title>locking/csd_lock: Print large numbers as negatives</title>
<updated>2024-07-29T02:14:38Z</updated>
<author>
<name>Paul E. McKenney</name>
<email>paulmck@kernel.org</email>
</author>
<published>2024-07-01T16:49:49Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=c1972c8dc987769eac8e5dede536d3cff489c6ee'/>
<id>urn:sha1:c1972c8dc987769eac8e5dede536d3cff489c6ee</id>
<content type='text'>
The CSD-lock-hold diagnostics from CONFIG_CSD_LOCK_WAIT_DEBUG are
printed in nanoseconds as unsigned long longs, which is a bit obtuse for
human readers when timing bugs result in negative CSD-lock hold times.
Yes, there are some people to whom it is immediately obvious that
18446744073709551615 is really -1, but for the rest of us...

Therefore, print these numbers as signed long longs, making the negative
hold times immediately apparent.

Reported-by: Rik van Riel &lt;riel@surriel.com&gt;
Signed-off-by: Paul E. McKenney &lt;paulmck@kernel.org&gt;
Cc: Imran Khan &lt;imran.f.khan@oracle.com&gt;
Cc: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Leonardo Bras &lt;leobras@redhat.com&gt;
Cc: "Peter Zijlstra (Intel)" &lt;peterz@infradead.org&gt;
Cc: Rik van Riel &lt;riel@surriel.com&gt;
Reviewed-by: Rik van Riel &lt;riel@surriel.com&gt;
Signed-off-by: Neeraj Upadhyay &lt;neeraj.upadhyay@kernel.org&gt;
</content>
</entry>
<entry>
<title>smp: Add missing destroy_work_on_stack() call in smp_call_on_cpu()</title>
<updated>2024-07-10T20:40:39Z</updated>
<author>
<name>Zqiang</name>
<email>qiang.zhang1211@gmail.com</email>
</author>
<published>2024-07-04T06:52:13Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=77aeb1b685f9db73d276bad4bb30d48505a6fd23'/>
<id>urn:sha1:77aeb1b685f9db73d276bad4bb30d48505a6fd23</id>
<content type='text'>
For CONFIG_DEBUG_OBJECTS_WORK=y kernels sscs.work defined by
INIT_WORK_ONSTACK() is initialized by debug_object_init_on_stack() for
the debug check in __init_work() to work correctly.

But this lacks the counterpart to remove the tracked object from debug
objects again, which will cause a debug object warning once the stack is
freed.

Add the missing destroy_work_on_stack() invocation to cure that.

[ tglx: Massaged changelog ]

Signed-off-by: Zqiang &lt;qiang.zhang1211@gmail.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Tested-by: Paul E. McKenney &lt;paulmck@kernel.org&gt;
Link: https://lore.kernel.org/r/20240704065213.13559-1-qiang.zhang1211@gmail.com

</content>
</entry>
<entry>
<title>smp: Use str_plural() to fix Coccinelle warnings</title>
<updated>2024-06-17T13:17:44Z</updated>
<author>
<name>Thorsten Blum</name>
<email>thorsten.blum@toblux.com</email>
</author>
<published>2024-05-08T15:42:26Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=c4df15931cb72556fea93bd763ada88e56cbd8e5'/>
<id>urn:sha1:c4df15931cb72556fea93bd763ada88e56cbd8e5</id>
<content type='text'>
Fixes the following two Coccinelle/coccicheck warnings reported by
string_choices.cocci:

	opportunity for str_plural(num_cpus)
	opportunity for str_plural(num_nodes)

Signed-off-by: Thorsten Blum &lt;thorsten.blum@toblux.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Acked-by: Paul E. McKenney &lt;paulmck@kernel.org&gt;
Link: https://lore.kernel.org/r/20240508154225.309703-2-thorsten.blum@toblux.com

</content>
</entry>
<entry>
<title>Merge tag 'csd-lock.2023.10.23a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu</title>
<updated>2023-10-31T03:56:53Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2023-10-31T03:56:53Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=9a0f53e0cfc2ef262c05b8e4ab89e7f2accaf96c'/>
<id>urn:sha1:9a0f53e0cfc2ef262c05b8e4ab89e7f2accaf96c</id>
<content type='text'>
Pull CSD lock update from Paul McKenney:
 "This adds a kernel boot parameter that causes the kernel to panic if
  one of the call_smp_function() APIs is stalled for more than the
  specified duration.

  This is useful in deployments in which a clean panic is preferable to
  an indefinite stall"

* tag 'csd-lock.2023.10.23a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu:
  smp,csd: Throw an error if a CSD lock is stuck for too long
</content>
</entry>
<entry>
<title>Merge branch 'linus' into smp/core</title>
<updated>2023-10-17T19:40:46Z</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2023-10-17T19:40:46Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=a940daa52167e9db8ecce82213813b735a9d9f23'/>
<id>urn:sha1:a940daa52167e9db8ecce82213813b735a9d9f23</id>
<content type='text'>
Pull in upstream to get the fixes so depending changes can be applied.
</content>
</entry>
<entry>
<title>smp,csd: Throw an error if a CSD lock is stuck for too long</title>
<updated>2023-10-16T23:06:37Z</updated>
<author>
<name>Rik van Riel</name>
<email>riel@surriel.com</email>
</author>
<published>2023-08-21T20:04:09Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=94b3f0b5af2c7af69e3d6e0cdd9b0ea535f22186'/>
<id>urn:sha1:94b3f0b5af2c7af69e3d6e0cdd9b0ea535f22186</id>
<content type='text'>
The CSD lock seems to get stuck in 2 "modes". When it gets stuck
temporarily, it usually gets released in a few seconds, and sometimes
up to one or two minutes.

If the CSD lock stays stuck for more than several minutes, it never
seems to get unstuck, and gradually more and more things in the system
end up also getting stuck.

In the latter case, we should just give up, so the system can dump out
a little more information about what went wrong, and, with panic_on_oops
and a kdump kernel loaded, dump a whole bunch more information about what
might have gone wrong.  In addition, there is an smp.panic_on_ipistall
kernel boot parameter that by default retains the old behavior, but when
set enables the panic after the CSD lock has been stuck for more than
the specified number of milliseconds, as in 300,000 for five minutes.

[ paulmck: Apply Imran Khan feedback. ]
[ paulmck: Apply Leonardo Bras feedback. ]

Link: https://lore.kernel.org/lkml/bc7cc8b0-f587-4451-8bcd-0daae627bcc7@paulmck-laptop/
Signed-off-by: Rik van Riel &lt;riel@surriel.com&gt;
Signed-off-by: Paul E. McKenney &lt;paulmck@kernel.org&gt;
Reviewed-by: Imran Khan &lt;imran.f.khan@oracle.com&gt;
Reviewed-by: Leonardo Bras &lt;leobras@redhat.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Valentin Schneider &lt;vschneid@redhat.com&gt;
Cc: Juergen Gross &lt;jgross@suse.com&gt;
Cc: Jonathan Corbet &lt;corbet@lwn.net&gt;
Cc: Randy Dunlap &lt;rdunlap@infradead.org&gt;
</content>
</entry>
<entry>
<title>smp: Change function signatures to use call_single_data_t</title>
<updated>2023-09-13T12:59:24Z</updated>
<author>
<name>Leonardo Bras</name>
<email>leobras@redhat.com</email>
</author>
<published>2023-08-31T06:31:28Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=d090ec0df81e56556af3a2bf04a7e89347ae5784'/>
<id>urn:sha1:d090ec0df81e56556af3a2bf04a7e89347ae5784</id>
<content type='text'>
call_single_data_t is a size-aligned typedef of struct __call_single_data.

This alignment is desirable in order to have smp_call_function*() avoid
bouncing an extra cacheline in case of an unaligned csd, given this
would hurt performance.

Since the removal of struct request-&gt;csd in commit 660e802c76c8
("blk-mq: use percpu csd to remote complete instead of per-rq csd") there
are no current users of smp_call_function*() with unaligned csd.

Change every 'struct __call_single_data' function parameter to
'call_single_data_t', so we have warnings if any new code tries to
introduce an smp_call_function*() call with unaligned csd.

Signed-off-by: Leonardo Bras &lt;leobras@redhat.com&gt;
Reviewed-by: Guo Ren &lt;guoren@kernel.org&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Link: https://lore.kernel.org/r/20230831063129.335425-1-leobras@redhat.com
</content>
</entry>
</feed>
