<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/kernel/rcu/srcutree.c, branch v6.3</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v6.3</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v6.3'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2023-01-04T01:49:23Z</updated>
<entry>
<title>srcu: Update comment after the index flip</title>
<updated>2023-01-04T01:49:23Z</updated>
<author>
<name>Paul E. McKenney</name>
<email>paulmck@kernel.org</email>
</author>
<published>2022-12-21T16:32:51Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=dafc4d1603c27671adc2b41eb7e7827f8cc18961'/>
<id>urn:sha1:dafc4d1603c27671adc2b41eb7e7827f8cc18961</id>
<content type='text'>
Because there is not guaranteed to be a full memory barrier between
the -&gt;srcu_unlock_count increment of an srcu_read_unlock() and the
-&gt;srcu_lock_count increment of the next srcu_read_lock(), this next
srcu_read_lock() is not guaranteed to see the effect of the index flip
just prior to this comment.  However, this next srcu_read_lock() will
execute a full memory barrier, so the srcu_read_lock() after that is
guaranteed to see that index flip.

This guarantee is illustrated by the following diagram of events and
the litmus test following that.

------------------------------------------------------------------------

READER                  UPDATER
-------------           ----------
                           // idx is initially 0.

                           srcu_flip() {
                              smp_mb();
// RSCS

srcu_read_unlock() {
  smp_mb();
                              idx++;    // P
                              smp_mb(); // QQ
                           }

                           srcu_readers_unlock_idx(0) {
        ,--counted------------ count all unlock[0]; // Q
        |
  unlock[0]++;  // X

}
                               smp_mb();
srcu_read_lock() {
  READ(idx) = 0;         ,---- count all lock[0]; // contributes imbalance of 1.
  lock[0]++;  ----counted              |
  smp_mb(); // PP          }           |
}                                      |
                                       |
// RSCS                             not going to effect above scan
                                       |
srcu_read_unlock() {                   |
  smp_mb();                            |
  unlock[0]++;                         |
}                                      |
                                      /
                                     /
srcu_read_lock() {                  |
  READ(idx);  // Y  -----cannot be counted because of P (has to sample idx as 1)
  lock[1]++;
  ...
}

------------------------------------------------------------------------

This makes it similar to the store buffer pattern. Using X, Y, P and Q
annotated above, we get:

------------------------------------------------------------------------

READER                    UPDATER
X (write)                 P (write)

smp_mb(); //PP            smp_mb(); //QQ

Y (read)                  Q (read)

------------------------------------------------------------------------

ASCII art courtesy of Joel Fernandes.

Reported-by: Joel Fernandes &lt;joel@joelfernandes.org&gt;
Reported-by: Boqun Feng &lt;boqun.feng@gmail.com&gt;
Reported-by: Frederic Weisbecker &lt;frederic@kernel.org&gt;
Reported-by: Neeraj Upadhyay &lt;quic_neeraju@quicinc.com&gt;
Signed-off-by: Paul E. McKenney &lt;paulmck@kernel.org&gt;
</content>
</entry>
<entry>
<title>srcu: Yet more detail for srcu_readers_active_idx_check() comments</title>
<updated>2023-01-04T01:49:23Z</updated>
<author>
<name>Paul E. McKenney</name>
<email>paulmck@kernel.org</email>
</author>
<published>2022-12-14T18:50:30Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=0cd4b50b12d96d668b0627c149b19b5784ad4898'/>
<id>urn:sha1:0cd4b50b12d96d668b0627c149b19b5784ad4898</id>
<content type='text'>
The comment in srcu_readers_active_idx_check() following the smp_mb()
is out of date, hailing from a simpler time when preemption was disabled
across the bulk of __srcu_read_lock().  The fact that preemption was
disabled meant that the number of tasks that had fetched the old index
but not yet incremented counters was limited by the number of CPUs.

In our more complex modern times, the number of CPUs is no longer a limit.
This commit therefore updates this comment, additionally giving more
memory-ordering detail.

[ paulmck: Apply Nt-&gt;Nc feedback from Joel Fernandes. ]

Reported-by: Boqun Feng &lt;boqun.feng@gmail.com&gt;
Reported-by: Frederic Weisbecker &lt;frederic@kernel.org&gt;
Reported-by: "Joel Fernandes (Google)" &lt;joel@joelfernandes.org&gt;
Reported-by: Neeraj Upadhyay &lt;neeraj.iitr10@gmail.com&gt;
Reported-by: Uladzislau Rezki &lt;urezki@gmail.com&gt;
Signed-off-by: Paul E. McKenney &lt;paulmck@kernel.org&gt;
</content>
</entry>
<entry>
<title>srcu: Remove needless rcu_seq_done() check while holding read lock</title>
<updated>2023-01-04T01:49:23Z</updated>
<author>
<name>Pingfan Liu</name>
<email>kernelfans@gmail.com</email>
</author>
<published>2022-11-23T13:56:37Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=1bafbfb3e1a18af7f404977ed0d218dc4f176f8e'/>
<id>urn:sha1:1bafbfb3e1a18af7f404977ed0d218dc4f176f8e</id>
<content type='text'>
The srcu_gp_start_if_needed() function now read-holds the srcu_struct
whose grace period is being started, which means that the corresponding
SRCU grace period cannot end.  This in turn means that the SRCU
grace-period sequence number returned by rcu_seq_snap() cannot expire
during this time.  And that means that the calls to rcu_seq_done() in
srcu_funnel_exp_start() and srcu_funnel_gp_start() can never return true.

This commit therefore removes these rcu_seq_done() checks, but adds checks
in kernels built with CONFIG_PROVE_RCU=y that splats if rcu_seq_done()
does somehow return true.

[ paulmck: Rearrange checks to handle kernels built with lockdep. ]

Signed-off-by: Pingfan Liu &lt;kernelfans@gmail.com&gt;
Cc: Lai Jiangshan &lt;jiangshanlai@gmail.com&gt;
Cc: Frederic Weisbecker &lt;frederic@kernel.org&gt;
Cc: Josh Triplett &lt;josh@joshtriplett.org&gt;
Cc: Steven Rostedt &lt;rostedt@goodmis.org&gt;
Cc: Mathieu Desnoyers &lt;mathieu.desnoyers@efficios.com&gt;
To: rcu@vger.kernel.org
Signed-off-by: Paul E. McKenney &lt;paulmck@kernel.org&gt;
</content>
</entry>
<entry>
<title>srcu: Fix the comparision in srcu_invl_snp_seq()</title>
<updated>2023-01-04T01:49:22Z</updated>
<author>
<name>Pingfan Liu</name>
<email>kernelfans@gmail.com</email>
</author>
<published>2022-11-16T01:52:44Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=50be0c0439fc1d8bda733ff26f6526e49970857a'/>
<id>urn:sha1:50be0c0439fc1d8bda733ff26f6526e49970857a</id>
<content type='text'>
A grace-period sequence number contains two fields: counter and
state.  SRCU_SNP_INIT_SEQ provides a guaranteed invalid value for
grace-period sequence numbers in newly allocated srcu_node structures'
-&gt;srcu_have_cbs[] and -&gt;srcu_gp_seq_needed_exp fields.  The point of the
comparison in srcu_invl_snp_seq() is not to detect invalid grace-period
sequence numbers in general, but rather to detect a newly allocated
srcu_node structure whose -&gt;srcu_have_cbs[] and -&gt;srcu_gp_seq_needed_exp
fields need to be brought into line with the srcu_struct structure's
-&gt;srcu_gp_seq field.

This commit therefore causes srcu_invl_snp_seq() to compare both fields
of the specified grace-period sequence number.

Signed-off-by: Pingfan Liu &lt;kernelfans@gmail.com&gt;
Cc: Lai Jiangshan &lt;jiangshanlai@gmail.com&gt;
Cc: "Paul E. McKenney" &lt;paulmck@kernel.org&gt;
Cc: Josh Triplett &lt;josh@joshtriplett.org&gt;
Cc: Steven Rostedt &lt;rostedt@goodmis.org&gt;
Cc: Mathieu Desnoyers &lt;mathieu.desnoyers@efficios.com&gt;
Cc: &lt;rcu@vger.kernel.org&gt;
Reviewed-by: Frederic Weisbecker &lt;frederic@kernel.org&gt;
Signed-off-by: Paul E. McKenney &lt;paulmck@kernel.org&gt;
</content>
</entry>
<entry>
<title>srcu: Delegate work to the boot cpu if using SRCU_SIZE_SMALL</title>
<updated>2023-01-04T01:49:22Z</updated>
<author>
<name>Pingfan Liu</name>
<email>kernelfans@gmail.com</email>
</author>
<published>2022-10-31T01:52:37Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=7f24626d6dd844bfc6d1f492d214d29c86d02550'/>
<id>urn:sha1:7f24626d6dd844bfc6d1f492d214d29c86d02550</id>
<content type='text'>
Commit 994f706872e6 ("srcu: Make Tree SRCU able to operate without
snp_node array") assumes that cpu 0 is always online.  However, there
really are situations when some other CPU is the boot CPU, for example,
when booting a kdump kernel with the maxcpus=1 boot parameter.

On PowerPC, the kdump kernel can hang as follows:
...
[    1.740036] systemd[1]: Hostname set to &lt;xyz.com&gt;
[  243.686240] INFO: task systemd:1 blocked for more than 122 seconds.
[  243.686264]       Not tainted 6.1.0-rc1 #1
[  243.686272] "echo 0 &gt; /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  243.686281] task:systemd         state:D stack:0     pid:1     ppid:0      flags:0x00042000
[  243.686296] Call Trace:
[  243.686301] [c000000016657640] [c000000016657670] 0xc000000016657670 (unreliable)
[  243.686317] [c000000016657830] [c00000001001dec0] __switch_to+0x130/0x220
[  243.686333] [c000000016657890] [c000000010f607b8] __schedule+0x1f8/0x580
[  243.686347] [c000000016657940] [c000000010f60bb4] schedule+0x74/0x140
[  243.686361] [c0000000166579b0] [c000000010f699b8] schedule_timeout+0x168/0x1c0
[  243.686374] [c000000016657a80] [c000000010f61de8] __wait_for_common+0x148/0x360
[  243.686387] [c000000016657b20] [c000000010176bb0] __flush_work.isra.0+0x1c0/0x3d0
[  243.686401] [c000000016657bb0] [c0000000105f2768] fsnotify_wait_marks_destroyed+0x28/0x40
[  243.686415] [c000000016657bd0] [c0000000105f21b8] fsnotify_destroy_group+0x68/0x160
[  243.686428] [c000000016657c40] [c0000000105f6500] inotify_release+0x30/0xa0
[  243.686440] [c000000016657cb0] [c0000000105751a8] __fput+0xc8/0x350
[  243.686452] [c000000016657d00] [c00000001017d524] task_work_run+0xe4/0x170
[  243.686464] [c000000016657d50] [c000000010020e94] do_notify_resume+0x134/0x140
[  243.686478] [c000000016657d80] [c00000001002eb18] interrupt_exit_user_prepare_main+0x198/0x270
[  243.686493] [c000000016657de0] [c00000001002ec60] syscall_exit_prepare+0x70/0x180
[  243.686505] [c000000016657e10] [c00000001000bf7c] system_call_vectored_common+0xfc/0x280
[  243.686520] --- interrupt: 3000 at 0x7fffa47d5ba4
[  243.686528] NIP:  00007fffa47d5ba4 LR: 0000000000000000 CTR: 0000000000000000
[  243.686538] REGS: c000000016657e80 TRAP: 3000   Not tainted  (6.1.0-rc1)
[  243.686548] MSR:  800000000000d033 &lt;SF,EE,PR,ME,IR,DR,RI,LE&gt;  CR: 42044440  XER: 00000000
[  243.686572] IRQMASK: 0
[  243.686572] GPR00: 0000000000000006 00007ffffa606710 00007fffa48e7200 0000000000000000
[  243.686572] GPR04: 0000000000000002 000000000000000a 0000000000000000 0000000000000001
[  243.686572] GPR08: 000001000c172dd0 0000000000000000 0000000000000000 0000000000000000
[  243.686572] GPR12: 0000000000000000 00007fffa4ff4bc0 0000000000000000 0000000000000000
[  243.686572] GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[  243.686572] GPR20: 0000000132dfdc50 000000000000000e 0000000000189375 0000000000000000
[  243.686572] GPR24: 00007ffffa606ae0 0000000000000005 000001000c185490 000001000c172570
[  243.686572] GPR28: 000001000c172990 000001000c184850 000001000c172e00 00007fffa4fedd98
[  243.686683] NIP [00007fffa47d5ba4] 0x7fffa47d5ba4
[  243.686691] LR [0000000000000000] 0x0
[  243.686698] --- interrupt: 3000
[  243.686708] INFO: task kworker/u16:1:24 blocked for more than 122 seconds.
[  243.686717]       Not tainted 6.1.0-rc1 #1
[  243.686724] "echo 0 &gt; /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  243.686733] task:kworker/u16:1   state:D stack:0     pid:24    ppid:2      flags:0x00000800
[  243.686747] Workqueue: events_unbound fsnotify_mark_destroy_workfn
[  243.686758] Call Trace:
[  243.686762] [c0000000166736e0] [c00000004fd91000] 0xc00000004fd91000 (unreliable)
[  243.686775] [c0000000166738d0] [c00000001001dec0] __switch_to+0x130/0x220
[  243.686788] [c000000016673930] [c000000010f607b8] __schedule+0x1f8/0x580
[  243.686801] [c0000000166739e0] [c000000010f60bb4] schedule+0x74/0x140
[  243.686814] [c000000016673a50] [c000000010f699b8] schedule_timeout+0x168/0x1c0
[  243.686827] [c000000016673b20] [c000000010f61de8] __wait_for_common+0x148/0x360
[  243.686840] [c000000016673bc0] [c000000010210840] __synchronize_srcu.part.0+0xa0/0xe0
[  243.686855] [c000000016673c30] [c0000000105f2c64] fsnotify_mark_destroy_workfn+0xc4/0x1a0
[  243.686868] [c000000016673ca0] [c000000010174ea8] process_one_work+0x2a8/0x570
[  243.686882] [c000000016673d40] [c000000010175208] worker_thread+0x98/0x5e0
[  243.686895] [c000000016673dc0] [c0000000101828d4] kthread+0x124/0x130
[  243.686908] [c000000016673e10] [c00000001000cd40] ret_from_kernel_thread+0x5c/0x64
[  366.566274] INFO: task systemd:1 blocked for more than 245 seconds.
[  366.566298]       Not tainted 6.1.0-rc1 #1
[  366.566305] "echo 0 &gt; /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  366.566314] task:systemd         state:D stack:0     pid:1     ppid:0      flags:0x00042000
[  366.566329] Call Trace:
...

The above splat occurs because PowerPC really does use maxcpus=1
instead of nr_cpus=1 in the kernel command line.  Consequently, the
(quite possibly non-zero) kdump CPU is the only online CPU in the kdump
kernel.  SRCU unconditionally queues a sdp-&gt;work on cpu 0, for which no
worker thread has been created, so sdp-&gt;work will be never executed and
__synchronize_srcu() will never be completed.

This commit therefore replaces CPU ID 0 with get_boot_cpu_id() in key
places in Tree SRCU.  Since the CPU indicated by get_boot_cpu_id()
is guaranteed to be online, this avoids the above splat.

Signed-off-by: Pingfan Liu &lt;kernelfans@gmail.com&gt;
Cc: "Paul E. McKenney" &lt;paulmck@kernel.org&gt;
Cc: Lai Jiangshan &lt;jiangshanlai@gmail.com&gt;
Cc: Josh Triplett &lt;josh@joshtriplett.org&gt;
Cc: Steven Rostedt &lt;rostedt@goodmis.org&gt;
Cc: Mathieu Desnoyers &lt;mathieu.desnoyers@efficios.com&gt;
To: rcu@vger.kernel.org
Signed-off-by: Paul E. McKenney &lt;paulmck@kernel.org&gt;
</content>
</entry>
<entry>
<title>srcu: Debug NMI safety even on archs that don't require it</title>
<updated>2022-10-21T17:44:11Z</updated>
<author>
<name>Frederic Weisbecker</name>
<email>frederic@kernel.org</email>
</author>
<published>2022-10-13T17:22:44Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=e29a4915db1480f96e0bc2e928699d086a71f43c'/>
<id>urn:sha1:e29a4915db1480f96e0bc2e928699d086a71f43c</id>
<content type='text'>
Currently the NMI safety debugging is only performed on architectures
that don't support NMI-safe this_cpu_inc().

Reorder the code so that other architectures like x86 also detect bad
uses.

[ paulmck: Apply kernel test robot, Stephen Rothwell, and Zqiang feedback. ]

Signed-off-by: Frederic Weisbecker &lt;frederic@kernel.org&gt;
Signed-off-by: Paul E. McKenney &lt;paulmck@kernel.org&gt;
</content>
</entry>
<entry>
<title>srcu: Explain the reason behind the read side critical section on GP start</title>
<updated>2022-10-21T17:16:15Z</updated>
<author>
<name>Frederic Weisbecker</name>
<email>frederic@kernel.org</email>
</author>
<published>2022-10-13T17:22:43Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=ae3c0706160b60ac5e7d36aac428ae6e572dc932'/>
<id>urn:sha1:ae3c0706160b60ac5e7d36aac428ae6e572dc932</id>
<content type='text'>
Tell about the need to protect against concurrent updaters who may
overflow the GP counter behind the current update.

Reported-by: Paul E. McKenney &lt;paulmck@kernel.org&gt;
Signed-off-by: Frederic Weisbecker &lt;frederic@kernel.org&gt;
Signed-off-by: Paul E. McKenney &lt;paulmck@kernel.org&gt;
</content>
</entry>
<entry>
<title>srcu: Warn when NMI-unsafe API is used in NMI</title>
<updated>2022-10-21T17:15:53Z</updated>
<author>
<name>Frederic Weisbecker</name>
<email>frederic@kernel.org</email>
</author>
<published>2022-10-13T17:22:42Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=6b77bb9b99c66c6596c58e7a25169bc2ea6b82dd'/>
<id>urn:sha1:6b77bb9b99c66c6596c58e7a25169bc2ea6b82dd</id>
<content type='text'>
Using the NMI-unsafe reader API from within an NMI handler is very likely
to be buggy for three reasons:

1) NMIs aren't strictly re-entrant (a pending nested NMI will execute at
   the end of the current one) so it should be fine to use a non-atomic
   increment here. However, breakpoints can still interrupt NMIs and if
   a breakpoint callback has a reader on that same ssp, a racy increment
   can happen.

2) If the only reader site for a given srcu_struct structure is in an
   NMI handler, then RCU should be used instead of SRCU.

3) Because of the previous reason (2), an srcu_struct structure having
   an SRCU read side critical section in an NMI handler is likely to
   have another one from a task context.

For all these reasons, warn if an NMI-unsafe reader API is used from an
NMI handler.

Signed-off-by: Frederic Weisbecker &lt;frederic@kernel.org&gt;
Reviewed-by: Joel Fernandes (Google) &lt;joel@joelfernandes.org&gt;
Signed-off-by: Paul E. McKenney &lt;paulmck@kernel.org&gt;
</content>
</entry>
<entry>
<title>srcu: Check for consistent global per-srcu_struct NMI safety</title>
<updated>2022-10-20T22:02:27Z</updated>
<author>
<name>Paul E. McKenney</name>
<email>paulmck@kernel.org</email>
</author>
<published>2022-09-20T21:54:41Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=36f65f1d1553e35cd9e6b281271f40d639a128c3'/>
<id>urn:sha1:36f65f1d1553e35cd9e6b281271f40d639a128c3</id>
<content type='text'>
This commit adds runtime checks to verify that a given srcu_struct uses
consistent NMI-safe (or not) read-side primitives globally, but based
on the per-CPU data.  These global checks are made by the grace-period
code that must scan the srcu_data structures anyway, and are done only
in kernels built with CONFIG_PROVE_RCU=y.

Link: https://lore.kernel.org/all/20220910221947.171557773@linutronix.de/

Signed-off-by: Paul E. McKenney &lt;paulmck@kernel.org&gt;
Reviewed-by: Frederic Weisbecker &lt;frederic@kernel.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: John Ogness &lt;john.ogness@linutronix.de&gt;
Cc: Petr Mladek &lt;pmladek@suse.com&gt;
</content>
</entry>
<entry>
<title>srcu: Check for consistent per-CPU per-srcu_struct NMI safety</title>
<updated>2022-10-20T22:02:15Z</updated>
<author>
<name>Paul E. McKenney</name>
<email>paulmck@kernel.org</email>
</author>
<published>2022-09-19T21:03:07Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=27120e7d2c4d5c438b76f9c6330037a52ad0722e'/>
<id>urn:sha1:27120e7d2c4d5c438b76f9c6330037a52ad0722e</id>
<content type='text'>
This commit adds runtime checks to verify that a given srcu_struct uses
consistent NMI-safe (or not) read-side primitives on a per-CPU basis.

Link: https://lore.kernel.org/all/20220910221947.171557773@linutronix.de/

Signed-off-by: Paul E. McKenney &lt;paulmck@kernel.org&gt;
Reviewed-by: Frederic Weisbecker &lt;frederic@kernel.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: John Ogness &lt;john.ogness@linutronix.de&gt;
Cc: Petr Mladek &lt;pmladek@suse.com&gt;
</content>
</entry>
</feed>
