<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/kernel/rcu, branch v6.8</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v6.8</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v6.8'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2024-01-24T17:16:17Z</updated>
<entry>
<title>rcu: Defer RCU kthreads wakeup when CPU is dying</title>
<updated>2024-01-24T17:16:17Z</updated>
<author>
<name>Frederic Weisbecker</name>
<email>frederic@kernel.org</email>
</author>
<published>2023-12-18T23:19:15Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=e787644caf7628ad3269c1fbd321c3255cf51710'/>
<id>urn:sha1:e787644caf7628ad3269c1fbd321c3255cf51710</id>
<content type='text'>
When the CPU goes idle for the last time during the CPU down hotplug
process, RCU reports a final quiescent state for the current CPU. If
this quiescent state propagates up to the top, some tasks may then be
woken up to complete the grace period: the main grace period kthread
and/or the expedited main workqueue (or kworker).

If those kthreads have a SCHED_FIFO policy, the wake up can indirectly
arm the RT bandwith timer to the local offline CPU. Since this happens
after hrtimers have been migrated at CPUHP_AP_HRTIMERS_DYING stage, the
timer gets ignored. Therefore if the RCU kthreads are waiting for RT
bandwidth to be available, they may never be actually scheduled.

This triggers TREE03 rcutorture hangs:

	 rcu: INFO: rcu_preempt self-detected stall on CPU
	 rcu:     4-...!: (1 GPs behind) idle=9874/1/0x4000000000000000 softirq=0/0 fqs=20 rcuc=21071 jiffies(starved)
	 rcu:     (t=21035 jiffies g=938281 q=40787 ncpus=6)
	 rcu: rcu_preempt kthread starved for 20964 jiffies! g938281 f0x0 RCU_GP_WAIT_FQS(5) -&gt;state=0x0 -&gt;cpu=0
	 rcu:     Unless rcu_preempt kthread gets sufficient CPU time, OOM is now expected behavior.
	 rcu: RCU grace-period kthread stack dump:
	 task:rcu_preempt     state:R  running task     stack:14896 pid:14    tgid:14    ppid:2      flags:0x00004000
	 Call Trace:
	  &lt;TASK&gt;
	  __schedule+0x2eb/0xa80
	  schedule+0x1f/0x90
	  schedule_timeout+0x163/0x270
	  ? __pfx_process_timeout+0x10/0x10
	  rcu_gp_fqs_loop+0x37c/0x5b0
	  ? __pfx_rcu_gp_kthread+0x10/0x10
	  rcu_gp_kthread+0x17c/0x200
	  kthread+0xde/0x110
	  ? __pfx_kthread+0x10/0x10
	  ret_from_fork+0x2b/0x40
	  ? __pfx_kthread+0x10/0x10
	  ret_from_fork_asm+0x1b/0x30
	  &lt;/TASK&gt;

The situation can't be solved with just unpinning the timer. The hrtimer
infrastructure and the nohz heuristics involved in finding the best
remote target for an unpinned timer would then also need to handle
enqueues from an offline CPU in the most horrendous way.

So fix this on the RCU side instead and defer the wake up to an online
CPU if it's too late for the local one.

Reported-by: Paul E. McKenney &lt;paulmck@kernel.org&gt;
Fixes: 5c0930ccaad5 ("hrtimers: Push pending hrtimers away from outgoing CPU earlier")
Signed-off-by: Frederic Weisbecker &lt;frederic@kernel.org&gt;
Signed-off-by: Paul E. McKenney &lt;paulmck@kernel.org&gt;
Signed-off-by: Neeraj Upadhyay (AMD) &lt;neeraj.iitr10@gmail.com&gt;
</content>
</entry>
<entry>
<title>Merge branches 'doc.2023.12.13a', 'torture.2023.11.23a', 'fixes.2023.12.13a', 'rcu-tasks.2023.12.12b' and 'srcu.2023.12.13a' into rcu-merge.2023.12.13a</title>
<updated>2023-12-13T19:51:31Z</updated>
<author>
<name>Neeraj Upadhyay (AMD)</name>
<email>neeraj.iitr10@gmail.com</email>
</author>
<published>2023-12-13T19:51:31Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=7dfb03dd24d43b9e7a725e70d2e8a83bb29df294'/>
<id>urn:sha1:7dfb03dd24d43b9e7a725e70d2e8a83bb29df294</id>
<content type='text'>
</content>
</entry>
<entry>
<title>rcu: Force quiescent states only for ongoing grace period</title>
<updated>2023-12-13T19:49:02Z</updated>
<author>
<name>Zqiang</name>
<email>qiang.zhang1211@gmail.com</email>
</author>
<published>2023-11-01T03:35:07Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=dee39c0c1e9624f925da4ca0bece46bdc7427257'/>
<id>urn:sha1:dee39c0c1e9624f925da4ca0bece46bdc7427257</id>
<content type='text'>
If an rcutorture test scenario creates an fqs_task kthread, it will
periodically invoke rcu_force_quiescent_state() in order to start
force-quiescent-state (FQS) operations.  However, an FQS operation
will be started even if there is no RCU grace period in progress.
Although testing FQS operations startup when there is no grace period in
progress is necessary, it need not happen all that often.  This commit
therefore causes rcu_force_quiescent_state() to take an early exit
if there is no grace period in progress.

Note that there will still be attempts to start an FQS scan in the
absence of a grace period because the grace period might end right
after the rcu_force_quiescent_state() function's check.  In actual
testing, this happens about once every ten minutes, which should
provide adequate testing.

Signed-off-by: Zqiang &lt;qiang.zhang1211@gmail.com&gt;
Reviewed-by: Joel Fernandes (Google) &lt;joel@joelfernandes.org&gt;
Signed-off-by: Paul E. McKenney &lt;paulmck@kernel.org&gt;
Signed-off-by: Neeraj Upadhyay (AMD) &lt;neeraj.iitr10@gmail.com&gt;
</content>
</entry>
<entry>
<title>srcu: Explain why callbacks invocations can't run concurrently</title>
<updated>2023-12-11T21:11:17Z</updated>
<author>
<name>Frederic Weisbecker</name>
<email>frederic@kernel.org</email>
</author>
<published>2023-10-03T23:29:03Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=c21357e4461f3f9c8ff93302906b5372411ee108'/>
<id>urn:sha1:c21357e4461f3f9c8ff93302906b5372411ee108</id>
<content type='text'>
If an SRCU barrier is queued while callbacks are running and a new
callbacks invocator for the same sdp were to run concurrently, the
RCU barrier might execute too early. As this requirement is non-obvious,
make sure to keep a record.

Signed-off-by: Frederic Weisbecker &lt;frederic@kernel.org&gt;
Reviewed-by: Joel Fernandes (Google) &lt;joel@joelfernandes.org&gt;
Signed-off-by: Paul E. McKenney &lt;paulmck@kernel.org&gt;
Signed-off-by: Neeraj Upadhyay (AMD) &lt;neeraj.iitr10@gmail.com&gt;
</content>
</entry>
<entry>
<title>srcu: No need to advance/accelerate if no callback enqueued</title>
<updated>2023-12-11T21:11:16Z</updated>
<author>
<name>Frederic Weisbecker</name>
<email>frederic@kernel.org</email>
</author>
<published>2023-10-03T23:29:02Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=94c55b9e21979daa88e190bf971c47432a818ebe'/>
<id>urn:sha1:94c55b9e21979daa88e190bf971c47432a818ebe</id>
<content type='text'>
While in grace period start, there is nothing to accelerate and
therefore no need to advance the callbacks either if no callback is
to be enqueued.

Spare these needless operations in this case.

Signed-off-by: Frederic Weisbecker &lt;frederic@kernel.org&gt;
Reviewed-by: Joel Fernandes (Google) &lt;joel@joelfernandes.org&gt;
Signed-off-by: Paul E. McKenney &lt;paulmck@kernel.org&gt;
Signed-off-by: Neeraj Upadhyay (AMD) &lt;neeraj.iitr10@gmail.com&gt;
</content>
</entry>
<entry>
<title>srcu: Remove superfluous callbacks advancing from srcu_gp_start()</title>
<updated>2023-12-11T21:10:45Z</updated>
<author>
<name>Frederic Weisbecker</name>
<email>frederic@kernel.org</email>
</author>
<published>2023-10-03T23:29:01Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=20eb4142397cf3ec221de43f10ea149af462c572'/>
<id>urn:sha1:20eb4142397cf3ec221de43f10ea149af462c572</id>
<content type='text'>
Callbacks advancing on SRCU must be performed on two specific places:

1) On enqueue time in order to make room for the acceleration of the
   new callback.

2) On invocation time in order to move the callbacks ready to invoke.

Any other callback advancing callsite is needless. Remove the remaining
one in srcu_gp_start().

Co-developed-by: Yong He &lt;zhuangel570@gmail.com&gt;
Signed-off-by: Yong He &lt;zhuangel570@gmail.com&gt;
Co-developed-by: Joel Fernandes &lt;joel@joelfernandes.org&gt;
Signed-off-by: Joel Fernandes &lt;joel@joelfernandes.org&gt;
Signed-off-by: Frederic Weisbecker &lt;frederic@kernel.org&gt;
Signed-off-by: Paul E. McKenney &lt;paulmck@kernel.org&gt;
Co-developed-by: Neeraj Upadhyay (AMD) &lt;neeraj.iitr10@gmail.com&gt;
Signed-off-by: Neeraj Upadhyay (AMD) &lt;neeraj.iitr10@gmail.com&gt;
</content>
</entry>
<entry>
<title>rcu: Restrict access to RCU CPU stall notifiers</title>
<updated>2023-12-11T21:01:22Z</updated>
<author>
<name>Paul E. McKenney</name>
<email>paulmck@kernel.org</email>
</author>
<published>2023-11-02T01:28:38Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=4e58aaeebb3c27993c734c99eae6881b196b1ddb'/>
<id>urn:sha1:4e58aaeebb3c27993c734c99eae6881b196b1ddb</id>
<content type='text'>
Although the RCU CPU stall notifiers can be useful for dumping state when
tracking down delicate forward-progress bugs where NUMA effects cause
cache lines to be delivered to a given CPU regularly, but always in a
state that prevents that CPU from making forward progress.  These bugs can
be detected by the RCU CPU stall-warning mechanism, but in some cases,
the stall-warnings printk()s disrupt the forward-progress bug before
any useful state can be obtained.

Unfortunately, the notifier mechanism added by commit 5b404fdabacf ("rcu:
Add RCU CPU stall notifier") can make matters worse if used at all
carelessly. For example, if the stall warning was caused by a lock not
being released, then any attempt to acquire that lock in the notifier
will hang. This will prevent not only the notifier from producing any
useful output, but it will also prevent the stall-warning message from
ever appearing.

This commit therefore hides this new RCU CPU stall notifier
mechanism under a new RCU_CPU_STALL_NOTIFIER Kconfig option that
depends on both DEBUG_KERNEL and RCU_EXPERT.  In addition, the
rcupdate.rcu_cpu_stall_notifiers=1 kernel boot parameter must also
be specified.  The RCU_CPU_STALL_NOTIFIER Kconfig option's help text
contains a warning and explains the dangers of careless use, recommending
lockless notifier code.  In addition, a WARN() is triggered each time
that an attempt is made to register a stall-warning notifier in kernels
built with CONFIG_RCU_CPU_STALL_NOTIFIER=y.

This combination of measures will keep use of this mechanism confined to
debug kernels and away from routine deployments.

[ paulmck: Apply Dan Carpenter feedback. ]

Fixes: 5b404fdabacf ("rcu: Add RCU CPU stall notifier")
Reported-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Paul E. McKenney &lt;paulmck@kernel.org&gt;
Reviewed-by: Joel Fernandes (Google) &lt;joel@joelfernandes.org&gt;
Signed-off-by: Neeraj Upadhyay (AMD) &lt;neeraj.iitr10@gmail.com&gt;
</content>
</entry>
<entry>
<title>rcu-tasks: Mark RCU Tasks accesses to current-&gt;rcu_tasks_idle_cpu</title>
<updated>2023-12-11T20:52:47Z</updated>
<author>
<name>Paul E. McKenney</name>
<email>paulmck@kernel.org</email>
</author>
<published>2023-10-11T16:45:54Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=18966f7b9458d3b19412fe9dfb421ab59401bfe1'/>
<id>urn:sha1:18966f7b9458d3b19412fe9dfb421ab59401bfe1</id>
<content type='text'>
The task_struct structure's -&gt;rcu_tasks_idle_cpu can be concurrently
read and written from the RCU Tasks grace-period kthread and from the
CPU on which the task_struct structure's task is running.  This commit
therefore marks the accesses appropriately.

Reported-by: Boqun Feng &lt;boqun.feng@gmail.com&gt;
Signed-off-by: Paul E. McKenney &lt;paulmck@kernel.org&gt;
Reviewed-by: Joel Fernandes (Google) &lt;joel@joelfernandes.org&gt;
Signed-off-by: Neeraj Upadhyay (AMD) &lt;neeraj.iitr10@gmail.com&gt;
</content>
</entry>
<entry>
<title>rcutorture: Add fqs_holdoff check before fqs_task is created</title>
<updated>2023-11-23T06:28:18Z</updated>
<author>
<name>Zqiang</name>
<email>qiang.zhang1211@gmail.com</email>
</author>
<published>2023-11-03T07:26:39Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=90f1015dfee3d33f8ca7bfe03296d100d465e385'/>
<id>urn:sha1:90f1015dfee3d33f8ca7bfe03296d100d465e385</id>
<content type='text'>
For rcutorture tests on RCU implementations that support
force-quiescent-state operations and that set the fqs_duration module
parameter greater than zero, the fqs_task kthread will be created.
However, if the fqs_holdoff module parameter is not set, then its default
value of zero will cause fqs_task enter a long-term busy loop until
stopped by kthread_stop().  This commit therefore adds a fqs_holdoff
check before the fqs_task is created, making sure that whenever the
fqs_task is created, the fqs_holdoff will be greater than zero.

Signed-off-by: Zqiang &lt;qiang.zhang1211@gmail.com&gt;
Signed-off-by: Paul E. McKenney &lt;paulmck@kernel.org&gt;
Signed-off-by: Neeraj Upadhyay (AMD) &lt;neeraj.iitr10@gmail.com&gt;
</content>
</entry>
<entry>
<title>Merge tag 'rcu-fixes-v6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks</title>
<updated>2023-11-08T17:47:52Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2023-11-08T17:47:52Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=90450a06162e6c71ab813ea22a83196fe7cff4bc'/>
<id>urn:sha1:90450a06162e6c71ab813ea22a83196fe7cff4bc</id>
<content type='text'>
Pull RCU fixes from Frederic Weisbecker:

 - Fix a lock inversion between scheduler and RCU introduced in
   v6.2-rc4. The scenario could trigger on any user of RCU_NOCB
   (mostly Android but also nohz_full)

 - Fix PF_IDLE semantic changes introduced in v6.6-rc3 breaking
   some RCU-Tasks and RCU-Tasks-Trace expectations as to what
   exactly is an idle task. This resulted in potential spurious
   stalls and warnings.

* tag 'rcu-fixes-v6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks:
  rcu/tasks-trace: Handle new PF_IDLE semantics
  rcu/tasks: Handle new PF_IDLE semantics
  rcu: Introduce rcu_cpu_online()
  rcu: Break rcu_node_0 --&gt; &amp;rq-&gt;__lock order
</content>
</entry>
</feed>
