<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/kernel/rcu/srcutree.c, branch v6.0</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v6.0</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v6.0'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2022-07-19T18:39:59Z</updated>
<entry>
<title>srcu: Make expedited RCU grace periods block even less frequently</title>
<updated>2022-07-19T18:39:59Z</updated>
<author>
<name>Neeraj Upadhyay</name>
<email>quic_neeraju@quicinc.com</email>
</author>
<published>2022-07-01T03:15:45Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=4f2bfd9494a072d58203600de6bedd72680e612a'/>
<id>urn:sha1:4f2bfd9494a072d58203600de6bedd72680e612a</id>
<content type='text'>
The purpose of commit 282d8998e997 ("srcu: Prevent expedited GPs
and blocking readers from consuming CPU") was to prevent a long
series of never-blocking expedited SRCU grace periods from blocking
kernel-live-patching (KLP) progress.  Although it was successful, it also
resulted in excessive boot times on certain embedded workloads running
under qemu with the "-bios QEMU_EFI.fd" command line.  Here "excessive"
means increasing the boot time up into the three-to-four minute range.
This increase in boot time was due to the more than 6000 back-to-back
invocations of synchronize_rcu_expedited() within the KVM host OS, which
in turn resulted from qemu's emulation of a long series of MMIO accesses.

Commit 640a7d37c3f4 ("srcu: Block less aggressively for expedited grace
periods") did not significantly help this particular use case.

Zhangfei Gao and Shameerali Kolothum Thodi did experiments varying the
value of SRCU_MAX_NODELAY_PHASE with HZ=250 and with various values
of non-sleeping per phase counts on a system with preemption enabled,
and observed the following boot times:

+──────────────────────────+────────────────+
| SRCU_MAX_NODELAY_PHASE   | Boot time (s)  |
+──────────────────────────+────────────────+
| 100                      | 30.053         |
| 150                      | 25.151         |
| 200                      | 20.704         |
| 250                      | 15.748         |
| 500                      | 11.401         |
| 1000                     | 11.443         |
| 10000                    | 11.258         |
| 1000000                  | 11.154         |
+──────────────────────────+────────────────+

Analysis on the experiment results show additional improvements with
CPU-bound delays approaching one jiffy in duration. This improvement was
also seen when number of per-phase iterations were scaled to one jiffy.

This commit therefore scales per-grace-period phase number of non-sleeping
polls so that non-sleeping polls extend for about one jiffy. In addition,
the delay-calculation call to srcu_get_delay() in srcu_gp_end() is
replaced with a simple check for an expedited grace period.  This change
schedules callback invocation immediately after expedited grace periods
complete, which results in greatly improved boot times.  Testing done
by Marc and Zhangfei confirms that this change recovers most of the
performance degradation in boottime; for CONFIG_HZ_250 configuration,
specifically, boot times improve from 3m50s to 41s on Marc's setup;
and from 2m40s to ~9.7s on Zhangfei's setup.

In addition to the changes to default per phase delays, this
change adds 3 new kernel parameters - srcutree.srcu_max_nodelay,
srcutree.srcu_max_nodelay_phase, and srcutree.srcu_retry_check_delay.
This allows users to configure the srcu grace period scanning delays in
order to more quickly react to additional use cases.

Fixes: 640a7d37c3f4 ("srcu: Block less aggressively for expedited grace periods")
Fixes: 282d8998e997 ("srcu: Prevent expedited GPs and blocking readers from consuming CPU")
Reported-by: Zhangfei Gao &lt;zhangfei.gao@linaro.org&gt;
Reported-by: yueluck &lt;yueluck@163.com&gt;
Signed-off-by: Neeraj Upadhyay &lt;quic_neeraju@quicinc.com&gt;
Tested-by: Marc Zyngier &lt;maz@kernel.org&gt;
Tested-by: Zhangfei Gao &lt;zhangfei.gao@linaro.org&gt;
Link: https://lore.kernel.org/all/20615615-0013-5adc-584f-2b1d5c03ebfc@linaro.org/
Signed-off-by: Paul E. McKenney &lt;paulmck@kernel.org&gt;
</content>
</entry>
<entry>
<title>srcu: Block less aggressively for expedited grace periods</title>
<updated>2022-07-19T18:39:59Z</updated>
<author>
<name>Paul E. McKenney</name>
<email>paulmck@kernel.org</email>
</author>
<published>2022-06-12T22:00:06Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=8f870e6eb8c0c3f9869bf3fcf9db39f86cfcea49'/>
<id>urn:sha1:8f870e6eb8c0c3f9869bf3fcf9db39f86cfcea49</id>
<content type='text'>
Commit 282d8998e997 ("srcu: Prevent expedited GPs and blocking readers
from consuming CPU") fixed a problem where a long-running expedited SRCU
grace period could block kernel live patching.  It did so by giving up
on expediting once a given SRCU expedited grace period grew too old.

Unfortunately, this added excessive delays to boots of virtual embedded
systems specifying "-bios QEMU_EFI.fd" to qemu.  This commit therefore
makes the transition away from expediting less aggressive, increasing
the per-grace-period phase number of non-sleeping polls of readers from
one to three and increasing the required grace-period age from one jiffy
(actually from zero to one jiffies) to two jiffies (actually from one
to two jiffies).

Fixes: 282d8998e997 ("srcu: Prevent expedited GPs and blocking readers from consuming CPU")
Signed-off-by: Paul E. McKenney &lt;paulmck@kernel.org&gt;
Reported-by: Zhangfei Gao &lt;zhangfei.gao@linaro.org&gt;
Reported-by: chenxiang (M)" &lt;chenxiang66@hisilicon.com&gt;
Cc: Shameerali Kolothum Thodi  &lt;shameerali.kolothum.thodi@huawei.com&gt;
Cc: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
Reviewed-by: Neeraj Upadhyay &lt;quic_neeraju@quicinc.com&gt;
Link: https://lore.kernel.org/all/20615615-0013-5adc-584f-2b1d5c03ebfc@linaro.org/
</content>
</entry>
<entry>
<title>srcu: Drop needless initialization of sdp in srcu_gp_start()</title>
<updated>2022-05-03T17:20:57Z</updated>
<author>
<name>Lukas Bulwahn</name>
<email>lukas.bulwahn@gmail.com</email>
</author>
<published>2022-03-15T08:55:49Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=586e31d59c436cda65a2e8ac04ff954bed247023'/>
<id>urn:sha1:586e31d59c436cda65a2e8ac04ff954bed247023</id>
<content type='text'>
Commit 9c7ef4c30f12 ("srcu: Make Tree SRCU able to operate without
snp_node array") initializes the local variable sdp differently depending
on the srcu's state in srcu_gp_start().  Either way, this initialization
overwrites the value used when sdp is defined.

This commit therefore drops this pointless definition-time initialization.
Although there is no functional change, compiler code generation may
be affected.

Signed-off-by: Lukas Bulwahn &lt;lukas.bulwahn@gmail.com&gt;
Signed-off-by: Paul E. McKenney &lt;paulmck@kernel.org&gt;
</content>
</entry>
<entry>
<title>srcu: Prevent expedited GPs and blocking readers from consuming CPU</title>
<updated>2022-05-03T17:20:57Z</updated>
<author>
<name>Paul E. McKenney</name>
<email>paulmck@kernel.org</email>
</author>
<published>2022-03-08T23:45:33Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=282d8998e9979c2186af7f7d22366f2fc3149838'/>
<id>urn:sha1:282d8998e9979c2186af7f7d22366f2fc3149838</id>
<content type='text'>
If an SRCU reader blocks while a synchronize_srcu_expedited() waits for
that same reader, then that grace period will spawn an endless series of
workqueue handlers, consuming a full CPU.  This quickly gets pointless
because consuming more CPU isn't going to make that reader get done
faster, especially if it is blocked waiting for an external event.

This commit therefore spawns at most one pair of back-to-back workqueue
handlers per expedited grace period phase, instead inserting increasing
delays as that grace period phase grows older, but capped at 10 jiffies.
In any case, if there have been at least 100 back-to-back workqueue
handlers within a single jiffy, regardless of grace period or grace-period
phase, then a one-jiffy delay is inserted.

[ paulmck:  Apply feedback from kernel test robot. ]

Cc: Neeraj Upadhyay &lt;quic_neeraju@quicinc.com&gt;
Reported-by: Song Liu &lt;song@kernel.org&gt;
Tested-by: kernel test robot &lt;oliver.sang@intel.com&gt;
Signed-off-by: Paul E. McKenney &lt;paulmck@kernel.org&gt;
</content>
</entry>
<entry>
<title>srcu: Add contention check to call_srcu() srcu_data -&gt;lock acquisition</title>
<updated>2022-05-03T17:20:57Z</updated>
<author>
<name>Paul E. McKenney</name>
<email>paulmck@kernel.org</email>
</author>
<published>2022-01-31T21:27:15Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=c2445d38785086422e56dcbe049b73a53b2ba81f'/>
<id>urn:sha1:c2445d38785086422e56dcbe049b73a53b2ba81f</id>
<content type='text'>
This commit increases the sensitivity of contention detection by adding
checks to the acquisition of the srcu_data structure's lock on the
call_srcu() code path.

Co-developed-by: Neeraj Upadhyay &lt;quic_neeraju@quicinc.com&gt;
Signed-off-by: Neeraj Upadhyay &lt;quic_neeraju@quicinc.com&gt;
Signed-off-by: Paul E. McKenney &lt;paulmck@kernel.org&gt;
</content>
</entry>
<entry>
<title>srcu: Automatically determine size-transition strategy at boot</title>
<updated>2022-05-03T17:19:39Z</updated>
<author>
<name>Paul E. McKenney</name>
<email>paulmck@kernel.org</email>
</author>
<published>2022-01-31T19:21:30Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=a57ffb3c6b67e59e8632f731414b792eacc6cca0'/>
<id>urn:sha1:a57ffb3c6b67e59e8632f731414b792eacc6cca0</id>
<content type='text'>
This commit adds a srcutree.convert_to_big option of zero that causes
SRCU to decide at boot whether to wait for contention (small systems) or
immediately expand to large (large systems).  A new srcutree.big_cpu_lim
(defaulting to 128) defines how many CPUs constitute a large system.

Co-developed-by: Neeraj Upadhyay &lt;quic_neeraju@quicinc.com&gt;
Signed-off-by: Neeraj Upadhyay &lt;quic_neeraju@quicinc.com&gt;
Signed-off-by: Paul E. McKenney &lt;paulmck@kernel.org&gt;
</content>
</entry>
<entry>
<title>srcu: Add contention-triggered addition of srcu_node tree</title>
<updated>2022-04-11T22:52:30Z</updated>
<author>
<name>Paul E. McKenney</name>
<email>paulmck@kernel.org</email>
</author>
<published>2022-01-28T04:32:05Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=9f2e91d94c91558e3764fe4e01c5e6281a90f239'/>
<id>urn:sha1:9f2e91d94c91558e3764fe4e01c5e6281a90f239</id>
<content type='text'>
This commit instruments the acquisitions of the srcu_struct structure's
-&gt;lock, enabling the initiation of a transition from SRCU_SIZE_SMALL
to SRCU_SIZE_BIG when sufficient contention is experienced.  The
instrumentation counts the number of trylock failures within the confines
of a single jiffy.  If that number exceeds the value specified by the
srcutree.small_contention_lim kernel boot parameter (which defaults to
100), and if the value specified by the srcutree.convert_to_big kernel
boot parameter has the 0x10 bit set (defaults to 0), then a transition
will be automatically initiated.

By default, there will never be any transitions, so that none of the
srcu_struct structures ever gains an srcu_node array.

The useful values for srcutree.convert_to_big are:

0x00:  Never convert.
0x01:  Always convert at init_srcu_struct() time.
0x02:  Convert when rcutorture prints its first round of statistics.
0x03:  Decide conversion approach at boot given system size.
0x10:  Convert if contention is encountered.
0x12:  Convert if contention is encountered or when rcutorture prints
        its first round of statistics, whichever comes first.

The value 0x11 acts the same as 0x01 because the conversion happens
before there is any chance of contention.

[ paulmck: Apply "static" feedback from kernel test robot. ]

Co-developed-by: Neeraj Upadhyay &lt;quic_neeraju@quicinc.com&gt;
Signed-off-by: Neeraj Upadhyay &lt;quic_neeraju@quicinc.com&gt;
Signed-off-by: Paul E. McKenney &lt;paulmck@kernel.org&gt;
</content>
</entry>
<entry>
<title>srcu: Create concurrency-safe helper for initiating size transition</title>
<updated>2022-04-11T22:52:30Z</updated>
<author>
<name>Paul E. McKenney</name>
<email>paulmck@kernel.org</email>
</author>
<published>2022-01-27T22:56:39Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=99659f64b14e55cfa48980f5396f83820bafd028'/>
<id>urn:sha1:99659f64b14e55cfa48980f5396f83820bafd028</id>
<content type='text'>
Once there are contention-initiated size transitions, it will be
possible for rcutorture to initiate a transition at the same time
as a contention-initiated transition.  This commit therefore creates
a concurrency-safe helper function named srcu_transition_to_big() to
safely initiate size transitions.

Co-developed-by: Neeraj Upadhyay &lt;quic_neeraju@quicinc.com&gt;
Signed-off-by: Neeraj Upadhyay &lt;quic_neeraju@quicinc.com&gt;
Signed-off-by: Paul E. McKenney &lt;paulmck@kernel.org&gt;
</content>
</entry>
<entry>
<title>srcu: Explain srcu_funnel_gp_start() call to list_add() is safe</title>
<updated>2022-04-11T22:52:30Z</updated>
<author>
<name>Paul E. McKenney</name>
<email>paulmck@kernel.org</email>
</author>
<published>2022-01-27T21:47:42Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=ee5e2448bceb9400aa27207f0c0220f9dedd85eb'/>
<id>urn:sha1:ee5e2448bceb9400aa27207f0c0220f9dedd85eb</id>
<content type='text'>
This commit adds a comment explaining why an unprotected call to
list_add() from srcu_funnel_gp_start() can be safe.  TL;DR: It is only
called during very early boot when we don't have no steeking concurrency!

Co-developed-by: Neeraj Upadhyay &lt;quic_neeraju@quicinc.com&gt;
Signed-off-by: Neeraj Upadhyay &lt;quic_neeraju@quicinc.com&gt;
Signed-off-by: Paul E. McKenney &lt;paulmck@kernel.org&gt;
</content>
</entry>
<entry>
<title>srcu: Prevent cleanup_srcu_struct() from freeing non-dynamic -&gt;sda</title>
<updated>2022-04-11T22:52:30Z</updated>
<author>
<name>Paul E. McKenney</name>
<email>paulmck@kernel.org</email>
</author>
<published>2022-01-27T21:20:49Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=46470cf85d2b61abd37c6f66c4dacc1bc510d10f'/>
<id>urn:sha1:46470cf85d2b61abd37c6f66c4dacc1bc510d10f</id>
<content type='text'>
When an srcu_struct structure is created (but not in a kernel module)
by DEFINE_SRCU() and friends, the per-CPU srcu_data structure is
statically allocated.  In all other cases, that structure is obtained
from alloc_percpu(), in which case cleanup_srcu_struct() must invoke
free_percpu() on the resulting -&gt;sda pointer in the srcu_struct pointer.

Which it does.

Except that it also invokes free_percpu() on the -&gt;sda pointer
referencing the statically allocated per-CPU srcu_data structures.
Which free_percpu() is surprisingly OK with.

This commit nevertheless stops cleanup_srcu_struct() from freeing
statically allocated per-CPU srcu_data structures.

Co-developed-by: Neeraj Upadhyay &lt;quic_neeraju@quicinc.com&gt;
Signed-off-by: Neeraj Upadhyay &lt;quic_neeraju@quicinc.com&gt;
Signed-off-by: Paul E. McKenney &lt;paulmck@kernel.org&gt;
</content>
</entry>
</feed>
