<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/include/asm-generic/percpu.h, branch v3.16</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v3.16</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v3.16'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2014-04-07T23:36:13Z</updated>
<entry>
<title>percpu: add raw_cpu_ops</title>
<updated>2014-04-07T23:36:13Z</updated>
<author>
<name>Christoph Lameter</name>
<email>cl@linux.com</email>
</author>
<published>2014-04-07T22:39:34Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=b3ca1c10d7b32fdfdfaf5484eda486323f52d9be'/>
<id>urn:sha1:b3ca1c10d7b32fdfdfaf5484eda486323f52d9be</id>
<content type='text'>
The kernel has never been audited to ensure that this_cpu operations are
consistently used throughout the kernel.  The code generated in many
places can be improved through the use of this_cpu operations (which
uses a segment register for relocation of per cpu offsets instead of
performing address calculations).

The patch set also addresses various consistency issues in general with
the per cpu macros.

A. The semantics of __this_cpu_ptr() differs from this_cpu_ptr only
   because checks are skipped. This is typically shown through a raw_
   prefix. So this patch set changes the places where __this_cpu_ptr()
   is used to raw_cpu_ptr().

B. There has been the long term wish by some that __this_cpu operations
   would check for preemption. However, there are cases where preemption
   checks need to be skipped. This patch set adds raw_cpu operations that
   do not check for preemption and then adds preemption checks to the
   __this_cpu operations.

C. The use of __get_cpu_var is always a reference to a percpu variable
   that can also be handled via a this_cpu operation. This patch set
   replaces all uses of __get_cpu_var with this_cpu operations.

D. We can then use this_cpu RMW operations in various places replacing
   sequences of instructions by a single one.

E. The use of this_cpu operations throughout will allow other arches than
   x86 to implement optimized references and RMV operations to work with
   per cpu local data.

F. The use of this_cpu operations opens up the possibility to
   further optimize code that relies on synchronization through
   per cpu data.

The patch set works in a couple of stages:

I. Patch 1 adds the additional raw_cpu operations and raw_cpu_ptr().
    Also converts the existing __this_cpu_xx_# primitive in the x86
    code to raw_cpu_xx_#.

II. Patch 2-4 use the raw_cpu operations in places that would give
     us false positives once they are enabled.

III. Patch 5 adds preemption checks to __this_cpu operations to allow
    checking if preemption is properly disabled when these functions
    are used.

IV. Patches 6-20 are patches that simply replace uses of __get_cpu_var
   with this_cpu_ptr. They do not depend on any changes to the percpu
   code. No preemption tests are skipped if they are applied.

V. Patches 21-46 are conversion patches that use this_cpu operations
   in various kernel subsystems/drivers or arch code.

VI.  Patches 47/48 (not included in this series) remove no longer used
    functions (__this_cpu_ptr and __get_cpu_var).  These should only be
    applied after all the conversion patches have made it and after we
    have done additional passes through the kernel to ensure that none of
    the uses of these functions remain.

This patch (of 46):

The patches following this one will add preemption checks to __this_cpu
ops so we need to have an alternative way to use this_cpu operations
without preemption checks.

raw_cpu_ops will be the basis for all other ops since these will be the
operations that do not implement any checks.

Primitive operations are renamed by this patch from __this_cpu_xxx to
raw_cpu_xxxx.

Also change the uses of the x86 percpu primitives in preempt.h.
These depend directly on asm/percpu.h (header #include nesting issue).

Signed-off-by: Peter Zijlstra &lt;peterz@infradead.org&gt;
Signed-off-by: Christoph Lameter &lt;cl@linux.com&gt;
Acked-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Tejun Heo &lt;tj@kernel.org&gt;
Cc: "James E.J. Bottomley" &lt;jejb@parisc-linux.org&gt;
Cc: "Paul E. McKenney" &lt;paulmck@linux.vnet.ibm.com&gt;
Cc: Alex Shi &lt;alex.shi@intel.com&gt;
Cc: Arnd Bergmann &lt;arnd@arndb.de&gt;
Cc: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
Cc: Bryan Wu &lt;cooloney@gmail.com&gt;
Cc: Catalin Marinas &lt;catalin.marinas@arm.com&gt;
Cc: Chris Metcalf &lt;cmetcalf@tilera.com&gt;
Cc: Daniel Lezcano &lt;daniel.lezcano@linaro.org&gt;
Cc: David Daney &lt;david.daney@cavium.com&gt;
Cc: David Miller &lt;davem@davemloft.net&gt;
Cc: David S. Miller &lt;davem@davemloft.net&gt;
Cc: Dimitri Sivanich &lt;sivanich@sgi.com&gt;
Cc: Dipankar Sarma &lt;dipankar@in.ibm.com&gt;
Cc: Eric Dumazet &lt;edumazet@google.com&gt;
Cc: Fenghua Yu &lt;fenghua.yu@intel.com&gt;
Cc: Frederic Weisbecker &lt;fweisbec@gmail.com&gt;
Cc: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Cc: H. Peter Anvin &lt;hpa@linux.intel.com&gt;
Cc: Haavard Skinnemoen &lt;hskinnemoen@gmail.com&gt;
Cc: Hans-Christian Egtvedt &lt;egtvedt@samfundet.no&gt;
Cc: Hedi Berriche &lt;hedi@sgi.com&gt;
Cc: Heiko Carstens &lt;heiko.carstens@de.ibm.com&gt;
Cc: Helge Deller &lt;deller@gmx.de&gt;
Cc: Ivan Kokshaysky &lt;ink@jurassic.park.msu.ru&gt;
Cc: James Hogan &lt;james.hogan@imgtec.com&gt;
Cc: Jens Axboe &lt;axboe@kernel.dk&gt;
Cc: John Stultz &lt;john.stultz@linaro.org&gt;
Cc: Martin Schwidefsky &lt;schwidefsky@de.ibm.com&gt;
Cc: Masami Hiramatsu &lt;masami.hiramatsu.pt@hitachi.com&gt;
Cc: Matt Turner &lt;mattst88@gmail.com&gt;
Cc: Mike Frysinger &lt;vapier@gentoo.org&gt;
Cc: Mike Travis &lt;travis@sgi.com&gt;
Cc: Neil Brown &lt;neilb@suse.de&gt;
Cc: Nicolas Pitre &lt;nicolas.pitre@linaro.org&gt;
Cc: Paul Mackerras &lt;paulus@samba.org&gt;
Cc: Paul Mundt &lt;lethal@linux-sh.org&gt;
Cc: Rafael J. Wysocki &lt;rjw@sisk.pl&gt;
Cc: Ralf Baechle &lt;ralf@linux-mips.org&gt;
Cc: Richard Henderson &lt;rth@twiddle.net&gt;
Cc: Robert Richter &lt;rric@kernel.org&gt;
Cc: Russell King &lt;linux@arm.linux.org.uk&gt;
Cc: Russell King &lt;rmk+kernel@arm.linux.org.uk&gt;
Cc: Rusty Russell &lt;rusty@rustcorp.com.au&gt;
Cc: Steven Rostedt &lt;rostedt@goodmis.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Tony Luck &lt;tony.luck@intel.com&gt;
Cc: Will Deacon &lt;will.deacon@arm.com&gt;
Cc: Wim Van Sebroeck &lt;wim@iguana.be&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>percpu: Optimize __get_cpu_var()</title>
<updated>2010-09-10T08:56:51Z</updated>
<author>
<name>Brian Gerst</name>
<email>brgerst@gmail.com</email>
</author>
<published>2010-09-09T16:17:26Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=677243d7494d09bfa782425f063a6013de53c35b'/>
<id>urn:sha1:677243d7494d09bfa782425f063a6013de53c35b</id>
<content type='text'>
Redefine __get_cpu_var() using this_cpu_ptr() which can be
arch-optimized.

Signed-off-by: Brian Gerst &lt;brgerst@gmail.com&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
</content>
</entry>
<entry>
<title>x86, percpu: Optimize this_cpu_ptr</title>
<updated>2010-09-10T08:56:47Z</updated>
<author>
<name>Brian Gerst</name>
<email>brgerst@gmail.com</email>
</author>
<published>2010-09-09T16:17:26Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=db7829c6cc32f3c0c9a324118d743acb1abff081'/>
<id>urn:sha1:db7829c6cc32f3c0c9a324118d743acb1abff081</id>
<content type='text'>
Allow arches to implement __this_cpu_ptr, and provide an x86 version.

Before:
	movq $foo, %rax
	movq %gs:this_cpu_off, %rdx
	addq %rdx, %rax

After:
	movq $foo, %rax
	addq %gs:this_cpu_off, %rax

The benefit is doing it in one less instruction and not clobbering
a temporary register.

tj: * Beefed up the comment a bit and renamed in-macro temp variable
      to match neighboring macros.

    * Folded fix for const pointer case found in linux-next.

    * Fixed sparse notation.

Signed-off-by: Brian Gerst &lt;brgerst@gmail.com&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
</content>
</entry>
<entry>
<title>percpu: handle __percpu notations in UP accessors</title>
<updated>2010-08-07T12:20:53Z</updated>
<author>
<name>Namhyung Kim</name>
<email>namhyung@gmail.com</email>
</author>
<published>2010-08-06T18:26:23Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=18cb2aef91b37dbce2bec2f39bb1dddd0e9dd838'/>
<id>urn:sha1:18cb2aef91b37dbce2bec2f39bb1dddd0e9dd838</id>
<content type='text'>
UP accessors didn't take care of __percpu notations leading to a lot
of spurious sparse warnings on UP configurations.  Fix it.

Signed-off-by: Namhyung Kim &lt;namhyung@gmail.com&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
</content>
</entry>
<entry>
<title>Merge branch 'for-35' of git://repo.or.cz/linux-kbuild</title>
<updated>2010-06-01T15:55:52Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2010-06-01T15:55:52Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=1f73897861b8ef0be64ff4b801f8d6f830f683b5'/>
<id>urn:sha1:1f73897861b8ef0be64ff4b801f8d6f830f683b5</id>
<content type='text'>
* 'for-35' of git://repo.or.cz/linux-kbuild: (81 commits)
  kbuild: Revert part of e8d400a to resolve a conflict
  kbuild: Fix checking of scm-identifier variable
  gconfig: add support to show hidden options that have prompts
  menuconfig: add support to show hidden options which have prompts
  gconfig: remove show_debug option
  gconfig: remove dbg_print_ptype() and dbg_print_stype()
  kconfig: fix zconfdump()
  kconfig: some small fixes
  add random binaries to .gitignore
  kbuild: Include gen_initramfs_list.sh and the file list in the .d file
  kconfig: recalc symbol value before showing search results
  .gitignore: ignore *.lzo files
  headerdep: perlcritic warning
  scripts/Makefile.lib: Align the output of LZO
  kbuild: Generate modules.builtin in make modules_install
  Revert "kbuild: specify absolute paths for cscope"
  kbuild: Do not unnecessarily regenerate modules.builtin
  headers_install: use local file handles
  headers_check: fix perl warnings
  export_report: fix perl warnings
  ...
</content>
</entry>
<entry>
<title>Rename .data[.percpu][.XXX] to .data[..percpu][..XXX].</title>
<updated>2010-03-03T10:26:00Z</updated>
<author>
<name>Denys Vlasenko</name>
<email>vda.linux@googlemail.com</email>
</author>
<published>2010-02-20T00:03:43Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=3d9a854c2dac3e888393b23ba7adafcce4d6d4b9'/>
<id>urn:sha1:3d9a854c2dac3e888393b23ba7adafcce4d6d4b9</id>
<content type='text'>
Signed-off-by: Denys Vlasenko &lt;vda.linux@googlemail.com&gt;
Signed-off-by: Michal Marek &lt;mmarek@suse.cz&gt;
</content>
</entry>
<entry>
<title>percpu: make accessors check for percpu pointer in sparse</title>
<updated>2009-10-29T13:34:15Z</updated>
<author>
<name>Tejun Heo</name>
<email>tj@kernel.org</email>
</author>
<published>2009-10-29T13:34:15Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=545695fb41da117928ab946067a42d9e15fd009d'/>
<id>urn:sha1:545695fb41da117928ab946067a42d9e15fd009d</id>
<content type='text'>
The previous patch made sparse warn about percpu variables being used
directly without going through percpu accessors.  This patch
implements the other half - checking whether non percpu variable is
passed into percpu accessors.

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Cc: Rusty Russell &lt;rusty@rustcorp.com.au&gt;
Cc: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</content>
</entry>
<entry>
<title>percpu: add __percpu for sparse.</title>
<updated>2009-10-29T13:34:15Z</updated>
<author>
<name>Rusty Russell</name>
<email>rusty@rustcorp.com.au</email>
</author>
<published>2009-10-29T13:34:15Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=e0fdb0e050eae331046385643618f12452aa7e73'/>
<id>urn:sha1:e0fdb0e050eae331046385643618f12452aa7e73</id>
<content type='text'>
We have to make __kernel "__attribute__((address_space(0)))" so we can
cast to it.

tj: * put_cpu_var() update.

    * Annotations added to dynamic allocator interface.

Signed-off-by: Rusty Russell &lt;rusty@rustcorp.com.au&gt;
Cc: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
</content>
</entry>
<entry>
<title>percpu: remove per_cpu__ prefix.</title>
<updated>2009-10-29T13:34:15Z</updated>
<author>
<name>Rusty Russell</name>
<email>rusty@rustcorp.com.au</email>
</author>
<published>2009-10-29T13:34:15Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=dd17c8f72993f9461e9c19250e3f155d6d99df22'/>
<id>urn:sha1:dd17c8f72993f9461e9c19250e3f155d6d99df22</id>
<content type='text'>
Now that the return from alloc_percpu is compatible with the address
of per-cpu vars, it makes sense to hand around the address of per-cpu
variables.  To make this sane, we remove the per_cpu__ prefix we used
created to stop people accidentally using these vars directly.

Now we have sparse, we can use that (next patch).

tj: * Updated to convert stuff which were missed by or added after the
      original patch.

    * Kill per_cpu_var() macro.

Signed-off-by: Rusty Russell &lt;rusty@rustcorp.com.au&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Reviewed-by: Christoph Lameter &lt;cl@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>this_cpu: Introduce this_cpu_ptr() and generic this_cpu_* operations</title>
<updated>2009-10-03T10:48:22Z</updated>
<author>
<name>Christoph Lameter</name>
<email>cl@linux-foundation.org</email>
</author>
<published>2009-10-03T10:48:22Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=7340a0b15280c9d902c7dd0608b8e751b5a7c403'/>
<id>urn:sha1:7340a0b15280c9d902c7dd0608b8e751b5a7c403</id>
<content type='text'>
This patch introduces two things: First this_cpu_ptr and then per cpu
atomic operations.

this_cpu_ptr
------------

A common operation when dealing with cpu data is to get the instance of the
cpu data associated with the currently executing processor. This can be
optimized by

this_cpu_ptr(xx) = per_cpu_ptr(xx, smp_processor_id).

The problem with per_cpu_ptr(x, smp_processor_id) is that it requires
an array lookup to find the offset for the cpu. Processors typically
have the offset for the current cpu area in some kind of (arch dependent)
efficiently accessible register or memory location.

We can use that instead of doing the array lookup to speed up the
determination of the address of the percpu variable. This is particularly
significant because these lookups occur in performance critical paths
of the core kernel. this_cpu_ptr() can avoid memory accesses and

this_cpu_ptr comes in two flavors. The preemption context matters since we
are referring the the currently executing processor. In many cases we must
insure that the processor does not change while a code segment is executed.

__this_cpu_ptr 	-&gt; Do not check for preemption context
this_cpu_ptr	-&gt; Check preemption context

The parameter to these operations is a per cpu pointer. This can be the
address of a statically defined per cpu variable (&amp;per_cpu_var(xxx)) or
the address of a per cpu variable allocated with the per cpu allocator.

per cpu atomic operations: this_cpu_*(var, val)
-----------------------------------------------
this_cpu_* operations (like this_cpu_add(struct-&gt;y, value) operate on
abitrary scalars that are members of structures allocated with the new
per cpu allocator. They can also operate on static per_cpu variables
if they are passed to per_cpu_var() (See patch to use this_cpu_*
operations for vm statistics).

These operations are guaranteed to be atomic vs preemption when modifying
the scalar. The calculation of the per cpu offset is also guaranteed to
be atomic at the same time. This means that a this_cpu_* operation can be
safely used to modify a per cpu variable in a context where interrupts are
enabled and preemption is allowed. Many architectures can perform such
a per cpu atomic operation with a single instruction.

Note that the atomicity here is different from regular atomic operations.
Atomicity is only guaranteed for data accessed from the currently executing
processor. Modifications from other processors are still possible. There
must be other guarantees that the per cpu data is not modified from another
processor when using these instruction. The per cpu atomicity is created
by the fact that the processor either executes and instruction or not.
Embedded in the instruction is the relocation of the per cpu address to
the are reserved for the current processor and the RMW action. Therefore
interrupts or preemption cannot occur in the mids of this processing.

Generic fallback functions are used if an arch does not define optimized
this_cpu operations. The functions come also come in the two flavors used
for this_cpu_ptr().

The firstparameter is a scalar that is a member of a structure allocated
through allocpercpu or a per cpu variable (use per_cpu_var(xxx)). The
operations are similar to what percpu_add() and friends do.

this_cpu_read(scalar)
this_cpu_write(scalar, value)
this_cpu_add(scale, value)
this_cpu_sub(scalar, value)
this_cpu_inc(scalar)
this_cpu_dec(scalar)
this_cpu_and(scalar, value)
this_cpu_or(scalar, value)
this_cpu_xor(scalar, value)

Arch code can override the generic functions and provide optimized atomic
per cpu operations. These atomic operations must provide both the relocation
(x86 does it through a segment override) and the operation on the data in a
single instruction. Otherwise preempt needs to be disabled and there is no
gain from providing arch implementations.

A third variant is provided prefixed by irqsafe_. These variants are safe
against hardware interrupts on the *same* processor (all per cpu atomic
primitives are *always* *only* providing safety for code running on the
*same* processor!). The increment needs to be implemented by the hardware
in such a way that it is a single RMW instruction that is either processed
before or after an interrupt.

cc: David Howells &lt;dhowells@redhat.com&gt;
cc: Ingo Molnar &lt;mingo@elte.hu&gt;
cc: Rusty Russell &lt;rusty@rustcorp.com.au&gt;
cc: Eric Dumazet &lt;dada1@cosmosbay.com&gt;
Signed-off-by: Christoph Lameter &lt;cl@linux-foundation.org&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
</content>
</entry>
</feed>
