<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/include/asm-generic/percpu.h, branch v2.6.34</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v2.6.34</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v2.6.34'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2009-10-29T13:34:15Z</updated>
<entry>
<title>percpu: make accessors check for percpu pointer in sparse</title>
<updated>2009-10-29T13:34:15Z</updated>
<author>
<name>Tejun Heo</name>
<email>tj@kernel.org</email>
</author>
<published>2009-10-29T13:34:15Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=545695fb41da117928ab946067a42d9e15fd009d'/>
<id>urn:sha1:545695fb41da117928ab946067a42d9e15fd009d</id>
<content type='text'>
The previous patch made sparse warn about percpu variables being used
directly without going through percpu accessors.  This patch
implements the other half - checking whether non percpu variable is
passed into percpu accessors.

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Cc: Rusty Russell &lt;rusty@rustcorp.com.au&gt;
Cc: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</content>
</entry>
<entry>
<title>percpu: add __percpu for sparse.</title>
<updated>2009-10-29T13:34:15Z</updated>
<author>
<name>Rusty Russell</name>
<email>rusty@rustcorp.com.au</email>
</author>
<published>2009-10-29T13:34:15Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=e0fdb0e050eae331046385643618f12452aa7e73'/>
<id>urn:sha1:e0fdb0e050eae331046385643618f12452aa7e73</id>
<content type='text'>
We have to make __kernel "__attribute__((address_space(0)))" so we can
cast to it.

tj: * put_cpu_var() update.

    * Annotations added to dynamic allocator interface.

Signed-off-by: Rusty Russell &lt;rusty@rustcorp.com.au&gt;
Cc: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
</content>
</entry>
<entry>
<title>percpu: remove per_cpu__ prefix.</title>
<updated>2009-10-29T13:34:15Z</updated>
<author>
<name>Rusty Russell</name>
<email>rusty@rustcorp.com.au</email>
</author>
<published>2009-10-29T13:34:15Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=dd17c8f72993f9461e9c19250e3f155d6d99df22'/>
<id>urn:sha1:dd17c8f72993f9461e9c19250e3f155d6d99df22</id>
<content type='text'>
Now that the return from alloc_percpu is compatible with the address
of per-cpu vars, it makes sense to hand around the address of per-cpu
variables.  To make this sane, we remove the per_cpu__ prefix we used
created to stop people accidentally using these vars directly.

Now we have sparse, we can use that (next patch).

tj: * Updated to convert stuff which were missed by or added after the
      original patch.

    * Kill per_cpu_var() macro.

Signed-off-by: Rusty Russell &lt;rusty@rustcorp.com.au&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Reviewed-by: Christoph Lameter &lt;cl@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>this_cpu: Introduce this_cpu_ptr() and generic this_cpu_* operations</title>
<updated>2009-10-03T10:48:22Z</updated>
<author>
<name>Christoph Lameter</name>
<email>cl@linux-foundation.org</email>
</author>
<published>2009-10-03T10:48:22Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=7340a0b15280c9d902c7dd0608b8e751b5a7c403'/>
<id>urn:sha1:7340a0b15280c9d902c7dd0608b8e751b5a7c403</id>
<content type='text'>
This patch introduces two things: First this_cpu_ptr and then per cpu
atomic operations.

this_cpu_ptr
------------

A common operation when dealing with cpu data is to get the instance of the
cpu data associated with the currently executing processor. This can be
optimized by

this_cpu_ptr(xx) = per_cpu_ptr(xx, smp_processor_id).

The problem with per_cpu_ptr(x, smp_processor_id) is that it requires
an array lookup to find the offset for the cpu. Processors typically
have the offset for the current cpu area in some kind of (arch dependent)
efficiently accessible register or memory location.

We can use that instead of doing the array lookup to speed up the
determination of the address of the percpu variable. This is particularly
significant because these lookups occur in performance critical paths
of the core kernel. this_cpu_ptr() can avoid memory accesses and

this_cpu_ptr comes in two flavors. The preemption context matters since we
are referring the the currently executing processor. In many cases we must
insure that the processor does not change while a code segment is executed.

__this_cpu_ptr 	-&gt; Do not check for preemption context
this_cpu_ptr	-&gt; Check preemption context

The parameter to these operations is a per cpu pointer. This can be the
address of a statically defined per cpu variable (&amp;per_cpu_var(xxx)) or
the address of a per cpu variable allocated with the per cpu allocator.

per cpu atomic operations: this_cpu_*(var, val)
-----------------------------------------------
this_cpu_* operations (like this_cpu_add(struct-&gt;y, value) operate on
abitrary scalars that are members of structures allocated with the new
per cpu allocator. They can also operate on static per_cpu variables
if they are passed to per_cpu_var() (See patch to use this_cpu_*
operations for vm statistics).

These operations are guaranteed to be atomic vs preemption when modifying
the scalar. The calculation of the per cpu offset is also guaranteed to
be atomic at the same time. This means that a this_cpu_* operation can be
safely used to modify a per cpu variable in a context where interrupts are
enabled and preemption is allowed. Many architectures can perform such
a per cpu atomic operation with a single instruction.

Note that the atomicity here is different from regular atomic operations.
Atomicity is only guaranteed for data accessed from the currently executing
processor. Modifications from other processors are still possible. There
must be other guarantees that the per cpu data is not modified from another
processor when using these instruction. The per cpu atomicity is created
by the fact that the processor either executes and instruction or not.
Embedded in the instruction is the relocation of the per cpu address to
the are reserved for the current processor and the RMW action. Therefore
interrupts or preemption cannot occur in the mids of this processing.

Generic fallback functions are used if an arch does not define optimized
this_cpu operations. The functions come also come in the two flavors used
for this_cpu_ptr().

The firstparameter is a scalar that is a member of a structure allocated
through allocpercpu or a per cpu variable (use per_cpu_var(xxx)). The
operations are similar to what percpu_add() and friends do.

this_cpu_read(scalar)
this_cpu_write(scalar, value)
this_cpu_add(scale, value)
this_cpu_sub(scalar, value)
this_cpu_inc(scalar)
this_cpu_dec(scalar)
this_cpu_and(scalar, value)
this_cpu_or(scalar, value)
this_cpu_xor(scalar, value)

Arch code can override the generic functions and provide optimized atomic
per cpu operations. These atomic operations must provide both the relocation
(x86 does it through a segment override) and the operation on the data in a
single instruction. Otherwise preempt needs to be disabled and there is no
gain from providing arch implementations.

A third variant is provided prefixed by irqsafe_. These variants are safe
against hardware interrupts on the *same* processor (all per cpu atomic
primitives are *always* *only* providing safety for code running on the
*same* processor!). The increment needs to be implemented by the hardware
in such a way that it is a single RMW instruction that is either processed
before or after an interrupt.

cc: David Howells &lt;dhowells@redhat.com&gt;
cc: Ingo Molnar &lt;mingo@elte.hu&gt;
cc: Rusty Russell &lt;rusty@rustcorp.com.au&gt;
cc: Eric Dumazet &lt;dada1@cosmosbay.com&gt;
Signed-off-by: Christoph Lameter &lt;cl@linux-foundation.org&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
</content>
</entry>
<entry>
<title>x86/i386: Put aligned stack-canary in percpu shared_aligned section</title>
<updated>2009-09-04T05:10:31Z</updated>
<author>
<name>Jeremy Fitzhardinge</name>
<email>jeremy@goop.org</email>
</author>
<published>2009-09-03T21:31:44Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=53f824520b6d84ca5b4a8fd71addc91dbf64357e'/>
<id>urn:sha1:53f824520b6d84ca5b4a8fd71addc91dbf64357e</id>
<content type='text'>
Pack aligned things together into a special section to minimize
padding holes.

Suggested-by: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
Signed-off-by: Jeremy Fitzhardinge &lt;jeremy.fitzhardinge@citrix.com&gt;
Cc: Tejun Heo &lt;tj@kernel.org&gt;
LKML-Reference: &lt;4AA035C0.9070202@goop.org&gt;
[ queued up in tip:x86/asm because it depends on this commit:
  x86/i386: Make sure stack-protector segment base is cache aligned ]
Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
</content>
</entry>
<entry>
<title>alpha: fix percpu build breakage</title>
<updated>2009-07-01T01:55:59Z</updated>
<author>
<name>Tejun Heo</name>
<email>tj@kernel.org</email>
</author>
<published>2009-06-30T18:41:18Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=b01e8dc34379f4ba2f454390e340a025edbaaa7e'/>
<id>urn:sha1:b01e8dc34379f4ba2f454390e340a025edbaaa7e</id>
<content type='text'>
alpha percpu access requires custom SHIFT_PERCPU_PTR() definition for
modules to work around addressing range limitation.  This is done via
generating inline assembly using C preprocessing which forces the
assembler to generate external reference.  This happens behind the
compiler's back and makes the compiler think that static percpu variables
in modules are unused.

This used to be worked around by using __unused attribute for percpu
variables which prevent the compiler from omitting the variable; however,
recent declare/definition attribute unification change broke this as
__used can't be used for declaration.  Also, in the process,
PER_CPU_ATTRIBUTES definition in alpha percpu.h got broken.

This patch adds PER_CPU_DEF_ATTRIBUTES which is only used for definitions
and make alpha use it to add __used for percpu variables in modules.  This
also fixes the PER_CPU_ATTRIBUTES double definition bug.

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Tested-by: maximilian attems &lt;max@stro.at&gt;
Acked-by: Ivan Kokshaysky &lt;ink@jurassic.park.msu.ru&gt;
Cc: Richard Henderson &lt;rth@twiddle.net&gt;
Cc: &lt;stable@kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>PERCPU: Collect the DECLARE/DEFINE declarations together</title>
<updated>2009-04-22T02:40:00Z</updated>
<author>
<name>David Howells</name>
<email>dhowells@redhat.com</email>
</author>
<published>2009-04-21T22:00:29Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=5028eaa97dd1dab9cd7c30c4d38f71c708ca64bc'/>
<id>urn:sha1:5028eaa97dd1dab9cd7c30c4d38f71c708ca64bc</id>
<content type='text'>
Collect the DECLARE/DEFINE declarations together in linux/percpu-defs.h so
that they're in one place, and give them descriptive comments, particularly
the SHARED_ALIGNED variant.

It would be nice to collect these in linux/percpu.h, but that's not possible
without sorting out the severe #include recursion between the x86 arch headers
and the general headers (and possibly other arches too).

Signed-off-by: David Howells &lt;dhowells@redhat.com&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>FRV: Fix the section attribute on UP DECLARE_PER_CPU()</title>
<updated>2009-04-22T02:39:59Z</updated>
<author>
<name>David Howells</name>
<email>dhowells@redhat.com</email>
</author>
<published>2009-04-21T22:00:24Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=9b8de7479d0dbab1ed98b5b015d44232c9d3d08e'/>
<id>urn:sha1:9b8de7479d0dbab1ed98b5b015d44232c9d3d08e</id>
<content type='text'>
In non-SMP mode, the variable section attribute specified by DECLARE_PER_CPU()
does not agree with that specified by DEFINE_PER_CPU().  This means that
architectures that have a small data section references relative to a base
register may throw up linkage errors due to too great a displacement between
where the base register points and the per-CPU variable.

On FRV, the .h declaration says that the variable is in the .sdata section, but
the .c definition says it's actually in the .data section.  The linker throws
up the following errors:

kernel/built-in.o: In function `release_task':
kernel/exit.c:78: relocation truncated to fit: R_FRV_GPREL12 against symbol `per_cpu__process_counts' defined in .data section in kernel/built-in.o
kernel/exit.c:78: relocation truncated to fit: R_FRV_GPREL12 against symbol `per_cpu__process_counts' defined in .data section in kernel/built-in.o

To fix this, DECLARE_PER_CPU() should simply apply the same section attribute
as does DEFINE_PER_CPU().  However, this is made slightly more complex by
virtue of the fact that there are several variants on DEFINE, so these need to
be matched by variants on DECLARE.

Signed-off-by: David Howells &lt;dhowells@redhat.com&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>percpu: unbreak alpha percpu</title>
<updated>2009-04-10T19:36:18Z</updated>
<author>
<name>Tejun Heo</name>
<email>tj@kernel.org</email>
</author>
<published>2009-04-10T19:02:40Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=066123a535927b3f17cac2305258cc71abdb0d92'/>
<id>urn:sha1:066123a535927b3f17cac2305258cc71abdb0d92</id>
<content type='text'>
For the time being, move the generic percpu_*() accessors to
linux/percpu.h.

asm-generic/percpu.h is meant to carry generic stuff for low level
stuff - declarations, definitions and pointer offset calculation
and so on but not for generic interface.

Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
</content>
</entry>
<entry>
<title>percpu: add optimized generic percpu accessors</title>
<updated>2009-01-16T13:20:31Z</updated>
<author>
<name>Ingo Molnar</name>
<email>mingo@elte.hu</email>
</author>
<published>2009-01-15T13:15:53Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=6dbde3530850d4d8bfc1b6bd4006d92786a2787f'/>
<id>urn:sha1:6dbde3530850d4d8bfc1b6bd4006d92786a2787f</id>
<content type='text'>
It is an optimization and a cleanup, and adds the following new
generic percpu methods:

  percpu_read()
  percpu_write()
  percpu_add()
  percpu_sub()
  percpu_and()
  percpu_or()
  percpu_xor()

and implements support for them on x86. (other architectures will fall
back to a default implementation)

The advantage is that for example to read a local percpu variable,
instead of this sequence:

 return __get_cpu_var(var);

 ffffffff8102ca2b:	48 8b 14 fd 80 09 74 	mov    -0x7e8bf680(,%rdi,8),%rdx
 ffffffff8102ca32:	81
 ffffffff8102ca33:	48 c7 c0 d8 59 00 00 	mov    $0x59d8,%rax
 ffffffff8102ca3a:	48 8b 04 10          	mov    (%rax,%rdx,1),%rax

We can get a single instruction by using the optimized variants:

 return percpu_read(var);

 ffffffff8102ca3f:	65 48 8b 05 91 8f fd 	mov    %gs:0x7efd8f91(%rip),%rax

I also cleaned up the x86-specific APIs and made the x86 code use
these new generic percpu primitives.

tj: * fixed generic percpu_sub() definition as Roel Kluin pointed out
    * added percpu_and() for completeness's sake
    * made generic percpu ops atomic against preemption

Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
</content>
</entry>
</feed>
