<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/ipc/util.h, branch v3.12</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v3.12</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v3.12'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2013-09-24T16:36:53Z</updated>
<entry>
<title>ipc: fix race with LSMs</title>
<updated>2013-09-24T16:36:53Z</updated>
<author>
<name>Davidlohr Bueso</name>
<email>davidlohr@hp.com</email>
</author>
<published>2013-09-24T00:04:45Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=53dad6d3a8e5ac1af8bacc6ac2134ae1a8b085f1'/>
<id>urn:sha1:53dad6d3a8e5ac1af8bacc6ac2134ae1a8b085f1</id>
<content type='text'>
Currently, IPC mechanisms do security and auditing related checks under
RCU.  However, since security modules can free the security structure,
for example, through selinux_[sem,msg_queue,shm]_free_security(), we can
race if the structure is freed before other tasks are done with it,
creating a use-after-free condition.  Manfred illustrates this nicely,
for instance with shared mem and selinux:

 -&gt; do_shmat calls rcu_read_lock()
 -&gt; do_shmat calls shm_object_check().
     Checks that the object is still valid - but doesn't acquire any locks.
     Then it returns.
 -&gt; do_shmat calls security_shm_shmat (e.g. selinux_shm_shmat)
 -&gt; selinux_shm_shmat calls ipc_has_perm()
 -&gt; ipc_has_perm accesses ipc_perms-&gt;security

shm_close()
 -&gt; shm_close acquires rw_mutex &amp; shm_lock
 -&gt; shm_close calls shm_destroy
 -&gt; shm_destroy calls security_shm_free (e.g. selinux_shm_free_security)
 -&gt; selinux_shm_free_security calls ipc_free_security(&amp;shp-&gt;shm_perm)
 -&gt; ipc_free_security calls kfree(ipc_perms-&gt;security)

This patch delays the freeing of the security structures after all RCU
readers are done.  Furthermore it aligns the security life cycle with
that of the rest of IPC - freeing them based on the reference counter.
For situations where we need not free security, the current behavior is
kept.  Linus states:

 "... the old behavior was suspect for another reason too: having the
  security blob go away from under a user sounds like it could cause
  various other problems anyway, so I think the old code was at least
  _prone_ to bugs even if it didn't have catastrophic behavior."

I have tested this patch with IPC testcases from LTP on both my
quad-core laptop and on a 64 core NUMA server.  In both cases selinux is
enabled, and tests pass for both voluntary and forced preemption models.
While the mentioned races are theoretical (at least no one as reported
them), I wanted to make sure that this new logic doesn't break anything
we weren't aware of.

Suggested-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Davidlohr Bueso &lt;davidlohr@hp.com&gt;
Acked-by: Manfred Spraul &lt;manfred@colorfullife.com&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>ipc: drop ipc_lock_check</title>
<updated>2013-09-11T22:59:45Z</updated>
<author>
<name>Davidlohr Bueso</name>
<email>davidlohr.bueso@hp.com</email>
</author>
<published>2013-09-11T21:26:31Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=20b8875abcf2daa1dda5cf70bd6369df5e85d4c1'/>
<id>urn:sha1:20b8875abcf2daa1dda5cf70bd6369df5e85d4c1</id>
<content type='text'>
No remaining users, we now use ipc_obtain_object_check().

Signed-off-by: Davidlohr Bueso &lt;davidlohr.bueso@hp.com&gt;
Cc: Sedat Dilek &lt;sedat.dilek@gmail.com&gt;
Cc: Rik van Riel &lt;riel@redhat.com&gt;
Cc: Manfred Spraul &lt;manfred@colorfullife.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>ipc: drop ipc_lock_by_ptr</title>
<updated>2013-09-11T22:59:44Z</updated>
<author>
<name>Davidlohr Bueso</name>
<email>davidlohr.bueso@hp.com</email>
</author>
<published>2013-09-11T21:26:29Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=32a2750010981216fb788c5190fb0e646abfab30'/>
<id>urn:sha1:32a2750010981216fb788c5190fb0e646abfab30</id>
<content type='text'>
After previous cleanups and optimizations, this function is no longer
heavily used and we don't have a good reason to keep it.  Update the few
remaining callers and get rid of it.

Signed-off-by: Davidlohr Bueso &lt;davidlohr.bueso@hp.com&gt;
Cc: Sedat Dilek &lt;sedat.dilek@gmail.com&gt;
Cc: Rik van Riel &lt;riel@redhat.com&gt;
Cc: Manfred Spraul &lt;manfred@colorfullife.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>ipc: rename ids-&gt;rw_mutex</title>
<updated>2013-09-11T22:59:42Z</updated>
<author>
<name>Davidlohr Bueso</name>
<email>davidlohr.bueso@hp.com</email>
</author>
<published>2013-09-11T21:26:24Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=d9a605e40b1376eb02b067d7690580255a0df68f'/>
<id>urn:sha1:d9a605e40b1376eb02b067d7690580255a0df68f</id>
<content type='text'>
Since in some situations the lock can be shared for readers, we shouldn't
be calling it a mutex, rename it to rwsem.

Signed-off-by: Davidlohr Bueso &lt;davidlohr.bueso@hp.com&gt;
Tested-by: Sedat Dilek &lt;sedat.dilek@gmail.com&gt;
Cc: Rik van Riel &lt;riel@redhat.com&gt;
Cc: Manfred Spraul &lt;manfred@colorfullife.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>ipc: drop ipcctl_pre_down</title>
<updated>2013-09-11T22:59:39Z</updated>
<author>
<name>Davidlohr Bueso</name>
<email>davidlohr.bueso@hp.com</email>
</author>
<published>2013-09-11T21:26:17Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=3b1c4ad37741e53804ffe0a30dd01e08b2ab6241'/>
<id>urn:sha1:3b1c4ad37741e53804ffe0a30dd01e08b2ab6241</id>
<content type='text'>
Now that sem, msgque and shm, through *_down(), all use the lockless
variant of ipcctl_pre_down(), go ahead and delete it.

[akpm@linux-foundation.org: fix function name in kerneldoc, cleanups]
Signed-off-by: Davidlohr Bueso &lt;davidlohr.bueso@hp.com&gt;
Tested-by: Sedat Dilek &lt;sedat.dilek@gmail.com&gt;
Cc: Rik van Riel &lt;riel@redhat.com&gt;
Cc: Manfred Spraul &lt;manfred@colorfullife.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>ipc: close open coded spin lock calls</title>
<updated>2013-07-09T17:33:27Z</updated>
<author>
<name>Davidlohr Bueso</name>
<email>davidlohr.bueso@hp.com</email>
</author>
<published>2013-07-08T23:01:11Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=cf9d5d78d05bca96df7618dfc3a5ee4414dcae58'/>
<id>urn:sha1:cf9d5d78d05bca96df7618dfc3a5ee4414dcae58</id>
<content type='text'>
Signed-off-by: Davidlohr Bueso &lt;davidlohr.bueso@hp.com&gt;
Cc: Andi Kleen &lt;andi@firstfloor.org&gt;
Cc: Rik van Riel &lt;riel@redhat.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>ipc: introduce ipc object locking helpers</title>
<updated>2013-07-09T17:33:27Z</updated>
<author>
<name>Davidlohr Bueso</name>
<email>davidlohr.bueso@hp.com</email>
</author>
<published>2013-07-08T23:01:10Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=1ca7003ab41152d673d9e359632283d05294f3d6'/>
<id>urn:sha1:1ca7003ab41152d673d9e359632283d05294f3d6</id>
<content type='text'>
Simple helpers around the (kern_ipc_perm *)-&gt;lock spinlock.

Signed-off-by: Davidlohr Bueso &lt;davidlohr.bueso@hp.com&gt;
Cc: Andi Kleen &lt;andi@firstfloor.org&gt;
Cc: Rik van Riel &lt;riel@redhat.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>ipc,sem: fine grained locking for semtimedop</title>
<updated>2013-05-01T15:12:58Z</updated>
<author>
<name>Rik van Riel</name>
<email>riel@surriel.com</email>
</author>
<published>2013-05-01T02:15:44Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=6062a8dc0517bce23e3c2f7d2fea5e22411269a3'/>
<id>urn:sha1:6062a8dc0517bce23e3c2f7d2fea5e22411269a3</id>
<content type='text'>
Introduce finer grained locking for semtimedop, to handle the common case
of a program wanting to manipulate one semaphore from an array with
multiple semaphores.

If the call is a semop manipulating just one semaphore in an array with
multiple semaphores, only take the lock for that semaphore itself.

If the call needs to manipulate multiple semaphores, or another caller is
in a transaction that manipulates multiple semaphores, the sem_array lock
is taken, as well as all the locks for the individual semaphores.

On a 24 CPU system, performance numbers with the semop-multi
test with N threads and N semaphores, look like this:

	vanilla		Davidlohr's	Davidlohr's +	Davidlohr's +
threads			patches		rwlock patches	v3 patches
10	610652		726325		1783589		2142206
20	341570		365699		1520453		1977878
30	288102		307037		1498167		2037995
40	290714		305955		1612665		2256484
50	288620		312890		1733453		2650292
60	289987		306043		1649360		2388008
70	291298		306347		1723167		2717486
80	290948		305662		1729545		2763582
90	290996		306680		1736021		2757524
100	292243		306700		1773700		3059159

[davidlohr.bueso@hp.com: do not call sem_lock when bogus sma]
[davidlohr.bueso@hp.com: make refcounter atomic]
Signed-off-by: Rik van Riel &lt;riel@redhat.com&gt;
Suggested-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Acked-by: Davidlohr Bueso &lt;davidlohr.bueso@hp.com&gt;
Cc: Chegu Vinod &lt;chegu_vinod@hp.com&gt;
Cc: Jason Low &lt;jason.low2@hp.com&gt;
Reviewed-by: Michel Lespinasse &lt;walken@google.com&gt;
Cc: Peter Hurley &lt;peter@hurleysoftware.com&gt;
Cc: Stanislav Kinsbursky &lt;skinsbursky@parallels.com&gt;
Tested-by: Emmanuel Benisty &lt;benisty.e@gmail.com&gt;
Tested-by: Sedat Dilek &lt;sedat.dilek@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>ipc,sem: do not hold ipc lock more than necessary</title>
<updated>2013-05-01T15:12:58Z</updated>
<author>
<name>Davidlohr Bueso</name>
<email>davidlohr.bueso@hp.com</email>
</author>
<published>2013-05-01T02:15:29Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=16df3674efe39f3ab63e7052f1244dd3d50e7f84'/>
<id>urn:sha1:16df3674efe39f3ab63e7052f1244dd3d50e7f84</id>
<content type='text'>
Instead of holding the ipc lock for permissions and security checks, among
others, only acquire it when necessary.

Some numbers....

1) With Rik's semop-multi.c microbenchmark we can see the following
   results:

Baseline (3.9-rc1):
cpus 4, threads: 256, semaphores: 128, test duration: 30 secs
total operations: 151452270, ops/sec 5048409

+  59.40%            a.out  [kernel.kallsyms]  [k] _raw_spin_lock
+   6.14%            a.out  [kernel.kallsyms]  [k] sys_semtimedop
+   3.84%            a.out  [kernel.kallsyms]  [k] avc_has_perm_flags
+   3.64%            a.out  [kernel.kallsyms]  [k] __audit_syscall_exit
+   2.06%            a.out  [kernel.kallsyms]  [k] copy_user_enhanced_fast_string
+   1.86%            a.out  [kernel.kallsyms]  [k] ipc_lock

With this patchset:
cpus 4, threads: 256, semaphores: 128, test duration: 30 secs
total operations: 273156400, ops/sec 9105213

+  18.54%            a.out  [kernel.kallsyms]  [k] _raw_spin_lock
+  11.72%            a.out  [kernel.kallsyms]  [k] sys_semtimedop
+   7.70%            a.out  [kernel.kallsyms]  [k] ipc_has_perm.isra.21
+   6.58%            a.out  [kernel.kallsyms]  [k] avc_has_perm_flags
+   6.54%            a.out  [kernel.kallsyms]  [k] __audit_syscall_exit
+   4.71%            a.out  [kernel.kallsyms]  [k] ipc_obtain_object_check

2) While on an Oracle swingbench DSS (data mining) workload the
   improvements are not as exciting as with Rik's benchmark, we can see
   some positive numbers.  For an 8 socket machine the following are the
   percentages of %sys time incurred in the ipc lock:

Baseline (3.9-rc1):
100 swingbench users: 8,74%
400 swingbench users: 21,86%
800 swingbench users: 84,35%

With this patchset:
100 swingbench users: 8,11%
400 swingbench users: 19,93%
800 swingbench users: 77,69%

[riel@redhat.com: fix two locking bugs]
[sasha.levin@oracle.com: prevent releasing RCU read lock twice in semctl_main]
[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Davidlohr Bueso &lt;davidlohr.bueso@hp.com&gt;
Signed-off-by: Rik van Riel &lt;riel@redhat.com&gt;
Reviewed-by: Chegu Vinod &lt;chegu_vinod@hp.com&gt;
Acked-by: Michel Lespinasse &lt;walken@google.com&gt;
Cc: Rik van Riel &lt;riel@redhat.com&gt;
Cc: Jason Low &lt;jason.low2@hp.com&gt;
Cc: Emmanuel Benisty &lt;benisty.e@gmail.com&gt;
Cc: Peter Hurley &lt;peter@hurleysoftware.com&gt;
Cc: Stanislav Kinsbursky &lt;skinsbursky@parallels.com&gt;
Tested-by: Sedat Dilek &lt;sedat.dilek@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>ipc: introduce lockless pre_down ipcctl</title>
<updated>2013-05-01T15:12:58Z</updated>
<author>
<name>Davidlohr Bueso</name>
<email>davidlohr.bueso@hp.com</email>
</author>
<published>2013-05-01T02:15:24Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=444d0f621b64716f7868dcbde448e0c66ece4e61'/>
<id>urn:sha1:444d0f621b64716f7868dcbde448e0c66ece4e61</id>
<content type='text'>
Various forms of ipc use ipcctl_pre_down() to retrieve an ipc object and
check permissions, mostly for IPC_RMID and IPC_SET commands.

Introduce ipcctl_pre_down_nolock(), a lockless version of this function.
The locking version is retained, yet modified to call the nolock version
without affecting its semantics, thus transparent to all ipc callers.

Signed-off-by: Davidlohr Bueso &lt;davidlohr.bueso@hp.com&gt;
Signed-off-by: Rik van Riel &lt;riel@redhat.com&gt;
Suggested-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Chegu Vinod &lt;chegu_vinod@hp.com&gt;
Cc: Emmanuel Benisty &lt;benisty.e@gmail.com&gt;
Cc: Jason Low &lt;jason.low2@hp.com&gt;
Cc: Michel Lespinasse &lt;walken@google.com&gt;
Cc: Peter Hurley &lt;peter@hurleysoftware.com&gt;
Cc: Stanislav Kinsbursky &lt;skinsbursky@parallels.com&gt;
Tested-by: Sedat Dilek &lt;sedat.dilek@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
</feed>
