linux/ipc/sem.c, branch v3.11

ipc/sem.c: rename try_atomic_semop() to perform_atomic_semop(), docu update

2013-07-09T17:33:28Z

Cleanup: Some minor points that I noticed while writing the previous patches 1) The name try_atomic_semop() is misleading: The function performs the operation (if it is possible). 2) Some documentation updates. No real code change, a rename and documentation changes. Signed-off-by: Manfred Spraul Cc: Rik van Riel Cc: Davidlohr Bueso Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds

ipc/sem.c: replace shared sem_otime with per-semaphore value

2013-07-09T17:33:28Z

sem_otime contains the time of the last semaphore operation that completed successfully. Every operation updates this value, thus access from multiple cpus can cause thrashing. Therefore the patch replaces the variable with a per-semaphore variable. The per-array sem_otime is only calculated when required. No performance improvement on a single-socket i3 - only important for larger systems. Signed-off-by: Manfred Spraul Cc: Rik van Riel Cc: Davidlohr Bueso Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds

ipc/sem.c: always use only one queue for alter operations

2013-07-09T17:33:28Z

There are two places that can contain alter operations: - the global queue: sma->pending_alter - the per-semaphore queues: sma->sem_base[].pending_alter. Since one of the queues must be processed first, this causes an odd priorization of the wakeups: complex operations have priority over simple ops. The patch restores the behavior of linux <=3.0.9: The longest waiting operation has the highest priority. This is done by using only one queue: - if there are complex ops, then sma->pending_alter is used. - otherwise, the per-semaphore queues are used. As a side effect, do_smart_update_queue() becomes much simpler: no more goto logic. Signed-off-by: Manfred Spraul Cc: Rik van Riel Cc: Davidlohr Bueso Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds

ipc/sem: separate wait-for-zero and alter tasks into seperate queues

2013-07-09T17:33:28Z

Introduce separate queues for operations that do not modify the semaphore values. Advantages: - Simpler logic in check_restart(). - Faster update_queue(): Right now, all wait-for-zero operations are always tested, even if the semaphore value is not 0. - wait-for-zero gets again priority, as in linux <=3.0.9 Signed-off-by: Manfred Spraul Cc: Rik van Riel Cc: Davidlohr Bueso Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds

ipc/sem.c: cacheline align the semaphore structures

2013-07-09T17:33:28Z

As now each semaphore has its own spinlock and parallel operations are possible, give each semaphore its own cacheline. On a i3 laptop, this gives up to 28% better performance: #semscale 10 | grep "interleave 2" - before: Cpus 1, interleave 2 delay 0: 36109234 in 10 secs Cpus 2, interleave 2 delay 0: 55276317 in 10 secs Cpus 3, interleave 2 delay 0: 62411025 in 10 secs Cpus 4, interleave 2 delay 0: 81963928 in 10 secs -after: Cpus 1, interleave 2 delay 0: 35527306 in 10 secs Cpus 2, interleave 2 delay 0: 70922909 in 10 secs <<< + 28% Cpus 3, interleave 2 delay 0: 80518538 in 10 secs Cpus 4, interleave 2 delay 0: 89115148 in 10 secs <<< + 8.7% i3, with 2 cores and with hyperthreading enabled. Interleave 2 in order use first the full cores. HT partially hides the delay from cacheline trashing, thus the improvement is "only" 8.7% if 4 threads are running. Signed-off-by: Manfred Spraul Cc: Rik van Riel Cc: Davidlohr Bueso Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds

ipc: remove unused functions

2013-07-09T17:33:27Z

We can now drop the msg_lock and msg_lock_check functions along with a bogus comment introduced previously in semctl_down. Signed-off-by: Davidlohr Bueso Cc: Andi Kleen Cc: Rik van Riel Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds

ipc: move locking out of ipcctl_pre_down_nolock

2013-07-09T17:33:27Z

This function currently acquires both the rw_mutex and the rcu lock on successful lookups, leaving the callers to explicitly unlock them, creating another two level locking situation. Make the callers (including those that still use ipcctl_pre_down()) explicitly lock and unlock the rwsem and rcu lock. Signed-off-by: Davidlohr Bueso Cc: Andi Kleen Cc: Rik van Riel Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds

ipc: close open coded spin lock calls

2013-07-09T17:33:27Z

Signed-off-by: Davidlohr Bueso Cc: Andi Kleen Cc: Rik van Riel Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds

ipc/sem.c: Fix missing wakeups in do_smart_update_queue()

2013-05-26T22:14:51Z

do_smart_update_queue() is called when an operation (semop, semctl(SETVAL), semctl(SETALL), ...) modified the array. It must check which of the sleeping tasks can proceed. do_smart_update_queue() missed a few wakeups: - if a sleeping complex op was completed, then all per-semaphore queues must be scanned - not only those that were modified by *sops - if a sleeping simple op proceeded, then the global queue must be scanned again And: - the test for "|sops == NULL) before scanning the global queue is not required: If the global queue is empty, then it doesn't need to be scanned - regardless of the reason for calling do_smart_update_queue() The patch is not optimized, i.e. even completing a wait-for-zero operation causes a rescan. This is done to keep the patch as simple as possible. Signed-off-by: Manfred Spraul Acked-by: Davidlohr Bueso Cc: Rik van Riel Cc: Andrew Morton Signed-off-by: Linus Torvalds

ipc,sem: fix semctl(..., GETNCNT)

2013-05-09T21:17:47Z

The semctl GETNCNT returns the number of semops waiting for the specified semaphore to become nonzero. After commit 9f1bc2c9022c ("ipc,sem: have only one list in struct sem_queue"), the semops waiting on just one semaphore are waiting on that semaphore's list. In order to return the correct count, we have to walk that list too, in addition to the sem_array's list for complex operations. Signed-off-by: Rik van Riel Signed-off-by: Linus Torvalds