linux/lib/rhashtable.c, branch v4.0

rhashtable: use cond_resched()

2015-02-27T22:55:14Z

If a hash table has 128 slots and 16384 elems, expand to 256 slots takes more than one second. For larger sets, a soft lockup is detected. Holding cpu for that long, even in a work queue is a show stopper for non preemptable kernels. cond_resched() at strategic points to allow process scheduler to reschedule us. Signed-off-by: Eric Dumazet Acked-by: Daniel Borkmann Signed-off-by: David S. Miller

rhashtable: remove indirection for grow/shrink decision functions

2015-02-27T21:06:02Z

Currently, all real users of rhashtable default their grow and shrink decision functions to rht_grow_above_75() and rht_shrink_below_30(), so that there's currently no need to have this explicitly selectable. It can/should be generic and private inside rhashtable until a real use case pops up. Since we can make this private, we'll save us this additional indirection layer and can improve insertion/deletion time as well. Reference: http://patchwork.ozlabs.org/patch/443040/ Suggested-by: David S. Miller Signed-off-by: Daniel Borkmann Acked-by: Thomas Graf Signed-off-by: David S. Miller

rhashtable: unconditionally grow when max_shift is not specified

2015-02-27T21:06:02Z

While commit c0c09bfdc415 ("rhashtable: avoid unnecessary wakeup for worker queue") rightfully moved part of the decision making of whether we should expand or shrink from the expand/shrink functions themselves into insert/delete functions in order to avoid unnecessary worker wake-ups, it however introduced a regression by doing so. Before that change, if no max_shift was specified (= 0) on rhashtable initialization, rhashtable_expand() would just grow unconditionally and lets the available memory be the limiting factor. After that change, if no max_shift was specified, there would be _no_ expansion step at all. Given that netlink and tipc have a max_shift specified, it was not visible there, but Josh Hunt reported that if nft that starts out with a default element hint of 3 if not otherwise provided, would slow i.e. inserts down trememdously as it cannot grow larger to relax table occupancy. Given that the test case verifies shrinks/expands manually, we also must remove pointer to the helper functions to explicitly avoid parallel resizing on insertions/deletions. test_bucket_stats() and test_rht_lookup() could also be wrapped around rhashtable mutex to explicitly synchronize a walk from resizing, but I think that defeats the actual test case which intended to have explicit test steps, i.e. 1) inserts, 2) expands, 3) shrinks, 4) deletions, with object verification after each stage. Reported-by: Josh Hunt Fixes: c0c09bfdc415 ("rhashtable: avoid unnecessary wakeup for worker queue") Signed-off-by: Daniel Borkmann Cc: Ying Xue Cc: Josh Hunt Acked-by: Thomas Graf Signed-off-by: David S. Miller

rhashtable: initialize all rhashtable walker members

2015-02-23T20:23:19Z

Commit f2dba9c6ff ("rhashtable: Introduce rhashtable_walk_*") forgot to initialize the members of struct rhashtable_walker after allocating it, which caused an undefined value for 'resize' which is used later on. Fixes: f2dba9c6ff ("rhashtable: Introduce rhashtable_walk_*") Signed-off-by: Sasha Levin Signed-off-by: David S. Miller

rhashtable: better high order allocation attempts

2015-02-20T22:38:09Z

When trying to allocate future tables via bucket_table_alloc(), it seems overkill on large table shifts that we probe for kzalloc() unconditionally first, as it's likely to fail. Only probe with kzalloc() for more reasonable table sizes and use vzalloc() either as a fallback on failure or directly in case of large table sizes. Fixes: 7e1e77636e36 ("lib: Resizable, Scalable, Concurrent Hash Table") Signed-off-by: Daniel Borkmann Acked-by: Thomas Graf Signed-off-by: David S. Miller

rhashtable: don't test for shrink on insert, expansion on delete

2015-02-20T22:38:09Z

Restore pre 54c5b7d311c8 behaviour and only probe for expansions on inserts and shrinks on deletes. Currently, it will happen that on initial inserts into a sparse hash table, we may i.e. shrink it first simply because it's not fully populated yet, only to later realize that we need to grow again. This however is counter intuitive, e.g. an initial default size of 64 elements is already small enough, and in case an elements size hint is given to the hash table by a user, we should avoid unnecessary expansion steps, so a shrink is clearly unintended here. Fixes: 54c5b7d311c8 ("rhashtable: introduce rhashtable_wakeup_worker helper function") Signed-off-by: Daniel Borkmann Cc: Ying Xue Acked-by: Thomas Graf Signed-off-by: David S. Miller

rhashtable: using ERR_PTR requires linux/err.h

2015-02-09T05:52:24Z

Signed-off-by: Stephen Rothwell Signed-off-by: David S. Miller

rhashtable: Fix remove logic to avoid cross references between buckets

2015-02-06T23:19:17Z

The remove logic properly searched the remaining chain for a matching entry with an identical hash but it did this while searching from both the old and new table. Instead in order to not leave stale references behind we need to: 1. When growing and searching from the new table: Search remaining chain for entry with same hash to avoid having the new table directly point to a entry with a different hash. 2. When shrinking and searching from the old table: Check if the element after the removed would create a cross reference and avoid it if so. These bugs were present from the beginning in nft_hash. Also, both insert functions calculated the hash based on the mask of the new table. This worked while growing. Wwhile shrinking, the mask of the inew table is smaller than the mask of the old table. This lead to a bit not being taken into account when selecting the bucket lock and thus caused the wrong bucket to be locked eventually. Fixes: 7e1e77636e36 ("lib: Resizable, Scalable, Concurrent Hash Table") Fixes: 97defe1ecf86 ("rhashtable: Per bucket locks & deferred expansion/shrinking") Reported-by: Ying Xue Signed-off-by: Thomas Graf Signed-off-by: David S. Miller

rhashtable: Avoid bucket cross reference after removal

2015-02-06T23:18:35Z

During a resize, when two buckets in the larger table map to a single bucket in the smaller table and the new table has already been (partially) linked to the old table. Removal of an element may result the bucket in the larger table to point to entries which all hash to a different value than the bucket index. Thus causing two buckets to point to the same sub chain after unzipping. This is not illegal *during* the resize phase but after it has completed. Keep the old table around until all of the unzipping is done to allow the removal code to only search for matching hashed entries during this special period. Reported-by: Ying Xue Fixes: 97defe1ecf86 ("rhashtable: Per bucket locks & deferred expansion/shrinking") Signed-off-by: Thomas Graf Signed-off-by: David S. Miller

rhashtable: Add more lock verification

2015-02-06T23:18:34Z

Catch hash miscalculations which result in hard to track down race conditions. Signed-off-by: Thomas Graf Signed-off-by: David S. Miller