<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/net/netfilter, branch v6.17</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v6.17</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v6.17'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2025-09-10T18:30:37Z</updated>
<entry>
<title>netfilter: nf_tables: restart set lookup on base_seq change</title>
<updated>2025-09-10T18:30:37Z</updated>
<author>
<name>Florian Westphal</name>
<email>fw@strlen.de</email>
</author>
<published>2025-09-10T08:02:22Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=b2f742c846cab9afc5953a5d8f17b54922dcc723'/>
<id>urn:sha1:b2f742c846cab9afc5953a5d8f17b54922dcc723</id>
<content type='text'>
The hash, hash_fast, rhash and bitwise sets may indicate no result even
though a matching element exists during a short time window while other
cpu is finalizing the transaction.

This happens when the hash lookup/bitwise lookup function has picked up
the old genbit, right before it was toggled by nf_tables_commit(), but
then the same cpu managed to unlink the matching old element from the
hash table:

cpu0					cpu1
  has added new elements to clone
  has marked elements as being
  inactive in new generation
					perform lookup in the set
  enters commit phase:
					A) observes old genbit
   increments base_seq
I) increments the genbit
II) removes old element from the set
					B) finds matching element
					C) returns no match: found
					element is not valid in old
					generation

					Next lookup observes new genbit and
					finds matching e2.

Consider a packet matching element e1, e2.

cpu0 processes following transaction:
1. remove e1
2. adds e2, which has same key as e1.

P matches both e1 and e2.  Therefore, cpu1 should always find a match
for P. Due to above race, this is not the case:

cpu1 observed the old genbit.  e2 will not be considered once it is found.
The element e1 is not found anymore if cpu0 managed to unlink it from the
hlist before cpu1 found it during list traversal.

The situation only occurs for a brief time period, lookups happening
after I) observe new genbit and return e2.

This problem exists in all set types except nft_set_pipapo, so fix it once
in nft_lookup rather than each set ops individually.

Sample the base sequence counter, which gets incremented right before the
genbit is changed.

Then, if no match is found, retry the lookup if the base sequence was
altered in between.

If the base sequence hasn't changed:
 - No update took place: no-match result is expected.
   This is the common case.  or:
 - nf_tables_commit() hasn't progressed to genbit update yet.
   Old elements were still visible and nomatch result is expected, or:
 - nf_tables_commit updated the genbit:
   We picked up the new base_seq, so the lookup function also picked
   up the new genbit, no-match result is expected.

If the old genbit was observed, then nft_lookup also picked up the old
base_seq: nft_lookup_should_retry() returns true and relookup is performed
in the new generation.

This problem was added when the unconditional synchronize_rcu() call
that followed the current/next generation bit toggle was removed.

Thanks to Pablo Neira Ayuso for reviewing an earlier version of this
patchset, for suggesting re-use of existing base_seq and placement of
the restart loop in nft_set_do_lookup().

Fixes: 0cbc06b3faba ("netfilter: nf_tables: remove synchronize_rcu in commit phase")
Signed-off-by: Florian Westphal &lt;fw@strlen.de&gt;
</content>
</entry>
<entry>
<title>netfilter: nf_tables: make nft_set_do_lookup available unconditionally</title>
<updated>2025-09-10T18:30:37Z</updated>
<author>
<name>Florian Westphal</name>
<email>fw@strlen.de</email>
</author>
<published>2025-09-10T08:02:21Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=11fe5a82e53ac3581a80c88e0e35fb8a80e15f48'/>
<id>urn:sha1:11fe5a82e53ac3581a80c88e0e35fb8a80e15f48</id>
<content type='text'>
This function was added for retpoline mitigation and is replaced by a
static inline helper if mitigations are not enabled.

Enable this helper function unconditionally so next patch can add a lookup
restart mechanism to fix possible false negatives while transactions are
in progress.

Adding lookup restarts in nft_lookup_eval doesn't work as nft_objref would
then need the same copypaste loop.

This patch is separate to ease review of the actual bug fix.

Suggested-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
Signed-off-by: Florian Westphal &lt;fw@strlen.de&gt;
</content>
</entry>
<entry>
<title>netfilter: nf_tables: place base_seq in struct net</title>
<updated>2025-09-10T18:30:37Z</updated>
<author>
<name>Florian Westphal</name>
<email>fw@strlen.de</email>
</author>
<published>2025-09-10T08:02:20Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=64102d9bbc3d41dac5188b8fba75b1344c438970'/>
<id>urn:sha1:64102d9bbc3d41dac5188b8fba75b1344c438970</id>
<content type='text'>
This will soon be read from packet path around same time as the gencursor.

Both gencursor and base_seq get incremented almost at the same time, so
it makes sense to place them in the same structure.

This doesn't increase struct net size on 64bit due to padding.

Signed-off-by: Florian Westphal &lt;fw@strlen.de&gt;
</content>
</entry>
<entry>
<title>netfilter: nft_set_rbtree: continue traversal if element is inactive</title>
<updated>2025-09-10T18:30:37Z</updated>
<author>
<name>Florian Westphal</name>
<email>fw@strlen.de</email>
</author>
<published>2025-09-10T08:02:19Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=a60f7bf4a1524d8896b76ba89623080aebf44272'/>
<id>urn:sha1:a60f7bf4a1524d8896b76ba89623080aebf44272</id>
<content type='text'>
When the rbtree lookup function finds a match in the rbtree, it sets the
range start interval to a potentially inactive element.

Then, after tree lookup, if the matching element is inactive, it returns
NULL and suppresses a matching result.

This is wrong and leads to false negative matches when a transaction has
already entered the commit phase.

cpu0					cpu1
  has added new elements to clone
  has marked elements as being
  inactive in new generation
					perform lookup in the set
  enters commit phase:
I) increments the genbit
					A) observes new genbit
					B) finds matching range
					C) returns no match: found
					range invalid in new generation
II) removes old elements from the tree
					C New nft_lookup happening now
				       	  will find matching element,
					  because it is no longer
					  obscured by old, inactive one.

Consider a packet matching range r1-r2:

cpu0 processes following transaction:
1. remove r1-r2
2. add r1-r3

P is contained in both ranges. Therefore, cpu1 should always find a match
for P.  Due to above race, this is not the case:

cpu1 does find r1-r2, but then ignores it due to the genbit indicating
the range has been removed.  It does NOT test for further matches.

The situation persists for all lookups until after cpu0 hits II) after
which r1-r3 range start node is tested for the first time.

Move the "interval start is valid" check ahead so that tree traversal
continues if the starting interval is not valid in this generation.

Thanks to Stefan Hanreich for providing an initial reproducer for this
bug.

Reported-by: Stefan Hanreich &lt;s.hanreich@proxmox.com&gt;
Fixes: c1eda3c6394f ("netfilter: nft_rbtree: ignore inactive matching element with no descendants")
Signed-off-by: Florian Westphal &lt;fw@strlen.de&gt;
</content>
</entry>
<entry>
<title>netfilter: nft_set_pipapo: don't check genbit from packetpath lookups</title>
<updated>2025-09-10T18:30:37Z</updated>
<author>
<name>Florian Westphal</name>
<email>fw@strlen.de</email>
</author>
<published>2025-09-10T08:02:18Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=c4eaca2e1052adfd67bed0a36a9d4b8e515666e4'/>
<id>urn:sha1:c4eaca2e1052adfd67bed0a36a9d4b8e515666e4</id>
<content type='text'>
The pipapo set type is special in that it has two copies of its
datastructure: one live copy containing only valid elements and one
on-demand clone used during transaction where adds/deletes happen.

This clone is not visible to the datapath.

This is unlike all other set types in nftables, those all link new
elements into their live hlist/tree.

For those sets, the lookup functions must skip the new elements while the
transaction is ongoing to ensure consistency.

As the clone is shallow, removal does have an effect on the packet path:
once the transaction enters the commit phase the 'gencursor' bit that
determines which elements are active and which elements should be ignored
(because they are no longer valid) is flipped.

This causes the datapath lookup to ignore these elements if they are found
during lookup.

This opens up a small race window where pipapo has an inconsistent view of
the dataset from when the transaction-cpu flipped the genbit until the
transaction-cpu calls nft_pipapo_commit() to swap live/clone pointers:

cpu0					cpu1
  has added new elements to clone
  has marked elements as being
  inactive in new generation
					perform lookup in the set
  enters commit phase:

I) increments the genbit
					A) observes new genbit
  removes elements from the clone so
  they won't be found anymore
					B) lookup in datastructure
					   can't see new elements yet,
					   but old elements are ignored
					   -&gt; Only matches elements that
					   were not changed in the
					   transaction
II) calls nft_pipapo_commit(), clone
    and live pointers are swapped.
					C New nft_lookup happening now
				       	  will find matching elements.

Consider a packet matching range r1-r2:

cpu0 processes following transaction:
1. remove r1-r2
2. add r1-r3

P is contained in both ranges. Therefore, cpu1 should always find a match
for P.  Due to above race, this is not the case:

cpu1 does find r1-r2, but then ignores it due to the genbit indicating
the range has been removed.

At the same time, r1-r3 is not visible yet, because it can only be found
in the clone.

The situation persists for all lookups until after cpu0 hits II).

The fix is easy: Don't check the genbit from pipapo lookup functions.
This is possible because unlike the other set types, the new elements are
not reachable from the live copy of the dataset.

The clone/live pointer swap is enough to avoid matching on old elements
while at the same time all new elements are exposed in one go.

After this change, step B above returns a match in r1-r2.
This is fine: r1-r2 only becomes truly invalid the moment they get freed.
This happens after a synchronize_rcu() call and rcu read lock is held
via netfilter hook traversal (nf_hook_slow()).

Cc: Stefano Brivio &lt;sbrivio@redhat.com&gt;
Fixes: 3c4287f62044 ("nf_tables: Add set type for arbitrary concatenation of ranges")
Signed-off-by: Florian Westphal &lt;fw@strlen.de&gt;
</content>
</entry>
<entry>
<title>netfilter: nft_set_bitmap: fix lockdep splat due to missing annotation</title>
<updated>2025-09-10T18:28:24Z</updated>
<author>
<name>Florian Westphal</name>
<email>fw@strlen.de</email>
</author>
<published>2025-09-09T12:45:21Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=5e13f2c491a4100d208e77e92fe577fe3dbad6c2'/>
<id>urn:sha1:5e13f2c491a4100d208e77e92fe577fe3dbad6c2</id>
<content type='text'>
Running new 'set_flush_add_atomic_bitmap' test case for nftables.git
with CONFIG_PROVE_RCU_LIST=y yields:

net/netfilter/nft_set_bitmap.c:231 RCU-list traversed in non-reader section!!
rcu_scheduler_active = 2, debug_locks = 1
1 lock held by nft/4008:
 #0: ffff888147f79cd8 (&amp;nft_net-&gt;commit_mutex){+.+.}-{4:4}, at: nf_tables_valid_genid+0x2f/0xd0

 lockdep_rcu_suspicious+0x116/0x160
 nft_bitmap_walk+0x22d/0x240
 nf_tables_delsetelem+0x1010/0x1a00
 ..

This is a false positive, the list cannot be altered while the
transaction mutex is held, so pass the relevant argument to the iterator.

Fixes tag intentionally wrong; no point in picking this up if earlier
false-positive-fixups were not applied.

Fixes: 28b7a6b84c0a ("netfilter: nf_tables: avoid false-positive lockdep splats in set walker")
Signed-off-by: Florian Westphal &lt;fw@strlen.de&gt;
</content>
</entry>
<entry>
<title>netfilter: nf_tables: Introduce NFTA_DEVICE_PREFIX</title>
<updated>2025-09-04T07:19:25Z</updated>
<author>
<name>Phil Sutter</name>
<email>phil@nwl.cc</email>
</author>
<published>2025-08-07T13:49:59Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=4039ce7ef40474d5ba46f414c50cc7020b9cf8ae'/>
<id>urn:sha1:4039ce7ef40474d5ba46f414c50cc7020b9cf8ae</id>
<content type='text'>
This new attribute is supposed to be used instead of NFTA_DEVICE_NAME
for simple wildcard interface specs. It holds a NUL-terminated string
representing an interface name prefix to match on.

While kernel code to distinguish full names from prefixes in
NFTA_DEVICE_NAME is simpler than this solution, reusing the existing
attribute with different semantics leads to confusion between different
versions of kernel and user space though:

* With old kernels, wildcards submitted by user space are accepted yet
  silently treated as regular names.
* With old user space, wildcards submitted by kernel may cause crashes
  since libnftnl expects NUL-termination when there is none.

Using a distinct attribute type sanitizes these situations as the
receiving part detects and rejects the unexpected attribute nested in
*_HOOK_DEVS attributes.

Fixes: 6d07a289504a ("netfilter: nf_tables: Support wildcard netdev hook specs")
Signed-off-by: Phil Sutter &lt;phil@nwl.cc&gt;
Signed-off-by: Florian Westphal &lt;fw@strlen.de&gt;
</content>
</entry>
<entry>
<title>netfilter: conntrack: helper: Replace -EEXIST by -EBUSY</title>
<updated>2025-08-27T09:53:38Z</updated>
<author>
<name>Phil Sutter</name>
<email>phil@nwl.cc</email>
</author>
<published>2025-08-18T11:22:20Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=54416fd76770bd04fc3c501810e8d673550bab26'/>
<id>urn:sha1:54416fd76770bd04fc3c501810e8d673550bab26</id>
<content type='text'>
The helper registration return value is passed-through by module_init
callbacks which modprobe confuses with the harmless -EEXIST returned
when trying to load an already loaded module.

Make sure modprobe fails so users notice their helper has not been
registered and won't work.

Suggested-by: Christophe Leroy &lt;christophe.leroy@csgroup.eu&gt;
Fixes: 12f7a505331e ("netfilter: add user-space connection tracking helper infrastructure")
Signed-off-by: Phil Sutter &lt;phil@nwl.cc&gt;
Signed-off-by: Florian Westphal &lt;fw@strlen.de&gt;
</content>
</entry>
<entry>
<title>netfilter: nf_tables: reject duplicate device on updates</title>
<updated>2025-08-13T06:34:55Z</updated>
<author>
<name>Pablo Neira Ayuso</name>
<email>pablo@netfilter.org</email>
</author>
<published>2025-08-13T00:38:50Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=cf5fb87fcdaaaafec55dcc0dc5a9e15ead343973'/>
<id>urn:sha1:cf5fb87fcdaaaafec55dcc0dc5a9e15ead343973</id>
<content type='text'>
A chain/flowtable update with duplicated devices in the same batch is
possible. Unfortunately, netdev event path only removes the first
device that is found, leaving unregistered the hook of the duplicated
device.

Check if a duplicated device exists in the transaction batch, bail out
with EEXIST in such case.

WARNING is hit when unregistering the hook:

 [49042.221275] WARNING: CPU: 4 PID: 8425 at net/netfilter/core.c:340 nf_hook_entry_head+0xaa/0x150
 [49042.221375] CPU: 4 UID: 0 PID: 8425 Comm: nft Tainted: G S                  6.16.0+ #170 PREEMPT(full)
 [...]
 [49042.221382] RIP: 0010:nf_hook_entry_head+0xaa/0x150

Fixes: 78d9f48f7f44 ("netfilter: nf_tables: add devices to existing flowtable")
Fixes: b9703ed44ffb ("netfilter: nf_tables: support for adding new devices to an existing netdev chain")
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
Signed-off-by: Florian Westphal &lt;fw@strlen.de&gt;
</content>
</entry>
<entry>
<title>ipvs: Fix estimator kthreads preferred affinity</title>
<updated>2025-08-13T06:34:33Z</updated>
<author>
<name>Frederic Weisbecker</name>
<email>frederic@kernel.org</email>
</author>
<published>2025-07-29T12:26:11Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=c0a23bbc98e93704a1f4fb5e7e7bb2d7c0fb6eb3'/>
<id>urn:sha1:c0a23bbc98e93704a1f4fb5e7e7bb2d7c0fb6eb3</id>
<content type='text'>
The estimator kthreads' affinity are defined by sysctl overwritten
preferences and applied through a plain call to the scheduler's affinity
API.

However since the introduction of managed kthreads preferred affinity,
such a practice shortcuts the kthreads core code which eventually
overwrites the target to the default unbound affinity.

Fix this with using the appropriate kthread's API.

Fixes: d1a89197589c ("kthread: Default affine kthread to its preferred NUMA node")
Signed-off-by: Frederic Weisbecker &lt;frederic@kernel.org&gt;
Acked-by: Julian Anastasov &lt;ja@ssi.bg&gt;
Signed-off-by: Florian Westphal &lt;fw@strlen.de&gt;
</content>
</entry>
</feed>
