<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/net/netfilter/xt_socket.c, branch v4.9</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v4.9</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v4.9'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2016-04-05T02:11:20Z</updated>
<entry>
<title>tcp/dccp: do not touch listener sk_refcnt under synflood</title>
<updated>2016-04-05T02:11:20Z</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2016-04-01T15:52:17Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=3b24d854cb35383c30642116e5992fd619bdc9bc'/>
<id>urn:sha1:3b24d854cb35383c30642116e5992fd619bdc9bc</id>
<content type='text'>
When a SYNFLOOD targets a non SO_REUSEPORT listener, multiple
cpus contend on sk-&gt;sk_refcnt and sk-&gt;sk_wmem_alloc changes.

By letting listeners use SOCK_RCU_FREE infrastructure,
we can relax TCP_LISTEN lookup rules and avoid touching sk_refcnt

Note that we still use SLAB_DESTROY_BY_RCU rules for other sockets,
only listeners are impacted by this change.

Peak performance under SYNFLOOD is increased by ~33% :

On my test machine, I could process 3.2 Mpps instead of 2.4 Mpps

Most consuming functions are now skb_set_owner_w() and sock_wfree()
contending on sk-&gt;sk_wmem_alloc when cooking SYNACK and freeing them.

Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>inet: refactor inet[6]_lookup functions to take skb</title>
<updated>2016-02-11T08:54:14Z</updated>
<author>
<name>Craig Gallek</name>
<email>kraig@google.com</email>
</author>
<published>2016-02-10T16:50:38Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=a583636a83ea383fd07517e5a7a2eedbc5d90fb1'/>
<id>urn:sha1:a583636a83ea383fd07517e5a7a2eedbc5d90fb1</id>
<content type='text'>
This is a preliminary step to allow fast socket lookup of SO_REUSEPORT
groups.  Doing so with a BPF filter will require access to the
skb in question.  This change plumbs the skb (and offset to payload
data) through the call stack to the listening socket lookup
implementations where it will be used in a following patch.

Signed-off-by: Craig Gallek &lt;kraig@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>netfilter: x_tables: Use par-&gt;net instead of computing from the passed net devices</title>
<updated>2015-09-18T19:58:25Z</updated>
<author>
<name>Eric W. Biederman</name>
<email>ebiederm@xmission.com</email>
</author>
<published>2015-09-18T19:32:59Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=686c9b50809dc80cba7c2e9f809471ab40bae735'/>
<id>urn:sha1:686c9b50809dc80cba7c2e9f809471ab40bae735</id>
<content type='text'>
Signed-off-by: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
</content>
</entry>
<entry>
<title>netfilter: xt_socket: add XT_SOCKET_RESTORESKMARK flag</title>
<updated>2015-06-18T11:05:09Z</updated>
<author>
<name>Harout Hedeshian</name>
<email>harouth@codeaurora.org</email>
</author>
<published>2015-06-16T00:40:43Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=01555e74bde51444c6898ef1800fb2bc697d479e'/>
<id>urn:sha1:01555e74bde51444c6898ef1800fb2bc697d479e</id>
<content type='text'>
xt_socket is useful for matching sockets with IP_TRANSPARENT and
taking some action on the matching packets. However, it lacks the
ability to match only a small subset of transparent sockets.

Suppose there are 2 applications, each with its own set of transparent
sockets. The first application wants all matching packets dropped,
while the second application wants them forwarded somewhere else.

Add the ability to retore the skb-&gt;mark from the sk_mark. The mark
is only restored if a matching socket is found and the transparent /
nowildcard conditions are satisfied.

Now the 2 hypothetical applications can differentiate their sockets
based on a mark value set with SO_MARK.

iptables -t mangle -I PREROUTING -m socket --transparent \
                                           --restore-skmark -j action
iptables -t mangle -A action -m mark --mark 10 -j action2
iptables -t mangle -A action -m mark --mark 11 -j action3

Signed-off-by: Harout Hedeshian &lt;harouth@codeaurora.org&gt;
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
</content>
</entry>
<entry>
<title>netfilter: x_tables: don't extract flow keys on early demuxed sks in socket match</title>
<updated>2015-04-08T14:47:49Z</updated>
<author>
<name>Daniel Borkmann</name>
<email>daniel@iogearbox.net</email>
</author>
<published>2015-04-02T12:28:30Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=d64d80a2cde94f3e89caebd27240be419fec5b81'/>
<id>urn:sha1:d64d80a2cde94f3e89caebd27240be419fec5b81</id>
<content type='text'>
Currently in xt_socket, we take advantage of early demuxed sockets
since commit 00028aa37098 ("netfilter: xt_socket: use IP early demux")
in order to avoid a second socket lookup in the fast path, but we
only make partial use of this:

We still unnecessarily parse headers, extract proto, {s,d}addr and
{s,d}ports from the skb data, accessing possible conntrack information,
etc even though we were not even calling into the socket lookup via
xt_socket_get_sock_{v4,v6}() due to skb-&gt;sk hit, meaning those cycles
can be spared.

After this patch, we only proceed the slower, manual lookup path
when we have a skb-&gt;sk miss, thus time to match verdict for early
demuxed sockets will improve further, which might be i.e. interesting
for use cases such as mentioned in 681f130f39e1 ("netfilter: xt_socket:
add XT_SOCKET_NOWILDCARD flag").

Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
</content>
</entry>
<entry>
<title>netfilter: xt_socket: prepare for TCP_NEW_SYN_RECV support</title>
<updated>2015-03-17T19:17:59Z</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2015-03-17T04:06:17Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=a9407000038805e5215a49c0a50c9e2b2ff38220'/>
<id>urn:sha1:a9407000038805e5215a49c0a50c9e2b2ff38220</id>
<content type='text'>
TCP request socks soon will be visible in ehash table.

xt_socket will be able to match them, but first we need
to make sure to not consider them as full sockets.

Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>netfilter: xt_socket: fix a stack corruption bug</title>
<updated>2015-02-16T16:00:48Z</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2015-02-16T03:03:45Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=78296c97ca1fd3b104f12e1f1fbc06c46635990b'/>
<id>urn:sha1:78296c97ca1fd3b104f12e1f1fbc06c46635990b</id>
<content type='text'>
As soon as extract_icmp6_fields() returns, its local storage (automatic
variables) is deallocated and can be overwritten.

Lets add an additional parameter to make sure storage is valid long
enough.

While we are at it, adds some const qualifiers.

Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Fixes: b64c9256a9b76 ("tproxy: added IPv6 support to the socket match")
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
</content>
</entry>
<entry>
<title>netfilter: xt_socket: use sock_gen_put()</title>
<updated>2013-10-17T08:27:25Z</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2013-10-11T16:03:25Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=1a8bf6eeef9fe417f90e5338a2fd7fba69c6d0e4'/>
<id>urn:sha1:1a8bf6eeef9fe417f90e5338a2fd7fba69c6d0e4</id>
<content type='text'>
TCP listener refactoring, part 7 :

Use sock_gen_put() instead of xt_socket_put_sk() for future
SYN_RECV support.

Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
</content>
</entry>
<entry>
<title>ipv6: make lookups simpler and faster</title>
<updated>2013-10-09T04:01:25Z</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2013-10-03T22:42:29Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=efe4208f47f907b86f528788da711e8ab9dea44d'/>
<id>urn:sha1:efe4208f47f907b86f528788da711e8ab9dea44d</id>
<content type='text'>
TCP listener refactoring, part 4 :

To speed up inet lookups, we moved IPv4 addresses from inet to struct
sock_common

Now is time to do the same for IPv6, because it permits us to have fast
lookups for all kind of sockets, including upcoming SYN_RECV.

Getting IPv6 addresses in TCP lookups currently requires two extra cache
lines, plus a dereference (and memory stall).

inet6_sk(sk) does the dereference of inet_sk(__sk)-&gt;pinet6

This patch is way bigger than its IPv4 counter part, because for IPv4,
we could add aliases (inet_daddr, inet_rcv_saddr), while on IPv6,
it's not doable easily.

inet6_sk(sk)-&gt;daddr becomes sk-&gt;sk_v6_daddr
inet6_sk(sk)-&gt;rcv_saddr becomes sk-&gt;sk_v6_rcv_saddr

And timewait socket also have tw-&gt;tw_v6_daddr &amp; tw-&gt;tw_v6_rcv_saddr
at the same offset.

We get rid of INET6_TW_MATCH() as INET6_MATCH() is now the generic
macro.

Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next</title>
<updated>2013-08-20T20:30:54Z</updated>
<author>
<name>David S. Miller</name>
<email>davem@davemloft.net</email>
</author>
<published>2013-08-20T20:30:54Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=89d5e23210f53ab53b7ff64843bce62a106d454f'/>
<id>urn:sha1:89d5e23210f53ab53b7ff64843bce62a106d454f</id>
<content type='text'>
Conflicts:
	net/netfilter/nf_conntrack_proto_tcp.c

The conflict had to do with overlapping changes dealing with
fixing the use of an "s32" to hold the value returned by
NAT_OFFSET().

Pablo Neira Ayuso says:

====================
The following batch contains Netfilter/IPVS updates for your net-next tree.
More specifically, they are:

* Trivial typo fix in xt_addrtype, from Phil Oester.

* Remove net_ratelimit in the conntrack logging for consistency with other
  logging subsystem, from Patrick McHardy.

* Remove unneeded includes from the recently added xt_connlabel support, from
  Florian Westphal.

* Allow to update conntracks via nfqueue, don't need NFQA_CFG_F_CONNTRACK for
  this, from Florian Westphal.

* Remove tproxy core, now that we have socket early demux, from Florian
  Westphal.

* A couple of patches to refactor conntrack event reporting to save a good
  bunch of lines, from Florian Westphal.

* Fix missing locking in NAT sequence adjustment, it did not manifested in
  any known bug so far, from Patrick McHardy.

* Change sequence number adjustment variable to 32 bits, to delay the
  possible early overflow in long standing connections, also from Patrick.

* Comestic cleanups for IPVS, from Dragos Foianu.

* Fix possible null dereference in IPVS in the SH scheduler, from Daniel
  Borkmann.

* Allow to attach conntrack expectations via nfqueue. Before this patch, you
  had to use ctnetlink instead, thus, we save the conntrack lookup.

* Export xt_rpfilter and xt_HMARK header files, from Nicolas Dichtel.
====================

Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
</feed>
