<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/include/net/request_sock.h, branch v4.3</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v4.3</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v4.3'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2015-05-05T20:02:34Z</updated>
<entry>
<title>tcp: provide SYN headers for passive connections</title>
<updated>2015-05-05T20:02:34Z</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2015-05-04T04:34:46Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=cd8ae85299d54155702a56811b2e035e63064d3d'/>
<id>urn:sha1:cd8ae85299d54155702a56811b2e035e63064d3d</id>
<content type='text'>
This patch allows a server application to get the TCP SYN headers for
its passive connections.  This is useful if the server is doing
fingerprinting of clients based on SYN packet contents.

Two socket options are added: TCP_SAVE_SYN and TCP_SAVED_SYN.

The first is used on a socket to enable saving the SYN headers
for child connections. This can be set before or after the listen()
call.

The latter is used to retrieve the SYN headers for passive connections,
if the parent listener has enabled TCP_SAVE_SYN.

TCP_SAVED_SYN is read once, it frees the saved SYN headers.

The data returned in TCP_SAVED_SYN are network (IPv4/IPv6) and TCP
headers.

Original patch was written by Tom Herbert, I changed it to not hold
a full skb (and associated dst and conntracking reference).

We have used such patch for about 3 years at Google.

Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Acked-by: Neal Cardwell &lt;ncardwell@google.com&gt;
Tested-by: Neal Cardwell &lt;ncardwell@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>inet: fix possible panic in reqsk_queue_unlink()</title>
<updated>2015-04-24T15:39:15Z</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2015-04-24T01:03:44Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=b357a364c57c940ddb932224542494363df37378'/>
<id>urn:sha1:b357a364c57c940ddb932224542494363df37378</id>
<content type='text'>
[ 3897.923145] BUG: unable to handle kernel NULL pointer dereference at
 0000000000000080
[ 3897.931025] IP: [&lt;ffffffffa9f27686&gt;] reqsk_timer_handler+0x1a6/0x243

There is a race when reqsk_timer_handler() and tcp_check_req() call
inet_csk_reqsk_queue_unlink() on the same req at the same time.

Before commit fa76ce7328b2 ("inet: get rid of central tcp/dccp listener
timer"), listener spinlock was held and race could not happen.

To solve this bug, we change reqsk_queue_unlink() to not assume req
must be found, and we return a status, to conditionally release a
refcount on the request sock.

This also means tcp_check_req() in non fastopen case might or not
consume req refcount, so tcp_v6_hnd_req() &amp; tcp_v4_hnd_req() have
to properly handle this.

(Same remark for dccp_check_req() and its callers)

inet_csk_reqsk_queue_drop() is now too big to be inlined, as it is
called 4 times in tcp and 3 times in dccp.

Fixes: fa76ce7328b2 ("inet: get rid of central tcp/dccp listener timer")
Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Reported-by: Yuchung Cheng &lt;ycheng@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>net: convert syn_wait_lock to a spinlock</title>
<updated>2015-03-23T20:52:26Z</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2015-03-22T17:22:21Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=b282705336e03fc7b9377a278939594870a40f96'/>
<id>urn:sha1:b282705336e03fc7b9377a278939594870a40f96</id>
<content type='text'>
This is a low hanging fruit, as we'll get rid of syn_wait_lock eventually.

We hold syn_wait_lock for such small sections, that it makes no sense to use
a read/write lock. A spin lock is simply faster.

Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>inet: remove sk_listener parameter from syn_ack_timeout()</title>
<updated>2015-03-23T20:52:25Z</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2015-03-22T17:22:19Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=42cb80a2353f42913ae78074ffa1f1b4a49e5436'/>
<id>urn:sha1:42cb80a2353f42913ae78074ffa1f1b4a49e5436</id>
<content type='text'>
It is not needed, and req-&gt;sk_listener points to the listener anyway.
request_sock argument can be const.

Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>inet: get rid of central tcp/dccp listener timer</title>
<updated>2015-03-20T16:40:25Z</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2015-03-20T02:04:20Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=fa76ce7328b289b6edd476e24eb52fd634261720'/>
<id>urn:sha1:fa76ce7328b289b6edd476e24eb52fd634261720</id>
<content type='text'>
One of the major issue for TCP is the SYNACK rtx handling,
done by inet_csk_reqsk_queue_prune(), fired by the keepalive
timer of a TCP_LISTEN socket.

This function runs for awful long times, with socket lock held,
meaning that other cpus needing this lock have to spin for hundred of ms.

SYNACK are sent in huge bursts, likely to cause severe drops anyway.

This model was OK 15 years ago when memory was very tight.

We now can afford to have a timer per request sock.

Timer invocations no longer need to lock the listener,
and can be run from all cpus in parallel.

With following patch increasing somaxconn width to 32 bits,
I tested a listener with more than 4 million active request sockets,
and a steady SYNFLOOD of ~200,000 SYN per second.
Host was sending ~830,000 SYNACK per second.

This is ~100 times more what we could achieve before this patch.

Later, we will get rid of the listener hash and use ehash instead.

Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>inet: drop prev pointer handling in request sock</title>
<updated>2015-03-20T16:40:25Z</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2015-03-20T02:04:19Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=52452c542559ac980b48dbf22a30ee7fa0af507c'/>
<id>urn:sha1:52452c542559ac980b48dbf22a30ee7fa0af507c</id>
<content type='text'>
When request sock are put in ehash table, the whole notion
of having a previous request to update dl_next is pointless.

Also, following patch will get rid of big purge timer,
so we want to delete a request sock without holding listener lock.

Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>inet: request sock should init IPv6/IPv4 addresses</title>
<updated>2015-03-19T02:00:35Z</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2015-03-18T21:05:38Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=08d2cc3b26554cae21f279b520ae5c2a3b2be421'/>
<id>urn:sha1:08d2cc3b26554cae21f279b520ae5c2a3b2be421</id>
<content type='text'>
In order to be able to use sk_ehashfn() for request socks,
we need to initialize their IPv6/IPv4 addresses.

Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>inet: fix request sock refcounting</title>
<updated>2015-03-18T02:02:29Z</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2015-03-18T01:32:31Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=0470c8ca1d57927f2cc3e1d5add1fb2834609447'/>
<id>urn:sha1:0470c8ca1d57927f2cc3e1d5add1fb2834609447</id>
<content type='text'>
While testing last patch series, I found req sock refcounting was wrong.

We must set skc_refcnt to 1 for all request socks added in hashes,
but also on request sockets created by FastOpen or syncookies.

It is tricky because we need to defer this initialization so that
future RCU lookups do not try to take a refcount on a not yet
fully initialized request socket.

Also get rid of ireq_refcnt alias.

Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Fixes: 13854e5a6046 ("inet: add proper refcounting to request sock")
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>inet: add rsk_listener field to struct request_sock</title>
<updated>2015-03-18T02:01:56Z</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2015-03-18T01:32:28Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=4e9a578e5b6bdfa8b7fed7a41f28a86a7cffc85f'/>
<id>urn:sha1:4e9a578e5b6bdfa8b7fed7a41f28a86a7cffc85f</id>
<content type='text'>
Once we'll be able to lookup request sockets in ehash table,
we'll need to get access to listener which created this request.

This avoid doing a lookup to find the listener, which benefits
for a more solid SO_REUSEPORT, and is needed once we no
longer queue request sock into a listener private queue.

Note that 'struct tcp_request_sock'-&gt;listener could be reduced
to a single bit, as TFO listener should match req-&gt;rsk_listener.
TFO will no longer need to hold a reference on the listener.

Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>inet: add proper refcounting to request sock</title>
<updated>2015-03-16T19:55:29Z</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2015-03-16T04:12:16Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=13854e5a60461daee08ce99842b7f4d37553d911'/>
<id>urn:sha1:13854e5a60461daee08ce99842b7f4d37553d911</id>
<content type='text'>
reqsk_put() is the generic function that should be used
to release a refcount (and automatically call reqsk_free())

reqsk_free() might be called if refcount is known to be 0
or undefined.

refcnt is set to one in inet_csk_reqsk_queue_add()

As request socks are not yet in global ehash table,
I added temporary debugging checks in reqsk_put() and reqsk_free()

Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
</feed>
