<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/include/net/ip.h, branch v3.6</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v3.6</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v3.6'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2012-08-10T21:08:57Z</updated>
<entry>
<title>ipv4: fix ip_send_skb()</title>
<updated>2012-08-10T21:08:57Z</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2012-08-10T02:22:47Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=b5ec8eeac46a99004c26791f70b15d001e970acf'/>
<id>urn:sha1:b5ec8eeac46a99004c26791f70b15d001e970acf</id>
<content type='text'>
ip_send_skb() can send orphaned skb, so we must pass the net pointer to
avoid possible NULL dereference in error path.

Bug added by commit 3a7c384ffd57 (ipv4: tcp: unicast_sock should not
land outside of TCP stack)

Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>ipv4: tcp: remove per net tcp_sock</title>
<updated>2012-07-19T17:35:30Z</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2012-07-19T07:34:03Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=be9f4a44e7d41cee50ddb5f038fc2391cbbb4046'/>
<id>urn:sha1:be9f4a44e7d41cee50ddb5f038fc2391cbbb4046</id>
<content type='text'>
tcp_v4_send_reset() and tcp_v4_send_ack() use a single socket
per network namespace.

This leads to bad behavior on multiqueue NICS, because many cpus
contend for the socket lock and once socket lock is acquired, extra
false sharing on various socket fields slow down the operations.

To better resist to attacks, we use a percpu socket. Each cpu can
run without contention, using appropriate memory (local node)

Additional features :

1) We also mirror the queue_mapping of the incoming skb, so that
answers use the same queue if possible.

2) Setting SOCK_USE_WRITE_QUEUE socket flag speedup sock_wfree()

3) We now limit the number of in-flight RST/ACK [1] packets
per cpu, instead of per namespace, and we honor the sysctl_wmem_default
limit dynamically. (Prior to this patch, sysctl_wmem_default value was
copied at boot time, so any further change would not affect tcp_sock
limit)

[1] These packets are only generated when no socket was matched for
the incoming packet.

Reported-by: Bill Sommerfeld &lt;wsommerfeld@google.com&gt;
Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Cc: Tom Herbert &lt;therbert@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>ipv4: Show that ip_send_reply() is purely unicast routine.</title>
<updated>2012-06-28T10:21:41Z</updated>
<author>
<name>David S. Miller</name>
<email>davem@davemloft.net</email>
</author>
<published>2012-06-28T10:21:41Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=70e7341673a47fb1525cfc7d6651cc98b5348928'/>
<id>urn:sha1:70e7341673a47fb1525cfc7d6651cc98b5348928</id>
<content type='text'>
Rename it to ip_send_unicast_reply() and add explicit 'saddr'
argument.

This removed one of the few users of rt-&gt;rt_spec_dst.

Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>ipv4: Add sysctl knob to control early socket demux</title>
<updated>2012-06-23T00:11:13Z</updated>
<author>
<name>Alexander Duyck</name>
<email>alexander.h.duyck@intel.com</email>
</author>
<published>2012-06-21T13:58:31Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=6648bd7e0e62c0c8c03b15e00c9e7015e232feff'/>
<id>urn:sha1:6648bd7e0e62c0c8c03b15e00c9e7015e232feff</id>
<content type='text'>
This change is meant to add a control for disabling early socket demux.
The main motivation behind this patch is to provide an option to disable
the feature as it adds an additional cost to routing that reduces overall
throughput by up to 5%.  For example one of my systems went from 12.1Mpps
to 11.6 after the early socket demux was added.  It looks like the reason
for the regression is that we are now having to perform two lookups, first
the one for an established socket, and then the one for the routing table.

By adding this patch and toggling the value for ip_early_demux to 0 I am
able to get back to the 12.1Mpps I was previously seeing.

[ Move local variables in ip_rcv_finish() down into the basic
  block in which they are actually used.  -DaveM ]

Signed-off-by: Alexander Duyck &lt;alexander.h.duyck@intel.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>net: delete all instances of special processing for token ring</title>
<updated>2012-05-16T00:14:35Z</updated>
<author>
<name>Paul Gortmaker</name>
<email>paul.gortmaker@windriver.com</email>
</author>
<published>2012-05-10T21:14:35Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=211ed865108e24697b44bee5daac502ee6bdd4a4'/>
<id>urn:sha1:211ed865108e24697b44bee5daac502ee6bdd4a4</id>
<content type='text'>
We are going to delete the Token ring support.  This removes any
special processing in the core networking for token ring, (aside
from net/tr.c itself), leaving the drivers and remaining tokenring
support present but inert.

The mass removal of the drivers and net/tr.c will be in a separate
commit, so that the history of these files that we still care
about won't have the giant deletion tied into their history.

Signed-off-by: Paul Gortmaker &lt;paul.gortmaker@windriver.com&gt;
</content>
</entry>
<entry>
<title>net: Delete all remaining instances of ctl_path</title>
<updated>2012-04-21T01:22:30Z</updated>
<author>
<name>Eric W. Biederman</name>
<email>ebiederm@xmission.com</email>
</author>
<published>2012-04-19T13:45:29Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=a5347fe36b313c07d59b065d00a8fa56362c5f97'/>
<id>urn:sha1:a5347fe36b313c07d59b065d00a8fa56362c5f97</id>
<content type='text'>
We don't use struct ctl_path anymore so delete the exported constants.

Signed-off-by: Eric W. Biederman &lt;ebiederm@xmission.com&gt;
Acked-by: Pavel Emelyanov &lt;xemul@parallels.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>ipv4: Make ip_call_ra_chain() return bool.</title>
<updated>2012-03-09T22:34:50Z</updated>
<author>
<name>David S. Miller</name>
<email>davem@davemloft.net</email>
</author>
<published>2012-03-08T01:45:32Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=ba57b4db2624793c6eb8f2c051c9f7b8a6e7b6a6'/>
<id>urn:sha1:ba57b4db2624793c6eb8f2c051c9f7b8a6e7b6a6</id>
<content type='text'>
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>net: use IS_ENABLED(CONFIG_IPV6)</title>
<updated>2011-12-11T23:25:16Z</updated>
<author>
<name>Eric Dumazet</name>
<email>eric.dumazet@gmail.com</email>
</author>
<published>2011-12-10T09:48:31Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=dfd56b8b38fff3586f36232db58e1e9f7885a605'/>
<id>urn:sha1:dfd56b8b38fff3586f36232db58e1e9f7885a605</id>
<content type='text'>
Instead of testing defined(CONFIG_IPV6) || defined(CONFIG_IPV6_MODULE)

Signed-off-by: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>ipv4: PKTINFO doesnt need dst reference</title>
<updated>2011-11-09T21:36:27Z</updated>
<author>
<name>Eric Dumazet</name>
<email>eric.dumazet@gmail.com</email>
</author>
<published>2011-11-09T07:24:35Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=d826eb14ecef3574b6b3be55e5f4329f4a76fbf3'/>
<id>urn:sha1:d826eb14ecef3574b6b3be55e5f4329f4a76fbf3</id>
<content type='text'>
Le lundi 07 novembre 2011 à 15:33 +0100, Eric Dumazet a écrit :

&gt; At least, in recent kernels we dont change dst-&gt;refcnt in forwarding
&gt; patch (usinf NOREF skb-&gt;dst)
&gt;
&gt; One particular point is the atomic_inc(dst-&gt;refcnt) we have to perform
&gt; when queuing an UDP packet if socket asked PKTINFO stuff (for example a
&gt; typical DNS server has to setup this option)
&gt;
&gt; I have one patch somewhere that stores the information in skb-&gt;cb[] and
&gt; avoid the atomic_{inc|dec}(dst-&gt;refcnt).
&gt;

OK I found it, I did some extra tests and believe its ready.

[PATCH net-next] ipv4: IP_PKTINFO doesnt need dst reference

When a socket uses IP_PKTINFO notifications, we currently force a dst
reference for each received skb. Reader has to access dst to get needed
information (rt_iif &amp; rt_spec_dst) and must release dst reference.

We also forced a dst reference if skb was put in socket backlog, even
without IP_PKTINFO handling. This happens under stress/load.

We can instead store the needed information in skb-&gt;cb[], so that only
softirq handler really access dst, improving cache hit ratios.

This removes two atomic operations per packet, and false sharing as
well.

On a benchmark using a mono threaded receiver (doing only recvmsg()
calls), I can reach 720.000 pps instead of 570.000 pps.

IP_PKTINFO is typically used by DNS servers, and any multihomed aware
UDP application.

Signed-off-by: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>ipv4: tcp: fix TOS value in ACK messages sent from TIME_WAIT</title>
<updated>2011-10-24T07:06:21Z</updated>
<author>
<name>Eric Dumazet</name>
<email>eric.dumazet@gmail.com</email>
</author>
<published>2011-10-24T07:06:21Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=66b13d99d96a1a69f47a6bc3dc47f45955967377'/>
<id>urn:sha1:66b13d99d96a1a69f47a6bc3dc47f45955967377</id>
<content type='text'>
There is a long standing bug in linux tcp stack, about ACK messages sent
on behalf of TIME_WAIT sockets.

In the IP header of the ACK message, we choose to reflect TOS field of
incoming message, and this might break some setups.

Example of things that were broken :
  - Routing using TOS as a selector
  - Firewalls
  - Trafic classification / shaping

We now remember in timewait structure the inet tos field and use it in
ACK generation, and route lookup.

Notes :
 - We still reflect incoming TOS in RST messages.
 - We could extend MuraliRaja Muniraju patch to report TOS value in
netlink messages for TIME_WAIT sockets.
 - A patch is needed for IPv6

Signed-off-by: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
</feed>
