<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/net/ipv4/proc.c, branch v3.5</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v3.5</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v3.5'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2012-03-19T20:53:08Z</updated>
<entry>
<title>tcp: reduce out_of_order memory use</title>
<updated>2012-03-19T20:53:08Z</updated>
<author>
<name>Eric Dumazet</name>
<email>eric.dumazet@gmail.com</email>
</author>
<published>2012-03-18T11:07:47Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=c8628155ece363487b57d33441ea0359018c0fa7'/>
<id>urn:sha1:c8628155ece363487b57d33441ea0359018c0fa7</id>
<content type='text'>
With increasing receive window sizes, but speed of light not improved
that much, out of order queue can contain a huge number of skbs, waiting
to be moved to receive_queue when missing packets can fill the holes.

Some devices happen to use fat skbs (truesize of 4096 + sizeof(struct
sk_buff)) to store regular (MTU &lt;= 1500) frames. This makes highly
probable sk_rmem_alloc hits sk_rcvbuf limit, which can be 4Mbytes in
many cases.

When limit is hit, tcp stack calls tcp_collapse_ofo_queue(), a true
latency killer and cpu cache blower.

Doing the coalescing attempt each time we add a frame in ofo queue
permits to keep memory use tight and in many cases avoid the
tcp_collapse() thing later.

Tested on various wireless setups (b43, ath9k, ...) known to use big skb
truesize, this patch removed the "packets collapsed in receive queue due
to low socket buffer" I had before.

This also reduced average memory used by tcp sockets.

With help from Neal Cardwell.

Signed-off-by: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
Cc: Neal Cardwell &lt;ncardwell@google.com&gt;
Cc: Yuchung Cheng &lt;ycheng@google.com&gt;
Cc: H.K. Jerry Chu &lt;hkchu@google.com&gt;
Cc: Tom Herbert &lt;therbert@google.com&gt;
Cc: Ilpo Järvinen &lt;ilpo.jarvinen@helsinki.fi&gt;
Acked-by: Neal Cardwell &lt;ncardwell@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>tcp: add LINUX_MIB_TCPRETRANSFAIL counter</title>
<updated>2012-01-26T18:51:00Z</updated>
<author>
<name>Eric Dumazet</name>
<email>eric.dumazet@gmail.com</email>
</author>
<published>2012-01-25T04:44:20Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=09e9b813d34d9a09d64a64580a9959d8bae1f4f5'/>
<id>urn:sha1:09e9b813d34d9a09d64a64580a9959d8bae1f4f5</id>
<content type='text'>
It might be useful to get a counter of failed tcp_retransmit_skb()
calls.

Reported-by: Satoru Moriya &lt;satoru.moriya@hds.com&gt;
Signed-off-by: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>tcp: detect loss above high_seq in recovery</title>
<updated>2012-01-22T20:08:44Z</updated>
<author>
<name>Yuchung Cheng</name>
<email>ycheng@google.com</email>
</author>
<published>2012-01-19T14:42:21Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=974c12360dfe6ab01201fe9e708e7755c413f8b6'/>
<id>urn:sha1:974c12360dfe6ab01201fe9e708e7755c413f8b6</id>
<content type='text'>
Correctly implement a loss detection heuristic: New sequences (above
high_seq) sent during the fast recovery are deemed lost when higher
sequences are SACKed.

Current code does not catch these losses, because tcp_mark_head_lost()
does not check packets beyond high_seq. The fix is straight-forward by
checking packets until the highest sacked packet. In addition, all the
FLAG_DATA_LOST logic are in-effective and redundant and can be removed.

Update the loss heuristic comments. The algorithm above is documented
as heuristic B, but it is redundant too because heuristic A already
covers B.

Note that this change only marks some forward-retransmitted packets LOST.
It does NOT forbid TCP performing further CWR on new losses. A potential
follow-up patch under preparation is to perform another CWR on "new"
losses such as
1) sequence above high_seq is lost (by resetting high_seq to snd_nxt)
2) retransmission is lost.

Signed-off-by: Yuchung Cheng &lt;ycheng@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>foundations of per-cgroup memory pressure controlling.</title>
<updated>2011-12-13T00:04:10Z</updated>
<author>
<name>Glauber Costa</name>
<email>glommer@parallels.com</email>
</author>
<published>2011-12-11T21:47:02Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=180d8cd942ce336b2c869d324855c40c5db478ad'/>
<id>urn:sha1:180d8cd942ce336b2c869d324855c40c5db478ad</id>
<content type='text'>
This patch replaces all uses of struct sock fields' memory_pressure,
memory_allocated, sockets_allocated, and sysctl_mem to acessor
macros. Those macros can either receive a socket argument, or a mem_cgroup
argument, depending on the context they live in.

Since we're only doing a macro wrapping here, no performance impact at all is
expected in the case where we don't have cgroups disabled.

Signed-off-by: Glauber Costa &lt;glommer@parallels.com&gt;
Reviewed-by: Hiroyouki Kamezawa &lt;kamezawa.hiroyu@jp.fujitsu.com&gt;
CC: David S. Miller &lt;davem@davemloft.net&gt;
CC: Eric W. Biederman &lt;ebiederm@xmission.com&gt;
CC: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>ipv4: reduce percpu needs for icmpmsg mibs</title>
<updated>2011-11-09T21:04:20Z</updated>
<author>
<name>Eric Dumazet</name>
<email>eric.dumazet@gmail.com</email>
</author>
<published>2011-11-08T13:04:43Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=acb32ba3dee66d58704caeeb8c6ff95f60efdc66'/>
<id>urn:sha1:acb32ba3dee66d58704caeeb8c6ff95f60efdc66</id>
<content type='text'>
Reading /proc/net/snmp on a machine with a lot of cpus is very expensive
(can be ~88000 us).

This is because ICMPMSG MIB uses 4096 bytes per cpu, and folding values
for all possible cpus can read 16 Mbytes of memory.

ICMP messages are not considered as fast path on a typical server, and
eventually few cpus handle them anyway. We can afford an atomic
operation instead of using percpu data.

This saves 4096 bytes per cpu and per network namespace.

Signed-off-by: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>net: Add export.h for EXPORT_SYMBOL/THIS_MODULE to non-modules</title>
<updated>2011-10-31T23:30:30Z</updated>
<author>
<name>Paul Gortmaker</name>
<email>paul.gortmaker@windriver.com</email>
</author>
<published>2011-07-15T15:47:34Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=bc3b2d7fb9b014d75ebb79ba371a763dbab5e8cf'/>
<id>urn:sha1:bc3b2d7fb9b014d75ebb79ba371a763dbab5e8cf</id>
<content type='text'>
These files are non modular, but need to export symbols using
the macros now living in export.h -- call out the include so
that things won't break when we remove the implicit presence
of module.h from everywhere.

Signed-off-by: Paul Gortmaker &lt;paul.gortmaker@windriver.com&gt;
</content>
</entry>
<entry>
<title>tcp: Change possible SYN flooding messages</title>
<updated>2011-09-15T18:49:43Z</updated>
<author>
<name>Eric Dumazet</name>
<email>eric.dumazet@gmail.com</email>
</author>
<published>2011-08-30T03:21:44Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=946cedccbd7387488d2cee5da92cdfeb28d2e670'/>
<id>urn:sha1:946cedccbd7387488d2cee5da92cdfeb28d2e670</id>
<content type='text'>
"Possible SYN flooding on port xxxx " messages can fill logs on servers.

Change logic to log the message only once per listener, and add two new
SNMP counters to track :

TCPReqQFullDoCookies : number of times a SYNCOOKIE was replied to client

TCPReqQFullDrop : number of times a SYN request was dropped because
syncookies were not enabled.

Based on a prior patch from Tom Herbert, and suggestions from David.

Signed-off-by: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
CC: Tom Herbert &lt;therbert@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>tcp: Replace time wait bucket msg by counter</title>
<updated>2010-12-08T20:16:33Z</updated>
<author>
<name>Tom Herbert</name>
<email>therbert@google.com</email>
</author>
<published>2010-12-08T20:16:33Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=67631510a318d5a930055fe927607f483716e100'/>
<id>urn:sha1:67631510a318d5a930055fe927607f483716e100</id>
<content type='text'>
Rather than printing the message to the log, use a mib counter to keep
track of the count of occurences of time wait bucket overflow.  Reduces
spam in logs.

Signed-off-by: Tom Herbert &lt;therbert@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>net: avoid limits overflow</title>
<updated>2010-11-10T20:12:00Z</updated>
<author>
<name>Eric Dumazet</name>
<email>eric.dumazet@gmail.com</email>
</author>
<published>2010-11-09T23:24:26Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=8d987e5c75107ca7515fa19e857cfa24aab6ec8f'/>
<id>urn:sha1:8d987e5c75107ca7515fa19e857cfa24aab6ec8f</id>
<content type='text'>
Robin Holt tried to boot a 16TB machine and found some limits were
reached : sysctl_tcp_mem[2], sysctl_udp_mem[2]

We can switch infrastructure to use long "instead" of "int", now
atomic_long_t primitives are available for free.

Signed-off-by: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
Reported-by: Robin Holt &lt;holt@sgi.com&gt;
Reviewed-by: Robin Holt &lt;holt@sgi.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>snmp: 64bit ipstats_mib for all arches</title>
<updated>2010-06-30T20:31:19Z</updated>
<author>
<name>Eric Dumazet</name>
<email>eric.dumazet@gmail.com</email>
</author>
<published>2010-06-30T20:31:19Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=4ce3c183fcade7f4b30a33dae90cd774c3d9e094'/>
<id>urn:sha1:4ce3c183fcade7f4b30a33dae90cd774c3d9e094</id>
<content type='text'>
/proc/net/snmp and /proc/net/netstat expose SNMP counters.

Width of these counters is either 32 or 64 bits, depending on the size
of "unsigned long" in kernel.

This means user program parsing these files must already be prepared to
deal with 64bit values, regardless of user program being 32 or 64 bit.

This patch introduces 64bit snmp values for IPSTAT mib, where some
counters can wrap pretty fast if they are 32bit wide.

# netstat -s|egrep "InOctets|OutOctets"
    InOctets: 244068329096
    OutOctets: 244069348848

Signed-off-by: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
</feed>
