<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/net/ipv4, branch v4.4</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v4.4</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v4.4'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2016-01-06T21:39:56Z</updated>
<entry>
<title>tcp: fix zero cwnd in tcp_cwnd_reduction</title>
<updated>2016-01-06T21:39:56Z</updated>
<author>
<name>Yuchung Cheng</name>
<email>ycheng@google.com</email>
</author>
<published>2016-01-06T20:42:38Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=8b8a321ff72c785ed5e8b4cf6eda20b35d427390'/>
<id>urn:sha1:8b8a321ff72c785ed5e8b4cf6eda20b35d427390</id>
<content type='text'>
Patch 3759824da87b ("tcp: PRR uses CRB mode by default and SS mode
conditionally") introduced a bug that cwnd may become 0 when both
inflight and sndcnt are 0 (cwnd = inflight + sndcnt). This may lead
to a div-by-zero if the connection starts another cwnd reduction
phase by setting tp-&gt;prior_cwnd to the current cwnd (0) in
tcp_init_cwnd_reduction().

To prevent this we skip PRR operation when nothing is acked or
sacked. Then cwnd must be positive in all cases as long as ssthresh
is positive:

1) The proportional reduction mode
   inflight &gt; ssthresh &gt; 0

2) The reduction bound mode
  a) inflight == ssthresh &gt; 0

  b) inflight &lt; ssthresh
     sndcnt &gt; 0 since newly_acked_sacked &gt; 0 and inflight &lt; ssthresh

Therefore in all cases inflight and sndcnt can not both be 0.
We check invalid tp-&gt;prior_cwnd to avoid potential div0 bugs.

In reality this bug is triggered only with a sequence of less common
events.  For example, the connection is terminating an ECN-triggered
cwnd reduction with an inflight 0, then it receives reordered/old
ACKs or DSACKs from prior transmission (which acks nothing). Or the
connection is in fast recovery stage that marks everything lost,
but fails to retransmit due to local issues, then receives data
packets from other end which acks nothing.

Fixes: 3759824da87b ("tcp: PRR uses CRB mode by default and SS mode conditionally")
Reported-by: Oleksandr Natalenko &lt;oleksandr@natalenko.name&gt;
Signed-off-by: Yuchung Cheng &lt;ycheng@google.com&gt;
Signed-off-by: Neal Cardwell &lt;ncardwell@google.com&gt;
Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>net: Propagate lookup failure in l3mdev_get_saddr to caller</title>
<updated>2016-01-05T03:58:30Z</updated>
<author>
<name>David Ahern</name>
<email>dsa@cumulusnetworks.com</email>
</author>
<published>2016-01-04T17:09:27Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=b5bdacf3bb027ba0af4d61b38ec289bfc8b64372'/>
<id>urn:sha1:b5bdacf3bb027ba0af4d61b38ec289bfc8b64372</id>
<content type='text'>
Commands run in a vrf context are not failing as expected on a route lookup:
    root@kenny:~# ip ro ls table vrf-red
    unreachable default

    root@kenny:~# ping -I vrf-red -c1 -w1 10.100.1.254
    ping: Warning: source address might be selected on device other than vrf-red.
    PING 10.100.1.254 (10.100.1.254) from 0.0.0.0 vrf-red: 56(84) bytes of data.

    --- 10.100.1.254 ping statistics ---
    2 packets transmitted, 0 received, 100% packet loss, time 999ms

Since the vrf table does not have a route for 10.100.1.254 the ping
should have failed. The saddr lookup causes a full VRF table lookup.
Propogating a lookup failure to the user allows the command to fail as
expected:

    root@kenny:~# ping -I vrf-red -c1 -w1 10.100.1.254
    connect: No route to host

Signed-off-by: David Ahern &lt;dsa@cumulusnetworks.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec</title>
<updated>2015-12-22T21:26:31Z</updated>
<author>
<name>David S. Miller</name>
<email>davem@davemloft.net</email>
</author>
<published>2015-12-22T21:26:31Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=024f35c55233fb6041e2b31165271f5b941802e6'/>
<id>urn:sha1:024f35c55233fb6041e2b31165271f5b941802e6</id>
<content type='text'>
Steffen Klassert says:

====================
pull request (net): ipsec 2015-12-22

Just one patch to fix dst_entries_init with multiple namespaces.
From Dan Streetman.

Please pull or let me know if there are problems.
====================

Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>ipip: ioctl: Remove superfluous IP-TTL handling.</title>
<updated>2015-12-18T21:07:59Z</updated>
<author>
<name>Pravin B Shelar</name>
<email>pshelar@nicira.com</email>
</author>
<published>2015-12-18T00:46:39Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=6d3c348a63685410b12bf961b97063efeef2f901'/>
<id>urn:sha1:6d3c348a63685410b12bf961b97063efeef2f901</id>
<content type='text'>
IP-TTL case is already handled in ip_tunnel_ioctl() API.

Signed-off-by: Pravin B Shelar &lt;pshelar@nicira.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>tcp: restore fastopen with no data in SYN packet</title>
<updated>2015-12-17T20:37:39Z</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2015-12-16T21:53:10Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=07e100f984975cb0417a7d5e626d0409efbad478'/>
<id>urn:sha1:07e100f984975cb0417a7d5e626d0409efbad478</id>
<content type='text'>
Yuchung tracked a regression caused by commit 57be5bdad759 ("ip: convert
tcp_sendmsg() to iov_iter primitives") for TCP Fast Open.

Some Fast Open users do not actually add any data in the SYN packet.

Fixes: 57be5bdad759 ("ip: convert tcp_sendmsg() to iov_iter primitives")
Reported-by: Yuchung Cheng &lt;ycheng@google.com&gt;
Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Cc: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Acked-by: Yuchung Cheng &lt;ycheng@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>fou: clean up socket with kfree_rcu</title>
<updated>2015-12-17T00:03:02Z</updated>
<author>
<name>Hannes Frederic Sowa</name>
<email>hannes@stressinduktion.org</email>
</author>
<published>2015-12-15T20:01:53Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=3036facbb7be3a169e35be3b271162b0fa564a2d'/>
<id>urn:sha1:3036facbb7be3a169e35be3b271162b0fa564a2d</id>
<content type='text'>
fou-&gt;udp_offloads is managed by RCU. As it is actually included inside
the fou sockets, we cannot let the memory go out of scope before a grace
period. We either can synchronize_rcu or switch over to kfree_rcu to
manage the sockets. kfree_rcu seems appropriate as it is used by vxlan
and geneve.

Fixes: 23461551c00628c ("fou: Support for foo-over-udp RX path")
Cc: Tom Herbert &lt;tom@herbertland.com&gt;
Signed-off-by: Hannes Frederic Sowa &lt;hannes@stressinduktion.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>net: fix IP early demux races</title>
<updated>2015-12-15T04:52:00Z</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2015-12-14T22:08:53Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=5037e9ef9454917b047f9f3a19b4dd179fbf7cd4'/>
<id>urn:sha1:5037e9ef9454917b047f9f3a19b4dd179fbf7cd4</id>
<content type='text'>
David Wilder reported crashes caused by dst reuse.

&lt;quote David&gt;
  I am seeing a crash on a distro V4.2.3 kernel caused by a double
  release of a dst_entry.  In ipv4_dst_destroy() the call to
  list_empty() finds a poisoned next pointer, indicating the dst_entry
  has already been removed from the list and freed. The crash occurs
  18 to 24 hours into a run of a network stress exerciser.
&lt;/quote&gt;

Thanks to his detailed report and analysis, we were able to understand
the core issue.

IP early demux can associate a dst to skb, after a lookup in TCP/UDP
sockets.

When socket cache is not properly set, we want to store into
sk-&gt;sk_dst_cache the dst for future IP early demux lookups,
by acquiring a stable refcount on the dst.

Problem is this acquisition is simply using an atomic_inc(),
which works well, unless the dst was queued for destruction from
dst_release() noticing dst refcount went to zero, if DST_NOCACHE
was set on dst.

We need to make sure current refcount is not zero before incrementing
it, or risk double free as David reported.

This patch, being a stable candidate, adds two new helpers, and use
them only from IP early demux problematic paths.

It might be possible to merge in net-next skb_dst_force() and
skb_dst_force_safe(), but I prefer having the smallest patch for stable
kernels : Maybe some skb_dst_force() callers do not expect skb-&gt;dst
can suddenly be cleared.

Can probably be backported back to linux-3.6 kernels

Reported-by: David J. Wilder &lt;dwilder@us.ibm.com&gt;
Tested-by: David J. Wilder &lt;dwilder@us.ibm.com&gt;
Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>net: add validation for the socket syscall protocol argument</title>
<updated>2015-12-14T21:09:30Z</updated>
<author>
<name>Hannes Frederic Sowa</name>
<email>hannes@stressinduktion.org</email>
</author>
<published>2015-12-14T21:03:39Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=79462ad02e861803b3840cc782248c7359451cd9'/>
<id>urn:sha1:79462ad02e861803b3840cc782248c7359451cd9</id>
<content type='text'>
郭永刚 reported that one could simply crash the kernel as root by
using a simple program:

	int socket_fd;
	struct sockaddr_in addr;
	addr.sin_port = 0;
	addr.sin_addr.s_addr = INADDR_ANY;
	addr.sin_family = 10;

	socket_fd = socket(10,3,0x40000000);
	connect(socket_fd , &amp;addr,16);

AF_INET, AF_INET6 sockets actually only support 8-bit protocol
identifiers. inet_sock's skc_protocol field thus is sized accordingly,
thus larger protocol identifiers simply cut off the higher bits and
store a zero in the protocol fields.

This could lead to e.g. NULL function pointer because as a result of
the cut off inet_num is zero and we call down to inet_autobind, which
is NULL for raw sockets.

kernel: Call Trace:
kernel:  [&lt;ffffffff816db90e&gt;] ? inet_autobind+0x2e/0x70
kernel:  [&lt;ffffffff816db9a4&gt;] inet_dgram_connect+0x54/0x80
kernel:  [&lt;ffffffff81645069&gt;] SYSC_connect+0xd9/0x110
kernel:  [&lt;ffffffff810ac51b&gt;] ? ptrace_notify+0x5b/0x80
kernel:  [&lt;ffffffff810236d8&gt;] ? syscall_trace_enter_phase2+0x108/0x200
kernel:  [&lt;ffffffff81645e0e&gt;] SyS_connect+0xe/0x10
kernel:  [&lt;ffffffff81779515&gt;] tracesys_phase2+0x84/0x89

I found no particular commit which introduced this problem.

CVE: CVE-2015-8543
Cc: Cong Wang &lt;cwang@twopensource.com&gt;
Reported-by: 郭永刚 &lt;guoyonggang@360.cn&gt;
Signed-off-by: Hannes Frederic Sowa &lt;hannes@stressinduktion.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf</title>
<updated>2015-12-14T16:09:01Z</updated>
<author>
<name>David S. Miller</name>
<email>davem@davemloft.net</email>
</author>
<published>2015-12-14T16:09:01Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=9e5be5bd4382a6cf4c128eec798cebf8b635c3dd'/>
<id>urn:sha1:9e5be5bd4382a6cf4c128eec798cebf8b635c3dd</id>
<content type='text'>
Pablo Neira Ayuso says:

====================
netfilter fixes for net

The following patchset contains Netfilter fixes for you net tree,
specifically for nf_tables and nfnetlink_queue, they are:

1) Avoid a compilation warning in nfnetlink_queue that was introduced
   in the previous merge window with the simplification of the conntrack
   integration, from Arnd Bergmann.

2) nfnetlink_queue is leaking the pernet subsystem registration from
   a failure path, patch from Nikolay Borisov.

3) Pass down netns pointer to batch callback in nfnetlink, this is the
   largest patch and it is not a bugfix but it is a dependency to
   resolve a splat in the correct way.

4) Fix a splat due to incorrect socket memory accounting with nfnetlink
   skbuff clones.

5) Add missing conntrack dependencies to NFT_DUP_IPV4 and NFT_DUP_IPV6.

6) Traverse the nftables commit list in reverse order from the commit
   path, otherwise we crash when the user applies an incremental update
   via 'nft -f' that deletes an object that was just introduced in this
   batch, from Xin Long.

Regarding the compilation warning fix, many people have sent us (and
keep sending us) patches to address this, that's why I'm including this
batch even if this is not critical.
====================

Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>net: Flush local routes when device changes vrf association</title>
<updated>2015-12-14T04:58:44Z</updated>
<author>
<name>David Ahern</name>
<email>dsa@cumulusnetworks.com</email>
</author>
<published>2015-12-10T18:25:24Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=7f49e7a38b77a7538acf48762c22ccbd05d9535c'/>
<id>urn:sha1:7f49e7a38b77a7538acf48762c22ccbd05d9535c</id>
<content type='text'>
The VRF driver cycles netdevs when an interface is enslaved or released:
the down event is used to flush neighbor and route tables and the up
event (if the interface was already up) effectively moves local and
connected routes to the proper table.

As of 4f823defdd5b the local route is left hanging around after a link
down, so when a netdev is moved from one VRF to another (or released
from a VRF altogether) local routes are left in the wrong table.

Fix by handling the NETDEV_CHANGEUPPER event. When the upper dev is
an L3mdev then call fib_disable_ip to flush all routes, local ones
to.

Fixes: 4f823defdd5b ("ipv4: fix to not remove local route on link down")
Cc: Julian Anastasov &lt;ja@ssi.bg&gt;
Signed-off-by: David Ahern &lt;dsa@cumulusnetworks.com&gt;
Signed-off-by: Nikolay Aleksandrov &lt;nikolay@cumulusnetworks.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
</feed>
