<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/net/ipv4, branch v4.11</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v4.11</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v4.11'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2017-04-28T20:05:22Z</updated>
<entry>
<title>tcp: do not underestimate skb-&gt;truesize in tcp_trim_head()</title>
<updated>2017-04-28T20:05:22Z</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2017-04-27T00:15:40Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=7162fb242cb8322beb558828fd26b33c3e9fc805'/>
<id>urn:sha1:7162fb242cb8322beb558828fd26b33c3e9fc805</id>
<content type='text'>
Andrey found a way to trigger the WARN_ON_ONCE(delta &lt; len) in
skb_try_coalesce() using syzkaller and a filter attached to a TCP
socket over loopback interface.

I believe one issue with looped skbs is that tcp_trim_head() can end up
producing skb with under estimated truesize.

It hardly matters for normal conditions, since packets sent over
loopback are never truncated.

Bytes trimmed from skb-&gt;head should not change skb truesize, since
skb-&gt;head is not reallocated.

Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Reported-by: Andrey Konovalov &lt;andreyknvl@google.com&gt;
Tested-by: Andrey Konovalov &lt;andreyknvl@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>ipv4: Don't pass IP fragments to upper layer GRO handlers.</title>
<updated>2017-04-28T20:00:38Z</updated>
<author>
<name>Steffen Klassert</name>
<email>steffen.klassert@secunet.com</email>
</author>
<published>2017-04-28T08:54:32Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=9b83e0319840eca758ef586776a427284ff767bf'/>
<id>urn:sha1:9b83e0319840eca758ef586776a427284ff767bf</id>
<content type='text'>
Upper layer GRO handlers can not handle IP fragments, so
exit GRO processing in this case.

This fixes ESP GRO because the packet must be reassembled
before we can decapsulate, otherwise we get authentication
failures.

It also aligns IPv4 to IPv6 where packets with fragmentation
headers are not passed to upper layer GRO handlers.

Fixes: 7785bba299a8 ("esp: Add a software GRO codepath")
Signed-off-by: Steffen Klassert &lt;steffen.klassert@secunet.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>tcp: memset ca_priv data to 0 properly</title>
<updated>2017-04-26T18:58:32Z</updated>
<author>
<name>Wei Wang</name>
<email>weiwan@google.com</email>
</author>
<published>2017-04-26T00:38:02Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=c1201444075009507a6818de6518e2822b9a87c8'/>
<id>urn:sha1:c1201444075009507a6818de6518e2822b9a87c8</id>
<content type='text'>
Always zero out ca_priv data in tcp_assign_congestion_control() so that
ca_priv data is cleared out during socket creation.
Also always zero out ca_priv data in tcp_reinit_congestion_control() so
that when cc algorithm is changed, ca_priv data is cleared out as well.
We should still zero out ca_priv data even in TCP_CLOSE state because
user could call connect() on AF_UNSPEC to disconnect the socket and
leave it in TCP_CLOSE state and later call setsockopt() to switch cc
algorithm on this socket.

Fixes: 2b0a8c9ee ("tcp: add CDG congestion control")
Reported-by: Andrey Konovalov  &lt;andreyknvl@google.com&gt;
Signed-off-by: Wei Wang &lt;weiwan@google.com&gt;
Acked-by: Eric Dumazet &lt;edumazet@google.com&gt;
Acked-by: Yuchung Cheng &lt;ycheng@google.com&gt;
Acked-by: Neal Cardwell &lt;ncardwell@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>udp: disable inner UDP checksum offloads in IPsec case</title>
<updated>2017-04-24T17:48:54Z</updated>
<author>
<name>Ansis Atteka</name>
<email>aatteka@ovn.org</email>
</author>
<published>2017-04-21T22:23:05Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=b40c5f4fde22fb98eff205b3aece05b471c24eed'/>
<id>urn:sha1:b40c5f4fde22fb98eff205b3aece05b471c24eed</id>
<content type='text'>
Otherwise, UDP checksum offloads could corrupt ESP packets by attempting
to calculate UDP checksum when this inner UDP packet is already protected
by IPsec.

One way to reproduce this bug is to have a VM with virtio_net driver (UFO
set to ON in the guest VM); and then encapsulate all guest's Ethernet
frames in Geneve; and then further encrypt Geneve with IPsec.  In this
case following symptoms are observed:
1. If using ixgbe NIC, then it will complain with following error message:
   ixgbe 0000:01:00.1: partial checksum but l4 proto=32!
2. Receiving IPsec stack will drop all the corrupted ESP packets and
   increase XfrmInStateProtoError counter in /proc/net/xfrm_stat.
3. iperf UDP test from the VM with packet sizes above MTU will not work at
   all.
4. iperf TCP test from the VM will get ridiculously low performance because.

Signed-off-by: Ansis Atteka &lt;aatteka@ovn.org&gt;
Co-authored-by: Steffen Klassert &lt;steffen.klassert@secunet.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>ipv4: Avoid caching l3mdev dst on mismatched local route</title>
<updated>2017-04-24T16:50:29Z</updated>
<author>
<name>Robert Shearman</name>
<email>rshearma@brocade.com</email>
</author>
<published>2017-04-21T20:34:59Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=b7c8487cb3d99509220092fe77a2464dff43f015'/>
<id>urn:sha1:b7c8487cb3d99509220092fe77a2464dff43f015</id>
<content type='text'>
David reported that doing the following:

    ip li add red type vrf table 10
    ip link set dev eth1 vrf red
    ip addr add 127.0.0.1/8 dev red
    ip link set dev eth1 up
    ip li set red up
    ping -c1 -w1 -I red 127.0.0.1
    ip li del red

when either policy routing IP rules are present or the local table
lookup ip rule is before the l3mdev lookup results in a hang with
these messages:

    unregister_netdevice: waiting for red to become free. Usage count = 1

The problem is caused by caching the dst used for sending the packet
out of the specified interface on a local route with a different
nexthop interface. Thus the dst could stay around until the route in
the table the lookup was done is deleted which may be never.

Address the problem by not forcing output device to be the l3mdev in
the flow's output interface if the lookup didn't use the l3mdev. This
then results in the dst using the right device according to the route.

Changes in v2:
 - make the dev_out passed in by __ip_route_output_key_hash correct
   instead of checking the nh dev if FLOWI_FLAG_SKIP_NH_OIF is set as
   suggested by David.

Fixes: 5f02ce24c2696 ("net: l3mdev: Allow the l3mdev to be a loopback")
Reported-by: David Ahern &lt;dsa@cumulusnetworks.com&gt;
Suggested-by: David Ahern &lt;dsa@cumulusnetworks.com&gt;
Signed-off-by: Robert Shearman &lt;rshearma@brocade.com&gt;
Acked-by: David Ahern &lt;dsa@cumulusnetworks.com&gt;
Tested-by: David Ahern &lt;dsa@cumulusnetworks.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>net-timestamp: avoid use-after-free in ip_recv_error</title>
<updated>2017-04-17T16:59:22Z</updated>
<author>
<name>Willem de Bruijn</name>
<email>willemb@google.com</email>
</author>
<published>2017-04-12T23:24:35Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=1862d6208db0aeca9c8ace44915b08d5ab2cd667'/>
<id>urn:sha1:1862d6208db0aeca9c8ace44915b08d5ab2cd667</id>
<content type='text'>
Syzkaller reported a use-after-free in ip_recv_error at line

    info-&gt;ipi_ifindex = skb-&gt;dev-&gt;ifindex;

This function is called on dequeue from the error queue, at which
point the device pointer may no longer be valid.

Save ifindex on enqueue in __skb_complete_tx_timestamp, when the
pointer is valid or NULL. Store it in temporary storage skb-&gt;cb.

It is safe to reference skb-&gt;dev here, as called from device drivers
or dev_queue_xmit. The exception is when called from tcp_ack_tstamp;
in that case it is NULL and ifindex is set to 0 (invalid).

Do not return a pktinfo cmsg if ifindex is 0. This maintains the
current behavior of not returning a cmsg if skb-&gt;dev was NULL.

On dequeue, the ipv4 path will cast from sock_exterr_skb to
in_pktinfo. Both have ifindex as their first element, so no explicit
conversion is needed. This is by design, introduced in commit
0b922b7a829c ("net: original ingress device index in PKTINFO"). For
ipv6 ip6_datagram_support_cmsg converts to in6_pktinfo.

Fixes: 829ae9d61165 ("net-timestamp: allow reading recv cmsg on errqueue with origin tstamp")
Reported-by: Andrey Konovalov &lt;andreyknvl@google.com&gt;
Signed-off-by: Willem de Bruijn &lt;willemb@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>ipv4: fix a deadlock in ip_ra_control</title>
<updated>2017-04-17T16:46:50Z</updated>
<author>
<name>WANG Cong</name>
<email>xiyou.wangcong@gmail.com</email>
</author>
<published>2017-04-12T19:32:13Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=1215e51edad1272e669172b26aa12aac94810c7f'/>
<id>urn:sha1:1215e51edad1272e669172b26aa12aac94810c7f</id>
<content type='text'>
Similar to commit 87e9f0315952
("ipv4: fix a potential deadlock in mcast getsockopt() path"),
there is a deadlock scenario for IP_ROUTER_ALERT too:

       CPU0                    CPU1
       ----                    ----
  lock(rtnl_mutex);
                               lock(sk_lock-AF_INET);
                               lock(rtnl_mutex);
  lock(sk_lock-AF_INET);

Fix this by always locking RTNL first on all setsockopt() paths.

Note, after this patch ip_ra_lock is no longer needed either.

Reported-by: Dmitry Vyukov &lt;dvyukov@google.com&gt;
Tested-by: Andrey Konovalov &lt;andreyknvl@google.com&gt;
Signed-off-by: Cong Wang &lt;xiyou.wangcong@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf</title>
<updated>2017-04-14T14:47:13Z</updated>
<author>
<name>David S. Miller</name>
<email>davem@davemloft.net</email>
</author>
<published>2017-04-14T14:47:13Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=f4c13c8ec56e70eeff3e365e0c5fcdad16845b32'/>
<id>urn:sha1:f4c13c8ec56e70eeff3e365e0c5fcdad16845b32</id>
<content type='text'>
Pablo Neira Ayuso says:

====================
Netfilter fixes for net

The following patchset contains Netfilter fixes for your net tree,
they are:

1) Missing TCP header sanity check in TCPMSS target, from Eric Dumazet.

2) Incorrect event message type for related conntracks created via
   ctnetlink, from Liping Zhang.

3) Fix incorrect rcu locking when handling helpers from ctnetlink,
   from Gao feng.

4) Fix missing rcu locking when updating helper, from Liping Zhang.

5) Fix missing read_lock_bh when iterating over list of device addresses
   from TPROXY and redirect, also from Liping.

6) Fix crash when trying to dump expectations from conntrack with no
   helper via ctnetlink, from Liping.

7) Missing RCU protection to expecation list update given ctnetlink
   iterates over the list under rcu read lock side, from Liping too.

8) Don't dump autogenerated seed in nft_hash to userspace, this is
   very confusing to the user, again from Liping.

9) Fix wrong conntrack netns module refcount in ipt_CLUSTERIP,
   from Gao feng.
====================

Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>netfilter: ipt_CLUSTERIP: Fix wrong conntrack netns refcnt usage</title>
<updated>2017-04-13T21:21:40Z</updated>
<author>
<name>Gao Feng</name>
<email>fgao@ikuai8.com</email>
</author>
<published>2017-04-06T01:45:22Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=fe50543c194e2e1aee2f3eba41fcafd187b3dbde'/>
<id>urn:sha1:fe50543c194e2e1aee2f3eba41fcafd187b3dbde</id>
<content type='text'>
Current codes invoke wrongly nf_ct_netns_get in the destroy routine,
it should use nf_ct_netns_put, not nf_ct_netns_get.
It could cause some modules could not be unloaded.

Fixes: ecb2421b5ddf ("netfilter: add and use nf_ct_netns_get/put")
Signed-off-by: Gao Feng &lt;fgao@ikuai8.com&gt;
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
</content>
</entry>
<entry>
<title>tcp: clear saved_syn in tcp_disconnect()</title>
<updated>2017-04-10T01:27:28Z</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2017-04-08T15:07:33Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=17c3060b1701fc69daedb4c90be6325d3d9fca8e'/>
<id>urn:sha1:17c3060b1701fc69daedb4c90be6325d3d9fca8e</id>
<content type='text'>
In the (very unlikely) case a passive socket becomes a listener,
we do not want to duplicate its saved SYN headers.

This would lead to double frees, use after free, and please hackers and
various fuzzers

Tested:
    0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
   +0 setsockopt(3, IPPROTO_TCP, TCP_SAVE_SYN, [1], 4) = 0
   +0 fcntl(3, F_SETFL, O_RDWR|O_NONBLOCK) = 0

   +0 bind(3, ..., ...) = 0
   +0 listen(3, 5) = 0

   +0 &lt; S 0:0(0) win 32972 &lt;mss 1460,nop,wscale 7&gt;
   +0 &gt; S. 0:0(0) ack 1 &lt;...&gt;
  +.1 &lt; . 1:1(0) ack 1 win 257
   +0 accept(3, ..., ...) = 4

   +0 connect(4, AF_UNSPEC, ...) = 0
   +0 close(3) = 0
   +0 bind(4, ..., ...) = 0
   +0 listen(4, 5) = 0

   +0 &lt; S 0:0(0) win 32972 &lt;mss 1460,nop,wscale 7&gt;
   +0 &gt; S. 0:0(0) ack 1 &lt;...&gt;
  +.1 &lt; . 1:1(0) ack 1 win 257

Fixes: cd8ae85299d5 ("tcp: provide SYN headers for passive connections")
Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
</feed>
