<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/net/ipv4/ip_input.c, branch v4.20</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v4.20</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v4.20'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2018-12-06T00:22:05Z</updated>
<entry>
<title>net: use skb_list_del_init() to remove from RX sublists</title>
<updated>2018-12-06T00:22:05Z</updated>
<author>
<name>Edward Cree</name>
<email>ecree@solarflare.com</email>
</author>
<published>2018-12-04T17:37:57Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=22f6bbb7bcfcef0b373b0502a7ff390275c575dd'/>
<id>urn:sha1:22f6bbb7bcfcef0b373b0502a7ff390275c575dd</id>
<content type='text'>
list_del() leaves the skb-&gt;next pointer poisoned, which can then lead to
 a crash in e.g. OVS forwarding.  For example, setting up an OVS VXLAN
 forwarding bridge on sfc as per:

========
$ ovs-vsctl show
5dfd9c47-f04b-4aaa-aa96-4fbb0a522a30
    Bridge "br0"
        Port "br0"
            Interface "br0"
                type: internal
        Port "enp6s0f0"
            Interface "enp6s0f0"
        Port "vxlan0"
            Interface "vxlan0"
                type: vxlan
                options: {key="1", local_ip="10.0.0.5", remote_ip="10.0.0.4"}
    ovs_version: "2.5.0"
========
(where 10.0.0.5 is an address on enp6s0f1)
and sending traffic across it will lead to the following panic:
========
general protection fault: 0000 [#1] SMP PTI
CPU: 5 PID: 0 Comm: swapper/5 Not tainted 4.20.0-rc3-ehc+ #701
Hardware name: Dell Inc. PowerEdge R710/0M233H, BIOS 6.4.0 07/23/2013
RIP: 0010:dev_hard_start_xmit+0x38/0x200
Code: 53 48 89 fb 48 83 ec 20 48 85 ff 48 89 54 24 08 48 89 4c 24 18 0f 84 ab 01 00 00 48 8d 86 90 00 00 00 48 89 f5 48 89 44 24 10 &lt;4c&gt; 8b 33 48 c7 03 00 00 00 00 48 8b 05 c7 d1 b3 00 4d 85 f6 0f 95
RSP: 0018:ffff888627b437e0 EFLAGS: 00010202
RAX: 0000000000000000 RBX: dead000000000100 RCX: ffff88862279c000
RDX: ffff888614a342c0 RSI: 0000000000000000 RDI: 0000000000000000
RBP: ffff888618a88000 R08: 0000000000000001 R09: 00000000000003e8
R10: 0000000000000000 R11: ffff888614a34140 R12: 0000000000000000
R13: 0000000000000062 R14: dead000000000100 R15: ffff888616430000
FS:  0000000000000000(0000) GS:ffff888627b40000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f6d2bc6d000 CR3: 000000000200a000 CR4: 00000000000006e0
Call Trace:
 &lt;IRQ&gt;
 __dev_queue_xmit+0x623/0x870
 ? masked_flow_lookup+0xf7/0x220 [openvswitch]
 ? ep_poll_callback+0x101/0x310
 do_execute_actions+0xaba/0xaf0 [openvswitch]
 ? __wake_up_common+0x8a/0x150
 ? __wake_up_common_lock+0x87/0xc0
 ? queue_userspace_packet+0x31c/0x5b0 [openvswitch]
 ovs_execute_actions+0x47/0x120 [openvswitch]
 ovs_dp_process_packet+0x7d/0x110 [openvswitch]
 ovs_vport_receive+0x6e/0xd0 [openvswitch]
 ? dst_alloc+0x64/0x90
 ? rt_dst_alloc+0x50/0xd0
 ? ip_route_input_slow+0x19a/0x9a0
 ? __udp_enqueue_schedule_skb+0x198/0x1b0
 ? __udp4_lib_rcv+0x856/0xa30
 ? __udp4_lib_rcv+0x856/0xa30
 ? cpumask_next_and+0x19/0x20
 ? find_busiest_group+0x12d/0xcd0
 netdev_frame_hook+0xce/0x150 [openvswitch]
 __netif_receive_skb_core+0x205/0xae0
 __netif_receive_skb_list_core+0x11e/0x220
 netif_receive_skb_list+0x203/0x460
 ? __efx_rx_packet+0x335/0x5e0 [sfc]
 efx_poll+0x182/0x320 [sfc]
 net_rx_action+0x294/0x3c0
 __do_softirq+0xca/0x297
 irq_exit+0xa6/0xb0
 do_IRQ+0x54/0xd0
 common_interrupt+0xf/0xf
 &lt;/IRQ&gt;
========
So, in all listified-receive handling, instead pull skbs off the lists with
 skb_list_del_init().

Fixes: 9af86f933894 ("net: core: fix use-after-free in __netif_receive_skb_list_core")
Fixes: 7da517a3bc52 ("net: core: Another step of skb receive list processing")
Fixes: a4ca8b7df73c ("net: ipv4: fix drop handling in ip_list_rcv() and ip_list_rcv_finish()")
Fixes: d8269e2cbf90 ("net: ipv6: listify ipv6_rcv() and ip6_rcv_finish()")
Signed-off-by: Edward Cree &lt;ecree@solarflare.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>net: Add and use skb_list_del_init().</title>
<updated>2018-09-10T17:06:54Z</updated>
<author>
<name>David S. Miller</name>
<email>davem@davemloft.net</email>
</author>
<published>2018-07-31T22:27:56Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=992cba7e276d438ac8b0a8c17b147b37c8c286f7'/>
<id>urn:sha1:992cba7e276d438ac8b0a8c17b147b37c8c286f7</id>
<content type='text'>
It documents what is happening, and eliminates the spurious list
pointer poisoning.

In the long term, in order to get proper list head debugging, we
might want to use the list poison value as the indicator that
an SKB is a singleton and not on a list.

Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>net: Add and use skb_mark_not_on_list().</title>
<updated>2018-09-10T17:06:54Z</updated>
<author>
<name>David S. Miller</name>
<email>davem@davemloft.net</email>
</author>
<published>2018-07-30T03:42:53Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=a8305bff685252e80b7c60f4f5e7dd2e63e38218'/>
<id>urn:sha1:a8305bff685252e80b7c60f4f5e7dd2e63e38218</id>
<content type='text'>
An SKB is not on a list if skb-&gt;next is NULL.

Codify this convention into a helper function and use it
where we are dequeueing an SKB and need to mark it as such.

Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>net: ipv4: fix listify ip_rcv_finish in case of forwarding</title>
<updated>2018-07-12T23:40:19Z</updated>
<author>
<name>Jesper Dangaard Brouer</name>
<email>brouer@redhat.com</email>
</author>
<published>2018-07-11T15:01:20Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=0761680d521559813648fb430ddeb479c97ab060'/>
<id>urn:sha1:0761680d521559813648fb430ddeb479c97ab060</id>
<content type='text'>
In commit 5fa12739a53d ("net: ipv4: listify ip_rcv_finish") calling
dst_input(skb) was split-out.  The ip_sublist_rcv_finish() just calls
dst_input(skb) in a loop.

The problem is that ip_sublist_rcv_finish() forgot to remove the SKB
from the list before invoking dst_input().  Further more we need to
clear skb-&gt;next as other parts of the network stack use another kind
of SKB lists for xmit_more (see dev_hard_start_xmit).

A crash occurs if e.g. dst_input() invoke ip_forward(), which calls
dst_output()/ip_output() that eventually calls __dev_queue_xmit() +
sch_direct_xmit(), and a crash occurs in validate_xmit_skb_list().

This patch only fixes the crash, but there is a huge potential for
a performance boost if we can pass an SKB-list through to ip_forward.

Fixes: 5fa12739a53d ("net: ipv4: listify ip_rcv_finish")
Signed-off-by: Jesper Dangaard Brouer &lt;brouer@redhat.com&gt;
Acked-by: Edward Cree &lt;ecree@solarflare.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>net: ipv4: fix list processing on L3 slave devices</title>
<updated>2018-07-06T02:19:07Z</updated>
<author>
<name>Edward Cree</name>
<email>ecree@solarflare.com</email>
</author>
<published>2018-07-05T14:47:39Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=efe6aaca67a0229a195f493d142102c864b41dde'/>
<id>urn:sha1:efe6aaca67a0229a195f493d142102c864b41dde</id>
<content type='text'>
If we have an L3 master device, l3mdev_ip_rcv() will steal the skb, but
 we were returning NET_RX_SUCCESS from ip_rcv_finish_core() which meant
 that ip_list_rcv_finish() would keep it on the list.  Instead let's
 move the l3mdev_ip_rcv() call into the caller, so that our response to
 a steal can be different in the single packet path (return
 NET_RX_SUCCESS) and the list path (forget this packet and continue).

Fixes: 5fa12739a53d ("net: ipv4: listify ip_rcv_finish")
Signed-off-by: Edward Cree &lt;ecree@solarflare.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>net: ipv4: fix drop handling in ip_list_rcv() and ip_list_rcv_finish()</title>
<updated>2018-07-05T02:25:41Z</updated>
<author>
<name>Edward Cree</name>
<email>ecree@solarflare.com</email>
</author>
<published>2018-07-04T18:23:50Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=a4ca8b7df73c6d78b8b5aa8246a7d794b25c25ce'/>
<id>urn:sha1:a4ca8b7df73c6d78b8b5aa8246a7d794b25c25ce</id>
<content type='text'>
Since callees (ip_rcv_core() and ip_rcv_finish_core()) might free or steal
 the skb, we can't use the list_cut_before() method; we can't even do a
 list_del(&amp;skb-&gt;list) in the drop case, because skb might have already been
 freed and reused.
So instead, take each skb off the source list before processing, and add it
 to the sublist afterwards if it wasn't freed or stolen.

Fixes: 5fa12739a53d net: ipv4: listify ip_rcv_finish
Fixes: 17266ee93984 net: ipv4: listified version of ip_rcv
Signed-off-by: Edward Cree &lt;ecree@solarflare.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>net: ipv4: listify ip_rcv_finish</title>
<updated>2018-07-04T05:06:20Z</updated>
<author>
<name>Edward Cree</name>
<email>ecree@solarflare.com</email>
</author>
<published>2018-07-02T15:14:34Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=5fa12739a53d0780265ed9d44d9ec9ba5f9ad00a'/>
<id>urn:sha1:5fa12739a53d0780265ed9d44d9ec9ba5f9ad00a</id>
<content type='text'>
ip_rcv_finish_core(), if it does not drop, sets skb-&gt;dst by either early
 demux or route lookup.  The last step, calling dst_input(skb), is left to
 the caller; in the listified case, we split to form sublists with a common
 dst, but then ip_sublist_rcv_finish() just calls dst_input(skb) in a loop.
The next step in listification would thus be to add a list_input() method
 to struct dst_entry.

Early demux is an indirect call based on iph-&gt;protocol; this is another
 opportunity for listification which is not taken here (it would require
 slicing up ip_rcv_finish_core() to allow splitting on protocol changes).

Signed-off-by: Edward Cree &lt;ecree@solarflare.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>net: ipv4: listified version of ip_rcv</title>
<updated>2018-07-04T05:06:20Z</updated>
<author>
<name>Edward Cree</name>
<email>ecree@solarflare.com</email>
</author>
<published>2018-07-02T15:14:12Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=17266ee939849cb095ed7dd9edbec4162172226b'/>
<id>urn:sha1:17266ee939849cb095ed7dd9edbec4162172226b</id>
<content type='text'>
Also involved adding a way to run a netfilter hook over a list of packets.
 Rather than attempting to make netfilter know about lists (which would be
 a major project in itself) we just let it call the regular okfn (in this
 case ip_rcv_finish()) for any packets it steals, and have it give us back
 a list of packets it's synchronously accepted (which normally NF_HOOK
 would automatically call okfn() on, but we want to be able to potentially
 pass the list to a listified version of okfn().)
The netfilter hooks themselves are indirect calls that still happen per-
 packet (see nf_hook_entry_hookfn()), but again, changing that can be left
 for future work.

There is potential for out-of-order receives if the netfilter hook ends up
 synchronously stealing packets, as they will be processed before any
 accepts earlier in the list.  However, it was already possible for an
 asynchronous accept to cause out-of-order receives, so presumably this is
 considered OK.

Signed-off-by: Edward Cree &lt;ecree@solarflare.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>net: Make ip_ra_chain per struct net</title>
<updated>2018-03-22T19:12:56Z</updated>
<author>
<name>Kirill Tkhai</name>
<email>ktkhai@virtuozzo.com</email>
</author>
<published>2018-03-22T09:45:32Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=5796ef75ec7b6019eac88f66751d663d537a5cd3'/>
<id>urn:sha1:5796ef75ec7b6019eac88f66751d663d537a5cd3</id>
<content type='text'>
This is optimization, which makes ip_call_ra_chain()
iterate less sockets to find the sockets it's looking for.

Signed-off-by: Kirill Tkhai &lt;ktkhai@virtuozzo.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>IPv4: early demux can return an error code</title>
<updated>2017-10-01T02:55:47Z</updated>
<author>
<name>Paolo Abeni</name>
<email>pabeni@redhat.com</email>
</author>
<published>2017-09-28T13:51:36Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=7487449c86c65202b3b725c4524cb48dd65e4e6f'/>
<id>urn:sha1:7487449c86c65202b3b725c4524cb48dd65e4e6f</id>
<content type='text'>
Currently no error is emitted, but this infrastructure will
used by the next patch to allow source address validation
for mcast sockets.
Since early demux can do a route lookup and an ipv4 route
lookup can return an error code this is consistent with the
current ipv4 route infrastructure.

Signed-off-by: Paolo Abeni &lt;pabeni@redhat.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
</feed>
