<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/net/ipv6/netfilter, branch v5.10</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v5.10</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v5.10'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2020-12-08T11:57:39Z</updated>
<entry>
<title>netfilter: x_tables: Switch synchronization to RCU</title>
<updated>2020-12-08T11:57:39Z</updated>
<author>
<name>Subash Abhinov Kasiviswanathan</name>
<email>subashab@codeaurora.org</email>
</author>
<published>2020-11-25T18:27:22Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=cc00bcaa589914096edef7fb87ca5cee4a166b5c'/>
<id>urn:sha1:cc00bcaa589914096edef7fb87ca5cee4a166b5c</id>
<content type='text'>
When running concurrent iptables rules replacement with data, the per CPU
sequence count is checked after the assignment of the new information.
The sequence count is used to synchronize with the packet path without the
use of any explicit locking. If there are any packets in the packet path using
the table information, the sequence count is incremented to an odd value and
is incremented to an even after the packet process completion.

The new table value assignment is followed by a write memory barrier so every
CPU should see the latest value. If the packet path has started with the old
table information, the sequence counter will be odd and the iptables
replacement will wait till the sequence count is even prior to freeing the
old table info.

However, this assumes that the new table information assignment and the memory
barrier is actually executed prior to the counter check in the replacement
thread. If CPU decides to execute the assignment later as there is no user of
the table information prior to the sequence check, the packet path in another
CPU may use the old table information. The replacement thread would then free
the table information under it leading to a use after free in the packet
processing context-

Unable to handle kernel NULL pointer dereference at virtual
address 000000000000008e
pc : ip6t_do_table+0x5d0/0x89c
lr : ip6t_do_table+0x5b8/0x89c
ip6t_do_table+0x5d0/0x89c
ip6table_filter_hook+0x24/0x30
nf_hook_slow+0x84/0x120
ip6_input+0x74/0xe0
ip6_rcv_finish+0x7c/0x128
ipv6_rcv+0xac/0xe4
__netif_receive_skb+0x84/0x17c
process_backlog+0x15c/0x1b8
napi_poll+0x88/0x284
net_rx_action+0xbc/0x23c
__do_softirq+0x20c/0x48c

This could be fixed by forcing instruction order after the new table
information assignment or by switching to RCU for the synchronization.

Fixes: 80055dab5de0 ("netfilter: x_tables: make xt_replace_table wait until old rules are not used anymore")
Reported-by: Sean Tranchetti &lt;stranche@codeaurora.org&gt;
Reported-by: kernel test robot &lt;lkp@intel.com&gt;
Suggested-by: Florian Westphal &lt;fw@strlen.de&gt;
Signed-off-by: Subash Abhinov Kasiviswanathan &lt;subashab@codeaurora.org&gt;
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
</content>
</entry>
<entry>
<title>ipv6: Remove dependency of ipv6_frag_thdr_truncated on ipv6 module</title>
<updated>2020-11-19T18:49:50Z</updated>
<author>
<name>Georg Kohmann</name>
<email>geokohma@cisco.com</email>
</author>
<published>2020-11-19T09:58:33Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=2d8f6481c17db9fa5238b277cdbc392084060b09'/>
<id>urn:sha1:2d8f6481c17db9fa5238b277cdbc392084060b09</id>
<content type='text'>
IPV6=m
NF_DEFRAG_IPV6=y

ld: net/ipv6/netfilter/nf_conntrack_reasm.o: in function
`nf_ct_frag6_gather':
net/ipv6/netfilter/nf_conntrack_reasm.c:462: undefined reference to
`ipv6_frag_thdr_truncated'

Netfilter is depending on ipv6 symbol ipv6_frag_thdr_truncated. This
dependency is forcing IPV6=y.

Remove this dependency by moving ipv6_frag_thdr_truncated out of ipv6. This
is the same solution as used with a similar issues: Referring to
commit 70b095c843266 ("ipv6: remove dependency of nf_defrag_ipv6 on ipv6
module")

Fixes: 9d9e937b1c8b ("ipv6/netfilter: Discard first fragment not including all headers")
Reported-by: Randy Dunlap &lt;rdunlap@infradead.org&gt;
Reported-by: kernel test robot &lt;lkp@intel.com&gt;
Signed-off-by: Georg Kohmann &lt;geokohma@cisco.com&gt;
Acked-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
Acked-by: Randy Dunlap &lt;rdunlap@infradead.org&gt; # build-tested
Link: https://lore.kernel.org/r/20201119095833.8409-1-geokohma@cisco.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
</entry>
<entry>
<title>ipv6/netfilter: Discard first fragment not including all headers</title>
<updated>2020-11-16T18:15:11Z</updated>
<author>
<name>Georg Kohmann</name>
<email>geokohma@cisco.com</email>
</author>
<published>2020-11-11T11:50:25Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=9d9e937b1c8be97b424e3e11938e183fcde905c0'/>
<id>urn:sha1:9d9e937b1c8be97b424e3e11938e183fcde905c0</id>
<content type='text'>
Packets are processed even though the first fragment don't include all
headers through the upper layer header. This breaks TAHI IPv6 Core
Conformance Test v6LC.1.3.6.

Referring to RFC8200 SECTION 4.5: "If the first fragment does not include
all headers through an Upper-Layer header, then that fragment should be
discarded and an ICMP Parameter Problem, Code 3, message should be sent to
the source of the fragment, with the Pointer field set to zero."

The fragment needs to be validated the same way it is done in
commit 2efdaaaf883a ("IPv6: reply ICMP error if the first fragment don't
include all headers") for ipv6. Wrap the validation into a common function,
ipv6_frag_thdr_truncated() to check for truncation in the upper layer
header. This validation does not fullfill all aspects of RFC 8200,
section 4.5, but is at the moment sufficient to pass mentioned TAHI test.

In netfilter, utilize the fragment offset returned by find_prev_fhdr() to
let ipv6_frag_thdr_truncated() start it's traverse from the fragment
header.

Return 0 to drop the fragment in the netfilter. This is the same behaviour
as used on other protocol errors in this function, e.g. when
nf_ct_frag6_queue() returns -EPROTO. The Fragment will later be picked up
by ipv6_frag_rcv() in reassembly.c. ipv6_frag_rcv() will then send an
appropriate ICMP Parameter Problem message back to the source.

References commit 2efdaaaf883a ("IPv6: reply ICMP error if the first
fragment don't include all headers")

Signed-off-by: Georg Kohmann &lt;geokohma@cisco.com&gt;
Acked-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
Link: https://lore.kernel.org/r/20201111115025.28879-1-geokohma@cisco.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
</entry>
<entry>
<title>netfilter: use actual socket sk rather than skb sk when routing harder</title>
<updated>2020-10-30T11:57:39Z</updated>
<author>
<name>Jason A. Donenfeld</name>
<email>Jason@zx2c4.com</email>
</author>
<published>2020-10-29T02:56:06Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=46d6c5ae953cc0be38efd0e469284df7c4328cf8'/>
<id>urn:sha1:46d6c5ae953cc0be38efd0e469284df7c4328cf8</id>
<content type='text'>
If netfilter changes the packet mark when mangling, the packet is
rerouted using the route_me_harder set of functions. Prior to this
commit, there's one big difference between route_me_harder and the
ordinary initial routing functions, described in the comment above
__ip_queue_xmit():

   /* Note: skb-&gt;sk can be different from sk, in case of tunnels */
   int __ip_queue_xmit(struct sock *sk, struct sk_buff *skb, struct flowi *fl,

That function goes on to correctly make use of sk-&gt;sk_bound_dev_if,
rather than skb-&gt;sk-&gt;sk_bound_dev_if. And indeed the comment is true: a
tunnel will receive a packet in ndo_start_xmit with an initial skb-&gt;sk.
It will make some transformations to that packet, and then it will send
the encapsulated packet out of a *new* socket. That new socket will
basically always have a different sk_bound_dev_if (otherwise there'd be
a routing loop). So for the purposes of routing the encapsulated packet,
the routing information as it pertains to the socket should come from
that socket's sk, rather than the packet's original skb-&gt;sk. For that
reason __ip_queue_xmit() and related functions all do the right thing.

One might argue that all tunnels should just call skb_orphan(skb) before
transmitting the encapsulated packet into the new socket. But tunnels do
*not* do this -- and this is wisely avoided in skb_scrub_packet() too --
because features like TSQ rely on skb-&gt;destructor() being called when
that buffer space is truely available again. Calling skb_orphan(skb) too
early would result in buffers filling up unnecessarily and accounting
info being all wrong. Instead, additional routing must take into account
the new sk, just as __ip_queue_xmit() notes.

So, this commit addresses the problem by fishing the correct sk out of
state-&gt;sk -- it's already set properly in the call to nf_hook() in
__ip_local_out(), which receives the sk as part of its normal
functionality. So we make sure to plumb state-&gt;sk through the various
route_me_harder functions, and then make correct use of it following the
example of __ip_queue_xmit().

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Jason A. Donenfeld &lt;Jason@zx2c4.com&gt;
Reviewed-by: Florian Westphal &lt;fw@strlen.de&gt;
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
</content>
</entry>
<entry>
<title>netfilter: Drop fragmented ndisc packets assembled in netfilter</title>
<updated>2020-10-20T11:54:53Z</updated>
<author>
<name>Georg Kohmann</name>
<email>geokohma@cisco.com</email>
</author>
<published>2020-10-13T12:23:12Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=68f9f9c2c3b6a7259f6a92bc26cdc7bd22e7a982'/>
<id>urn:sha1:68f9f9c2c3b6a7259f6a92bc26cdc7bd22e7a982</id>
<content type='text'>
Fragmented ndisc packets assembled in netfilter not dropped as specified
in RFC 6980, section 5. This behaviour breaks TAHI IPv6 Core Conformance
Tests v6LC.2.1.22/23, V6LC.2.2.26/27 and V6LC.2.3.18.

Setting IP6SKB_FRAGMENTED flag during reassembly.

References: commit b800c3b966bc ("ipv6: drop fragmented ndisc packets by default (RFC 6980)")
Signed-off-by: Georg Kohmann &lt;geokohma@cisco.com&gt;
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
</content>
</entry>
<entry>
<title>Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net</title>
<updated>2020-10-15T19:43:21Z</updated>
<author>
<name>Jakub Kicinski</name>
<email>kuba@kernel.org</email>
</author>
<published>2020-10-15T19:43:21Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=2295cddf99e3f7c2be2b1160e2f5e53cc35b09be'/>
<id>urn:sha1:2295cddf99e3f7c2be2b1160e2f5e53cc35b09be</id>
<content type='text'>
Minor conflicts in net/mptcp/protocol.h and
tools/testing/selftests/net/Makefile.

In both cases code was added on both sides in the same place
so just keep both.

Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
</entry>
<entry>
<title>netfilter: nf_log: missing vlan offload tag and proto</title>
<updated>2020-10-13T23:25:14Z</updated>
<author>
<name>Pablo Neira Ayuso</name>
<email>pablo@netfilter.org</email>
</author>
<published>2020-10-12T15:06:06Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=0d9826bc18ce356e8909919ad681ad65d0a6061e'/>
<id>urn:sha1:0d9826bc18ce356e8909919ad681ad65d0a6061e</id>
<content type='text'>
Dump vlan tag and proto for the usual vlan offload case if the
NF_LOG_MACDECODE flag is set on. Without this information the logging is
misleading as there is no reference to the VLAN header.

[12716.993704] test: IN=veth0 OUT= MACSRC=86:6c:92:ea:d6:73 MACDST=0e:3b:eb:86:73:76 VPROTO=8100 VID=10 MACPROTO=0800 SRC=192.168.10.2 DST=172.217.168.163 LEN=52 TOS=0x00 PREC=0x00 TTL=64 ID=2548 DF PROTO=TCP SPT=55848 DPT=80 WINDOW=501 RES=0x00 ACK FIN URGP=0
[12721.157643] test: IN=veth0 OUT= MACSRC=86:6c:92:ea:d6:73 MACDST=0e:3b:eb:86:73:76 VPROTO=8100 VID=10 MACPROTO=0806 ARP HTYPE=1 PTYPE=0x0800 OPCODE=2 MACSRC=86:6c:92:ea:d6:73 IPSRC=192.168.10.2 MACDST=0e:3b:eb:86:73:76 IPDST=192.168.10.1

Fixes: 83e96d443b37 ("netfilter: log: split family specific code to nf_log_{ip,ip6,common}.c files")
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
</content>
</entry>
<entry>
<title>netfilter: ip6t_NPT: rewrite addresses in ICMPv6 original packet</title>
<updated>2020-08-28T17:18:48Z</updated>
<author>
<name>Michael Zhou</name>
<email>mzhou@cse.unsw.edu.au</email>
</author>
<published>2020-08-06T15:15:25Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=d5608a0578a07766a97c00904d85f6a7221ddb37'/>
<id>urn:sha1:d5608a0578a07766a97c00904d85f6a7221ddb37</id>
<content type='text'>
Detect and rewrite a prefix embedded in an ICMPv6 original packet that was
rewritten by a corresponding DNPT/SNPT rule so it will be recognised by
the host that sent the original packet.

Example

Rules in effect on the 1:2:3:4::/64 + 5:6:7:8::/64 side router:
* SNPT src-pfx 1:2:3:4::/64 dst-pfx 5:6:7:8::/64
* DNPT src-pfx 5:6:7:8::/64 dst-pfx 1:2:3:4::/64

No rules on the 9:a:b:c::/64 side.

1. 1:2:3:4::1 sends UDP packet to 9:a:b:c::1
2. Router applies SNPT changing src to 5:6:7:8::ffef::1
3. 9:a:b:c::1 receives packet with (src 5:6:7:8::ffef::1 dst 9:a:b:c::1)
	and replies with ICMPv6 port unreachable to 5:6:7:8::ffef::1,
	including original packet (src 5:6:7:8::ffef::1 dst 9:a:b:c::1)
4. Router forwards ICMPv6 packet with (src 9:a:b:c::1 dst 5:6:7:8::ffef::1)
	including original packet (src 5:6:7:8::ffef::1 dst 9:a:b:c::1)
	and applies DNPT changing dst to 1:2:3:4::1
5. 1:2:3:4::1 receives ICMPv6 packet with (src 9:a:b:c::1 dst 1:2:3:4::1)
	including original packet (src 5:6:7:8::ffef::1 dst 9:a:b:c::1).
	It doesn't recognise the original packet as the src doesn't
	match anything it originally sent

With this change, at step 4, DNPT will also rewrite the original packet
src to 1:2:3:4::1, so at step 5, 1:2:3:4::1 will recognise the ICMPv6
error and provide feedback to the application properly.

Conversely, SNPT will help when ICMPv6 errors are sent from the
translated network.

1. 9:a:b:c::1 sends UDP packet to 5:6:7:8::ffef::1
2. Router applies DNPT changing dst to 1:2:3:4::1
3. 1:2:3:4::1 receives packet with (src 9:a:b:c::1 dst 1:2:3:4::1)
	and replies with ICMPv6 port unreachable to 9:a:b:c::1
	including original packet (src 9:a:b:c::1 dst 1:2:3:4::1)
4. Router forwards ICMPv6 packet with (src 1:2:3:4::1 dst 9:a:b:c::1)
	including original packet (src 9:a:b:c::1 dst 1:2:3:4::1)
	and applies SNPT changing src to 5:6:7:8::ffef::1
5. 9:a:b:c::1 receives ICMPv6 packet with
	(src 5:6:7:8::ffef::1 dst 9:a:b:c::1) including
	original packet (src 9:a:b:c::1 dst 1:2:3:4::1).
	It doesn't recognise the original packet as the dst doesn't
	match anything it already sent

The change to SNPT means the ICMPv6 original packet dst will be
rewritten to 5:6:7:8::ffef::1 in step 4, allowing the error to be
properly recognised in step 5.

Signed-off-by: Michael Zhou &lt;mzhou@cse.unsw.edu.au&gt;
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
</content>
</entry>
<entry>
<title>Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next</title>
<updated>2020-08-06T03:13:21Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2020-08-06T03:13:21Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=47ec5303d73ea344e84f46660fff693c57641386'/>
<id>urn:sha1:47ec5303d73ea344e84f46660fff693c57641386</id>
<content type='text'>
Pull networking updates from David Miller:

 1) Support 6Ghz band in ath11k driver, from Rajkumar Manoharan.

 2) Support UDP segmentation in code TSO code, from Eric Dumazet.

 3) Allow flashing different flash images in cxgb4 driver, from Vishal
    Kulkarni.

 4) Add drop frames counter and flow status to tc flower offloading,
    from Po Liu.

 5) Support n-tuple filters in cxgb4, from Vishal Kulkarni.

 6) Various new indirect call avoidance, from Eric Dumazet and Brian
    Vazquez.

 7) Fix BPF verifier failures on 32-bit pointer arithmetic, from
    Yonghong Song.

 8) Support querying and setting hardware address of a port function via
    devlink, use this in mlx5, from Parav Pandit.

 9) Support hw ipsec offload on bonding slaves, from Jarod Wilson.

10) Switch qca8k driver over to phylink, from Jonathan McDowell.

11) In bpftool, show list of processes holding BPF FD references to
    maps, programs, links, and btf objects. From Andrii Nakryiko.

12) Several conversions over to generic power management, from Vaibhav
    Gupta.

13) Add support for SO_KEEPALIVE et al. to bpf_setsockopt(), from Dmitry
    Yakunin.

14) Various https url conversions, from Alexander A. Klimov.

15) Timestamping and PHC support for mscc PHY driver, from Antoine
    Tenart.

16) Support bpf iterating over tcp and udp sockets, from Yonghong Song.

17) Support 5GBASE-T i40e NICs, from Aleksandr Loktionov.

18) Add kTLS RX HW offload support to mlx5e, from Tariq Toukan.

19) Fix the -&gt;ndo_start_xmit() return type to be netdev_tx_t in several
    drivers. From Luc Van Oostenryck.

20) XDP support for xen-netfront, from Denis Kirjanov.

21) Support receive buffer autotuning in MPTCP, from Florian Westphal.

22) Support EF100 chip in sfc driver, from Edward Cree.

23) Add XDP support to mvpp2 driver, from Matteo Croce.

24) Support MPTCP in sock_diag, from Paolo Abeni.

25) Commonize UDP tunnel offloading code by creating udp_tunnel_nic
    infrastructure, from Jakub Kicinski.

26) Several pci_ --&gt; dma_ API conversions, from Christophe JAILLET.

27) Add FLOW_ACTION_POLICE support to mlxsw, from Ido Schimmel.

28) Add SK_LOOKUP bpf program type, from Jakub Sitnicki.

29) Refactor a lot of networking socket option handling code in order to
    avoid set_fs() calls, from Christoph Hellwig.

30) Add rfc4884 support to icmp code, from Willem de Bruijn.

31) Support TBF offload in dpaa2-eth driver, from Ioana Ciornei.

32) Support XDP_REDIRECT in qede driver, from Alexander Lobakin.

33) Support PCI relaxed ordering in mlx5 driver, from Aya Levin.

34) Support TCP syncookies in MPTCP, from Flowian Westphal.

35) Fix several tricky cases of PMTU handling wrt. briding, from Stefano
    Brivio.

* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2056 commits)
  net: thunderx: initialize VF's mailbox mutex before first usage
  usb: hso: remove bogus check for EINPROGRESS
  usb: hso: no complaint about kmalloc failure
  hso: fix bailout in error case of probe
  ip_tunnel_core: Fix build for archs without _HAVE_ARCH_IPV6_CSUM
  selftests/net: relax cpu affinity requirement in msg_zerocopy test
  mptcp: be careful on subflow creation
  selftests: rtnetlink: make kci_test_encap() return sub-test result
  selftests: rtnetlink: correct the final return value for the test
  net: dsa: sja1105: use detected device id instead of DT one on mismatch
  tipc: set ub-&gt;ifindex for local ipv6 address
  ipv6: add ipv6_dev_find()
  net: openvswitch: silence suspicious RCU usage warning
  Revert "vxlan: fix tos value before xmit"
  ptp: only allow phase values lower than 1 period
  farsync: switch from 'pci_' to 'dma_' API
  wan: wanxl: switch from 'pci_' to 'dma_' API
  hv_netvsc: do not use VF device if link is down
  dpaa2-eth: Fix passing zero to 'PTR_ERR' warning
  net: macb: Properly handle phylink on at91sam9x
  ...
</content>
</entry>
<entry>
<title>Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next</title>
<updated>2020-08-03T23:03:18Z</updated>
<author>
<name>David S. Miller</name>
<email>davem@davemloft.net</email>
</author>
<published>2020-08-03T23:03:18Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=f2e0b29a9ac7766799360eb6cd72ba83889f616a'/>
<id>urn:sha1:f2e0b29a9ac7766799360eb6cd72ba83889f616a</id>
<content type='text'>
Pablo Neira Ayuso says:

====================
Netfilter updates for net-next

1) UAF in chain binding support from previous batch, from Dan Carpenter.

2) Queue up delayed work to expire connections with no destination,
   from Andrew Sy Kim.

3) Use fallthrough pseudo-keyword, from Gustavo A. R. Silva.

4) Replace HTTP links with HTTPS, from Alexander A. Klimov.

5) Remove superfluous null header checks in ip6tables, from
   Gaurav Singh.

6) Add extended netlink error reporting for expression.

7) Report EEXIST on overlapping chain, set elements and flowtable
   devices.
====================

Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
</feed>
