<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/net/netfilter, branch v4.3</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v4.3</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v4.3'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2015-10-17T11:01:24Z</updated>
<entry>
<title>netfilter: ipset: Fix sleeping memory allocation in atomic context</title>
<updated>2015-10-17T11:01:24Z</updated>
<author>
<name>Nikolay Borisov</name>
<email>kernel@kyup.com</email>
</author>
<published>2015-10-16T06:40:28Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=00db674bedd68ff8b5afae9030ff5e04d45d1b4a'/>
<id>urn:sha1:00db674bedd68ff8b5afae9030ff5e04d45d1b4a</id>
<content type='text'>
Commit 00590fdd5be0 introduced RCU locking in list type and in
doing so introduced a memory allocation in list_set_add, which
is done in an atomic context, due to the fact that ipset rcu
list modifications are serialised with a spin lock. The reason
why we can't use a mutex is that in addition to modifying the
list with ipset commands, it's also being modified when a
particular ipset rule timeout expires aka garbage collection.
This gc is triggered from set_cleanup_entries, which in turn
is invoked from a timer thus requiring the lock to be bh-safe.

Concretely the following call chain can lead to "sleeping function
called in atomic context" splat:
call_ad -&gt; list_set_uadt -&gt; list_set_uadd -&gt; kzalloc(, GFP_KERNEL).
And since GFP_KERNEL allows initiating direct reclaim thus
potentially sleeping in the allocation path.

To fix the issue change the allocation type to GFP_ATOMIC, to
correctly reflect that it is occuring in an atomic context.

Fixes: 00590fdd5be0 ("netfilter: ipset: Introduce RCU locking in list type")
Signed-off-by: Nikolay Borisov &lt;kernel@kyup.com&gt;
Acked-by: Jozsef Kadlecsik &lt;kadlec@blackhole.kfki.hu&gt;
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
</content>
</entry>
<entry>
<title>netfilter: sync with packet rx also after removing queue entries</title>
<updated>2015-10-13T11:59:56Z</updated>
<author>
<name>Florian Westphal</name>
<email>fw@strlen.de</email>
</author>
<published>2015-10-08T21:38:07Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=514ed62ed3f6846325d9bfb15cb5c3540547f13b'/>
<id>urn:sha1:514ed62ed3f6846325d9bfb15cb5c3540547f13b</id>
<content type='text'>
We need to sync packet rx again after flushing the queue entries.
Otherwise, the following race could happen:

cpu1: nf_unregister_hook(H) called, H unliked from lists, calls
synchronize_net() to wait for packet rx completion.

Problem is that while no new nf_queue_entry structs that use H can be
allocated, another CPU might receive a verdict from userspace just before
cpu1 calls nf_queue_nf_hook_drop to remove this entry:

cpu2: receive verdict from userspace, lock queue
cpu2: unlink nf_queue_entry struct E, which references H, from queue list
cpu1: calls nf_queue_nf_hook_drop, blocks on queue spinlock
cpu2: unlock queue
cpu1: nf_queue_nf_hook_drop drops affected queue entries
cpu2: call nf_reinject for E
cpu1: kfree(H)
cpu2: potential use-after-free for H

Cc: Eric W. Biederman &lt;ebiederm@xmission.com&gt;
Fixes: 085db2c04557 ("netfilter: Per network namespace netfilter hooks.")
Signed-off-by: Florian Westphal &lt;fw@strlen.de&gt;
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
</content>
</entry>
<entry>
<title>netfilter: nf_log: wait for rcu grace after logger unregistration</title>
<updated>2015-09-17T11:37:31Z</updated>
<author>
<name>Pablo Neira Ayuso</name>
<email>pablo@netfilter.org</email>
</author>
<published>2015-09-17T11:37:00Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=ad5001cc7cdf9aaee5eb213fdee657e4a3c94776'/>
<id>urn:sha1:ad5001cc7cdf9aaee5eb213fdee657e4a3c94776</id>
<content type='text'>
The nf_log_unregister() function needs to call synchronize_rcu() to make sure
that the objects are not dereferenced anymore on module removal.

Fixes: 5962815a6a56 ("netfilter: nf_log: use an array of loggers instead of list")
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
</content>
</entry>
<entry>
<title>netfilter: nft_compat: skip family comparison in case of NFPROTO_UNSPEC</title>
<updated>2015-09-14T16:10:57Z</updated>
<author>
<name>Pablo Neira Ayuso</name>
<email>pablo@netfilter.org</email>
</author>
<published>2015-09-14T16:04:09Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=ba378ca9c04a5fc1b2cf0f0274a9d02eb3d1bad9'/>
<id>urn:sha1:ba378ca9c04a5fc1b2cf0f0274a9d02eb3d1bad9</id>
<content type='text'>
Fix lookup of existing match/target structures in the corresponding list
by skipping the family check if NFPROTO_UNSPEC is used.

This is resulting in the allocation and insertion of one match/target
structure for each use of them. So this not only bloats memory
consumption but also severely affects the time to reload the ruleset
from the iptables-compat utility.

After this patch, iptables-compat-restore and iptables-compat take
almost the same time to reload large rulesets.

Fixes: 0ca743a55991 ("netfilter: nf_tables: add compatibility layer for x_tables")
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
</content>
</entry>
<entry>
<title>netfilter: nf_log: don't zap all loggers on unregister</title>
<updated>2015-09-14T12:40:24Z</updated>
<author>
<name>Florian Westphal</name>
<email>fw@strlen.de</email>
</author>
<published>2015-09-09T00:57:21Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=205ee117d4dc4a11ac3bd9638bb9b2e839f4de9a'/>
<id>urn:sha1:205ee117d4dc4a11ac3bd9638bb9b2e839f4de9a</id>
<content type='text'>
like nf_log_unset, nf_log_unregister must not reset the list of loggers.
Otherwise, a call to nf_log_unregister() will render loggers of other nf
protocols unusable:

iptables -A INPUT -j LOG
modprobe nf_log_arp ; rmmod nf_log_arp
iptables -A INPUT -j LOG
iptables: No chain/target/match by that name

Fixes: 30e0c6a6be ("netfilter: nf_log: prepare net namespace support for loggers")
Signed-off-by: Florian Westphal &lt;fw@strlen.de&gt;
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
</content>
</entry>
<entry>
<title>netlink, mmap: fix edge-case leakages in nf queue zero-copy</title>
<updated>2015-09-10T04:43:22Z</updated>
<author>
<name>Daniel Borkmann</name>
<email>daniel@iogearbox.net</email>
</author>
<published>2015-09-10T00:10:57Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=6bb0fef489f667cf701853054f44579754f00a06'/>
<id>urn:sha1:6bb0fef489f667cf701853054f44579754f00a06</id>
<content type='text'>
When netlink mmap on receive side is the consumer of nf queue data,
it can happen that in some edge cases, we write skb shared info into
the user space mmap buffer:

Assume a possible rx ring frame size of only 4096, and the network skb,
which is being zero-copied into the netlink skb, contains page frags
with an overall skb-&gt;len larger than the linear part of the netlink
skb.

skb_zerocopy(), which is generic and thus not aware of the fact that
shared info cannot be accessed for such skbs then tries to write and
fill frags, thus leaking kernel data/pointers and in some corner cases
possibly writing out of bounds of the mmap area (when filling the
last slot in the ring buffer this way).

I.e. the ring buffer slot is then of status NL_MMAP_STATUS_VALID, has
an advertised length larger than 4096, where the linear part is visible
at the slot beginning, and the leaked sizeof(struct skb_shared_info)
has been written to the beginning of the next slot (also corrupting
the struct nl_mmap_hdr slot header incl. status etc), since skb-&gt;end
points to skb-&gt;data + ring-&gt;frame_size - NL_MMAP_HDRLEN.

The fix adds and lets __netlink_alloc_skb() take the actual needed
linear room for the network skb + meta data into account. It's completely
irrelevant for non-mmaped netlink sockets, but in case mmap sockets
are used, it can be decided whether the available skb_tailroom() is
really large enough for the buffer, or whether it needs to internally
fallback to a normal alloc_skb().

&gt;From nf queue side, the information whether the destination port is
an mmap RX ring is not really available without extra port-to-socket
lookup, thus it can only be determined in lower layers i.e. when
__netlink_alloc_skb() is called that checks internally for this. I
chose to add the extra ldiff parameter as mmap will then still work:
We have data_len and hlen in nfqnl_build_packet_message(), data_len
is the full length (capped at queue-&gt;copy_range) for skb_zerocopy()
and hlen some possible part of data_len that needs to be copied; the
rem_len variable indicates the needed remaining linear mmap space.

The only other workaround in nf queue internally would be after
allocation time by f.e. cap'ing the data_len to the skb_tailroom()
iff we deal with an mmap skb, but that would 1) expose the fact that
we use a mmap skb to upper layers, and 2) trim the skb where we
otherwise could just have moved the full skb into the normal receive
queue.

After the patch, in my test case the ring slot doesn't fit and therefore
shows NL_MMAP_STATUS_COPY, where a full skb carries all the data and
thus needs to be picked up via recv().

Fixes: 3ab1f683bf8b ("nfnetlink: add support for memory mapped netlink")
Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf</title>
<updated>2015-09-06T04:57:42Z</updated>
<author>
<name>David S. Miller</name>
<email>davem@davemloft.net</email>
</author>
<published>2015-09-06T00:36:30Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=53cfd053e4d82b46dddc5d3522700263016e3156'/>
<id>urn:sha1:53cfd053e4d82b46dddc5d3522700263016e3156</id>
<content type='text'>
Conflicts:
	include/net/netfilter/nf_conntrack.h

The conflict was an overlap between changing the type of the zone
argument to nf_ct_tmpl_alloc() whilst exporting nf_ct_tmpl_free.

Pablo Neira Ayuso says:

====================
Netfilter fixes for net

The following patchset contains Netfilter fixes for net, they are:

1) Oneliner to restore maps in nf_tables since we support addressing registers
   at 32 bits level.

2) Restore previous default behaviour in bridge netfilter when CONFIG_IPV6=n,
   oneliner from Bernhard Thaler.

3) Out of bound access in ipset hash:net* set types, reported by Dave Jones'
   KASan utility, patch from Jozsef Kadlecsik.

4) Fix ipset compilation with gcc 4.4.7 related to C99 initialization of
   unnamed unions, patch from Elad Raz.

5) Add a workaround to address inconsistent endianess in the res_id field of
   nfnetlink batch messages, reported by Florian Westphal.

6) Fix error paths of CT/synproxy since the conntrack template was moved to use
   kmalloc, patch from Daniel Borkmann.

All of them look good to me to reach 4.2, I can route this to -stable myself
too, just let me know what you prefer.
====================

Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>netfilter: nf_conntrack: make nf_ct_zone_dflt built-in</title>
<updated>2015-09-02T23:32:56Z</updated>
<author>
<name>Daniel Borkmann</name>
<email>daniel@iogearbox.net</email>
</author>
<published>2015-09-02T23:26:07Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=62da98656b62a5ca57f22263705175af8ded5aa1'/>
<id>urn:sha1:62da98656b62a5ca57f22263705175af8ded5aa1</id>
<content type='text'>
Fengguang reported, that some randconfig generated the following linker
issue with nf_ct_zone_dflt object involved:

  [...]
  CC      init/version.o
  LD      init/built-in.o
  net/built-in.o: In function `ipv4_conntrack_defrag':
  nf_defrag_ipv4.c:(.text+0x93e95): undefined reference to `nf_ct_zone_dflt'
  net/built-in.o: In function `ipv6_defrag':
  nf_defrag_ipv6_hooks.c:(.text+0xe3ffe): undefined reference to `nf_ct_zone_dflt'
  make: *** [vmlinux] Error 1

Given that configurations exist where we have a built-in part, which is
accessing nf_ct_zone_dflt such as the two handlers nf_ct_defrag_user()
and nf_ct6_defrag_user(), and a part that configures nf_conntrack as a
module, we must move nf_ct_zone_dflt into a fixed, guaranteed built-in
area when netfilter is configured in general.

Therefore, split the more generic parts into a common header under
include/linux/netfilter/ and move nf_ct_zone_dflt into the built-in
section that already holds parts related to CONFIG_NF_CONNTRACK in the
netfilter core. This fixes the issue on my side.

Fixes: 308ac9143ee2 ("netfilter: nf_conntrack: push zone object into functions")
Reported-by: Fengguang Wu &lt;fengguang.wu@intel.com&gt;
Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>netfilter: conntrack: use nf_ct_tmpl_free in CT/synproxy error paths</title>
<updated>2015-09-01T10:15:08Z</updated>
<author>
<name>Daniel Borkmann</name>
<email>daniel@iogearbox.net</email>
</author>
<published>2015-08-31T17:11:02Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=9cf94eab8b309e8bcc78b41dd1561c75b537dd0b'/>
<id>urn:sha1:9cf94eab8b309e8bcc78b41dd1561c75b537dd0b</id>
<content type='text'>
Commit 0838aa7fcfcd ("netfilter: fix netns dependencies with conntrack
templates") migrated templates to the new allocator api, but forgot to
update error paths for them in CT and synproxy to use nf_ct_tmpl_free()
instead of nf_conntrack_free().

Due to that, memory is being freed into the wrong kmemcache, but also
we drop the per net reference count of ct objects causing an imbalance.

In Brad's case, this leads to a wrap-around of net-&gt;ct.count and thus
lets __nf_conntrack_alloc() refuse to create a new ct object:

  [   10.340913] xt_addrtype: ipv6 does not support BROADCAST matching
  [   10.810168] nf_conntrack: table full, dropping packet
  [   11.917416] r8169 0000:07:00.0 eth0: link up
  [   11.917438] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
  [   12.815902] nf_conntrack: table full, dropping packet
  [   15.688561] nf_conntrack: table full, dropping packet
  [   15.689365] nf_conntrack: table full, dropping packet
  [   15.690169] nf_conntrack: table full, dropping packet
  [   15.690967] nf_conntrack: table full, dropping packet
  [...]

With slab debugging, it also reports the wrong kmemcache (kmalloc-512 vs.
nf_conntrack_ffffffff81ce75c0) and reports poison overwrites, etc. Thus,
to fix the problem, export and use nf_ct_tmpl_free() instead.

Fixes: 0838aa7fcfcd ("netfilter: fix netns dependencies with conntrack templates")
Reported-by: Brad Jackson &lt;bjackson0971@gmail.com&gt;
Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
</content>
</entry>
<entry>
<title>Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next</title>
<updated>2015-08-28T23:29:59Z</updated>
<author>
<name>David S. Miller</name>
<email>davem@davemloft.net</email>
</author>
<published>2015-08-28T23:29:59Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=581a5f2a6111e3d3b22f31a7d93b6bae2627e83a'/>
<id>urn:sha1:581a5f2a6111e3d3b22f31a7d93b6bae2627e83a</id>
<content type='text'>
Pablo Neira Ayuso says:

====================
Netfilter updates for net-next

The following patchset contains Netfilter/IPVS updates for your net-next tree.
In sum, patches to address fallout from the previous round plus updates from
the IPVS folks via Simon Horman, they are:

1) Add a new scheduler to IPVS: The weighted overflow scheduling algorithm
   directs network connections to the server with the highest weight that is
   currently available and overflows to the next when active connections exceed
   the node's weight. From Raducu Deaconu.

2) Fix locking ordering in IPVS, always take rtnl_lock in first place. Patch
   from Julian Anastasov.

3) Allow to indicate the MTU to the IPVS in-kernel state sync daemon. From
   Julian Anastasov.

4) Enhance multicast configuration for the IPVS state sync daemon. Also from
   Julian.

5) Resolve sparse warnings in the nf_dup modules.

6) Fix a linking problem when CONFIG_NF_DUP_IPV6 is not set.

7) Add ICMP codes 5 and 6 to IPv6 REJECT target, they are more informative
   subsets of code 1. From Andreas Herz.

8) Revert the jumpstack size calculation from mark_source_chains due to chain
   depth miscalculations, from Florian Westphal.

9) Calm down more sparse warning around the Netfilter tree, again from Florian
   Westphal.
====================

Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
</feed>
