<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/net/netfilter/ipvs, branch v5.0</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v5.0</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v5.0'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2019-02-16T09:41:42Z</updated>
<entry>
<title>ipvs: fix warning on unused variable</title>
<updated>2019-02-16T09:41:42Z</updated>
<author>
<name>Andrea Claudi</name>
<email>aclaudi@redhat.com</email>
</author>
<published>2019-02-15T16:51:48Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=c93a49b9769e435990c82297aa0baa31e1538790'/>
<id>urn:sha1:c93a49b9769e435990c82297aa0baa31e1538790</id>
<content type='text'>
When CONFIG_IP_VS_IPV6 is not defined, build produced this warning:

net/netfilter/ipvs/ip_vs_ctl.c:899:6: warning: unused variable ‘ret’ [-Wunused-variable]
  int ret = 0;
      ^~~

Fix this by moving the declaration of 'ret' in the CONFIG_IP_VS_IPV6
section in the same function.

While at it, drop its unneeded initialisation.

Fixes: 098e13f5b21d ("ipvs: fix dependency on nf_defrag_ipv6")
Reported-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
Signed-off-by: Andrea Claudi &lt;aclaudi@redhat.com&gt;
Reviewed-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
</content>
</entry>
<entry>
<title>ipvs: fix dependency on nf_defrag_ipv6</title>
<updated>2019-02-12T10:24:01Z</updated>
<author>
<name>Andrea Claudi</name>
<email>aclaudi@redhat.com</email>
</author>
<published>2019-02-11T15:14:39Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=098e13f5b21d3398065fce8780f07a3ef62f4812'/>
<id>urn:sha1:098e13f5b21d3398065fce8780f07a3ef62f4812</id>
<content type='text'>
ipvs relies on nf_defrag_ipv6 module to manage IPv6 fragmentation,
but lacks proper Kconfig dependencies and does not explicitly
request defrag features.

As a result, if netfilter hooks are not loaded, when IPv6 fragmented
packet are handled by ipvs only the first fragment makes through.

Fix it properly declaring the dependency on Kconfig and registering
netfilter hooks on ip_vs_add_service() and ip_vs_new_dest().

Reported-by: Li Shuang &lt;shuali@redhat.com&gt;
Signed-off-by: Andrea Claudi &lt;aclaudi@redhat.com&gt;
Acked-by: Julian Anastasov &lt;ja@ssi.bg&gt;
Acked-by: Simon Horman &lt;horms@verge.net.au&gt;
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
</content>
</entry>
<entry>
<title>ipvs: Fix signed integer overflow when setsockopt timeout</title>
<updated>2019-01-24T12:38:54Z</updated>
<author>
<name>ZhangXiaoxu</name>
<email>zhangxiaoxu5@huawei.com</email>
</author>
<published>2019-01-10T08:39:06Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=53ab60baa1ac4f20b080a22c13b77b6373922fd7'/>
<id>urn:sha1:53ab60baa1ac4f20b080a22c13b77b6373922fd7</id>
<content type='text'>
There is a UBSAN bug report as below:
UBSAN: Undefined behaviour in net/netfilter/ipvs/ip_vs_ctl.c:2227:21
signed integer overflow:
-2147483647 * 1000 cannot be represented in type 'int'

Reproduce program:
	#include &lt;stdio.h&gt;
	#include &lt;sys/types.h&gt;
	#include &lt;sys/socket.h&gt;

	#define IPPROTO_IP 0
	#define IPPROTO_RAW 255

	#define IP_VS_BASE_CTL		(64+1024+64)
	#define IP_VS_SO_SET_TIMEOUT	(IP_VS_BASE_CTL+10)

	/* The argument to IP_VS_SO_GET_TIMEOUT */
	struct ipvs_timeout_t {
		int tcp_timeout;
		int tcp_fin_timeout;
		int udp_timeout;
	};

	int main() {
		int ret = -1;
		int sockfd = -1;
		struct ipvs_timeout_t to;

		sockfd = socket(AF_INET, SOCK_RAW, IPPROTO_RAW);
		if (sockfd == -1) {
			printf("socket init error\n");
			return -1;
		}

		to.tcp_timeout = -2147483647;
		to.tcp_fin_timeout = -2147483647;
		to.udp_timeout = -2147483647;

		ret = setsockopt(sockfd,
				 IPPROTO_IP,
				 IP_VS_SO_SET_TIMEOUT,
				 (char *)(&amp;to),
				 sizeof(to));

		printf("setsockopt return %d\n", ret);
		return ret;
	}

Return -EINVAL if the timeout value is negative or max than 'INT_MAX / HZ'.

Signed-off-by: ZhangXiaoxu &lt;zhangxiaoxu5@huawei.com&gt;
Acked-by: Simon Horman &lt;horms@verge.net.au&gt;
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
</content>
</entry>
<entry>
<title>ipvs: call ip_vs_dst_notifier earlier than ipv6_dev_notf</title>
<updated>2018-11-26T09:23:42Z</updated>
<author>
<name>Xin Long</name>
<email>lucien.xin@gmail.com</email>
</author>
<published>2018-11-15T07:14:30Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=2a31e4bd9ad255ee40809b5c798c4b1c2b09703b'/>
<id>urn:sha1:2a31e4bd9ad255ee40809b5c798c4b1c2b09703b</id>
<content type='text'>
ip_vs_dst_event is supposed to clean up all dst used in ipvs'
destinations when a net dev is going down. But it works only
when the dst's dev is the same as the dev from the event.

Now with the same priority but late registration,
ip_vs_dst_notifier is always called later than ipv6_dev_notf
where the dst's dev is set to lo for NETDEV_DOWN event.

As the dst's dev lo is not the same as the dev from the event
in ip_vs_dst_event, ip_vs_dst_notifier doesn't actually work.
Also as these dst have to wait for dest_trash_timer to clean
them up. It would cause some non-permanent kernel warnings:

  unregister_netdevice: waiting for br0 to become free. Usage count = 3

To fix it, call ip_vs_dst_notifier earlier than ipv6_dev_notf
by increasing its priority to ADDRCONF_NOTIFY_PRIORITY + 5.

Note that for ipv4 route fib_netdev_notifier doesn't set dst's
dev to lo in NETDEV_DOWN event, so this fix is only needed when
IP_VS_IPV6 is defined.

Fixes: 7a4f0761fce3 ("IPVS: init and cleanup restructuring")
Reported-by: Li Shuang &lt;shuali@redhat.com&gt;
Signed-off-by: Xin Long &lt;lucien.xin@gmail.com&gt;
Acked-by: Julian Anastasov &lt;ja@ssi.bg&gt;
Acked-by: Simon Horman &lt;horms@verge.net.au&gt;
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
</content>
</entry>
<entry>
<title>iov_iter: Separate type from direction and use accessor functions</title>
<updated>2018-10-23T23:41:07Z</updated>
<author>
<name>David Howells</name>
<email>dhowells@redhat.com</email>
</author>
<published>2018-10-19T23:57:56Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=aa563d7bca6e882ec2bdae24603c8f016401a144'/>
<id>urn:sha1:aa563d7bca6e882ec2bdae24603c8f016401a144</id>
<content type='text'>
In the iov_iter struct, separate the iterator type from the iterator
direction and use accessor functions to access them in most places.

Convert a bunch of places to use switch-statements to access them rather
then chains of bitwise-AND statements.  This makes it easier to add further
iterator types.  Also, this can be more efficient as to implement a switch
of small contiguous integers, the compiler can use ~50% fewer compare
instructions than it has to use bitwise-and instructions.

Further, cease passing the iterator type into the iterator setup function.
The iterator function can set that itself.  Only the direction is required.

Signed-off-by: David Howells &lt;dhowells@redhat.com&gt;
</content>
</entry>
<entry>
<title>net: Add extack to nlmsg_parse</title>
<updated>2018-10-08T17:39:04Z</updated>
<author>
<name>David Ahern</name>
<email>dsahern@gmail.com</email>
</author>
<published>2018-10-08T03:16:24Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=dac9c9790e542777079999900594fd069ba10489'/>
<id>urn:sha1:dac9c9790e542777079999900594fd069ba10489</id>
<content type='text'>
Make sure extack is passed to nlmsg_parse where easy to do so.
Most of these are dump handlers and leveraging the extack in
the netlink_callback.

Signed-off-by: David Ahern &lt;dsahern@gmail.com&gt;
Acked-by: Christian Brauner &lt;christian@brauner.io&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>net-ipv4: remove 2 always zero parameters from ipv4_update_pmtu()</title>
<updated>2018-09-27T03:30:55Z</updated>
<author>
<name>Maciej Żenczykowski</name>
<email>maze@google.com</email>
</author>
<published>2018-09-26T03:56:26Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=d888f39666774c7debfa34e4e20ba33cf61a6d71'/>
<id>urn:sha1:d888f39666774c7debfa34e4e20ba33cf61a6d71</id>
<content type='text'>
(the parameters in question are mark and flow_flags)

Reviewed-by: David Ahern &lt;dsahern@gmail.com&gt;
Signed-off-by: Maciej Żenczykowski &lt;maze@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>treewide: convert ISO_8859-1 text comments to utf-8</title>
<updated>2018-08-24T01:48:43Z</updated>
<author>
<name>Arnd Bergmann</name>
<email>arnd@arndb.de</email>
</author>
<published>2018-08-24T00:01:26Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=3723c63247854c97fe044c12a40e29043e9bbc1b'/>
<id>urn:sha1:3723c63247854c97fe044c12a40e29043e9bbc1b</id>
<content type='text'>
Almost all files in the kernel are either plain text or UTF-8 encoded.  A
couple however are ISO_8859-1, usually just a few characters in a C
comments, for historic reasons.

This converts them all to UTF-8 for consistency.

Link: http://lkml.kernel.org/r/20180724111600.4158975-1-arnd@arndb.de
Signed-off-by: Arnd Bergmann &lt;arnd@arndb.de&gt;
Acked-by: Simon Horman &lt;horms@verge.net.au&gt;			[IPVS portion]
Acked-by: Jonathan Cameron &lt;Jonathan.Cameron@huawei.com&gt;	[IIO]
Acked-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;			[powerpc]
Acked-by: Rob Herring &lt;robh@kernel.org&gt;
Cc: Joe Perches &lt;joe@perches.com&gt;
Cc: Arnd Bergmann &lt;arnd@arndb.de&gt;
Cc: Samuel Ortiz &lt;sameo@linux.intel.com&gt;
Cc: "David S. Miller" &lt;davem@davemloft.net&gt;
Cc: Rob Herring &lt;robh+dt@kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>ipvs: don't show negative times in ip_vs_conn</title>
<updated>2018-08-16T17:36:57Z</updated>
<author>
<name>Matteo Croce</name>
<email>mcroce@redhat.com</email>
</author>
<published>2018-07-31T16:03:33Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=b71ed54dc2a1dbb4328f7d7c98d712747aee92ba'/>
<id>urn:sha1:b71ed54dc2a1dbb4328f7d7c98d712747aee92ba</id>
<content type='text'>
Since commit 500462a9de65 ("timers: Switch to a non-cascading wheel"),
timers duration can last even 12.5% more than the scheduled interval.

IPVS has two handlers, /proc/net/ip_vs_conn and /proc/net/ip_vs_conn_sync,
which shows the remaining time before that a connection expires.
The default expire time for a connection is 60 seconds, and the
expiration timer can fire even 4 seconds later than the scheduled time.
The expiration time is calculated subtracting jiffies to the scheduled
expiration time, and it's shown as a huge number when the timer fires late,
since both values are unsigned.

This can confuse script and tools which relies on it, like ipvsadm:

    root@mcroce-redhat:~# while ipvsadm -lc |grep SYN_RECV; do sleep 1 ; done
    TCP 00:05  SYN_RECV    [fc00:1::1]:55732  [fc00:1::2]:8000   [fc00:2000::1]:8000
    TCP 00:04  SYN_RECV    [fc00:1::1]:55732  [fc00:1::2]:8000   [fc00:2000::1]:8000
    TCP 00:03  SYN_RECV    [fc00:1::1]:55732  [fc00:1::2]:8000   [fc00:2000::1]:8000
    TCP 00:02  SYN_RECV    [fc00:1::1]:55732  [fc00:1::2]:8000   [fc00:2000::1]:8000
    TCP 00:01  SYN_RECV    [fc00:1::1]:55732  [fc00:1::2]:8000   [fc00:2000::1]:8000
    TCP 00:00  SYN_RECV    [fc00:1::1]:55732  [fc00:1::2]:8000   [fc00:2000::1]:8000
    TCP 68719476:44 SYN_RECV    [fc00:1::1]:55732  [fc00:1::2]:8000   [fc00:2000::1]:8000
    TCP 68719476:43 SYN_RECV    [fc00:1::1]:55732  [fc00:1::2]:8000   [fc00:2000::1]:8000
    TCP 68719476:42 SYN_RECV    [fc00:1::1]:55732  [fc00:1::2]:8000   [fc00:2000::1]:8000
    TCP 68719476:41 SYN_RECV    [fc00:1::1]:55732  [fc00:1::2]:8000   [fc00:2000::1]:8000
    TCP 68719476:40 SYN_RECV    [fc00:1::1]:55732  [fc00:1::2]:8000   [fc00:2000::1]:8000
    TCP 68719476:39 SYN_RECV    [fc00:1::1]:55732  [fc00:1::2]:8000   [fc00:2000::1]:8000

Signed-off-by: Matteo Croce &lt;mcroce@redhat.com&gt;
Acked-by: Simon Horman &lt;horms@verge.net.au&gt;
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
</content>
</entry>
<entry>
<title>ipvs: fix race between ip_vs_conn_new() and ip_vs_del_dest()</title>
<updated>2018-08-16T17:36:51Z</updated>
<author>
<name>Tan Hu</name>
<email>tan.hu@zte.com.cn</email>
</author>
<published>2018-07-25T07:23:07Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=a53b42c11815d2357e31a9403ae3950517525894'/>
<id>urn:sha1:a53b42c11815d2357e31a9403ae3950517525894</id>
<content type='text'>
We came across infinite loop in ipvs when using ipvs in docker
env.

When ipvs receives new packets and cannot find an ipvs connection,
it will create a new connection, then if the dest is unavailable
(i.e. IP_VS_DEST_F_AVAILABLE), the packet will be dropped sliently.

But if the dropped packet is the first packet of this connection,
the connection control timer never has a chance to start and the
ipvs connection cannot be released. This will lead to memory leak, or
infinite loop in cleanup_net() when net namespace is released like
this:

    ip_vs_conn_net_cleanup at ffffffffa0a9f31a [ip_vs]
    __ip_vs_cleanup at ffffffffa0a9f60a [ip_vs]
    ops_exit_list at ffffffff81567a49
    cleanup_net at ffffffff81568b40
    process_one_work at ffffffff810a851b
    worker_thread at ffffffff810a9356
    kthread at ffffffff810b0b6f
    ret_from_fork at ffffffff81697a18

race condition:
    CPU1                           CPU2
    ip_vs_in()
      ip_vs_conn_new()
                                   ip_vs_del_dest()
                                     __ip_vs_unlink_dest()
                                       ~IP_VS_DEST_F_AVAILABLE
      cp-&gt;dest &amp;&amp; !IP_VS_DEST_F_AVAILABLE
      __ip_vs_conn_put
    ...
    cleanup_net  ---&gt; infinite looping

Fix this by checking whether the timer already started.

Signed-off-by: Tan Hu &lt;tan.hu@zte.com.cn&gt;
Reviewed-by: Jiang Biao &lt;jiang.biao2@zte.com.cn&gt;
Acked-by: Julian Anastasov &lt;ja@ssi.bg&gt;
Acked-by: Simon Horman &lt;horms@verge.net.au&gt;
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
</content>
</entry>
</feed>
