<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/net/core, branch v5.7</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v5.7</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v5.7'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2020-05-29T23:56:53Z</updated>
<entry>
<title>neigh: fix ARP retransmit timer guard</title>
<updated>2020-05-29T23:56:53Z</updated>
<author>
<name>Hangbin Liu</name>
<email>liuhangbin@gmail.com</email>
</author>
<published>2020-05-28T07:15:13Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=96d10d5b192e7f064457fd957e6a3ce8c2dce4b4'/>
<id>urn:sha1:96d10d5b192e7f064457fd957e6a3ce8c2dce4b4</id>
<content type='text'>
In commit 19e16d220f0a ("neigh: support smaller retrans_time settting")
we add more accurate control for ARP and NS. But for ARP I forgot to
update the latest guard in neigh_timer_handler(), then the next
retransmit would be reset to jiffies + HZ/2 if we set the retrans_time
less than 500ms. Fix it by setting the time_before() check to HZ/100.

IPv6 does not have this issue.

Reported-by: Jianwen Ji &lt;jiji@redhat.com&gt;
Fixes: 19e16d220f0a ("neigh: support smaller retrans_time settting")
Signed-off-by: Hangbin Liu &lt;liuhangbin@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>flow_dissector: Drop BPF flow dissector prog ref on netns cleanup</title>
<updated>2020-05-22T00:52:45Z</updated>
<author>
<name>Jakub Sitnicki</name>
<email>jakub@cloudflare.com</email>
</author>
<published>2020-05-21T08:34:35Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=5cf65922bb15279402e1e19b5ee8c51d618fa51f'/>
<id>urn:sha1:5cf65922bb15279402e1e19b5ee8c51d618fa51f</id>
<content type='text'>
When attaching a flow dissector program to a network namespace with
bpf(BPF_PROG_ATTACH, ...) we grab a reference to bpf_prog.

If netns gets destroyed while a flow dissector is still attached, and there
are no other references to the prog, we leak the reference and the program
remains loaded.

Leak can be reproduced by running flow dissector tests from selftests/bpf:

  # bpftool prog list
  # ./test_flow_dissector.sh
  ...
  selftests: test_flow_dissector [PASS]
  # bpftool prog list
  4: flow_dissector  name _dissect  tag e314084d332a5338  gpl
          loaded_at 2020-05-20T18:50:53+0200  uid 0
          xlated 552B  jited 355B  memlock 4096B  map_ids 3,4
          btf_id 4
  #

Fix it by detaching the flow dissector program when netns is going away.

Fixes: d58e468b1112 ("flow_dissector: implements flow dissector BPF hook")
Signed-off-by: Jakub Sitnicki &lt;jakub@cloudflare.com&gt;
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Reviewed-by: Stanislav Fomichev &lt;sdf@google.com&gt;
Link: https://lore.kernel.org/bpf/20200521083435.560256-1-jakub@cloudflare.com
</content>
</entry>
<entry>
<title>__netif_receive_skb_core: pass skb by reference</title>
<updated>2020-05-19T22:38:00Z</updated>
<author>
<name>Boris Sukholitko</name>
<email>boris.sukholitko@broadcom.com</email>
</author>
<published>2020-05-19T07:32:37Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=c0bbbdc32febd4f034ecbf3ea17865785b2c0652'/>
<id>urn:sha1:c0bbbdc32febd4f034ecbf3ea17865785b2c0652</id>
<content type='text'>
__netif_receive_skb_core may change the skb pointer passed into it (e.g.
in rx_handler). The original skb may be freed as a result of this
operation.

The callers of __netif_receive_skb_core may further process original skb
by using pt_prev pointer returned by __netif_receive_skb_core thus
leading to unpleasant effects.

The solution is to pass skb by reference into __netif_receive_skb_core.

v2: Added Fixes tag and comment regarding ppt_prev and skb invariant.

Fixes: 88eb1944e18c ("net: core: propagate SKB lists through packet_type lookup")
Signed-off-by: Boris Sukholitko &lt;boris.sukholitko@broadcom.com&gt;
Acked-by: Edward Cree &lt;ecree@solarflare.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>netprio_cgroup: Fix unlimited memory leak of v2 cgroups</title>
<updated>2020-05-10T03:59:21Z</updated>
<author>
<name>Zefan Li</name>
<email>lizefan@huawei.com</email>
</author>
<published>2020-05-09T03:32:10Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=090e28b229af92dc5b40786ca673999d59e73056'/>
<id>urn:sha1:090e28b229af92dc5b40786ca673999d59e73056</id>
<content type='text'>
If systemd is configured to use hybrid mode which enables the use of
both cgroup v1 and v2, systemd will create new cgroup on both the default
root (v2) and netprio_cgroup hierarchy (v1) for a new session and attach
task to the two cgroups. If the task does some network thing then the v2
cgroup can never be freed after the session exited.

One of our machines ran into OOM due to this memory leak.

In the scenario described above when sk_alloc() is called
cgroup_sk_alloc() thought it's in v2 mode, so it stores
the cgroup pointer in sk-&gt;sk_cgrp_data and increments
the cgroup refcnt, but then sock_update_netprioidx()
thought it's in v1 mode, so it stores netprioidx value
in sk-&gt;sk_cgrp_data, so the cgroup refcnt will never be freed.

Currently we do the mode switch when someone writes to the ifpriomap
cgroup control file. The easiest fix is to also do the switch when
a task is attached to a new cgroup.

Fixes: bd1060a1d671 ("sock, cgroup: add sock-&gt;sk_cgroup")
Reported-by: Yang Yingliang &lt;yangyingliang@huawei.com&gt;
Tested-by: Yang Yingliang &lt;yangyingliang@huawei.com&gt;
Signed-off-by: Zefan Li &lt;lizefan@huawei.com&gt;
Acked-by: Tejun Heo &lt;tj@kernel.org&gt;
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
</entry>
<entry>
<title>Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf</title>
<updated>2020-05-09T01:58:39Z</updated>
<author>
<name>Jakub Kicinski</name>
<email>kuba@kernel.org</email>
</author>
<published>2020-05-09T01:58:39Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=14d8f7486a344ee64c37641c70b2d67013eb9de6'/>
<id>urn:sha1:14d8f7486a344ee64c37641c70b2d67013eb9de6</id>
<content type='text'>
Daniel Borkmann says:

====================
pull-request: bpf 2020-05-09

The following pull-request contains BPF updates for your *net* tree.

We've added 4 non-merge commits during the last 9 day(s) which contain
a total of 4 files changed, 11 insertions(+), 6 deletions(-).

The main changes are:

1) Fix msg_pop_data() helper incorrectly setting an sge length in some
   cases as well as fixing bpf_tcp_ingress() wrongly accounting bytes
   in sg.size, from John Fastabend.

2) Fix to return an -EFAULT error when copy_to_user() of the value
   fails in map_lookup_and_delete_elem(), from Wei Yongjun.

3) Fix sk_psock refcnt leak in tcp_bpf_recvmsg(), from Xiyu Yang.
====================

Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
</entry>
<entry>
<title>net: fix a potential recursive NETDEV_FEAT_CHANGE</title>
<updated>2020-05-08T01:18:36Z</updated>
<author>
<name>Cong Wang</name>
<email>xiyou.wangcong@gmail.com</email>
</author>
<published>2020-05-07T19:19:03Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=dd912306ff008891c82cd9f63e8181e47a9cb2fb'/>
<id>urn:sha1:dd912306ff008891c82cd9f63e8181e47a9cb2fb</id>
<content type='text'>
syzbot managed to trigger a recursive NETDEV_FEAT_CHANGE event
between bonding master and slave. I managed to find a reproducer
for this:

  ip li set bond0 up
  ifenslave bond0 eth0
  brctl addbr br0
  ethtool -K eth0 lro off
  brctl addif br0 bond0
  ip li set br0 up

When a NETDEV_FEAT_CHANGE event is triggered on a bonding slave,
it captures this and calls bond_compute_features() to fixup its
master's and other slaves' features. However, when syncing with
its lower devices by netdev_sync_lower_features() this event is
triggered again on slaves when the LRO feature fails to change,
so it goes back and forth recursively until the kernel stack is
exhausted.

Commit 17b85d29e82c intentionally lets __netdev_update_features()
return -1 for such a failure case, so we have to just rely on
the existing check inside netdev_sync_lower_features() and skip
NETDEV_FEAT_CHANGE event only for this specific failure case.

Fixes: fd867d51f889 ("net/core: generic support for disabling netdev features down stack")
Reported-by: syzbot+e73ceacfd8560cc8a3ca@syzkaller.appspotmail.com
Reported-by: syzbot+c2fb6f9ddcea95ba49b5@syzkaller.appspotmail.com
Cc: Jarod Wilson &lt;jarod@redhat.com&gt;
Cc: Nikolay Aleksandrov &lt;nikolay@cumulusnetworks.com&gt;
Cc: Josh Poimboeuf &lt;jpoimboe@redhat.com&gt;
Cc: Jann Horn &lt;jannh@google.com&gt;
Reviewed-by: Jay Vosburgh &lt;jay.vosburgh@canonical.com&gt;
Signed-off-by: Cong Wang &lt;xiyou.wangcong@gmail.com&gt;
Acked-by: Nikolay Aleksandrov &lt;nikolay@cumulusnetworks.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>bpf, sockmap: msg_pop_data can incorrecty set an sge length</title>
<updated>2020-05-05T22:22:15Z</updated>
<author>
<name>John Fastabend</name>
<email>john.fastabend@gmail.com</email>
</author>
<published>2020-05-04T17:21:23Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=3e104c23816220919ea1b3fd93fabe363c67c484'/>
<id>urn:sha1:3e104c23816220919ea1b3fd93fabe363c67c484</id>
<content type='text'>
When sk_msg_pop() is called where the pop operation is working on
the end of a sge element and there is no additional trailing data
and there _is_ data in front of pop, like the following case,

   |____________a_____________|__pop__|

We have out of order operations where we incorrectly set the pop
variable so that instead of zero'ing pop we incorrectly leave it
untouched, effectively. This can cause later logic to shift the
buffers around believing it should pop extra space. The result is
we have 'popped' more data then we expected potentially breaking
program logic.

It took us a while to hit this case because typically we pop headers
which seem to rarely be at the end of a scatterlist elements but
we can't rely on this.

Fixes: 7246d8ed4dcce ("bpf: helper to pop data from messages")
Signed-off-by: John Fastabend &lt;john.fastabend@gmail.com&gt;
Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Reviewed-by: Jakub Sitnicki &lt;jakub@cloudflare.com&gt;
Acked-by: Martin KaFai Lau &lt;kafai@fb.com&gt;
Link: https://lore.kernel.org/bpf/158861288359.14306.7654891716919968144.stgit@john-Precision-5820-Tower
</content>
</entry>
<entry>
<title>neigh: send protocol value in neighbor create notification</title>
<updated>2020-05-05T20:38:59Z</updated>
<author>
<name>Roman Mashak</name>
<email>mrv@mojatatu.com</email>
</author>
<published>2020-05-02T01:34:18Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=38212bb31fe923d0a2c6299bd2adfbb84cddef2a'/>
<id>urn:sha1:38212bb31fe923d0a2c6299bd2adfbb84cddef2a</id>
<content type='text'>
When a new neighbor entry has been added, event is generated but it does not
include protocol, because its value is assigned after the event notification
routine has run, so move protocol assignment code earlier.

Fixes: df9b0e30d44c ("neighbor: Add protocol attribute")
Cc: David Ahern &lt;dsahern@gmail.com&gt;
Signed-off-by: Roman Mashak &lt;mrv@mojatatu.com&gt;
Reviewed-by: David Ahern &lt;dsahern@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>devlink: Fix reporter's recovery condition</title>
<updated>2020-05-04T17:40:39Z</updated>
<author>
<name>Aya Levin</name>
<email>ayal@mellanox.com</email>
</author>
<published>2020-05-04T08:27:46Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=bea0c5c942d3b4e9fb6ed45f6a7de74c6b112437'/>
<id>urn:sha1:bea0c5c942d3b4e9fb6ed45f6a7de74c6b112437</id>
<content type='text'>
Devlink health core conditions the reporter's recovery with the
expiration of the grace period. This is not relevant for the first
recovery. Explicitly demand that the grace period will only apply to
recoveries other than the first.

Fixes: c8e1da0bf923 ("devlink: Add health report functionality")
Signed-off-by: Aya Levin &lt;ayal@mellanox.com&gt;
Reviewed-by: Moshe Shemesh &lt;moshe@mellanox.com&gt;
Reviewed-by: Jiri Pirko &lt;jiri@mellanox.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>drop_monitor: work around gcc-10 stringop-overflow warning</title>
<updated>2020-05-01T22:45:16Z</updated>
<author>
<name>Arnd Bergmann</name>
<email>arnd@arndb.de</email>
</author>
<published>2020-04-30T21:30:49Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=dc30b4059f6e2abf3712ab537c8718562b21c45d'/>
<id>urn:sha1:dc30b4059f6e2abf3712ab537c8718562b21c45d</id>
<content type='text'>
The current gcc-10 snapshot produces a false-positive warning:

net/core/drop_monitor.c: In function 'trace_drop_common.constprop':
cc1: error: writing 8 bytes into a region of size 0 [-Werror=stringop-overflow=]
In file included from net/core/drop_monitor.c:23:
include/uapi/linux/net_dropmon.h:36:8: note: at offset 0 to object 'entries' with size 4 declared here
   36 |  __u32 entries;
      |        ^~~~~~~

I reported this in the gcc bugzilla, but in case it does not get
fixed in the release, work around it by using a temporary variable.

Fixes: 9a8afc8d3962 ("Network Drop Monitor: Adding drop monitor implementation &amp; Netlink protocol")
Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94881
Signed-off-by: Arnd Bergmann &lt;arnd@arndb.de&gt;
Acked-by: Neil Horman &lt;nhorman@tuxdriver.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
</feed>
