<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/net/tipc, branch v5.4</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v5.4</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v5.4'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2019-11-15T02:03:03Z</updated>
<entry>
<title>tipc: add back tipc prefix to log messages</title>
<updated>2019-11-15T02:03:03Z</updated>
<author>
<name>Matt Bennett</name>
<email>matt.bennett@alliedtelesis.co.nz</email>
</author>
<published>2019-11-13T23:20:03Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=d7f9f47d4d1243cf960725a755123cfae13a3e51'/>
<id>urn:sha1:d7f9f47d4d1243cf960725a755123cfae13a3e51</id>
<content type='text'>
The tipc prefix for log messages generated by tipc was
removed in commit 07f6c4bc048a ("tipc: convert tipc reference
table to use generic rhashtable").

This is still a useful prefix so add it back.

Signed-off-by: Matt Bennett &lt;matt.bennett@alliedtelesis.co.nz&gt;
Acked-by: Jon Maloy &lt;jon.maloy@ericsson.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>net: use skb_queue_empty_lockless() in poll() handlers</title>
<updated>2019-10-28T20:33:41Z</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2019-10-24T05:44:50Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=3ef7cf57c72f32f61e97f8fa401bc39ea1f1a5d4'/>
<id>urn:sha1:3ef7cf57c72f32f61e97f8fa401bc39ea1f1a5d4</id>
<content type='text'>
Many poll() handlers are lockless. Using skb_queue_empty_lockless()
instead of skb_queue_empty() is more appropriate.

Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>net: silence KCSAN warnings about sk-&gt;sk_backlog.len reads</title>
<updated>2019-10-10T04:43:00Z</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2019-10-09T22:41:03Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=70c2655849a25431f31b505a07fe0c861e5e41fb'/>
<id>urn:sha1:70c2655849a25431f31b505a07fe0c861e5e41fb</id>
<content type='text'>
sk-&gt;sk_backlog.len can be written by BH handlers, and read
from process contexts in a lockless way.

Note the write side should also use WRITE_ONCE() or a variant.
We need some agreement about the best way to do this.

syzbot reported :

BUG: KCSAN: data-race in tcp_add_backlog / tcp_grow_window.isra.0

write to 0xffff88812665f32c of 4 bytes by interrupt on cpu 1:
 sk_add_backlog include/net/sock.h:934 [inline]
 tcp_add_backlog+0x4a0/0xcc0 net/ipv4/tcp_ipv4.c:1737
 tcp_v4_rcv+0x1aba/0x1bf0 net/ipv4/tcp_ipv4.c:1925
 ip_protocol_deliver_rcu+0x51/0x470 net/ipv4/ip_input.c:204
 ip_local_deliver_finish+0x110/0x140 net/ipv4/ip_input.c:231
 NF_HOOK include/linux/netfilter.h:305 [inline]
 NF_HOOK include/linux/netfilter.h:299 [inline]
 ip_local_deliver+0x133/0x210 net/ipv4/ip_input.c:252
 dst_input include/net/dst.h:442 [inline]
 ip_rcv_finish+0x121/0x160 net/ipv4/ip_input.c:413
 NF_HOOK include/linux/netfilter.h:305 [inline]
 NF_HOOK include/linux/netfilter.h:299 [inline]
 ip_rcv+0x18f/0x1a0 net/ipv4/ip_input.c:523
 __netif_receive_skb_one_core+0xa7/0xe0 net/core/dev.c:5004
 __netif_receive_skb+0x37/0xf0 net/core/dev.c:5118
 netif_receive_skb_internal+0x59/0x190 net/core/dev.c:5208
 napi_skb_finish net/core/dev.c:5671 [inline]
 napi_gro_receive+0x28f/0x330 net/core/dev.c:5704
 receive_buf+0x284/0x30b0 drivers/net/virtio_net.c:1061
 virtnet_receive drivers/net/virtio_net.c:1323 [inline]
 virtnet_poll+0x436/0x7d0 drivers/net/virtio_net.c:1428
 napi_poll net/core/dev.c:6352 [inline]
 net_rx_action+0x3ae/0xa50 net/core/dev.c:6418

read to 0xffff88812665f32c of 4 bytes by task 7292 on cpu 0:
 tcp_space include/net/tcp.h:1373 [inline]
 tcp_grow_window.isra.0+0x6b/0x480 net/ipv4/tcp_input.c:413
 tcp_event_data_recv+0x68f/0x990 net/ipv4/tcp_input.c:717
 tcp_rcv_established+0xbfe/0xf50 net/ipv4/tcp_input.c:5618
 tcp_v4_do_rcv+0x381/0x4e0 net/ipv4/tcp_ipv4.c:1542
 sk_backlog_rcv include/net/sock.h:945 [inline]
 __release_sock+0x135/0x1e0 net/core/sock.c:2427
 release_sock+0x61/0x160 net/core/sock.c:2943
 tcp_recvmsg+0x63b/0x1a30 net/ipv4/tcp.c:2181
 inet_recvmsg+0xbb/0x250 net/ipv4/af_inet.c:838
 sock_recvmsg_nosec net/socket.c:871 [inline]
 sock_recvmsg net/socket.c:889 [inline]
 sock_recvmsg+0x92/0xb0 net/socket.c:885
 sock_read_iter+0x15f/0x1e0 net/socket.c:967
 call_read_iter include/linux/fs.h:1864 [inline]
 new_sync_read+0x389/0x4f0 fs/read_write.c:414
 __vfs_read+0xb1/0xc0 fs/read_write.c:427
 vfs_read fs/read_write.c:461 [inline]
 vfs_read+0x143/0x2c0 fs/read_write.c:446

Reported by Kernel Concurrency Sanitizer on:
CPU: 0 PID: 7292 Comm: syz-fuzzer Not tainted 5.3.0+ #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011

Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Reported-by: syzbot &lt;syzkaller@googlegroups.com&gt;
Signed-off-by: Jakub Kicinski &lt;jakub.kicinski@netronome.com&gt;
</content>
</entry>
<entry>
<title>net: silence KCSAN warnings around sk_add_backlog() calls</title>
<updated>2019-10-10T04:42:59Z</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2019-10-09T22:21:13Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=8265792bf8871acc2d00fd03883d830e2249d395'/>
<id>urn:sha1:8265792bf8871acc2d00fd03883d830e2249d395</id>
<content type='text'>
sk_add_backlog() callers usually read sk-&gt;sk_rcvbuf without
owning the socket lock. This means sk_rcvbuf value can
be changed by other cpus, and KCSAN complains.

Add READ_ONCE() annotations to document the lockless nature
of these reads.

Note that writes over sk_rcvbuf should also use WRITE_ONCE(),
but this will be done in separate patches to ease stable
backports (if we decide this is relevant for stable trees).

BUG: KCSAN: data-race in tcp_add_backlog / tcp_recvmsg

write to 0xffff88812ab369f8 of 8 bytes by interrupt on cpu 1:
 __sk_add_backlog include/net/sock.h:902 [inline]
 sk_add_backlog include/net/sock.h:933 [inline]
 tcp_add_backlog+0x45a/0xcc0 net/ipv4/tcp_ipv4.c:1737
 tcp_v4_rcv+0x1aba/0x1bf0 net/ipv4/tcp_ipv4.c:1925
 ip_protocol_deliver_rcu+0x51/0x470 net/ipv4/ip_input.c:204
 ip_local_deliver_finish+0x110/0x140 net/ipv4/ip_input.c:231
 NF_HOOK include/linux/netfilter.h:305 [inline]
 NF_HOOK include/linux/netfilter.h:299 [inline]
 ip_local_deliver+0x133/0x210 net/ipv4/ip_input.c:252
 dst_input include/net/dst.h:442 [inline]
 ip_rcv_finish+0x121/0x160 net/ipv4/ip_input.c:413
 NF_HOOK include/linux/netfilter.h:305 [inline]
 NF_HOOK include/linux/netfilter.h:299 [inline]
 ip_rcv+0x18f/0x1a0 net/ipv4/ip_input.c:523
 __netif_receive_skb_one_core+0xa7/0xe0 net/core/dev.c:5004
 __netif_receive_skb+0x37/0xf0 net/core/dev.c:5118
 netif_receive_skb_internal+0x59/0x190 net/core/dev.c:5208
 napi_skb_finish net/core/dev.c:5671 [inline]
 napi_gro_receive+0x28f/0x330 net/core/dev.c:5704
 receive_buf+0x284/0x30b0 drivers/net/virtio_net.c:1061
 virtnet_receive drivers/net/virtio_net.c:1323 [inline]
 virtnet_poll+0x436/0x7d0 drivers/net/virtio_net.c:1428
 napi_poll net/core/dev.c:6352 [inline]
 net_rx_action+0x3ae/0xa50 net/core/dev.c:6418

read to 0xffff88812ab369f8 of 8 bytes by task 7271 on cpu 0:
 tcp_recvmsg+0x470/0x1a30 net/ipv4/tcp.c:2047
 inet_recvmsg+0xbb/0x250 net/ipv4/af_inet.c:838
 sock_recvmsg_nosec net/socket.c:871 [inline]
 sock_recvmsg net/socket.c:889 [inline]
 sock_recvmsg+0x92/0xb0 net/socket.c:885
 sock_read_iter+0x15f/0x1e0 net/socket.c:967
 call_read_iter include/linux/fs.h:1864 [inline]
 new_sync_read+0x389/0x4f0 fs/read_write.c:414
 __vfs_read+0xb1/0xc0 fs/read_write.c:427
 vfs_read fs/read_write.c:461 [inline]
 vfs_read+0x143/0x2c0 fs/read_write.c:446
 ksys_read+0xd5/0x1b0 fs/read_write.c:587
 __do_sys_read fs/read_write.c:597 [inline]
 __se_sys_read fs/read_write.c:595 [inline]
 __x64_sys_read+0x4c/0x60 fs/read_write.c:595
 do_syscall_64+0xcf/0x2f0 arch/x86/entry/common.c:296
 entry_SYSCALL_64_after_hwframe+0x44/0xa9

Reported by Kernel Concurrency Sanitizer on:
CPU: 0 PID: 7271 Comm: syz-fuzzer Not tainted 5.3.0+ #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011

Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Reported-by: syzbot &lt;syzkaller@googlegroups.com&gt;
Signed-off-by: Jakub Kicinski &lt;jakub.kicinski@netronome.com&gt;
</content>
</entry>
<entry>
<title>tipc: fix unlimited bundling of small messages</title>
<updated>2019-10-02T15:02:05Z</updated>
<author>
<name>Tuong Lien</name>
<email>tuong.t.lien@dektech.com.au</email>
</author>
<published>2019-10-02T11:49:43Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=e95584a889e1902fdf1ded9712e2c3c3083baf96'/>
<id>urn:sha1:e95584a889e1902fdf1ded9712e2c3c3083baf96</id>
<content type='text'>
We have identified a problem with the "oversubscription" policy in the
link transmission code.

When small messages are transmitted, and the sending link has reached
the transmit window limit, those messages will be bundled and put into
the link backlog queue. However, bundles of data messages are counted
at the 'CRITICAL' level, so that the counter for that level, instead of
the counter for the real, bundled message's level is the one being
increased.
Subsequent, to-be-bundled data messages at non-CRITICAL levels continue
to be tested against the unchanged counter for their own level, while
contributing to an unrestrained increase at the CRITICAL backlog level.

This leaves a gap in congestion control algorithm for small messages
that can result in starvation for other users or a "real" CRITICAL
user. Even that eventually can lead to buffer exhaustion &amp; link reset.

We fix this by keeping a 'target_bskb' buffer pointer at each levels,
then when bundling, we only bundle messages at the same importance
level only. This way, we know exactly how many slots a certain level
have occupied in the queue, so can manage level congestion accurately.

By bundling messages at the same level, we even have more benefits. Let
consider this:
- One socket sends 64-byte messages at the 'CRITICAL' level;
- Another sends 4096-byte messages at the 'LOW' level;

When a 64-byte message comes and is bundled the first time, we put the
overhead of message bundle to it (+ 40-byte header, data copy, etc.)
for later use, but the next message can be a 4096-byte one that cannot
be bundled to the previous one. This means the last bundle carries only
one payload message which is totally inefficient, as for the receiver
also! Later on, another 64-byte message comes, now we make a new bundle
and the same story repeats...

With the new bundling algorithm, this will not happen, the 64-byte
messages will be bundled together even when the 4096-byte message(s)
comes in between. However, if the 4096-byte messages are sent at the
same level i.e. 'CRITICAL', the bundling algorithm will again cause the
same overhead.

Also, the same will happen even with only one socket sending small
messages at a rate close to the link transmit's one, so that, when one
message is bundled, it's transmitted shortly. Then, another message
comes, a new bundle is created and so on...

We will solve this issue radically by another patch.

Fixes: 365ad353c256 ("tipc: reduce risk of user starvation during link congestion")
Reported-by: Hoang Le &lt;hoang.h.le@dektech.com.au&gt;
Acked-by: Jon Maloy &lt;jon.maloy@ericsson.com&gt;
Signed-off-by: Tuong Lien &lt;tuong.t.lien@dektech.com.au&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net</title>
<updated>2019-09-15T12:17:27Z</updated>
<author>
<name>David S. Miller</name>
<email>davem@davemloft.net</email>
</author>
<published>2019-09-15T12:17:27Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=aa2eaa8c272a3211dec07ce9c6c863a7e355c10e'/>
<id>urn:sha1:aa2eaa8c272a3211dec07ce9c6c863a7e355c10e</id>
<content type='text'>
Minor overlapping changes in the btusb and ixgbe drivers.

Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>tipc: add NULL pointer check before calling kfree_rcu</title>
<updated>2019-09-05T07:58:49Z</updated>
<author>
<name>Xin Long</name>
<email>lucien.xin@gmail.com</email>
</author>
<published>2019-09-03T09:53:12Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=42dec1dbe38239cf91cc1f4df7830c66276ced37'/>
<id>urn:sha1:42dec1dbe38239cf91cc1f4df7830c66276ced37</id>
<content type='text'>
Unlike kfree(p), kfree_rcu(p, rcu) won't do NULL pointer check. When
tipc_nametbl_remove_publ returns NULL, the panic below happens:

   BUG: unable to handle kernel NULL pointer dereference at 0000000000000068
   RIP: 0010:__call_rcu+0x1d/0x290
   Call Trace:
    &lt;IRQ&gt;
    tipc_publ_notify+0xa9/0x170 [tipc]
    tipc_node_write_unlock+0x8d/0x100 [tipc]
    tipc_node_link_down+0xae/0x1d0 [tipc]
    tipc_node_check_dest+0x3ea/0x8f0 [tipc]
    ? tipc_disc_rcv+0x2c7/0x430 [tipc]
    tipc_disc_rcv+0x2c7/0x430 [tipc]
    ? tipc_rcv+0x6bb/0xf20 [tipc]
    tipc_rcv+0x6bb/0xf20 [tipc]
    ? ip_route_input_slow+0x9cf/0xb10
    tipc_udp_recv+0x195/0x1e0 [tipc]
    ? tipc_udp_is_known_peer+0x80/0x80 [tipc]
    udp_queue_rcv_skb+0x180/0x460
    udp_unicast_rcv_skb.isra.56+0x75/0x90
    __udp4_lib_rcv+0x4ce/0xb90
    ip_local_deliver_finish+0x11c/0x210
    ip_local_deliver+0x6b/0xe0
    ? ip_rcv_finish+0xa9/0x410
    ip_rcv+0x273/0x362

Fixes: 97ede29e80ee ("tipc: convert name table read-write lock to RCU")
Reported-by: Li Shuang &lt;shuali@redhat.com&gt;
Signed-off-by: Xin Long &lt;lucien.xin@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net</title>
<updated>2019-08-19T18:54:03Z</updated>
<author>
<name>David S. Miller</name>
<email>davem@davemloft.net</email>
</author>
<published>2019-08-19T18:54:03Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=446bf64b613c4433dac4b15f4eaf326beaad3c8e'/>
<id>urn:sha1:446bf64b613c4433dac4b15f4eaf326beaad3c8e</id>
<content type='text'>
Merge conflict of mlx5 resolved using instructions in merge
commit 9566e650bf7fdf58384bb06df634f7531ca3a97e.

Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>tipc: clean up skb list lock handling on send path</title>
<updated>2019-08-18T21:01:07Z</updated>
<author>
<name>Jon Maloy</name>
<email>jon.maloy@ericsson.com</email>
</author>
<published>2019-08-15T14:42:50Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=e654f9f53b45fde3fcc8051830b212c7a8f36148'/>
<id>urn:sha1:e654f9f53b45fde3fcc8051830b212c7a8f36148</id>
<content type='text'>
The policy for handling the skb list locks on the send and receive paths
is simple.

- On the send path we never need to grab the lock on the 'xmitq' list
  when the destination is an exernal node.

- On the receive path we always need to grab the lock on the 'inputq'
  list, irrespective of source node.

However, when transmitting node local messages those will eventually
end up on the receive path of a local socket, meaning that the argument
'xmitq' in tipc_node_xmit() will become the 'ínputq' argument in  the
function tipc_sk_rcv(). This has been handled by always initializing
the spinlock of the 'xmitq' list at message creation, just in case it
may end up on the receive path later, and despite knowing that the lock
in most cases never will be used.

This approach is inaccurate and confusing, and has also concealed the
fact that the stated 'no lock grabbing' policy for the send path is
violated in some cases.

We now clean up this by never initializing the lock at message creation,
instead doing this at the moment we find that the message actually will
enter the receive path. At the same time we fix the four locations
where we incorrectly access the spinlock on the send/error path.

This patch also reverts commit d12cffe9329f ("tipc: ensure head-&gt;lock
is initialised") which has now become redundant.

CC: Eric Dumazet &lt;edumazet@google.com&gt;
Reported-by: Chris Packham &lt;chris.packham@alliedtelesis.co.nz&gt;
Acked-by: Ying Xue &lt;ying.xue@windriver.com&gt;
Signed-off-by: Jon Maloy &lt;jon.maloy@ericsson.com&gt;
Reviewed-by: Xin Long &lt;lucien.xin@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>tipc: fix false detection of retransmit failures</title>
<updated>2019-08-16T23:27:13Z</updated>
<author>
<name>Tuong Lien</name>
<email>tuong.t.lien@dektech.com.au</email>
</author>
<published>2019-08-15T03:24:08Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=712042313b23b5df7451faf4b279beb3025e990c'/>
<id>urn:sha1:712042313b23b5df7451faf4b279beb3025e990c</id>
<content type='text'>
This commit eliminates the use of the link 'stale_limit' &amp; 'prev_from'
(besides the already removed - 'stale_cnt') variables in the detection
of repeated retransmit failures as there is no proper way to initialize
them to avoid a false detection, i.e. it is not really a retransmission
failure but due to a garbage values in the variables.

Instead, a jiffies variable will be added to individual skbs (like the
way we restrict the skb retransmissions) in order to mark the first skb
retransmit time. Later on, at the next retransmissions, the timestamp
will be checked to see if the skb in the link transmq is "too stale",
that is, the link tolerance time has passed, so that a link reset will
be ordered. Note, just checking on the first skb in the queue is fine
enough since it must be the oldest one.
A counter is also added to keep track the actual skb retransmissions'
number for later checking when the failure happens.

The downside of this approach is that the skb-&gt;cb[] buffer is about to
be exhausted, however it is always able to allocate another memory area
and keep a reference to it when needed.

Fixes: 77cf8edbc0e7 ("tipc: simplify stale link failure criteria")
Reported-by: Hoang Le &lt;hoang.h.le@dektech.com.au&gt;
Acked-by: Ying Xue &lt;ying.xue@windriver.com&gt;
Acked-by: Jon Maloy &lt;jon.maloy@ericsson.com&gt;
Signed-off-by: Tuong Lien &lt;tuong.t.lien@dektech.com.au&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
</feed>
