<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/include/net, branch v5.4</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v5.4</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v5.4'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2019-11-19T23:03:02Z</updated>
<entry>
<title>net/tls: enable sk_msg redirect to tls socket egress</title>
<updated>2019-11-19T23:03:02Z</updated>
<author>
<name>Willem de Bruijn</name>
<email>willemb@google.com</email>
</author>
<published>2019-11-18T15:40:51Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=d4ffb02dee2fcb20e0c8086a8d1305bf885820bb'/>
<id>urn:sha1:d4ffb02dee2fcb20e0c8086a8d1305bf885820bb</id>
<content type='text'>
Bring back tls_sw_sendpage_locked. sk_msg redirection into a socket
with TLS_TX takes the following path:

  tcp_bpf_sendmsg_redir
    tcp_bpf_push_locked
      tcp_bpf_push
        kernel_sendpage_locked
          sock-&gt;ops-&gt;sendpage_locked

Also update the flags test in tls_sw_sendpage_locked to allow flag
MSG_NO_SHARED_FRAGS. bpf_tcp_sendmsg sets this.

Link: https://lore.kernel.org/netdev/CA+FuTSdaAawmZ2N8nfDDKu3XLpXBbMtcCT0q4FntDD2gn8ASUw@mail.gmail.com/T/#t
Link: https://github.com/wdebruij/kerneltools/commits/icept.2
Fixes: 0608c69c9a80 ("bpf: sk_msg, sock{map|hash} redirect through ULP")
Fixes: f3de19af0f5b ("Revert \"net/tls: remove unused function tls_sw_sendpage_locked\"")
Signed-off-by: Willem de Bruijn &lt;willemb@google.com&gt;
Acked-by: John Fastabend &lt;john.fastabend@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>devlink: disallow reload operation during device cleanup</title>
<updated>2019-11-10T03:38:36Z</updated>
<author>
<name>Jiri Pirko</name>
<email>jiri@mellanox.com</email>
</author>
<published>2019-11-09T10:29:46Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=5a508a254bed9a2e36a5fb96c9065532a6bf1e9c'/>
<id>urn:sha1:5a508a254bed9a2e36a5fb96c9065532a6bf1e9c</id>
<content type='text'>
There is a race between driver code that does setup/cleanup of device
and devlink reload operation that in some drivers works with the same
code. Use after free could we easily obtained by running:

while true; do
        echo "0000:00:10.0" &gt;/sys/bus/pci/drivers/mlxsw_spectrum2/bind
        devlink dev reload pci/0000:00:10.0 &amp;
        echo "0000:00:10.0" &gt;/sys/bus/pci/drivers/mlxsw_spectrum2/unbind
done

Fix this by enabling reload only after setup of device is complete and
disabling it at the beginning of the cleanup process.

Reported-by: Ido Schimmel &lt;idosch@mellanox.com&gt;
Fixes: 2d8dc5bbf4e7 ("devlink: Add support for reload")
Signed-off-by: Jiri Pirko &lt;jiri@mellanox.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>net: fix data-race in neigh_event_send()</title>
<updated>2019-11-08T21:59:53Z</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2019-11-08T04:08:19Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=1b53d64435d56902fc234ff2507142d971a09687'/>
<id>urn:sha1:1b53d64435d56902fc234ff2507142d971a09687</id>
<content type='text'>
KCSAN reported the following data-race [1]

The fix will also prevent the compiler from optimizing out
the condition.

[1]

BUG: KCSAN: data-race in neigh_resolve_output / neigh_resolve_output

write to 0xffff8880a41dba78 of 8 bytes by interrupt on cpu 1:
 neigh_event_send include/net/neighbour.h:443 [inline]
 neigh_resolve_output+0x78/0x480 net/core/neighbour.c:1474
 neigh_output include/net/neighbour.h:511 [inline]
 ip_finish_output2+0x4af/0xe40 net/ipv4/ip_output.c:228
 __ip_finish_output net/ipv4/ip_output.c:308 [inline]
 __ip_finish_output+0x23a/0x490 net/ipv4/ip_output.c:290
 ip_finish_output+0x41/0x160 net/ipv4/ip_output.c:318
 NF_HOOK_COND include/linux/netfilter.h:294 [inline]
 ip_output+0xdf/0x210 net/ipv4/ip_output.c:432
 dst_output include/net/dst.h:436 [inline]
 ip_local_out+0x74/0x90 net/ipv4/ip_output.c:125
 __ip_queue_xmit+0x3a8/0xa40 net/ipv4/ip_output.c:532
 ip_queue_xmit+0x45/0x60 include/net/ip.h:237
 __tcp_transmit_skb+0xe81/0x1d60 net/ipv4/tcp_output.c:1169
 tcp_transmit_skb net/ipv4/tcp_output.c:1185 [inline]
 __tcp_retransmit_skb+0x4bd/0x15f0 net/ipv4/tcp_output.c:2976
 tcp_retransmit_skb+0x36/0x1a0 net/ipv4/tcp_output.c:2999
 tcp_retransmit_timer+0x719/0x16d0 net/ipv4/tcp_timer.c:515
 tcp_write_timer_handler+0x42d/0x510 net/ipv4/tcp_timer.c:598
 tcp_write_timer+0xd1/0xf0 net/ipv4/tcp_timer.c:618

read to 0xffff8880a41dba78 of 8 bytes by interrupt on cpu 0:
 neigh_event_send include/net/neighbour.h:442 [inline]
 neigh_resolve_output+0x57/0x480 net/core/neighbour.c:1474
 neigh_output include/net/neighbour.h:511 [inline]
 ip_finish_output2+0x4af/0xe40 net/ipv4/ip_output.c:228
 __ip_finish_output net/ipv4/ip_output.c:308 [inline]
 __ip_finish_output+0x23a/0x490 net/ipv4/ip_output.c:290
 ip_finish_output+0x41/0x160 net/ipv4/ip_output.c:318
 NF_HOOK_COND include/linux/netfilter.h:294 [inline]
 ip_output+0xdf/0x210 net/ipv4/ip_output.c:432
 dst_output include/net/dst.h:436 [inline]
 ip_local_out+0x74/0x90 net/ipv4/ip_output.c:125
 __ip_queue_xmit+0x3a8/0xa40 net/ipv4/ip_output.c:532
 ip_queue_xmit+0x45/0x60 include/net/ip.h:237
 __tcp_transmit_skb+0xe81/0x1d60 net/ipv4/tcp_output.c:1169
 tcp_transmit_skb net/ipv4/tcp_output.c:1185 [inline]
 __tcp_retransmit_skb+0x4bd/0x15f0 net/ipv4/tcp_output.c:2976
 tcp_retransmit_skb+0x36/0x1a0 net/ipv4/tcp_output.c:2999
 tcp_retransmit_timer+0x719/0x16d0 net/ipv4/tcp_timer.c:515
 tcp_write_timer_handler+0x42d/0x510 net/ipv4/tcp_timer.c:598

Reported by Kernel Concurrency Sanitizer on:
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.4.0-rc3+ #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011

Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Reported-by: syzbot &lt;syzkaller@googlegroups.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>net/fq_impl: Switch to kvmalloc() for memory allocation</title>
<updated>2019-11-08T08:11:49Z</updated>
<author>
<name>Toke Høiland-Jørgensen</name>
<email>toke@redhat.com</email>
</author>
<published>2019-11-05T15:57:50Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=71e67c3bd127cfe7863f54e4b087eba1cc8f9a7a'/>
<id>urn:sha1:71e67c3bd127cfe7863f54e4b087eba1cc8f9a7a</id>
<content type='text'>
The FQ implementation used by mac80211 allocates memory using kmalloc(),
which can fail; and Johannes reported that this actually happens in
practice.

To avoid this, switch the allocation to kvmalloc() instead; this also
brings fq_impl in line with all the FQ qdiscs.

Fixes: 557fc4a09803 ("fq: add fair queuing framework")
Reported-by: Johannes Berg &lt;johannes@sipsolutions.net&gt;
Signed-off-by: Toke Høiland-Jørgensen &lt;toke@redhat.com&gt;
Link: https://lore.kernel.org/r/20191105155750.547379-1-toke@redhat.com
Signed-off-by: Johannes Berg &lt;johannes.berg@intel.com&gt;
</content>
</entry>
<entry>
<title>Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf</title>
<updated>2019-11-07T05:16:55Z</updated>
<author>
<name>David S. Miller</name>
<email>davem@davemloft.net</email>
</author>
<published>2019-11-07T05:16:55Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=53ba60afb165a09559f254dfa89fe300882f31d1'/>
<id>urn:sha1:53ba60afb165a09559f254dfa89fe300882f31d1</id>
<content type='text'>
Pablo Neira Ayuso says:

====================
Netfilter fixes for net

The following patchset contains Netfilter fixes for net:

1) Missing register size validation in bitwise and cmp offloads.

2) Fix error code in ip_set_sockfn_get() when copy_to_user() fails,
   from Dan Carpenter.

3) Oneliner to copy MAC address in IPv6 hash:ip,mac sets, from
   Stefano Brivio.

4) Missing policy validation in ipset with NL_VALIDATE_STRICT,
   from Jozsef Kadlecsik.

5) Fix unaligned access to private data area of nf_tables instructions,
   from Lukas Wunner.

6) Relax check for object updates, reported as a regression by
   Eric Garver, patch from Fernando Fernandez Mancera.

7) Crash on ebtables dnat extension when used from the output path.
   From Florian Westphal.

8) Fix bogus EOPNOTSUPP when updating basechain flags.

9) Fix bogus EBUSY when updating a basechain that is already offloaded.
====================

Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>net/tls: add a TX lock</title>
<updated>2019-11-07T01:33:32Z</updated>
<author>
<name>Jakub Kicinski</name>
<email>jakub.kicinski@netronome.com</email>
</author>
<published>2019-11-05T22:24:35Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=79ffe6087e9145d2377385cac48d0d6a6b4225a5'/>
<id>urn:sha1:79ffe6087e9145d2377385cac48d0d6a6b4225a5</id>
<content type='text'>
TLS TX needs to release and re-acquire the socket lock if send buffer
fills up.

TLS SW TX path currently depends on only allowing one thread to enter
the function by the abuse of sk_write_pending. If another writer is
already waiting for memory no new ones are allowed in.

This has two problems:
 - writers don't wake other threads up when they leave the kernel;
   meaning that this scheme works for single extra thread (second
   application thread or delayed work) because memory becoming
   available will send a wake up request, but as Mallesham and
   Pooja report with larger number of threads it leads to threads
   being put to sleep indefinitely;
 - the delayed work does not get _scheduled_ but it may _run_ when
   other writers are present leading to crashes as writers don't
   expect state to change under their feet (same records get pushed
   and freed multiple times); it's hard to reliably bail from the
   work, however, because the mere presence of a writer does not
   guarantee that the writer will push pending records before exiting.

Ensuring wakeups always happen will make the code basically open
code a mutex. Just use a mutex.

The TLS HW TX path does not have any locking (not even the
sk_write_pending hack), yet it uses a per-socket sg_tx_data
array to push records.

Fixes: a42055e8d2c3 ("net/tls: Add support for async encryption of records for performance")
Reported-by: Mallesham  Jatharakonda &lt;mallesh537@gmail.com&gt;
Reported-by: Pooja Trivedi &lt;poojatrivedi@gmail.com&gt;
Signed-off-by: Jakub Kicinski &lt;jakub.kicinski@netronome.com&gt;
Reviewed-by: Simon Horman &lt;simon.horman@netronome.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>net: prevent load/store tearing on sk-&gt;sk_stamp</title>
<updated>2019-11-06T02:22:30Z</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2019-11-05T05:38:43Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=f75359f3ac855940c5718af10ba089b8977bf339'/>
<id>urn:sha1:f75359f3ac855940c5718af10ba089b8977bf339</id>
<content type='text'>
Add a couple of READ_ONCE() and WRITE_ONCE() to prevent
load-tearing and store-tearing in sock_read_timestamp()
and sock_write_timestamp()

This might prevent another KCSAN report.

Fixes: 3a0ed3e96197 ("sock: Make sock-&gt;sk_stamp thread-safe")
Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Cc: Deepa Dinamani &lt;deepa.kernel@gmail.com&gt;
Acked-by: Deepa Dinamani &lt;deepa.kernel@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>net: sched: prevent duplicate flower rules from tcf_proto destroy race</title>
<updated>2019-11-06T01:47:26Z</updated>
<author>
<name>John Hurley</name>
<email>john.hurley@netronome.com</email>
</author>
<published>2019-11-02T14:17:47Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=59eb87cb52c9f7164804bc8639c4d03ba9b0c169'/>
<id>urn:sha1:59eb87cb52c9f7164804bc8639c4d03ba9b0c169</id>
<content type='text'>
When a new filter is added to cls_api, the function
tcf_chain_tp_insert_unique() looks up the protocol/priority/chain to
determine if the tcf_proto is duplicated in the chain's hashtable. It then
creates a new entry or continues with an existing one. In cls_flower, this
allows the function fl_ht_insert_unque to determine if a filter is a
duplicate and reject appropriately, meaning that the duplicate will not be
passed to drivers via the offload hooks. However, when a tcf_proto is
destroyed it is removed from its chain before a hardware remove hook is
hit. This can lead to a race whereby the driver has not received the
remove message but duplicate flows can be accepted. This, in turn, can
lead to the offload driver receiving incorrect duplicate flows and out of
order add/delete messages.

Prevent duplicates by utilising an approach suggested by Vlad Buslov. A
hash table per block stores each unique chain/protocol/prio being
destroyed. This entry is only removed when the full destroy (and hardware
offload) has completed. If a new flow is being added with the same
identiers as a tc_proto being detroyed, then the add request is replayed
until the destroy is complete.

Fixes: 8b64678e0af8 ("net: sched: refactor tp insert/delete for concurrent execution")
Signed-off-by: John Hurley &lt;john.hurley@netronome.com&gt;
Signed-off-by: Vlad Buslov &lt;vladbu@mellanox.com&gt;
Reviewed-by: Simon Horman &lt;simon.horman@netronome.com&gt;
Reported-by: Louis Peens &lt;louis.peens@netronome.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>bonding: fix state transition issue in link monitoring</title>
<updated>2019-11-06T01:40:16Z</updated>
<author>
<name>Jay Vosburgh</name>
<email>jay.vosburgh@canonical.com</email>
</author>
<published>2019-11-02T04:56:42Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=1899bb325149e481de31a4f32b59ea6f24e176ea'/>
<id>urn:sha1:1899bb325149e481de31a4f32b59ea6f24e176ea</id>
<content type='text'>
Since de77ecd4ef02 ("bonding: improve link-status update in
mii-monitoring"), the bonding driver has utilized two separate variables
to indicate the next link state a particular slave should transition to.
Each is used to communicate to a different portion of the link state
change commit logic; one to the bond_miimon_commit function itself, and
another to the state transition logic.

	Unfortunately, the two variables can become unsynchronized,
resulting in incorrect link state transitions within bonding.  This can
cause slaves to become stuck in an incorrect link state until a
subsequent carrier state transition.

	The issue occurs when a special case in bond_slave_netdev_event
sets slave-&gt;link directly to BOND_LINK_FAIL.  On the next pass through
bond_miimon_inspect after the slave goes carrier up, the BOND_LINK_FAIL
case will set the proposed next state (link_new_state) to BOND_LINK_UP,
but the new_link to BOND_LINK_DOWN.  The setting of the final link state
from new_link comes after that from link_new_state, and so the slave
will end up incorrectly in _DOWN state.

	Resolve this by combining the two variables into one.

Reported-by: Aleksei Zakharov &lt;zakharov.a.g@yandex.ru&gt;
Reported-by: Sha Zhang &lt;zhangsha.zhang@huawei.com&gt;
Cc: Mahesh Bandewar &lt;maheshb@google.com&gt;
Fixes: de77ecd4ef02 ("bonding: improve link-status update in mii-monitoring")
Signed-off-by: Jay Vosburgh &lt;jay.vosburgh@canonical.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>netfilter: nf_tables: Align nft_expr private data to 64-bit</title>
<updated>2019-11-04T19:58:32Z</updated>
<author>
<name>Lukas Wunner</name>
<email>lukas@wunner.de</email>
</author>
<published>2019-10-31T10:06:24Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=250367c59e6ba0d79d702a059712d66edacd4a1a'/>
<id>urn:sha1:250367c59e6ba0d79d702a059712d66edacd4a1a</id>
<content type='text'>
Invoking the following commands on a 32-bit architecture with strict
alignment requirements (such as an ARMv7-based Raspberry Pi) results
in an alignment exception:

 # nft add table ip test-ip4
 # nft add chain ip test-ip4 output { type filter hook output priority 0; }
 # nft add rule  ip test-ip4 output quota 1025 bytes

Alignment trap: not handling instruction e1b26f9f at [&lt;7f4473f8&gt;]
Unhandled fault: alignment exception (0x001) at 0xb832e824
Internal error: : 1 [#1] PREEMPT SMP ARM
Hardware name: BCM2835
[&lt;7f4473fc&gt;] (nft_quota_do_init [nft_quota])
[&lt;7f447448&gt;] (nft_quota_init [nft_quota])
[&lt;7f4260d0&gt;] (nf_tables_newrule [nf_tables])
[&lt;7f4168dc&gt;] (nfnetlink_rcv_batch [nfnetlink])
[&lt;7f416bd0&gt;] (nfnetlink_rcv [nfnetlink])
[&lt;8078b334&gt;] (netlink_unicast)
[&lt;8078b664&gt;] (netlink_sendmsg)
[&lt;8071b47c&gt;] (sock_sendmsg)
[&lt;8071bd18&gt;] (___sys_sendmsg)
[&lt;8071ce3c&gt;] (__sys_sendmsg)
[&lt;8071ce94&gt;] (sys_sendmsg)

The reason is that nft_quota_do_init() calls atomic64_set() on an
atomic64_t which is only aligned to 32-bit, not 64-bit, because it
succeeds struct nft_expr in memory which only contains a 32-bit pointer.
Fix by aligning the nft_expr private data to 64-bit.

Fixes: 96518518cc41 ("netfilter: add nftables")
Signed-off-by: Lukas Wunner &lt;lukas@wunner.de&gt;
Cc: stable@vger.kernel.org # v3.13+
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
</content>
</entry>
</feed>
