<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/include/net/sch_generic.h, branch v5.4</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v5.4</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v5.4'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2019-11-06T01:47:26Z</updated>
<entry>
<title>net: sched: prevent duplicate flower rules from tcf_proto destroy race</title>
<updated>2019-11-06T01:47:26Z</updated>
<author>
<name>John Hurley</name>
<email>john.hurley@netronome.com</email>
</author>
<published>2019-11-02T14:17:47Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=59eb87cb52c9f7164804bc8639c4d03ba9b0c169'/>
<id>urn:sha1:59eb87cb52c9f7164804bc8639c4d03ba9b0c169</id>
<content type='text'>
When a new filter is added to cls_api, the function
tcf_chain_tp_insert_unique() looks up the protocol/priority/chain to
determine if the tcf_proto is duplicated in the chain's hashtable. It then
creates a new entry or continues with an existing one. In cls_flower, this
allows the function fl_ht_insert_unque to determine if a filter is a
duplicate and reject appropriately, meaning that the duplicate will not be
passed to drivers via the offload hooks. However, when a tcf_proto is
destroyed it is removed from its chain before a hardware remove hook is
hit. This can lead to a race whereby the driver has not received the
remove message but duplicate flows can be accepted. This, in turn, can
lead to the offload driver receiving incorrect duplicate flows and out of
order add/delete messages.

Prevent duplicates by utilising an approach suggested by Vlad Buslov. A
hash table per block stores each unique chain/protocol/prio being
destroyed. This entry is only removed when the full destroy (and hardware
offload) has completed. If a new flow is being added with the same
identiers as a tc_proto being detroyed, then the add request is replayed
until the destroy is complete.

Fixes: 8b64678e0af8 ("net: sched: refactor tp insert/delete for concurrent execution")
Signed-off-by: John Hurley &lt;john.hurley@netronome.com&gt;
Signed-off-by: Vlad Buslov &lt;vladbu@mellanox.com&gt;
Reviewed-by: Simon Horman &lt;simon.horman@netronome.com&gt;
Reported-by: Louis Peens &lt;louis.peens@netronome.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>sch_netem: fix rcu splat in netem_enqueue()</title>
<updated>2019-09-27T08:29:11Z</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2019-09-24T20:11:26Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=159d2c7d8106177bd9a986fd005a311fe0d11285'/>
<id>urn:sha1:159d2c7d8106177bd9a986fd005a311fe0d11285</id>
<content type='text'>
qdisc_root() use from netem_enqueue() triggers a lockdep warning.

__dev_queue_xmit() uses rcu_read_lock_bh() which is
not equivalent to rcu_read_lock() + local_bh_disable_bh as far
as lockdep is concerned.

WARNING: suspicious RCU usage
5.3.0-rc7+ #0 Not tainted
-----------------------------
include/net/sch_generic.h:492 suspicious rcu_dereference_check() usage!

other info that might help us debug this:

rcu_scheduler_active = 2, debug_locks = 1
3 locks held by syz-executor427/8855:
 #0: 00000000b5525c01 (rcu_read_lock_bh){....}, at: lwtunnel_xmit_redirect include/net/lwtunnel.h:92 [inline]
 #0: 00000000b5525c01 (rcu_read_lock_bh){....}, at: ip_finish_output2+0x2dc/0x2570 net/ipv4/ip_output.c:214
 #1: 00000000b5525c01 (rcu_read_lock_bh){....}, at: __dev_queue_xmit+0x20a/0x3650 net/core/dev.c:3804
 #2: 00000000364bae92 (&amp;(&amp;sch-&gt;q.lock)-&gt;rlock){+.-.}, at: spin_lock include/linux/spinlock.h:338 [inline]
 #2: 00000000364bae92 (&amp;(&amp;sch-&gt;q.lock)-&gt;rlock){+.-.}, at: __dev_xmit_skb net/core/dev.c:3502 [inline]
 #2: 00000000364bae92 (&amp;(&amp;sch-&gt;q.lock)-&gt;rlock){+.-.}, at: __dev_queue_xmit+0x14b8/0x3650 net/core/dev.c:3838

stack backtrace:
CPU: 0 PID: 8855 Comm: syz-executor427 Not tainted 5.3.0-rc7+ #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x172/0x1f0 lib/dump_stack.c:113
 lockdep_rcu_suspicious+0x153/0x15d kernel/locking/lockdep.c:5357
 qdisc_root include/net/sch_generic.h:492 [inline]
 netem_enqueue+0x1cfb/0x2d80 net/sched/sch_netem.c:479
 __dev_xmit_skb net/core/dev.c:3527 [inline]
 __dev_queue_xmit+0x15d2/0x3650 net/core/dev.c:3838
 dev_queue_xmit+0x18/0x20 net/core/dev.c:3902
 neigh_hh_output include/net/neighbour.h:500 [inline]
 neigh_output include/net/neighbour.h:509 [inline]
 ip_finish_output2+0x1726/0x2570 net/ipv4/ip_output.c:228
 __ip_finish_output net/ipv4/ip_output.c:308 [inline]
 __ip_finish_output+0x5fc/0xb90 net/ipv4/ip_output.c:290
 ip_finish_output+0x38/0x1f0 net/ipv4/ip_output.c:318
 NF_HOOK_COND include/linux/netfilter.h:294 [inline]
 ip_mc_output+0x292/0xf40 net/ipv4/ip_output.c:417
 dst_output include/net/dst.h:436 [inline]
 ip_local_out+0xbb/0x190 net/ipv4/ip_output.c:125
 ip_send_skb+0x42/0xf0 net/ipv4/ip_output.c:1555
 udp_send_skb.isra.0+0x6b2/0x1160 net/ipv4/udp.c:887
 udp_sendmsg+0x1e96/0x2820 net/ipv4/udp.c:1174
 inet_sendmsg+0x9e/0xe0 net/ipv4/af_inet.c:807
 sock_sendmsg_nosec net/socket.c:637 [inline]
 sock_sendmsg+0xd7/0x130 net/socket.c:657
 ___sys_sendmsg+0x3e2/0x920 net/socket.c:2311
 __sys_sendmmsg+0x1bf/0x4d0 net/socket.c:2413
 __do_sys_sendmmsg net/socket.c:2442 [inline]
 __se_sys_sendmmsg net/socket.c:2439 [inline]
 __x64_sys_sendmmsg+0x9d/0x100 net/socket.c:2439
 do_syscall_64+0xfd/0x6a0 arch/x86/entry/common.c:296
 entry_SYSCALL_64_after_hwframe+0x49/0xbe

Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Reported-by: syzbot &lt;syzkaller@googlegroups.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>net: sched: add API for registering unlocked offload block callbacks</title>
<updated>2019-08-26T21:17:43Z</updated>
<author>
<name>Vlad Buslov</name>
<email>vladbu@mellanox.com</email>
</author>
<published>2019-08-26T13:45:01Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=c9f14470d04830de217f9d28fcd0deffd7e8c0b1'/>
<id>urn:sha1:c9f14470d04830de217f9d28fcd0deffd7e8c0b1</id>
<content type='text'>
Extend struct flow_block_offload with "unlocked_driver_cb" flag to allow
registering and unregistering block hardware offload callbacks that do not
require caller to hold rtnl lock. Extend tcf_block with additional
lockeddevcnt counter that is incremented for each non-unlocked driver
callback attached to device. This counter is necessary to conditionally
obtain rtnl lock before calling hardware callbacks in following patches.

Register mlx5 tc block offload callbacks as "unlocked".

Signed-off-by: Vlad Buslov &lt;vladbu@mellanox.com&gt;
Acked-by: Jiri Pirko &lt;jiri@mellanox.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>net: sched: notify classifier on successful offload add/delete</title>
<updated>2019-08-26T21:17:43Z</updated>
<author>
<name>Vlad Buslov</name>
<email>vladbu@mellanox.com</email>
</author>
<published>2019-08-26T13:45:00Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=a449a3e77a85fc8b31fef7238451dc87af8ff1af'/>
<id>urn:sha1:a449a3e77a85fc8b31fef7238451dc87af8ff1af</id>
<content type='text'>
To remove dependency on rtnl lock, extend classifier ops with new
ops-&gt;hw_add() and ops-&gt;hw_del() callbacks. Call them from cls API while
holding cb_lock every time filter if successfully added to or deleted from
hardware.

Implement the new API in flower classifier. Use it to manage hw_filters
list under cb_lock protection, instead of relying on rtnl lock to
synchronize with concurrent fl_reoffload() call.

Signed-off-by: Vlad Buslov &lt;vladbu@mellanox.com&gt;
Acked-by: Jiri Pirko &lt;jiri@mellanox.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>net: sched: refactor block offloads counter usage</title>
<updated>2019-08-26T21:17:43Z</updated>
<author>
<name>Vlad Buslov</name>
<email>vladbu@mellanox.com</email>
</author>
<published>2019-08-26T13:44:59Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=401192113730947572d280ec465555ab9ff5a597'/>
<id>urn:sha1:401192113730947572d280ec465555ab9ff5a597</id>
<content type='text'>
Without rtnl lock protection filters can no longer safely manage block
offloads counter themselves. Refactor cls API to protect block offloadcnt
with tcf_block-&gt;cb_lock that is already used to protect driver callback
list and nooffloaddevcnt counter. The counter can be modified by concurrent
tasks by new functions that execute block callbacks (which is safe with
previous patch that changed its type to atomic_t), however, block
bind/unbind code that checks the counter value takes cb_lock in write mode
to exclude any concurrent modifications. This approach prevents race
conditions between bind/unbind and callback execution code but allows for
concurrency for tc rule update path.

Move block offload counter, filter in hardware counter and filter flags
management from classifiers into cls hardware offloads API. Make functions
tcf_block_offload_{inc|dec}() and tc_cls_offload_cnt_update() to be cls API
private. Implement following new cls API to be used instead:

  tc_setup_cb_add() - non-destructive filter add. If filter that wasn't
  already in hardware is successfully offloaded, increment block offloads
  counter, set filter in hardware counter and flag. On failure, previously
  offloaded filter is considered to be intact and offloads counter is not
  decremented.

  tc_setup_cb_replace() - destructive filter replace. Release existing
  filter block offload counter and reset its in hardware counter and flag.
  Set new filter in hardware counter and flag. On failure, previously
  offloaded filter is considered to be destroyed and offload counter is
  decremented.

  tc_setup_cb_destroy() - filter destroy. Unconditionally decrement block
  offloads counter.

  tc_setup_cb_reoffload() - reoffload filter to single cb. Execute cb() and
  call tc_cls_offload_cnt_update() if cb() didn't return an error.

Refactor all offload-capable classifiers to atomically offload filters to
hardware, change block offload counter, and set filter in hardware counter
and flag by means of the new cls API functions.

Signed-off-by: Vlad Buslov &lt;vladbu@mellanox.com&gt;
Acked-by: Jiri Pirko &lt;jiri@mellanox.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>net: sched: change tcf block offload counter type to atomic_t</title>
<updated>2019-08-26T21:17:43Z</updated>
<author>
<name>Vlad Buslov</name>
<email>vladbu@mellanox.com</email>
</author>
<published>2019-08-26T13:44:58Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=97394bef5622cb32fd1e5d152251090da6c238b9'/>
<id>urn:sha1:97394bef5622cb32fd1e5d152251090da6c238b9</id>
<content type='text'>
As a preparation for running proto ops functions without rtnl lock, change
offload counter type to atomic. This is necessary to allow updating the
counter by multiple concurrent users when offloading filters to hardware
from unlocked classifiers.

Signed-off-by: Vlad Buslov &lt;vladbu@mellanox.com&gt;
Acked-by: Jiri Pirko &lt;jiri@mellanox.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>net: sched: protect block offload-related fields with rw_semaphore</title>
<updated>2019-08-26T21:17:43Z</updated>
<author>
<name>Vlad Buslov</name>
<email>vladbu@mellanox.com</email>
</author>
<published>2019-08-26T13:44:57Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=4f8116c85057239ff37519debdd5d45b38ad8130'/>
<id>urn:sha1:4f8116c85057239ff37519debdd5d45b38ad8130</id>
<content type='text'>
In order to remove dependency on rtnl lock, extend tcf_block with 'cb_lock'
rwsem and use it to protect flow_block-&gt;cb_list and related counters from
concurrent modification. The lock is taken in read mode for read-only
traversal of cb_list in tc_setup_cb_call() and write mode in all other
cases. This approach ensures that:

- cb_list is not changed concurrently while filters is being offloaded on
  block.

- block-&gt;nooffloaddevcnt is checked while holding the lock in read mode,
  but is only changed by bind/unbind code when holding the cb_lock in write
  mode to prevent concurrent modification.

Signed-off-by: Vlad Buslov &lt;vladbu@mellanox.com&gt;
Acked-by: Jiri Pirko &lt;jiri@mellanox.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>flow_offload: move tc indirect block to flow offload</title>
<updated>2019-08-09T01:44:30Z</updated>
<author>
<name>wenxu</name>
<email>wenxu@ucloud.cn</email>
</author>
<published>2019-08-07T01:13:52Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=4e481908c51bf02457aecdedc2d80e1be22e0146'/>
<id>urn:sha1:4e481908c51bf02457aecdedc2d80e1be22e0146</id>
<content type='text'>
move tc indirect block to flow_offload and rename
it to flow indirect block.The nf_tables can use the
indr block architecture.

Signed-off-by: wenxu &lt;wenxu@ucloud.cn&gt;
Acked-by: Jakub Kicinski &lt;jakub.kicinski@netronome.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>net: flow_offload: add flow_block structure and use it</title>
<updated>2019-07-20T04:27:45Z</updated>
<author>
<name>Pablo Neira Ayuso</name>
<email>pablo@netfilter.org</email>
</author>
<published>2019-07-19T16:20:16Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=14bfb13f0ed525ed117b5d1f3e77e7c0a6be15de'/>
<id>urn:sha1:14bfb13f0ed525ed117b5d1f3e77e7c0a6be15de</id>
<content type='text'>
This object stores the flow block callbacks that are attached to this
block. Update flow_block_cb_lookup() to take this new object.

This patch restores the block sharing feature.

Fixes: da3eeb904ff4 ("net: flow_offload: add list handling functions")
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
Acked-by: Jiri Pirko &lt;jiri@mellanox.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>net: flow_offload: rename tc_setup_cb_t to flow_setup_cb_t</title>
<updated>2019-07-20T04:27:45Z</updated>
<author>
<name>Pablo Neira Ayuso</name>
<email>pablo@netfilter.org</email>
</author>
<published>2019-07-19T16:20:15Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=a7323311515d488b7714bb7504a1d50fabb0bfcf'/>
<id>urn:sha1:a7323311515d488b7714bb7504a1d50fabb0bfcf</id>
<content type='text'>
Rename this type definition and adapt users.

Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
Acked-by: Jiri Pirko &lt;jiri@mellanox.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
</feed>
