<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/net/core/dev.c, branch v5.4</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v5.4</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v5.4'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2019-11-01T22:16:01Z</updated>
<entry>
<title>net: fix installing orphaned programs</title>
<updated>2019-11-01T22:16:01Z</updated>
<author>
<name>Jakub Kicinski</name>
<email>jakub.kicinski@netronome.com</email>
</author>
<published>2019-11-01T03:07:00Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=aefc3e723a78c2e429a64dadd7815ef2a4aecd44'/>
<id>urn:sha1:aefc3e723a78c2e429a64dadd7815ef2a4aecd44</id>
<content type='text'>
When netdevice with offloaded BPF programs is destroyed
the programs are orphaned and removed from the program
IDA - their IDs get released (the programs may remain
accessible via existing open file descriptors and pinned
files). After IDs are released they are set to 0.

This confuses dev_change_xdp_fd() because it compares
the __dev_xdp_query() result where 0 means no program
with prog-&gt;aux-&gt;id where 0 means orphaned.

dev_change_xdp_fd() would have incorrectly returned success
even though it had not installed the program.

Since drivers already catch this case via bpf_offload_dev_match()
let them handle this case. The error message drivers produce in
this case ("program loaded for a different device") is in fact
correct as the orphaned program must had to be loaded for a
different device.

Fixes: c14a9f633d9e ("net: Don't call XDP_SETUP_PROG when nothing is changed")
Signed-off-by: Jakub Kicinski &lt;jakub.kicinski@netronome.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>netns: fix GFP flags in rtnl_net_notifyid()</title>
<updated>2019-10-26T03:14:42Z</updated>
<author>
<name>Guillaume Nault</name>
<email>gnault@redhat.com</email>
</author>
<published>2019-10-23T16:39:04Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=d4e4fdf9e4a27c87edb79b1478955075be141f67'/>
<id>urn:sha1:d4e4fdf9e4a27c87edb79b1478955075be141f67</id>
<content type='text'>
In rtnl_net_notifyid(), we certainly can't pass a null GFP flag to
rtnl_notify(). A GFP_KERNEL flag would be fine in most circumstances,
but there are a few paths calling rtnl_net_notifyid() from atomic
context or from RCU critical sections. The later also precludes the use
of gfp_any() as it wouldn't detect the RCU case. Also, the nlmsg_new()
call is wrong too, as it uses GFP_KERNEL unconditionally.

Therefore, we need to pass the GFP flags as parameter and propagate it
through function calls until the proper flags can be determined.

In most cases, GFP_KERNEL is fine. The exceptions are:
  * openvswitch: ovs_vport_cmd_get() and ovs_vport_cmd_dump()
    indirectly call rtnl_net_notifyid() from RCU critical section,

  * rtnetlink: rtmsg_ifinfo_build_skb() already receives GFP flags as
    parameter.

Also, in ovs_vport_cmd_build_info(), let's change the GFP flags used
by nlmsg_new(). The function is allowed to sleep, so better make the
flags consistent with the ones used in the following
ovs_vport_cmd_fill_info() call.

Found by code inspection.

Fixes: 9a9634545c70 ("netns: notify netns id events")
Signed-off-by: Guillaume Nault &lt;gnault@redhat.com&gt;
Acked-by: Nicolas Dichtel &lt;nicolas.dichtel@6wind.com&gt;
Acked-by: Pravin B Shelar &lt;pshelar@ovn.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>net: remove unnecessary variables and callback</title>
<updated>2019-10-24T21:53:49Z</updated>
<author>
<name>Taehee Yoo</name>
<email>ap420073@gmail.com</email>
</author>
<published>2019-10-21T18:47:58Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=f3b0a18bb6cb07a9abb75e21b1f08eeaefa78e81'/>
<id>urn:sha1:f3b0a18bb6cb07a9abb75e21b1f08eeaefa78e81</id>
<content type='text'>
This patch removes variables and callback these are related to the nested
device structure.
devices that can be nested have their own nest_level variable that
represents the depth of nested devices.
In the previous patch, new {lower/upper}_level variables are added and
they replace old private nest_level variable.
So, this patch removes all 'nest_level' variables.

In order to avoid lockdep warning, -&gt;ndo_get_lock_subclass() was added
to get lockdep subclass value, which is actually lower nested depth value.
But now, they use the dynamic lockdep key to avoid lockdep warning instead
of the subclass.
So, this patch removes -&gt;ndo_get_lock_subclass() callback.

Signed-off-by: Taehee Yoo &lt;ap420073@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>net: core: add ignore flag to netdev_adjacent structure</title>
<updated>2019-10-24T21:53:48Z</updated>
<author>
<name>Taehee Yoo</name>
<email>ap420073@gmail.com</email>
</author>
<published>2019-10-21T18:47:56Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=32b6d34fedc2229cdf6a047fdbc0704085441915'/>
<id>urn:sha1:32b6d34fedc2229cdf6a047fdbc0704085441915</id>
<content type='text'>
In order to link an adjacent node, netdev_upper_dev_link() is used
and in order to unlink an adjacent node, netdev_upper_dev_unlink() is used.
unlink operation does not fail, but link operation can fail.

In order to exchange adjacent nodes, we should unlink an old adjacent
node first. then, link a new adjacent node.
If link operation is failed, we should link an old adjacent node again.
But this link operation can fail too.
It eventually breaks the adjacent link relationship.

This patch adds an ignore flag into the netdev_adjacent structure.
If this flag is set, netdev_upper_dev_link() ignores an old adjacent
node for a moment.

This patch also adds new functions for other modules.
netdev_adjacent_change_prepare()
netdev_adjacent_change_commit()
netdev_adjacent_change_abort()

netdev_adjacent_change_prepare() inserts new device into adjacent list
but new device is not allowed to use immediately.
If netdev_adjacent_change_prepare() fails, it internally rollbacks
adjacent list so that we don't need any other action.
netdev_adjacent_change_commit() deletes old device in the adjacent list
and allows new device to use.
netdev_adjacent_change_abort() rollbacks adjacent list.

Signed-off-by: Taehee Yoo &lt;ap420073@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>net: core: add generic lockdep keys</title>
<updated>2019-10-24T21:53:48Z</updated>
<author>
<name>Taehee Yoo</name>
<email>ap420073@gmail.com</email>
</author>
<published>2019-10-21T18:47:51Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=ab92d68fc22f9afab480153bd82a20f6e2533769'/>
<id>urn:sha1:ab92d68fc22f9afab480153bd82a20f6e2533769</id>
<content type='text'>
Some interface types could be nested.
(VLAN, BONDING, TEAM, MACSEC, MACVLAN, IPVLAN, VIRT_WIFI, VXLAN, etc..)
These interface types should set lockdep class because, without lockdep
class key, lockdep always warn about unexisting circular locking.

In the current code, these interfaces have their own lockdep class keys and
these manage itself. So that there are so many duplicate code around the
/driver/net and /net/.
This patch adds new generic lockdep keys and some helper functions for it.

This patch does below changes.
a) Add lockdep class keys in struct net_device
   - qdisc_running, xmit, addr_list, qdisc_busylock
   - these keys are used as dynamic lockdep key.
b) When net_device is being allocated, lockdep keys are registered.
   - alloc_netdev_mqs()
c) When net_device is being free'd llockdep keys are unregistered.
   - free_netdev()
d) Add generic lockdep key helper function
   - netdev_register_lockdep_key()
   - netdev_unregister_lockdep_key()
   - netdev_update_lockdep_key()
e) Remove unnecessary generic lockdep macro and functions
f) Remove unnecessary lockdep code of each interfaces.

After this patch, each interface modules don't need to maintain
their lockdep keys.

Signed-off-by: Taehee Yoo &lt;ap420073@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>net: core: limit nested device depth</title>
<updated>2019-10-24T21:53:48Z</updated>
<author>
<name>Taehee Yoo</name>
<email>ap420073@gmail.com</email>
</author>
<published>2019-10-21T18:47:50Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=5343da4c17429efaa5fb1594ea96aee1a283e694'/>
<id>urn:sha1:5343da4c17429efaa5fb1594ea96aee1a283e694</id>
<content type='text'>
Current code doesn't limit the number of nested devices.
Nested devices would be handled recursively and this needs huge stack
memory. So, unlimited nested devices could make stack overflow.

This patch adds upper_level and lower_level, they are common variables
and represent maximum lower/upper depth.
When upper/lower device is attached or dettached,
{lower/upper}_level are updated. and if maximum depth is bigger than 8,
attach routine fails and returns -EMLINK.

In addition, this patch converts recursive routine of
netdev_walk_all_{lower/upper} to iterator routine.

Test commands:
    ip link add dummy0 type dummy
    ip link add link dummy0 name vlan1 type vlan id 1
    ip link set vlan1 up

    for i in {2..55}
    do
	    let A=$i-1

	    ip link add vlan$i link vlan$A type vlan id $i
    done
    ip link del dummy0

Splat looks like:
[  155.513226][  T908] BUG: KASAN: use-after-free in __unwind_start+0x71/0x850
[  155.514162][  T908] Write of size 88 at addr ffff8880608a6cc0 by task ip/908
[  155.515048][  T908]
[  155.515333][  T908] CPU: 0 PID: 908 Comm: ip Not tainted 5.4.0-rc3+ #96
[  155.516147][  T908] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
[  155.517233][  T908] Call Trace:
[  155.517627][  T908]
[  155.517918][  T908] Allocated by task 0:
[  155.518412][  T908] (stack is not available)
[  155.518955][  T908]
[  155.519228][  T908] Freed by task 0:
[  155.519885][  T908] (stack is not available)
[  155.520452][  T908]
[  155.520729][  T908] The buggy address belongs to the object at ffff8880608a6ac0
[  155.520729][  T908]  which belongs to the cache names_cache of size 4096
[  155.522387][  T908] The buggy address is located 512 bytes inside of
[  155.522387][  T908]  4096-byte region [ffff8880608a6ac0, ffff8880608a7ac0)
[  155.523920][  T908] The buggy address belongs to the page:
[  155.524552][  T908] page:ffffea0001822800 refcount:1 mapcount:0 mapping:ffff88806c657cc0 index:0x0 compound_mapcount:0
[  155.525836][  T908] flags: 0x100000000010200(slab|head)
[  155.526445][  T908] raw: 0100000000010200 ffffea0001813808 ffffea0001a26c08 ffff88806c657cc0
[  155.527424][  T908] raw: 0000000000000000 0000000000070007 00000001ffffffff 0000000000000000
[  155.528429][  T908] page dumped because: kasan: bad access detected
[  155.529158][  T908]
[  155.529410][  T908] Memory state around the buggy address:
[  155.530060][  T908]  ffff8880608a6b80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[  155.530971][  T908]  ffff8880608a6c00: fb fb fb fb fb f1 f1 f1 f1 00 f2 f2 f2 f3 f3 f3
[  155.531889][  T908] &gt;ffff8880608a6c80: f3 fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
[  155.532806][  T908]                                            ^
[  155.533509][  T908]  ffff8880608a6d00: fb fb fb fb fb fb fb fb fb f1 f1 f1 f1 00 00 00
[  155.534436][  T908]  ffff8880608a6d80: f2 f3 f3 f3 f3 fb fb fb 00 00 00 00 00 00 00 00
[ ... ]

Signed-off-by: Taehee Yoo &lt;ap420073@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>sk_buff: drop all skb extensions on free and skb scrubbing</title>
<updated>2019-09-27T18:40:19Z</updated>
<author>
<name>Florian Westphal</name>
<email>fw@strlen.de</email>
</author>
<published>2019-09-26T18:37:05Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=174e23810cd3183dc2ca3f5166ef965a55eaaf54'/>
<id>urn:sha1:174e23810cd3183dc2ca3f5166ef965a55eaaf54</id>
<content type='text'>
Now that we have a 3rd extension, add a new helper that drops the
extension space and use it when we need to scrub an sk_buff.

At this time, scrubbing clears secpath and bridge netfilter data, but
retains the tc skb extension, after this patch all three get cleared.

NAPI reuse/free assumes we can only have a secpath attached to skb, but
it seems better to clear all extensions there as well.

v2: add unlikely hint (Eric Dumazet)

Fixes: 95a7233c452a ("net: openvswitch: Set OvS recirc_id from tc chain index")
Signed-off-by: Florian Westphal &lt;fw@strlen.de&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>Merge ra.kernel.org:/pub/scm/linux/kernel/git/netdev/net</title>
<updated>2019-09-17T21:51:10Z</updated>
<author>
<name>David S. Miller</name>
<email>davem@davemloft.net</email>
</author>
<published>2019-09-17T21:51:10Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=1bab8d4c488be22d57f9dd09968c90a0ddc413bf'/>
<id>urn:sha1:1bab8d4c488be22d57f9dd09968c90a0ddc413bf</id>
<content type='text'>
Pull in bug fixes from 'net' tree for the merge window.

Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>net/sched: fix race between deactivation and dequeue for NOLOCK qdisc</title>
<updated>2019-09-15T18:35:55Z</updated>
<author>
<name>Paolo Abeni</name>
<email>pabeni@redhat.com</email>
</author>
<published>2019-09-12T10:02:42Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=d518d2ed8640c1cbbbb6f63939e3e65471817367'/>
<id>urn:sha1:d518d2ed8640c1cbbbb6f63939e3e65471817367</id>
<content type='text'>
The test implemented by some_qdisc_is_busy() is somewhat loosy for
NOLOCK qdisc, as we may hit the following scenario:

CPU1						CPU2
// in net_tx_action()
clear_bit(__QDISC_STATE_SCHED...);
						// in some_qdisc_is_busy()
						val = (qdisc_is_running(q) ||
						       test_bit(__QDISC_STATE_SCHED,
								&amp;q-&gt;state));
						// here val is 0 but...
qdisc_run(q)
// ... CPU1 is going to run the qdisc next

As a conseguence qdisc_run() in net_tx_action() can race with qdisc_reset()
in dev_qdisc_reset(). Such race is not possible for !NOLOCK qdisc as
both the above bit operations are under the root qdisc lock().

After commit 021a17ed796b ("pfifo_fast: drop unneeded additional lock on dequeue")
the race can cause use after free and/or null ptr dereference, but the root
cause is likely older.

This patch addresses the issue explicitly checking for deactivation under
the seqlock for NOLOCK qdisc, so that the qdisc_run() in the critical
scenario becomes a no-op.

Note that the enqueue() op can still execute concurrently with dev_qdisc_reset(),
but that is safe due to the skb_array() locking, and we can't avoid that
for NOLOCK qdiscs.

Fixes: 021a17ed796b ("pfifo_fast: drop unneeded additional lock on dequeue")
Reported-by: Li Shuang &lt;shuali@redhat.com&gt;
Reported-and-tested-by: Davide Caratti &lt;dcaratti@redhat.com&gt;
Signed-off-by: Paolo Abeni &lt;pabeni@redhat.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net</title>
<updated>2019-09-15T12:17:27Z</updated>
<author>
<name>David S. Miller</name>
<email>davem@davemloft.net</email>
</author>
<published>2019-09-15T12:17:27Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=aa2eaa8c272a3211dec07ce9c6c863a7e355c10e'/>
<id>urn:sha1:aa2eaa8c272a3211dec07ce9c6c863a7e355c10e</id>
<content type='text'>
Minor overlapping changes in the btusb and ixgbe drivers.

Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
</feed>
