<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/lib/kobject_uevent.c, branch v5.6</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v5.6</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v5.6'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2019-04-01T05:37:12Z</updated>
<entry>
<title>kobject: Don't trigger kobject_uevent(KOBJ_REMOVE) twice.</title>
<updated>2019-04-01T05:37:12Z</updated>
<author>
<name>Tetsuo Handa</name>
<email>penguin-kernel@I-love.SAKURA.ne.jp</email>
</author>
<published>2019-03-17T05:02:31Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=c03a0fd0b609e2f5c669c2b7f27c8e1928e9196e'/>
<id>urn:sha1:c03a0fd0b609e2f5c669c2b7f27c8e1928e9196e</id>
<content type='text'>
syzbot is hitting use-after-free bug in uinput module [1]. This is because
kobject_uevent(KOBJ_REMOVE) is called again due to commit 0f4dafc0563c6c49
("Kobject: auto-cleanup on final unref") after memory allocation fault
injection made kobject_uevent(KOBJ_REMOVE) from device_del() from
input_unregister_device() fail, while uinput_destroy_device() is expecting
that kobject_uevent(KOBJ_REMOVE) is not called after device_del() from
input_unregister_device() completed.

That commit intended to catch cases where nobody even attempted to send
"remove" uevents. But there is no guarantee that an event will ultimately
be sent. We are at the point of no return as far as the rest of the kernel
is concerned; there are no repeats or do-overs.

Also, it is not clear whether some subsystem depends on that commit.
If no subsystem depends on that commit, it will be better to remove
the state_{add,remove}_uevent_sent logic. But we don't want to risk
a regression (in a patch which will be backported) by trying to remove
that logic. Therefore, as a first step, let's avoid the use-after-free bug
by making sure that kobject_uevent(KOBJ_REMOVE) won't be triggered twice.

[1] https://syzkaller.appspot.com/bug?id=8b17c134fe938bbddd75a45afaa9e68af43a362d

Reported-by: syzbot &lt;syzbot+f648cfb7e0b52bf7ae32@syzkaller.appspotmail.com&gt;
Analyzed-by: Dmitry Torokhov &lt;dmitry.torokhov@gmail.com&gt;
Fixes: 0f4dafc0563c6c49 ("Kobject: auto-cleanup on final unref")
Cc: Kay Sievers &lt;kay@vrfy.org&gt;
Signed-off-by: Tetsuo Handa &lt;penguin-kernel@I-love.SAKURA.ne.jp&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>kobject: drop newline from msg string</title>
<updated>2019-01-22T13:25:26Z</updated>
<author>
<name>Bo YU</name>
<email>tsu.yubo@gmail.com</email>
</author>
<published>2019-01-09T09:17:48Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=549ad24374c0f796a0d3676a8962596915921c68'/>
<id>urn:sha1:549ad24374c0f796a0d3676a8962596915921c68</id>
<content type='text'>
There is currently a missing terminating newline in non-switch case
match when msg == NULL

Signed-off-by: Bo YU &lt;tsu.yubo@gmail.com&gt;
Reviewed-by: Rafael J. Wysocki &lt;rafael.j.wysocki@intel.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>kobject: to repalce printk with pr_* style</title>
<updated>2019-01-22T13:25:26Z</updated>
<author>
<name>Bo YU</name>
<email>tsu.yubo@gmail.com</email>
</author>
<published>2019-01-09T09:17:00Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=b3fa29ad83770c764909d8497be3866ad341a3a6'/>
<id>urn:sha1:b3fa29ad83770c764909d8497be3866ad341a3a6</id>
<content type='text'>
Repalce printk with pr_warn in kobject_synth_uevent and replace
printk with pr_err in uevent_net_init to make both consistent with
other code in kobject_uevent.c

Signed-off-by: Bo YU &lt;tsu.yubo@gmail.com&gt;
Reviewed-by: Rafael J. Wysocki &lt;rafael.j.wysocki@intel.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>kobject: Fix warnings in lib/kobject_uevent.c</title>
<updated>2018-11-11T19:27:43Z</updated>
<author>
<name>Bo YU</name>
<email>tsu.yubo@gmail.com</email>
</author>
<published>2018-10-30T12:01:15Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=6be244dcd59b01245394b62a128901b1cfec468c'/>
<id>urn:sha1:6be244dcd59b01245394b62a128901b1cfec468c</id>
<content type='text'>
Add a blank after declaration.

Signed-off-by: Bo YU &lt;tsu.yubo@gmail.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>kobject: drop unnecessary cast "%llu" for u64</title>
<updated>2018-11-11T19:27:43Z</updated>
<author>
<name>Bo YU</name>
<email>tsu.yubo@gmail.com</email>
</author>
<published>2018-10-30T12:01:14Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=e0d70bcb38d7eae5eb4da2db130a29d7007eac74'/>
<id>urn:sha1:e0d70bcb38d7eae5eb4da2db130a29d7007eac74</id>
<content type='text'>
There is no searon for u64 var cast to unsigned long long type.

Signed-off-by: Bo YU &lt;tsu.yubo@gmail.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>netns: restrict uevents</title>
<updated>2018-05-01T14:22:41Z</updated>
<author>
<name>Christian Brauner</name>
<email>christian.brauner@ubuntu.com</email>
</author>
<published>2018-04-29T10:44:12Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=a3498436b3a0f8ec289e6847e1de40b4123e1639'/>
<id>urn:sha1:a3498436b3a0f8ec289e6847e1de40b4123e1639</id>
<content type='text'>
commit 07e98962fa77 ("kobject: Send hotplug events in all network namespaces")

enabled sending hotplug events into all network namespaces back in 2010.
Over time the set of uevents that get sent into all network namespaces has
shrunk. We have now reached the point where hotplug events for all devices
that carry a namespace tag are filtered according to that namespace.
Specifically, they are filtered whenever the namespace tag of the kobject
does not match the namespace tag of the netlink socket.
Currently, only network devices carry namespace tags (i.e. network
namespace tags). Hence, uevents for network devices only show up in the
network namespace such devices are created in or moved to.

However, any uevent for a kobject that does not have a namespace tag
associated with it will not be filtered and we will broadcast it into all
network namespaces. This behavior stopped making sense when user namespaces
were introduced.

This patch simplifies and fixes couple of things:
- Split codepath for sending uevents by kobject namespace tags:
  1. Untagged kobjects - uevent_net_broadcast_untagged():
     Untagged kobjects will be broadcast into all uevent sockets recorded
     in uevent_sock_list, i.e. into all network namespacs owned by the
     intial user namespace.
  2. Tagged kobjects - uevent_net_broadcast_tagged():
     Tagged kobjects will only be broadcast into the network namespace they
     were tagged with.
  Handling of tagged kobjects in 2. does not cause any semantic changes.
  This is just splitting out the filtering logic that was handled by
  kobj_bcast_filter() before.
  Handling of untagged kobjects in 1. will cause a semantic change. The
  reasons why this is needed and ok have been discussed in [1]. Here is a
  short summary:
  - Userspace ignores uevents from network namespaces that are not owned by
    the intial user namespace:
    Uevents are filtered by userspace in a user namespace because the
    received uid != 0. Instead the uid associated with the event will be
    65534 == "nobody" because the global root uid is not mapped.
    This means we can safely and without introducing regressions modify the
    kernel to not send uevents into all network namespaces whose owning
    user namespace is not the initial user namespace because we know that
    userspace will ignore the message because of the uid anyway.
    I have a) verified that is is true for every udev implementation out
    there b) that this behavior has been present in all udev
    implementations from the very beginning.
  - Thundering herd:
    Broadcasting uevents into all network namespaces introduces significant
    overhead.
    All processes that listen to uevents running in non-initial user
    namespaces will end up responding to uevents that will be meaningless
    to them. Mainly, because non-initial user namespaces cannot easily
    manage devices unless they have a privileged host-process helping them
    out. This means that there will be a thundering herd of activity when
    there shouldn't be any.
  - Removing needless overhead/Increasing performance:
    Currently, the uevent socket for each network namespace is added to the
    global variable uevent_sock_list. The list itself needs to be protected
    by a mutex. So everytime a uevent is generated the mutex is taken on
    the list. The mutex is held *from the creation of the uevent (memory
    allocation, string creation etc. until all uevent sockets have been
    handled*. This is aggravated by the fact that for each uevent socket
    that has listeners the mc_list must be walked as well which means we're
    talking O(n^2) here. Given that a standard Linux workload usually has
    quite a lot of network namespaces and - in the face of containers - a
    lot of user namespaces this quickly becomes a performance problem (see
    "Thundering herd" above). By just recording uevent sockets of network
    namespaces that are owned by the initial user namespace we
    significantly increase performance in this codepath.
  - Injecting uevents:
    There's a valid argument that containers might be interested in
    receiving device events especially if they are delegated to them by a
    privileged userspace process. One prime example are SR-IOV enabled
    devices that are explicitly designed to be handed of to other users
    such as VMs or containers.
    This use-case can now be correctly handled since
    commit 692ec06d7c92 ("netns: send uevent messages"). This commit
    introduced the ability to send uevents from userspace. As such we can
    let a sufficiently privileged (CAP_SYS_ADMIN in the owning user
    namespace of the network namespace of the netlink socket) userspace
    process make a decision what uevents should be sent. This removes the
    need to blindly broadcast uevents into all user namespaces and provides
    a performant and safe solution to this problem.
  - Filtering logic:
    This patch filters by *owning user namespace of the network namespace a
    given task resides in* and not by user namespace of the task per se.
    This means if the user namespace of a given task is unshared but the
    network namespace is kept and is owned by the initial user namespace a
    listener that is opening the uevent socket in that network namespace
    can still listen to uevents.
- Fix permission for tagged kobjects:
  Network devices that are created or moved into a network namespace that
  is owned by a non-initial user namespace currently are send with
  INVALID_{G,U}ID in their credentials. This means that all current udev
  implementations in userspace will ignore the uevent they receive for
  them. This has lead to weird bugs whereby new devices showing up in such
  network namespaces were not recognized and did not get IPs assigned etc.
  This patch adjusts the permission to the appropriate {g,u}id in the
  respective user namespace. This way udevd is able to correctly handle
  such devices.
- Simplify filtering logic:
  do_one_broadcast() already ensures that only listeners in mc_list receive
  uevents that have the same network namespace as the uevent socket itself.
  So the filtering logic in kobj_bcast_filter is not needed (see [3]). This
  patch therefore removes kobj_bcast_filter() and replaces
  netlink_broadcast_filtered() with the simpler netlink_broadcast()
  everywhere.

[1]: https://lkml.org/lkml/2018/4/4/739
[2]: https://lkml.org/lkml/2018/4/26/767
[3]: https://lkml.org/lkml/2018/4/26/738
Signed-off-by: Christian Brauner &lt;christian.brauner@ubuntu.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>uevent: add alloc_uevent_skb() helper</title>
<updated>2018-05-01T14:22:40Z</updated>
<author>
<name>Christian Brauner</name>
<email>christian.brauner@ubuntu.com</email>
</author>
<published>2018-04-29T10:44:11Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=26045a7b14bc7a5455e411d820110f66557d6589'/>
<id>urn:sha1:26045a7b14bc7a5455e411d820110f66557d6589</id>
<content type='text'>
This patch adds alloc_uevent_skb() in preparation for follow up patches.

Signed-off-by: Christian Brauner &lt;christian.brauner@ubuntu.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>net: Drop pernet_operations::async</title>
<updated>2018-03-27T17:18:09Z</updated>
<author>
<name>Kirill Tkhai</name>
<email>ktkhai@virtuozzo.com</email>
</author>
<published>2018-03-27T15:02:13Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=2f635ceeb22ba13c307236d69795fbb29cfa3e7c'/>
<id>urn:sha1:2f635ceeb22ba13c307236d69795fbb29cfa3e7c</id>
<content type='text'>
Synchronous pernet_operations are not allowed anymore.
All are asynchronous. So, drop the structure member.

Signed-off-by: Kirill Tkhai &lt;ktkhai@virtuozzo.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>netns: send uevent messages</title>
<updated>2018-03-22T15:16:43Z</updated>
<author>
<name>Christian Brauner</name>
<email>christian.brauner@ubuntu.com</email>
</author>
<published>2018-03-19T12:17:31Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=692ec06d7c92af8ca841a6367648b9b3045344fd'/>
<id>urn:sha1:692ec06d7c92af8ca841a6367648b9b3045344fd</id>
<content type='text'>
This patch adds a receive method to NETLINK_KOBJECT_UEVENT netlink sockets
to allow sending uevent messages into the network namespace the socket
belongs to.

Currently non-initial network namespaces are already isolated and don't
receive uevents. There are a number of cases where it is beneficial for a
sufficiently privileged userspace process to send a uevent into a network
namespace.

One such use case would be debugging and fuzzing of a piece of software
which listens and reacts to uevents. By running a copy of that software
inside a network namespace, specific uevents could then be presented to it.
More concretely, this would allow for easy testing of udevd/ueventd.

This will also allow some piece of software to run components inside a
separate network namespace and then effectively filter what that software
can receive. Some examples of software that do directly listen to uevents
and that we have in the past attempted to run inside a network namespace
are rbd (CEPH client) or the X server.

Implementation:
The implementation has been kept as simple as possible from the kernel's
perspective. Specifically, a simple input method uevent_net_rcv() is added
to NETLINK_KOBJECT_UEVENT sockets which completely reuses existing
af_netlink infrastructure and does neither add an additional netlink family
nor requires any user-visible changes.

For example, by using netlink_rcv_skb() we can make use of existing netlink
infrastructure to report back informative error messages to userspace.

Furthermore, this implementation does not introduce any overhead for
existing uevent generating codepaths. The struct netns got a new uevent
socket member that records the uevent socket associated with that network
namespace including its position in the uevent socket list. Since we record
the uevent socket for each network namespace in struct net we don't have to
walk the whole uevent socket list. Instead we can directly retrieve the
relevant uevent socket and send the message. At exit time we can now also
trivially remove the uevent socket from the uevent socket list. This keeps
the codepath very performant without introducing needless overhead and even
makes older codepaths faster.

Uevent sequence numbers are kept global. When a uevent message is sent to
another network namespace the implementation will simply increment the
global uevent sequence number and append it to the received uevent. This
has the advantage that the kernel will never need to parse the received
uevent message to replace any existing uevent sequence numbers. Instead it
is up to the userspace process to remove any existing uevent sequence
numbers in case the uevent message to be sent contains any.

Security:
In order for a caller to send uevent messages to a target network namespace
the caller must have CAP_SYS_ADMIN in the owning user namespace of the
target network namespace. Additionally, any received uevent message is
verified to not exceed size UEVENT_BUFFER_SIZE. This includes the space
needed to append the uevent sequence number.

Testing:
This patch has been tested and verified to work with the following udev
implementations:
1. CentOS 6 with udevd version 147
2. Debian Sid with systemd-udevd version 237
3. Android 7.1.1 with ueventd

Signed-off-by: Christian Brauner &lt;christian.brauner@ubuntu.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>net: add uevent socket member</title>
<updated>2018-03-22T15:16:42Z</updated>
<author>
<name>Christian Brauner</name>
<email>christian.brauner@ubuntu.com</email>
</author>
<published>2018-03-19T12:17:30Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=94e5e3087a67c765be98592b36d8d187566478d5'/>
<id>urn:sha1:94e5e3087a67c765be98592b36d8d187566478d5</id>
<content type='text'>
This commit adds struct uevent_sock to struct net. Since struct uevent_sock
records the position of the uevent socket in the uevent socket list we can
trivially remove it from the uevent socket list during cleanup. This speeds
up the old removal codepath.
Note, list_del() will hit __list_del_entry_valid() in its call chain which
will validate that the element is a member of the list. If it isn't it will
take care that the list is not modified.

Signed-off-by: Christian Brauner &lt;christian.brauner@ubuntu.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
</feed>
