<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/kernel/bpf/hashtab.c, branch v5.14</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v5.14</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v5.14'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2021-08-06T23:39:22Z</updated>
<entry>
<title>bpf: Fix integer overflow involving bucket_size</title>
<updated>2021-08-06T23:39:22Z</updated>
<author>
<name>Tatsuhiko Yasumatsu</name>
<email>th.yasumatsu@gmail.com</email>
</author>
<published>2021-08-06T15:04:18Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=c4eb1f403243fc7bbb7de644db8587c03de36da6'/>
<id>urn:sha1:c4eb1f403243fc7bbb7de644db8587c03de36da6</id>
<content type='text'>
In __htab_map_lookup_and_delete_batch(), hash buckets are iterated
over to count the number of elements in each bucket (bucket_size).
If bucket_size is large enough, the multiplication to calculate
kvmalloc() size could overflow, resulting in out-of-bounds write
as reported by KASAN:

  [...]
  [  104.986052] BUG: KASAN: vmalloc-out-of-bounds in __htab_map_lookup_and_delete_batch+0x5ce/0xb60
  [  104.986489] Write of size 4194224 at addr ffffc9010503be70 by task crash/112
  [  104.986889]
  [  104.987193] CPU: 0 PID: 112 Comm: crash Not tainted 5.14.0-rc4 #13
  [  104.987552] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014
  [  104.988104] Call Trace:
  [  104.988410]  dump_stack_lvl+0x34/0x44
  [  104.988706]  print_address_description.constprop.0+0x21/0x140
  [  104.988991]  ? __htab_map_lookup_and_delete_batch+0x5ce/0xb60
  [  104.989327]  ? __htab_map_lookup_and_delete_batch+0x5ce/0xb60
  [  104.989622]  kasan_report.cold+0x7f/0x11b
  [  104.989881]  ? __htab_map_lookup_and_delete_batch+0x5ce/0xb60
  [  104.990239]  kasan_check_range+0x17c/0x1e0
  [  104.990467]  memcpy+0x39/0x60
  [  104.990670]  __htab_map_lookup_and_delete_batch+0x5ce/0xb60
  [  104.990982]  ? __wake_up_common+0x4d/0x230
  [  104.991256]  ? htab_of_map_free+0x130/0x130
  [  104.991541]  bpf_map_do_batch+0x1fb/0x220
  [...]

In hashtable, if the elements' keys have the same jhash() value, the
elements will be put into the same bucket. By putting a lot of elements
into a single bucket, the value of bucket_size can be increased to
trigger the integer overflow.

Triggering the overflow is possible for both callers with CAP_SYS_ADMIN
and callers without CAP_SYS_ADMIN.

It will be trivial for a caller with CAP_SYS_ADMIN to intentionally
reach this overflow by enabling BPF_F_ZERO_SEED. As this flag will set
the random seed passed to jhash() to 0, it will be easy for the caller
to prepare keys which will be hashed into the same value, and thus put
all the elements into the same bucket.

If the caller does not have CAP_SYS_ADMIN, BPF_F_ZERO_SEED cannot be
used. However, it will be still technically possible to trigger the
overflow, by guessing the random seed value passed to jhash() (32bit)
and repeating the attempt to trigger the overflow. In this case,
the probability to trigger the overflow will be low and will take
a very long time.

Fix the integer overflow by calling kvmalloc_array() instead of
kvmalloc() to allocate memory.

Fixes: 057996380a42 ("bpf: Add batch ops to all htab bpf map")
Signed-off-by: Tatsuhiko Yasumatsu &lt;th.yasumatsu@gmail.com&gt;
Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Link: https://lore.kernel.org/bpf/20210806150419.109658-1-th.yasumatsu@gmail.com
</content>
</entry>
<entry>
<title>bpf: Allow RCU-protected lookups to happen from bh context</title>
<updated>2021-06-24T17:41:15Z</updated>
<author>
<name>Toke Høiland-Jørgensen</name>
<email>toke@redhat.com</email>
</author>
<published>2021-06-24T16:05:54Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=694cea395fded425008e93cd90cfdf7a451674af'/>
<id>urn:sha1:694cea395fded425008e93cd90cfdf7a451674af</id>
<content type='text'>
XDP programs are called from a NAPI poll context, which means the RCU
reference liveness is ensured by local_bh_disable(). Add
rcu_read_lock_bh_held() as a condition to the RCU checks for map lookups so
lockdep understands that the dereferences are safe from inside *either* an
rcu_read_lock() section *or* a local_bh_disable() section. While both
bh_disabled and rcu_read_lock() provide RCU protection, they are
semantically distinct, so we need both conditions to prevent lockdep
complaints.

This change is done in preparation for removing the redundant
rcu_read_lock()s from drivers.

Signed-off-by: Toke Høiland-Jørgensen &lt;toke@redhat.com&gt;
Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Acked-by: Martin KaFai Lau &lt;kafai@fb.com&gt;
Link: https://lore.kernel.org/bpf/20210624160609.292325-5-toke@redhat.com
</content>
</entry>
<entry>
<title>bpf: Fix spelling mistakes</title>
<updated>2021-05-25T04:13:05Z</updated>
<author>
<name>Zhen Lei</name>
<email>thunder.leizhen@huawei.com</email>
</author>
<published>2021-05-25T02:56:59Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=8fb33b6055300a23f26868680c22a5726834785e'/>
<id>urn:sha1:8fb33b6055300a23f26868680c22a5726834785e</id>
<content type='text'>
Fix some spelling mistakes in comments:
aother ==&gt; another
Netiher ==&gt; Neither
desribe ==&gt; describe
intializing ==&gt; initializing
funciton ==&gt; function
wont ==&gt; won't and move the word 'the' at the end to the next line
accross ==&gt; across
pathes ==&gt; paths
triggerred ==&gt; triggered
excute ==&gt; execute
ether ==&gt; either
conervative ==&gt; conservative
convetion ==&gt; convention
markes ==&gt; marks
interpeter ==&gt; interpreter

Signed-off-by: Zhen Lei &lt;thunder.leizhen@huawei.com&gt;
Signed-off-by: Andrii Nakryiko &lt;andrii@kernel.org&gt;
Link: https://lore.kernel.org/bpf/20210525025659.8898-2-thunder.leizhen@huawei.com
</content>
</entry>
<entry>
<title>bpf: Add lookup_and_delete_elem support to hashtab</title>
<updated>2021-05-24T20:30:26Z</updated>
<author>
<name>Denis Salopek</name>
<email>denis.salopek@sartura.hr</email>
</author>
<published>2021-05-11T21:00:04Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=3e87f192b405960c0fe83e0925bd0dadf4f8cf43'/>
<id>urn:sha1:3e87f192b405960c0fe83e0925bd0dadf4f8cf43</id>
<content type='text'>
Extend the existing bpf_map_lookup_and_delete_elem() functionality to
hashtab map types, in addition to stacks and queues.
Create a new hashtab bpf_map_ops function that does lookup and deletion
of the element under the same bucket lock and add the created map_ops to
bpf.h.

Signed-off-by: Denis Salopek &lt;denis.salopek@sartura.hr&gt;
Signed-off-by: Andrii Nakryiko &lt;andrii@kernel.org&gt;
Acked-by: Yonghong Song &lt;yhs@fb.com&gt;
Link: https://lore.kernel.org/bpf/4d18480a3e990ffbf14751ddef0325eed3be2966.1620763117.git.denis.salopek@sartura.hr
</content>
</entry>
<entry>
<title>kernel/bpf/: Fix misspellings using codespell tool</title>
<updated>2021-03-16T19:22:20Z</updated>
<author>
<name>Liu xuzhi</name>
<email>liu.xuzhi@zte.com.cn</email>
</author>
<published>2021-03-11T12:31:03Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=6bd45f2e78f31bde335f7720e570a07331031110'/>
<id>urn:sha1:6bd45f2e78f31bde335f7720e570a07331031110</id>
<content type='text'>
A typo is found out by codespell tool in 34th lines of hashtab.c:

$ codespell ./kernel/bpf/
./hashtab.c:34 : differrent ==&gt; different

Fix a typo found by codespell.

Signed-off-by: Liu xuzhi &lt;liu.xuzhi@zte.com.cn&gt;
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Link: https://lore.kernel.org/bpf/20210311123103.323589-1-liu.xuzhi@zte.com.cn
</content>
</entry>
<entry>
<title>bpf: Add hashtab support for bpf_for_each_map_elem() helper</title>
<updated>2021-02-26T21:23:52Z</updated>
<author>
<name>Yonghong Song</name>
<email>yhs@fb.com</email>
</author>
<published>2021-02-26T20:49:27Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=314ee05e2fc601a7bece14376547d2b7a04bab67'/>
<id>urn:sha1:314ee05e2fc601a7bece14376547d2b7a04bab67</id>
<content type='text'>
This patch added support for hashmap, percpu hashmap,
lru hashmap and percpu lru hashmap.

Signed-off-by: Yonghong Song &lt;yhs@fb.com&gt;
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Acked-by: Andrii Nakryiko &lt;andrii@kernel.org&gt;
Link: https://lore.kernel.org/bpf/20210226204927.3885020-1-yhs@fb.com
</content>
</entry>
<entry>
<title>bpf: Allows per-cpu maps and map-in-map in sleepable programs</title>
<updated>2021-02-11T15:19:26Z</updated>
<author>
<name>Alexei Starovoitov</name>
<email>ast@kernel.org</email>
</author>
<published>2021-02-10T03:36:33Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=638e4b825d523bed7a55e776c153049fb7716466'/>
<id>urn:sha1:638e4b825d523bed7a55e776c153049fb7716466</id>
<content type='text'>
Since sleepable programs are now executing under migrate_disable
the per-cpu maps are safe to use.
The map-in-map were ok to use in sleepable from the time sleepable
progs were introduced.

Note that non-preallocated maps are still not safe, since there is
no rcu_read_lock yet in sleepable programs and dynamically allocated
map elements are relying on rcu protection. The sleepable programs
have rcu_read_lock_trace instead. That limitation will be addresses
in the future.

Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Acked-by: Andrii Nakryiko &lt;andrii@kernel.org&gt;
Acked-by: KP Singh &lt;kpsingh@kernel.org&gt;
Link: https://lore.kernel.org/bpf/20210210033634.62081-9-alexei.starovoitov@gmail.com
</content>
</entry>
<entry>
<title>bpf: Add schedule point in htab_init_buckets()</title>
<updated>2020-12-21T23:14:31Z</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2020-12-21T19:25:06Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=e7e518053c267bb6be3799520d9f4a34c7264a2e'/>
<id>urn:sha1:e7e518053c267bb6be3799520d9f4a34c7264a2e</id>
<content type='text'>
We noticed that with a LOCKDEP enabled kernel,
allocating a hash table with 65536 buckets would
use more than 60ms.

htab_init_buckets() runs from process context,
it is safe to schedule to avoid latency spikes.

Fixes: c50eb518e262 ("bpf: Use separate lockdep class for each hashtab")
Reported-by: John Sperbeck &lt;jsperbeck@google.com&gt;
Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Acked-by: Song Liu &lt;songliubraving@fb.com&gt;
Link: https://lore.kernel.org/bpf/20201221192506.707584-1-eric.dumazet@gmail.com
</content>
</entry>
<entry>
<title>bpf: Propagate __user annotations properly</title>
<updated>2020-12-08T03:26:09Z</updated>
<author>
<name>Lukas Bulwahn</name>
<email>lukas.bulwahn@gmail.com</email>
</author>
<published>2020-12-07T12:37:20Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=2f4b03195fe8ed3b1e213f4a6cfe14cfc109d829'/>
<id>urn:sha1:2f4b03195fe8ed3b1e213f4a6cfe14cfc109d829</id>
<content type='text'>
__htab_map_lookup_and_delete_batch() stores a user pointer in the local
variable ubatch and uses that in copy_{from,to}_user(), but ubatch misses a
__user annotation.

So, sparse warns in the various assignments and uses of ubatch:

  kernel/bpf/hashtab.c:1415:24: warning: incorrect type in initializer
    (different address spaces)
  kernel/bpf/hashtab.c:1415:24:    expected void *ubatch
  kernel/bpf/hashtab.c:1415:24:    got void [noderef] __user *

  kernel/bpf/hashtab.c:1444:46: warning: incorrect type in argument 2
    (different address spaces)
  kernel/bpf/hashtab.c:1444:46:    expected void const [noderef] __user *from
  kernel/bpf/hashtab.c:1444:46:    got void *ubatch

  kernel/bpf/hashtab.c:1608:16: warning: incorrect type in assignment
    (different address spaces)
  kernel/bpf/hashtab.c:1608:16:    expected void *ubatch
  kernel/bpf/hashtab.c:1608:16:    got void [noderef] __user *

  kernel/bpf/hashtab.c:1609:26: warning: incorrect type in argument 1
    (different address spaces)
  kernel/bpf/hashtab.c:1609:26:    expected void [noderef] __user *to
  kernel/bpf/hashtab.c:1609:26:    got void *ubatch

Add the __user annotation to repair this chain of propagating __user
annotations in __htab_map_lookup_and_delete_batch().

Signed-off-by: Lukas Bulwahn &lt;lukas.bulwahn@gmail.com&gt;
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Acked-by: Yonghong Song &lt;yhs@fb.com&gt;
Link: https://lore.kernel.org/bpf/20201207123720.19111-1-lukas.bulwahn@gmail.com
</content>
</entry>
<entry>
<title>bpf: Avoid overflows involving hash elem_size</title>
<updated>2020-12-07T20:57:25Z</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2020-12-07T18:28:21Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=e1868b9e36d0ca52e4e7c6c06953f191446e44df'/>
<id>urn:sha1:e1868b9e36d0ca52e4e7c6c06953f191446e44df</id>
<content type='text'>
Use of bpf_map_charge_init() was making sure hash tables would not use more
than 4GB of memory.

Since the implicit check disappeared, we have to be more careful
about overflows, to support big hash tables.

syzbot triggers a panic using :

bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_LRU_HASH, key_size=16384, value_size=8,
                     max_entries=262200, map_flags=0, inner_map_fd=-1, map_name="",
                     map_ifindex=0, btf_fd=-1, btf_key_type_id=0, btf_value_type_id=0,
                     btf_vmlinux_value_type_id=0}, 64) = ...

BUG: KASAN: vmalloc-out-of-bounds in bpf_percpu_lru_populate kernel/bpf/bpf_lru_list.c:594 [inline]
BUG: KASAN: vmalloc-out-of-bounds in bpf_lru_populate+0x4ef/0x5e0 kernel/bpf/bpf_lru_list.c:611
Write of size 2 at addr ffffc90017e4a020 by task syz-executor.5/19786

CPU: 0 PID: 19786 Comm: syz-executor.5 Not tainted 5.10.0-rc3-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:77 [inline]
 dump_stack+0x107/0x163 lib/dump_stack.c:118
 print_address_description.constprop.0.cold+0x5/0x4c8 mm/kasan/report.c:385
 __kasan_report mm/kasan/report.c:545 [inline]
 kasan_report.cold+0x1f/0x37 mm/kasan/report.c:562
 bpf_percpu_lru_populate kernel/bpf/bpf_lru_list.c:594 [inline]
 bpf_lru_populate+0x4ef/0x5e0 kernel/bpf/bpf_lru_list.c:611
 prealloc_init kernel/bpf/hashtab.c:319 [inline]
 htab_map_alloc+0xf6e/0x1230 kernel/bpf/hashtab.c:507
 find_and_alloc_map kernel/bpf/syscall.c:123 [inline]
 map_create kernel/bpf/syscall.c:829 [inline]
 __do_sys_bpf+0xa81/0x5170 kernel/bpf/syscall.c:4336
 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
 entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x45deb9
Code: 0d b4 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 &lt;48&gt; 3d 01 f0 ff ff 0f 83 db b3 fb ff c3 66 2e 0f 1f 84 00 00 00 00
RSP: 002b:00007fd93fbc0c78 EFLAGS: 00000246 ORIG_RAX: 0000000000000141
RAX: ffffffffffffffda RBX: 0000000000001a40 RCX: 000000000045deb9
RDX: 0000000000000040 RSI: 0000000020000280 RDI: 0000000000000000
RBP: 000000000119bf60 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 000000000119bf2c
R13: 00007ffc08a7be8f R14: 00007fd93fbc19c0 R15: 000000000119bf2c

Fixes: 755e5d55367a ("bpf: Eliminate rlimit-based memory accounting for hashtab maps")
Reported-by: syzbot &lt;syzkaller@googlegroups.com&gt;
Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Acked-by: Roman Gushchin &lt;guro@fb.com&gt;
Link: https://lore.kernel.org/bpf/20201207182821.3940306-1-eric.dumazet@gmail.com
</content>
</entry>
</feed>
