<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/kernel/bpf/hashtab.c, branch v6.17</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v6.17</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v6.17'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2025-04-28T15:40:45Z</updated>
<entry>
<title>Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf after rc4</title>
<updated>2025-04-28T15:40:45Z</updated>
<author>
<name>Alexei Starovoitov</name>
<email>ast@kernel.org</email>
</author>
<published>2025-04-28T15:40:45Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=224ee86639f57818cf4e05bd86eb7d9f31baac8d'/>
<id>urn:sha1:224ee86639f57818cf4e05bd86eb7d9f31baac8d</id>
<content type='text'>
Cross-merge bpf and other fixes after downstream PRs.

No conflicts.

Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
</content>
</entry>
<entry>
<title>bpf: fix possible endless loop in BPF map iteration</title>
<updated>2025-04-25T15:36:59Z</updated>
<author>
<name>Brandon Kammerdiener</name>
<email>brandon.kammerdiener@intel.com</email>
</author>
<published>2025-04-24T15:32:51Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=75673fda0c557ae26078177dd14d4857afbf128d'/>
<id>urn:sha1:75673fda0c557ae26078177dd14d4857afbf128d</id>
<content type='text'>
The _safe variant used here gets the next element before running the callback,
avoiding the endless loop condition.

Signed-off-by: Brandon Kammerdiener &lt;brandon.kammerdiener@intel.com&gt;
Link: https://lore.kernel.org/r/20250424153246.141677-2-brandon.kammerdiener@intel.com
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Acked-by: Hou Tao &lt;houtao1@huawei.com&gt;
</content>
</entry>
<entry>
<title>bpf: Don't allocate per-cpu extra_elems for fd htab</title>
<updated>2025-04-10T03:12:54Z</updated>
<author>
<name>Hou Tao</name>
<email>houtao1@huawei.com</email>
</author>
<published>2025-04-01T06:22:49Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=6704b1e8cfc5eed264065735fe00a1dd8a0bffef'/>
<id>urn:sha1:6704b1e8cfc5eed264065735fe00a1dd8a0bffef</id>
<content type='text'>
The update of element in fd htab is in-place now, therefore, there is no
need to allocate per-cpu extra_elems, just remove it.

Acked-by: Andrii Nakryiko &lt;andrii@kernel.org&gt;
Signed-off-by: Hou Tao &lt;houtao1@huawei.com&gt;
Link: https://lore.kernel.org/r/20250401062250.543403-6-houtao@huaweicloud.com
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
</content>
</entry>
<entry>
<title>bpf: Add is_fd_htab() helper</title>
<updated>2025-04-10T03:12:53Z</updated>
<author>
<name>Hou Tao</name>
<email>houtao1@huawei.com</email>
</author>
<published>2025-04-01T06:22:48Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=e8a65856c75d518d0bb15f38c90a4fd264ba1d3a'/>
<id>urn:sha1:e8a65856c75d518d0bb15f38c90a4fd264ba1d3a</id>
<content type='text'>
Add is_fd_htab() helper to check whether the map is htab of maps.

Acked-by: Andrii Nakryiko &lt;andrii@kernel.org&gt;
Signed-off-by: Hou Tao &lt;houtao1@huawei.com&gt;
Link: https://lore.kernel.org/r/20250401062250.543403-5-houtao@huaweicloud.com
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
</content>
</entry>
<entry>
<title>bpf: Support atomic update for htab of maps</title>
<updated>2025-04-10T03:12:53Z</updated>
<author>
<name>Hou Tao</name>
<email>houtao1@huawei.com</email>
</author>
<published>2025-04-01T06:22:47Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=2c304172e03193bd02363ee8969444261f7b7a57'/>
<id>urn:sha1:2c304172e03193bd02363ee8969444261f7b7a57</id>
<content type='text'>
As reported by Cody Haas [1], when there is concurrent map lookup and
map update operation in an existing element for htab of maps, the map
lookup procedure may return -ENOENT unexpectedly.

The root cause is twofold:

1) the update of existing element involves two separated list operation
In htab_map_update_elem(), it first inserts the new element at the head
of list, then it deletes the old element. Therefore, it is possible a
lookup operation has already iterated to the middle of the list when a
concurrent update operation begins, and the lookup operation will fail
to find the target element.

2) the immediate reuse of htab element.
It is more subtle. Even through the lookup operation finds the old
element, it is possible that the target element has been removed by a
concurrent update operation, and the element has been reused immediately
by other update operation which runs on the same CPU as the previous
update operation, and the element is inserted into the same bucket list.
After these steps above, when the lookup operation tries to compare the
key in the old element with the expected key, the match will fail
because the key in the old element have been overwritten by other update
operation.

The two-step update process is relatively straightforward to address.
The more challenging aspect is the immediate reuse. As Alexei pointed
out:

 So since 2022 both prealloc and no_prealloc reuse elements.
 We can consider a new flag for the hash map like F_REUSE_AFTER_RCU_GP
 that will use _rcu() flavor of freeing into bpf_ma,
 but it has to have a strong reason.

Given that htab of maps doesn't support special field in value and
directly stores the inner map pointer in htab_element, just do in-place
update for htab of maps instead of attempting to address the immediate
reuse issue.

[1]: https://lore.kernel.org/xdp-newbies/CAH7f-ULFTwKdoH_t2SFc5rWCVYLEg-14d1fBYWH2eekudsnTRg@mail.gmail.com/

Acked-by: Andrii Nakryiko &lt;andrii@kernel.org&gt;
Signed-off-by: Hou Tao &lt;houtao1@huawei.com&gt;
Link: https://lore.kernel.org/r/20250401062250.543403-4-houtao@huaweicloud.com
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
</content>
</entry>
<entry>
<title>bpf: Rename __htab_percpu_map_update_elem to htab_map_update_elem_in_place</title>
<updated>2025-04-10T03:12:53Z</updated>
<author>
<name>Hou Tao</name>
<email>houtao1@huawei.com</email>
</author>
<published>2025-04-01T06:22:46Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=5771e306b6cd8ce5b9935d006765f887f145e6d5'/>
<id>urn:sha1:5771e306b6cd8ce5b9935d006765f887f145e6d5</id>
<content type='text'>
Rename __htab_percpu_map_update_elem to htab_map_update_elem_in_place,
and add a new percpu argument for the helper to support in-place update
for both per-cpu htab and htab of maps.

Acked-by: Andrii Nakryiko &lt;andrii@kernel.org&gt;
Signed-off-by: Hou Tao &lt;houtao1@huawei.com&gt;
Link: https://lore.kernel.org/r/20250401062250.543403-3-houtao@huaweicloud.com
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
</content>
</entry>
<entry>
<title>bpf: Factor out htab_elem_value helper()</title>
<updated>2025-04-10T03:12:53Z</updated>
<author>
<name>Hou Tao</name>
<email>houtao1@huawei.com</email>
</author>
<published>2025-04-01T06:22:45Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=ba2b31b0f39fca12abbd21c53a92838bbc026023'/>
<id>urn:sha1:ba2b31b0f39fca12abbd21c53a92838bbc026023</id>
<content type='text'>
All hash maps store map key and map value together. The relative offset
of the map value compared to the map key is round_up(key_size, 8).
Therefore, factor out a common helper htab_elem_value() to calculate the
address of the map value instead of duplicating the logic.

Acked-by: Andrii Nakryiko &lt;andrii@kernel.org&gt;
Signed-off-by: Hou Tao &lt;houtao1@huawei.com&gt;
Link: https://lore.kernel.org/r/20250401062250.543403-2-houtao@huaweicloud.com
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
</content>
</entry>
<entry>
<title>bpf: Convert hashtab.c to rqspinlock</title>
<updated>2025-03-19T15:03:05Z</updated>
<author>
<name>Kumar Kartikeya Dwivedi</name>
<email>memxor@gmail.com</email>
</author>
<published>2025-03-16T04:05:35Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=4fa8d68aa53e6d76f66f3ed21e06c52cf8912074'/>
<id>urn:sha1:4fa8d68aa53e6d76f66f3ed21e06c52cf8912074</id>
<content type='text'>
Convert hashtab.c from raw_spinlock to rqspinlock, and drop the hashed
per-cpu counter crud from the code base which is no longer necessary.

Closes: https://lore.kernel.org/bpf/675302fd.050a0220.2477f.0004.GAE@google.com
Closes: https://lore.kernel.org/bpf/000000000000b3e63e061eed3f6b@google.com
Signed-off-by: Kumar Kartikeya Dwivedi &lt;memxor@gmail.com&gt;
Link: https://lore.kernel.org/r/20250316040541.108729-20-memxor@gmail.com
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
</content>
</entry>
<entry>
<title>bpf: Check map-&gt;record at the beginning of check_and_free_fields()</title>
<updated>2025-03-15T19:06:50Z</updated>
<author>
<name>Hou Tao</name>
<email>houtao1@huawei.com</email>
</author>
<published>2025-03-15T15:09:30Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=bb2243f4328bc2e4aa4d8566a5a0a7f9ce947570'/>
<id>urn:sha1:bb2243f4328bc2e4aa4d8566a5a0a7f9ce947570</id>
<content type='text'>
When there are no special fields in the map value, there is no need to
invoke bpf_obj_free_fields(). Therefore, checking the validity of
map-&gt;record in advance.

After the change, the benchmark result of the per-cpu update case in
map_perf_test increased by 40% under a 16-CPU VM.

Signed-off-by: Hou Tao &lt;houtao1@huawei.com&gt;
Acked-by: Kumar Kartikeya Dwivedi &lt;memxor@gmail.com&gt;
Link: https://lore.kernel.org/r/20250315150930.1511727-1-houtao@huaweicloud.com
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
</content>
</entry>
<entry>
<title>bpf: Fix kmemleak warning for percpu hashmap</title>
<updated>2025-02-24T20:11:00Z</updated>
<author>
<name>Yonghong Song</name>
<email>yonghong.song@linux.dev</email>
</author>
<published>2025-02-24T17:55:14Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=11ba7ce076e5903e7bdc1fd1498979c331b3c286'/>
<id>urn:sha1:11ba7ce076e5903e7bdc1fd1498979c331b3c286</id>
<content type='text'>
Vlad Poenaru reported the following kmemleak issue:

  unreferenced object 0x606fd7c44ac8 (size 32):
    backtrace (crc 0):
      pcpu_alloc_noprof+0x730/0xeb0
      bpf_map_alloc_percpu+0x69/0xc0
      prealloc_init+0x9d/0x1b0
      htab_map_alloc+0x363/0x510
      map_create+0x215/0x3a0
      __sys_bpf+0x16b/0x3e0
      __x64_sys_bpf+0x18/0x20
      do_syscall_64+0x7b/0x150
      entry_SYSCALL_64_after_hwframe+0x4b/0x53

Further investigation shows the reason is due to not 8-byte aligned
store of percpu pointer in htab_elem_set_ptr():
  *(void __percpu **)(l-&gt;key + key_size) = pptr;

Note that the whole htab_elem alignment is 8 (for x86_64). If the key_size
is 4, that means pptr is stored in a location which is 4 byte aligned but
not 8 byte aligned. In mm/kmemleak.c, scan_block() scans the memory based
on 8 byte stride, so it won't detect above pptr, hence reporting the memory
leak.

In htab_map_alloc(), we already have

        htab-&gt;elem_size = sizeof(struct htab_elem) +
                          round_up(htab-&gt;map.key_size, 8);
        if (percpu)
                htab-&gt;elem_size += sizeof(void *);
        else
                htab-&gt;elem_size += round_up(htab-&gt;map.value_size, 8);

So storing pptr with 8-byte alignment won't cause any problem and can fix
kmemleak too.

The issue can be reproduced with bpf selftest as well:
  1. Enable CONFIG_DEBUG_KMEMLEAK config
  2. Add a getchar() before skel destroy in test_hash_map() in prog_tests/for_each.c.
     The purpose is to keep map available so kmemleak can be detected.
  3. run './test_progs -t for_each/hash_map &amp;' and a kmemleak should be reported.

Reported-by: Vlad Poenaru &lt;thevlad@meta.com&gt;
Signed-off-by: Yonghong Song &lt;yonghong.song@linux.dev&gt;
Acked-by: Martin KaFai Lau &lt;martin.lau@kernel.org&gt;
Link: https://lore.kernel.org/r/20250224175514.2207227-1-yonghong.song@linux.dev
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
</content>
</entry>
</feed>
