summaryrefslogtreecommitdiffstats
path: root/Documentation/bpf
AgeCommit message (Collapse)AuthorLines
2026-02-10Merge tag 'bpf-next-7.0' of ↵Linus Torvalds-120/+143
git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Pull bpf updates from Alexei Starovoitov: - Support associating BPF program with struct_ops (Amery Hung) - Switch BPF local storage to rqspinlock and remove recursion detection counters which were causing false positives (Amery Hung) - Fix live registers marking for indirect jumps (Anton Protopopov) - Introduce execution context detection BPF helpers (Changwoo Min) - Improve verifier precision for 32bit sign extension pattern (Cupertino Miranda) - Optimize BTF type lookup by sorting vmlinux BTF and doing binary search (Donglin Peng) - Allow states pruning for misc/invalid slots in iterator loops (Eduard Zingerman) - In preparation for ASAN support in BPF arenas teach libbpf to move global BPF variables to the end of the region and enable arena kfuncs while holding locks (Emil Tsalapatis) - Introduce support for implicit arguments in kfuncs and migrate a number of them to new API. This is a prerequisite for cgroup sub-schedulers in sched-ext (Ihor Solodrai) - Fix incorrect copied_seq calculation in sockmap (Jiayuan Chen) - Fix ORC stack unwind from kprobe_multi (Jiri Olsa) - Speed up fentry attach by using single ftrace direct ops in BPF trampolines (Jiri Olsa) - Require frozen map for calculating map hash (KP Singh) - Fix lock entry creation in TAS fallback in rqspinlock (Kumar Kartikeya Dwivedi) - Allow user space to select cpu in lookup/update operations on per-cpu array and hash maps (Leon Hwang) - Make kfuncs return trusted pointers by default (Matt Bobrowski) - Introduce "fsession" support where single BPF program is executed upon entry and exit from traced kernel function (Menglong Dong) - Allow bpf_timer and bpf_wq use in all programs types (Mykyta Yatsenko, Andrii Nakryiko, Kumar Kartikeya Dwivedi, Alexei Starovoitov) - Make KF_TRUSTED_ARGS the default for all kfuncs and clean up their definition across the tree (Puranjay Mohan) - Allow BPF arena calls from non-sleepable context (Puranjay Mohan) - Improve register id comparison logic in the verifier and extend linked registers with negative offsets (Puranjay Mohan) - In preparation for BPF-OOM introduce kfuncs to access memcg events (Roman Gushchin) - Use CFI compatible destructor kfunc type (Sami Tolvanen) - Add bitwise tracking for BPF_END in the verifier (Tianci Cao) - Add range tracking for BPF_DIV and BPF_MOD in the verifier (Yazhou Tang) - Make BPF selftests work with 64k page size (Yonghong Song) * tag 'bpf-next-7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (268 commits) selftests/bpf: Fix outdated test on storage->smap selftests/bpf: Choose another percpu variable in bpf for btf_dump test selftests/bpf: Remove test_task_storage_map_stress_lookup selftests/bpf: Update task_local_storage/task_storage_nodeadlock test selftests/bpf: Update task_local_storage/recursion test selftests/bpf: Update sk_storage_omem_uncharge test bpf: Switch to bpf_selem_unlink_nofail in bpf_local_storage_{map_free, destroy} bpf: Support lockless unlink when freeing map or local storage bpf: Prepare for bpf_selem_unlink_nofail() bpf: Remove unused percpu counter from bpf_local_storage_map_free bpf: Remove cgroup local storage percpu counter bpf: Remove task local storage percpu counter bpf: Change local_storage->lock and b->lock to rqspinlock bpf: Convert bpf_selem_unlink to failable bpf: Convert bpf_selem_link_map to failable bpf: Convert bpf_selem_unlink_map to failable bpf: Select bpf_local_storage_map_bucket based on bpf_local_storage selftests/xsk: fix number of Tx frags in invalid packet selftests/xsk: properly handle batch ending in the middle of a packet bpf: Prevent reentrance into call_rcu_tasks_trace() ...
2026-01-23Documentation: use a source-read extension for the index link boilerplateJani Nikula-7/+0
The root document usually has a special :ref:`genindex` link to the generated index. This is also the case for Documentation/index.rst. The other index.rst files deeper in the directory hierarchy usually don't. For SPHINXDIRS builds, the root document isn't Documentation/index.rst, but some other index.rst in the hierarchy. Currently they have a ".. only::" block to add the index link when doing SPHINXDIRS html builds. This is obviously very tedious and repetitive. The link is also added to all index.rst files in the hierarchy for SPHINXDIRS builds, not just the root document. Put the boilerplate in a sphinx-includes/subproject-index.rst file, and include it at the end of the root document for subproject builds in an ad-hoc source-read extension defined in conf.py. For now, keep having the boilerplate in translations, because this approach currently doesn't cover translated index link headers. Cc: Jonathan Corbet <corbet@lwn.net> Cc: Mauro Carvalho Chehab <mchehab@kernel.org> Cc: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Tested-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Reviewed-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> [jc: did s/doctree/kern_doc_dir/ ] Signed-off-by: Jonathan Corbet <corbet@lwn.net> Message-ID: <20260123143149.2024303-1-jani.nikula@intel.com>
2026-01-20bpf,docs: Document KF_IMPLICIT_ARGS flagIhor Solodrai-17/+32
Add a section explaining KF_IMPLICIT_ARGS kfunc flag. Remove __prog arg annotation, as it is no longer supported. Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev> Link: https://lore.kernel.org/r/20260120223027.3981805-1-ihor.solodrai@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-01-02bpf: Replace __opt annotation with __nullable for kfuncsPuranjay Mohan-11/+20
The __opt annotation was originally introduced specifically for buffer/size argument pairs in bpf_dynptr_slice() and bpf_dynptr_slice_rdwr(), allowing the buffer pointer to be NULL while still validating the size as a constant. The __nullable annotation serves the same purpose but is more general and is already used throughout the BPF subsystem for raw tracepoints, struct_ops, and other kfuncs. This patch unifies the two annotations by replacing __opt with __nullable. The key change is in the verifier's get_kfunc_ptr_arg_type() function, where mem/size pair detection is now performed before the nullable check. This ensures that buffer/size pairs are correctly classified as KF_ARG_PTR_TO_MEM_SIZE even when the buffer is nullable, while adding an !arg_mem_size condition to the nullable check prevents interference with mem/size pair handling. When processing KF_ARG_PTR_TO_MEM_SIZE arguments, the verifier now uses is_kfunc_arg_nullable() instead of the removed is_kfunc_arg_optional() to determine whether to skip size validation for NULL buffers. This is the first documentation added for the __nullable annotation, which has been in use since it was introduced but was previously undocumented. No functional changes to verifier behavior - nullable buffer/size pairs continue to work exactly as before. Acked-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Link: https://lore.kernel.org/r/20260102221513.1961781-1-puranjay@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2026-01-02bpf: Make KF_TRUSTED_ARGS the default for all kfuncsPuranjay Mohan-93/+91
Change the verifier to make trusted args the default requirement for all kfuncs by removing is_kfunc_trusted_args() assuming it be to always return true. This works because: 1. Context pointers (xdp_md, __sk_buff, etc.) are handled through their own KF_ARG_PTR_TO_CTX case label and bypass the trusted check 2. Struct_ops callback arguments are already marked as PTR_TRUSTED during initialization and pass is_trusted_reg() 3. KF_RCU kfuncs are handled separately via is_kfunc_rcu() checks at call sites (always checked with || alongside is_kfunc_trusted_args) This simple change makes all kfuncs require trusted args by default while maintaining correct behavior for all existing special cases. Note: This change means kfuncs that previously accepted NULL pointers without KF_TRUSTED_ARGS will now reject NULL at verification time. Several netfilter kfuncs are affected: bpf_xdp_ct_lookup(), bpf_skb_ct_lookup(), bpf_xdp_ct_alloc(), and bpf_skb_ct_alloc() all accept NULL for their bpf_tuple and opts parameters internally (checked in __bpf_nf_ct_lookup), but after this change the verifier rejects NULL before the kfunc is even called. This is acceptable because these kfuncs don't work with NULL parameters in their proper usage. Now they will be rejected rather than returning an error, which shouldn't make a difference to BPF programs that were using these kfuncs properly. Acked-by: Eduard Zingerman <eddyz87@gmail.com> Reviewed-by: Emil Tsalapatis <emil@etsalapatis.com> Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Link: https://lore.kernel.org/r/20260102180038.2708325-2-puranjay@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-12-31bpf: Update BPF_PROG_RUN documentationSungRock Jung-1/+2
LWT_SEG6LOCAL no longer supports test_run starting from v6.11 so remove it from the list of program types supported by BPF_PROG_RUN. Add TRACING and NETFILTER program types to reflect the current set of types that implement test_run. Signed-off-by: SungRock Jung <tjdfkr2421@gmail.com> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20251221070041.26592-1-tjdfkr2421@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-11-25docs: bpf: map_array: Specify BPF_MAP_TYPE_PERCPU_ARRAY value size limitAlex Tran-2/+3
Specify value size limit for BPF_MAP_TYPE_PERCPU_ARRAY which is PCPU_MIN_UNIT_SIZE (32 kb). In percpu allocator (mm: percpu), any request with a size greater than PCPU_MIN_UNIT_SIZE is rejected. Signed-off-by: Alex Tran <alex.t.tran@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20251115063531.2302903-1-alex.t.tran@gmail.com
2025-11-04docs/bpf: Add missing BPF k/uprobe program types to docsDonald Hunter-0/+18
Update the table of program types in the libbpf docs with the missing k/uprobe multi and session program types. Signed-off-by: Donald Hunter <donald.hunter@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20251029180932.98038-1-donald.hunter@gmail.com
2025-09-19bpf: disable and remove registers chain based livenessEduard Zingerman-264/+0
Remove register chain based liveness tracking: - struct bpf_reg_state->{parent,live} fields are no longer needed; - REG_LIVE_WRITTEN marks are superseded by bpf_mark_stack_write() calls; - mark_reg_read() calls are superseded by bpf_mark_stack_read(); - log.c:print_liveness() is superseded by logging in liveness.c; - propagate_liveness() is superseded by bpf_update_live_stack(); - no need to establish register chains in is_state_visited() anymore; - fix a bunch of tests expecting "_w" suffixes in verifier log messages. Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20250918-callchain-sensitive-liveness-v3-9-c3cd27bacc60@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-09-18bpf: Enforce RCU protection for KF_RCU_PROTECTEDKumar Kartikeya Dwivedi-1/+18
Currently, KF_RCU_PROTECTED only applies to iterator APIs and that too in a convoluted fashion: the presence of this flag on the kfunc is used to set MEM_RCU in iterator type, and the lack of RCU protection results in an error only later, once next() or destroy() methods are invoked on the iterator. While there is no bug, this is certainly a bit unintuitive, and makes the enforcement of the flag iterator specific. In the interest of making this flag useful for other upcoming kfuncs, e.g. scx_bpf_cpu_curr() [0][1], add enforcement for invoking the kfunc in an RCU critical section in general. This would also mean that iterator APIs using KF_RCU_PROTECTED will error out earlier, instead of throwing an error for lack of RCU CS protection when next() or destroy() methods are invoked. In addition to this, if the kfuncs tagged KF_RCU_PROTECTED return a pointer value, ensure that this pointer value is only usable in an RCU critical section. There might be edge cases where the return value is special and doesn't need to imply MEM_RCU semantics, but in general, the assumption should hold for the majority of kfuncs, and we can revisit things if necessary later. [0]: https://lore.kernel.org/all/20250903212311.369697-3-christian.loehle@arm.com [1]: https://lore.kernel.org/all/20250909195709.92669-1-arighi@nvidia.com Tested-by: Andrea Righi <arighi@nvidia.com> Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20250917032755.4068726-2-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-06-26Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf after rc3Alexei Starovoitov-4/+10
Cross-merge BPF, perf and other fixes after downstream PRs. It restores BPF CI to green after critical fix commit bc4394e5e79c ("perf: Fix the throttle error of some clock events") No conflicts. Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-06-18bpf: Adjust free target to avoid global starvation of LRU mapWillem de Bruijn-4/+10
BPF_MAP_TYPE_LRU_HASH can recycle most recent elements well before the map is full, due to percpu reservations and force shrink before neighbor stealing. Once a CPU is unable to borrow from the global map, it will once steal one elem from a neighbor and after that each time flush this one element to the global list and immediately recycle it. Batch value LOCAL_FREE_TARGET (128) will exhaust a 10K element map with 79 CPUs. CPU 79 will observe this behavior even while its neighbors hold 78 * 127 + 1 * 15 == 9921 free elements (99%). CPUs need not be active concurrently. The issue can appear with affinity migration, e.g., irqbalance. Each CPU can reserve and then hold onto its 128 elements indefinitely. Avoid global list exhaustion by limiting aggregate percpu caches to half of map size, by adjusting LOCAL_FREE_TARGET based on cpu count. This change has no effect on sufficiently large tables. Similar to LOCAL_NR_SCANS and lru->nr_scans, introduce a map variable lru->free_target. The extra field fits in a hole in struct bpf_lru. The cacheline is already warm where read in the hot path. The field is only accessed with the lru lock held. Tested-by: Anton Protopopov <a.s.protopopov@gmail.com> Signed-off-by: Willem de Bruijn <willemb@google.com> Acked-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://lore.kernel.org/r/20250618215803.3587312-1-willemdebruijn.kernel@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-06-12docs/bpf: Default cpu version changed from v1 to v3 in llvm 20Yonghong Song-3/+4
The default cpu version is changed from v1 to v3 in llvm version 20. See [1] for more detailed reasoning. Update bpf_devel_QA.rst so developers can find such information easily. [1] https://github.com/llvm/llvm-project/pull/107008 Signed-off-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20250612043049.2411989-1-yonghong.song@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-06-11bpf, doc: Improve wording of docsEslam Khafagy-2/+2
The phrase "dividing -1" is one I find confusing. E.g., "INT_MIN dividing -1" sounds like "-1 / INT_MIN" rather than the inverse. "divided by" instead of "dividing" assuming the inverse is meant. Signed-off-by: Eslam Khafagy <eslam.medhat1993@gmail.com> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20250607222434.227890-1-eslam.medhat1993@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-06-06Documentation: Fix spelling mistake.Eslam Khafagy-2/+2
Fix typo "desination => destination" in file Documentation/bpf/standardization/instruction-set.rst Signed-off-by: Eslam Khafagy <eslam.medhat1993@gmail.com> Acked-by: Dave Thaler <dthaler1968@gmail.com> Acked-by: David Vernet <void@manifault.com> Link: https://lore.kernel.org/r/20250606100511.368450-1-eslam.medhat1993@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-05-13bpf: Add support for __prog argument suffix to pass in prog->auxKumar Kartikeya Dwivedi-0/+17
Instead of hardcoding the list of kfuncs that need prog->aux passed to them with a combination of fixup_kfunc_call adjustment + __ign suffix, combine both in __prog suffix, which ignores the argument passed in, and fixes it up to the prog->aux. This allows kfuncs to have the prog->aux passed into them without having to touch the verifier. Cc: Tejun Heo <tj@kernel.org> Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20250513142812.1021591-1-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-05-13docs: bpf: Fix bullet point formatting warningKhaled Elnaggar-5/+5
Fix indentation for a bullet list item in bpf_iterators.rst. According to reStructuredText rules, bullet list item bodies must be consistently indented relative to the bullet. The indentation of the first line after the bullet determines the alignment for the rest of the item body. Reported by smatch: /linux/Documentation/bpf/bpf_iterators.rst:55: WARNING: Bullet list ends without a blank line; unexpected unindent. [docutils] Fixes: 7220eabff8cb ("bpf, docs: document open-coded BPF iterators") Signed-off-by: Khaled Elnaggar <khaledelnaggarlinux@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20250513015901.475207-1-khaledelnaggarlinux@gmail.com
2025-05-09bpf, docs: document open-coded BPF iteratorsAndrii Nakryiko-3/+110
Extract BPF open-coded iterators documentation spread out across a few original commit messages ([0], [1]) into a dedicated doc section under Documentation/bpf/bpf_iterators.rst. Also make explicit expectation that BPF iterator program type should be accompanied by a corresponding open-coded BPF iterator implementation, going forward. [0] https://lore.kernel.org/all/20230308184121.1165081-3-andrii@kernel.org/ [1] https://lore.kernel.org/all/20230308184121.1165081-4-andrii@kernel.org/ Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/r/20250509180350.2604946-1-andrii@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-04-28Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf after rc4Alexei Starovoitov-0/+8
Cross-merge bpf and other fixes after downstream PRs. No conflicts. Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-04-25bpf: Add namespace to BPF internal symbolsAlexei Starovoitov-0/+8
Add namespace to BPF internal symbols used by light skeleton to prevent abuse and document with the code their allowed usage. Fixes: b1d18a7574d0 ("bpf: Extend sys_bpf commands for bpf_syscall programs.") Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/bpf/20250425014542.62385-1-alexei.starovoitov@gmail.com
2025-04-24bpf, docs: Fix non-standard line breakWangYuli-2/+2
Even though the kernel's coding-style document does not explicitly state this, we generally put a newline after the semicolon of every C language statement to enhance code readability. Adjust the placement of newlines to adhere to this convention. Reported-by: Chen Linxuan <chenlinxuan@uniontech.com> Signed-off-by: WangYuli <wangyuli@uniontech.com> Reviewed-by: Yanteng Si <si.yanteng@linux.dev> Link: https://lore.kernel.org/r/DB66473733449DB0+20250423030632.17626-1-wangyuli@uniontech.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-03-15bpf, docs: Fix broken link to renamed bpf_iter_task_vmas.cT.J. Mercier-1/+1
This file was renamed from bpf_iter_task_vma.c. Fixes: 45b38941c81f ("selftests/bpf: Rename bpf_iter_task_vma.c to bpf_iter_task_vmas.c") Signed-off-by: T.J. Mercier <tjmercier@google.com> Acked-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20250304204520.201115-1-tjmercier@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-02-26docs/bpf: Document some special sdiv/smod operationsYonghong Song-6/+14
Patch [1] fixed possible kernel crash due to specific sdiv/smod operations in bpf program. The following are related operations and the expected results of those operations: - LLONG_MIN/-1 = LLONG_MIN - INT_MIN/-1 = INT_MIN - LLONG_MIN%-1 = 0 - INT_MIN%-1 = 0 Those operations are replaced with codes which won't cause kernel crash. This patch documents what operations may cause exception and what replacement operations are. [1] https://lore.kernel.org/all/20240913150326.1187788-1-yonghong.song@linux.dev/ Signed-off-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20241107170924.2944681-1-yonghong.song@linux.dev Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-02-05docs/bpf: Document the semantics of BTF tags with kind_flagIhor Solodrai-4/+21
Explain the meaning of kind_flag in BTF type_tags and decl_tags. Update uapi btf.h kind_flag comment to reflect the changes. Signed-off-by: Ihor Solodrai <ihor.solodrai@linux.dev> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20250130201239.1429648-3-ihor.solodrai@linux.dev
2024-11-11bpf: Remove trailing whitespace in verifier.rstAbhinav Saxena-2/+2
Remove trailing whitespace in Documentation/bpf/verifier.rst. Signed-off-by: Abhinav Saxena <xandfury@gmail.com> Link: https://lore.kernel.org/r/20241107063708.106340-2-xandfury@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
2024-10-29docs/bpf: Add description of .BTF.base sectionAlan Maguire-1/+76
Now that .BTF.base sections are generated for out-of-tree kernel modules (provided pahole supports the "distilled_base" BTF feature), document .BTF.base and its role in supporting resilient split BTF and BTF relocation. Changes since v1: - updated formatting, corrected typo, used BTF ID[s] consistently (Andrii) Signed-off-by: Alan Maguire <alan.maguire@oracle.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20241028091543.2175967-1-alan.maguire@oracle.com
2024-09-12docs/bpf: Add missing BPF program types to docsDonald Hunter-4/+26
Update the table of program types in the libbpf documentation with the recently added program types. Signed-off-by: Donald Hunter <donald.hunter@gmail.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20240912095944.6386-1-donald.hunter@gmail.com
2024-09-11docs/bpf: Add constant values for linkagesWill Hawkins-4/+35
Make the values of the symbolic constants that define the valid linkages for functions and variables explicit. Signed-off-by: Will Hawkins <hawkinsw@obs.cr> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/bpf/20240911055033.2084881-1-hawkinsw@obs.cr
2024-08-29docs/bpf: Fix a typo in verifier.rstYiming Xiang-1/+1
In verifier.rst, there is a typo in section 'Register parentage chains'. Caller saved registers are r0-r5, callee saved registers are r6-r9. Here by context it means callee saved registers rather than caller saved registers. This may confuse users. Signed-off-by: Yiming Xiang <kxiang@umich.edu> Link: https://lore.kernel.org/r/20240829031712.198489-1-kxiang@umich.edu Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-06-23bpf, docs: Address comments from IETF Area DirectorsDave Thaler-35/+45
This patch does the following to address IETF feedback: * Remove mention of "program type" and reference future docs (and mention platform-specific docs exist) for helper functions and BTF. Addresses Roman Danyliw's comments based on GENART review from Ines Robles [0]. * Add reference for endianness as requested by John Scudder [1]. * Added bit numbers to top of 32-bit wide format diagrams as requested by Paul Wouters [2]. * Added more text about why BPF doesn't stand for anything, based on text from ebpf.io [3], as requested by Eric Vyncke and Gunter Van de Velde [4]. * Replaced "htobe16" (and similar) and the direction-specific description with just "be16" (and similar) and a direction-agnostic description, to match the direction-agnostic description in the Byteswap Instructions section. Based on feedback from Eric Vyncke [5]. [0] https://mailarchive.ietf.org/arch/msg/bpf/DvDgDWOiwk05OyNlWlAmELZFPlM/ [1] https://mailarchive.ietf.org/arch/msg/bpf/eKNXpU4jCLjsbZDSw8LjI29M3tM/ [2] https://mailarchive.ietf.org/arch/msg/bpf/hGk8HkYxeZTpdu9qW_MvbGKj7WU/ [3] https://ebpf.io/what-is-ebpf/#what-do-ebpf-and-bpf-stand-for [4] https://mailarchive.ietf.org/arch/msg/bpf/i93lzdN3ewnzzS_JMbinCIYxAIU/ [5] https://mailarchive.ietf.org/arch/msg/bpf/KBWXbMeDcSrq4vsKR_KkBbV6hI4/ Acked-by: David Vernet <void@manifault.com> Signed-off-by: Dave Thaler <dthaler1968@googlemail.com> Link: https://lore.kernel.org/r/20240623150453.10613-1-dthaler1968@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-05-28libbpf: Configure log verbosity with env variableMykyta Yatsenko-0/+8
Configure logging verbosity by setting LIBBPF_LOG_LEVEL environment variable, which is applied only to default logger. Once user set their custom logging callback, it is up to them to handle filtering. Signed-off-by: Mykyta Yatsenko <yatsenko@meta.com> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20240524131840.114289-1-yatsenko@meta.com
2024-05-26bpf, docs: Fix instruction.rst indentationDave Thaler-13/+13
The table captions patch corrected indented most tables to work with the table directive for adding a caption but missed two of them. Signed-off-by: Dave Thaler <dthaler1968@gmail.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20240526061815.22497-1-dthaler1968@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-05-25bpf, docs: Clarify call local offsetDave Thaler-3/+4
In the Jump instructions section it explains that the offset is "relative to the instruction following the jump instruction". But the program-local section confusingly said "referenced by offset from the call instruction, similar to JA". This patch updates that sentence with consistent wording, saying it's relative to the instruction following the call instruction. Signed-off-by: Dave Thaler <dthaler1968@gmail.com> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20240525153332.21355-1-dthaler1968@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-05-25bpf, docs: Add table captionsDave Thaler-82/+102
As suggested by Ines Robles in his IETF GENART review at https://datatracker.ietf.org/doc/review-ietf-bpf-isa-02-genart-lc-robles-2024-05-16/ Signed-off-by: Dave Thaler <dthaler1968@gmail.com> Link: https://lore.kernel.org/r/20240524164618.18894-1-dthaler1968@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-05-25bpf, docs: clarify sign extension of 64-bit use of 32-bit immDave Thaler-0/+17
imm is defined as a 32-bit signed integer. {MOV, K, ALU64} says it does "dst = src" (where src is 'imm') and it does do dst = (s64)imm, which in that sense does sign extend imm. The MOVSX instruction is explained as sign extending, so added the example of {MOV, K, ALU64} to make this more clear. {JLE, K, JMP} says it does "PC += offset if dst <= src" (where src is 'imm', and the comparison is unsigned). This was apparently ambiguous to some readers as to whether the comparison was "dst <= (u64)(u32)imm" or "dst <= (u64)(s64)imm" so added an example to make this more clear. v1 -> v2: Address comments from Yonghong Signed-off-by: Dave Thaler <dthaler1968@googlemail.com> Acked-by: Yonghong Song <yonghong.song@linux.dev> Acked-by: David Vernet <void@manifault.com> Link: https://lore.kernel.org/r/20240520215255.10595-1-dthaler1968@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-05-25bpf, docs: Use RFC 2119 language for ISA requirementsDave Thaler-8/+16
Per IETF convention and discussion at LSF/MM/BPF, use MUST etc. keywords as requested by IETF Area Director review. Also as requested, indicate that documenting BTF is out of scope of this document and will be covered by a separate IETF specification. Added paragraph about the terminology that is required IETF boilerplate and must be worded exactly as such. Signed-off-by: Dave Thaler <dthaler1968@googlemail.com> Acked-by: David Vernet <void@manifault.com> Link: https://lore.kernel.org/r/20240517165855.4688-1-dthaler1968@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-05-25bpf, docs: Move sentence about returning R0 to abi.rstDave Thaler-3/+3
As discussed at LSF/MM/BPF, the sentence about using R0 for returning values from calls is part of the calling convention and belongs in abi.rst. Any further additions or clarifications to this text are left for future patches on abi.rst. The current patch is simply to unblock progression of instruction-set.rst to a standard. In contrast, the restriction of register numbers to the range 0-10 is untouched, left in the instruction-set.rst definition of the src_reg and dst_reg fields. Signed-off-by: Dave Thaler <dthaler1968@googlemail.com> Link: https://lore.kernel.org/r/20240517153445.3914-1-dthaler1968@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-05-15bpf, docs: Fix the description of 'src' in ALU instructionsPuranjay Mohan-2/+3
An ALU instruction's source operand can be the value in the source register or the 32-bit immediate value encoded in the instruction. This is controlled by the 's' bit of the 'opcode'. The current description explicitly uses the phrase 'value of the source register' when defining the meaning of 'src'. Change the description to use 'source operand' in place of 'value of the source register'. Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Acked-by: Dave Thaler <dthaler1968@gmail.com> Link: https://lore.kernel.org/r/20240514130303.113607-1-puranjay@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-04-29bpf, docs: Clarify PC use in instruction-set.rstDave Thaler-0/+6
This patch elaborates on the use of PC by expanding the PC acronym, explaining the units, and the relative position to which the offset applies. Signed-off-by: Dave Thaler <dthaler1968@googlemail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: David Vernet <void@manifault.com> Link: https://lore.kernel.org/bpf/20240426231126.5130-1-dthaler1968@gmail.com
2024-04-25bpf, docs: Add introduction for use in the ISA Internet DraftDave Thaler-1/+5
The proposed intro paragraph text is derived from the first paragraph of the IETF BPF WG charter at https://datatracker.ietf.org/wg/bpf/about/ Signed-off-by: Dave Thaler <dthaler1968@gmail.com> Acked-by: David Vernet <void@manifault.com> Link: https://lore.kernel.org/r/20240422190942.24658-1-dthaler1968@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-04-21bpf, docs: Fix formatting nit in instruction-set.rstDave Thaler-1/+1
Other places that had pseudocode were prefixed with :: so as to appear in a literal block, but one place was inconsistent. This patch fixes that inconsistency. Signed-off-by: Dave Thaler <dthaler1968@googlemail.com> Acked-by: David Vernet <void@manifault.com> Link: https://lore.kernel.org/r/20240419213826.7301-1-dthaler1968@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-04-21bpf, docs: Clarify helper ID and pointer terms in instruction-set.rstDave Thaler-24/+24
Per IETF 119 meeting discussion and mailing list discussion at https://mailarchive.ietf.org/arch/msg/bpf/2JwWQwFdOeMGv0VTbD0CKWwAOEA/ the following changes are made. First, say call by "static ID" rather than call by "address" Second, change "pointer" to "address" Signed-off-by: Dave Thaler <dthaler1968@gmail.com> Acked-by: David Vernet <void@manifault.com> Link: https://lore.kernel.org/r/20240419203617.6850-1-dthaler1968@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-04-05bpf, docs: Editorial nits in instruction-set.rstDave Thaler-21/+26
This patch addresses a number of editorial nits including spelling, punctuation, grammar, and wording consistency issues in instruction-set.rst. Signed-off-by: Dave Thaler <dthaler1968@gmail.com> Acked-by: David Vernet <void@manifault.com> Link: https://lore.kernel.org/r/20240405155245.3618-1-dthaler1968@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-03-04bpf, docs: Rename legacy conformance group to packetDave Thaler-2/+2
There could be other legacy conformance groups in the future, so use a more descriptive name. The status of the conformance group in the IANA registry is what designates it as legacy, not the name of the group. Signed-off-by: Dave Thaler <dthaler1968@gmail.com> Link: https://lore.kernel.org/r/20240302012229.16452-1-dthaler1968@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
2024-03-02bpf, docs: Use IETF format for field definitions in instruction-set.rstDave Thaler-241/+290
In preparation for publication as an IETF RFC, the WG chairs asked me to convert the document to use IETF packet format for field layout, so this patch attempts to make it consistent with other IETF documents. Some fields that are not byte aligned were previously inconsistent in how values were defined. Some were defined as the value of the byte containing the field (like 0x20 for a field holding the high four bits of the byte), and others were defined as the value of the field itself (like 0x2). This PR makes them be consistent in using just the values of the field itself, which is IETF convention. As a result, some of the defines that used BPF_* would no longer match the value in the spec, and so this patch also drops the BPF_* prefix to avoid confusion with the defines that are the full-byte equivalent values. For consistency, BPF_* is then dropped from other fields too. BPF_<foo> is thus the Linux implementation-specific define for <foo> as it appears in the BPF ISA specification. The syntax BPF_ADD | BPF_X | BPF_ALU only worked for full-byte values so the convention {ADD, X, ALU} is proposed for referring to field values instead. Also replace the redundant "LSB bits" with "least significant bits". A preview of what the resulting Internet Draft would look like can be seen at: https://htmlpreview.github.io/?https://raw.githubusercontent.com/dthaler/ebp f-docs-1/format/draft-ietf-bpf-isa.html v1->v2: Fix sphinx issue as recommended by David Vernet Signed-off-by: Dave Thaler <dthaler1968@gmail.com> Acked-by: David Vernet <void@manifault.com> Link: https://lore.kernel.org/r/20240301222337.15931-1-dthaler1968@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-02-29bpf: Replace bpf_lpm_trie_key 0-length array with flexible arrayKees Cook-1/+1
Replace deprecated 0-length array in struct bpf_lpm_trie_key with flexible array. Found with GCC 13: ../kernel/bpf/lpm_trie.c:207:51: warning: array subscript i is outside array bounds of 'const __u8[0]' {aka 'const unsigned char[]'} [-Warray-bounds=] 207 | *(__be16 *)&key->data[i]); | ^~~~~~~~~~~~~ ../include/uapi/linux/swab.h:102:54: note: in definition of macro '__swab16' 102 | #define __swab16(x) (__u16)__builtin_bswap16((__u16)(x)) | ^ ../include/linux/byteorder/generic.h:97:21: note: in expansion of macro '__be16_to_cpu' 97 | #define be16_to_cpu __be16_to_cpu | ^~~~~~~~~~~~~ ../kernel/bpf/lpm_trie.c:206:28: note: in expansion of macro 'be16_to_cpu' 206 | u16 diff = be16_to_cpu(*(__be16 *)&node->data[i] ^ | ^~~~~~~~~~~ In file included from ../include/linux/bpf.h:7: ../include/uapi/linux/bpf.h:82:17: note: while referencing 'data' 82 | __u8 data[0]; /* Arbitrary size */ | ^~~~ And found at run-time under CONFIG_FORTIFY_SOURCE: UBSAN: array-index-out-of-bounds in kernel/bpf/lpm_trie.c:218:49 index 0 is out of range for type '__u8 [*]' Changing struct bpf_lpm_trie_key is difficult since has been used by userspace. For example, in Cilium: struct egress_gw_policy_key { struct bpf_lpm_trie_key lpm_key; __u32 saddr; __u32 daddr; }; While direct references to the "data" member haven't been found, there are static initializers what include the final member. For example, the "{}" here: struct egress_gw_policy_key in_key = { .lpm_key = { 32 + 24, {} }, .saddr = CLIENT_IP, .daddr = EXTERNAL_SVC_IP & 0Xffffff, }; To avoid the build time and run time warnings seen with a 0-sized trailing array for struct bpf_lpm_trie_key, introduce a new struct that correctly uses a flexible array for the trailing bytes, struct bpf_lpm_trie_key_u8. As part of this, include the "header" portion (which is just the "prefixlen" member), so it can be used by anything building a bpf_lpr_trie_key that has trailing members that aren't a u8 flexible array (like the self-test[1]), which is named struct bpf_lpm_trie_key_hdr. Unfortunately, C++ refuses to parse the __struct_group() helper, so it is not possible to define struct bpf_lpm_trie_key_hdr directly in struct bpf_lpm_trie_key_u8, so we must open-code the union directly. Adjust the kernel code to use struct bpf_lpm_trie_key_u8 through-out, and for the selftest to use struct bpf_lpm_trie_key_hdr. Add a comment to the UAPI header directing folks to the two new options. Reported-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: Gustavo A. R. Silva <gustavoars@kernel.org> Closes: https://paste.debian.net/hidden/ca500597/ Link: https://lore.kernel.org/all/202206281009.4332AA33@keescook/ [1] Link: https://lore.kernel.org/bpf/20240222155612.it.533-kees@kernel.org
2024-02-22bpf, docs: specify which BPF_ABS and BPF_IND fields were zeroDave Thaler-3/+4
Specifying which fields were unused allows IANA to only list as deprecated instructions that were actually used, leaving the rest as unassigned and possibly available for future use for something else. Signed-off-by: Dave Thaler <dthaler1968@gmail.com> Acked-by: David Vernet <void@manifault.com> Link: https://lore.kernel.org/r/20240221175419.16843-1-dthaler1968@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-02-22bpf, docs: Fix typos in instruction-set.rstDave Thaler-37/+37
* "BPF ADD" should be "BPF_ADD". * "src" should be "src_reg" in several places. The latter is the field name in the instruction. The former refers to the value of the register, or the immediate. * Add '' around field names in one sentence, for consistency with the rest of the document. Signed-off-by: Dave Thaler <dthaler1968@gmail.com> Acked-by: David Vernet <void@manifault.com> Link: https://lore.kernel.org/r/20240221173535.16601-1-dthaler1968@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-02-13bpf, docs: Update ISA document titleDave Thaler-4/+4
* Use "Instruction Set Architecture (ISA)" instead of "Instruction Set Specification" * Remove version number As previously discussed on the mailing list at https://mailarchive.ietf.org/arch/msg/bpf/SEpn3OL9TabNRn-4rDX9A6XVbjM/ Signed-off-by: Dave Thaler <dthaler1968@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Acked-by: David Vernet <void@manifault.com> Link: https://lore.kernel.org/bpf/20240208221449.12274-1-dthaler1968@gmail.com
2024-02-06bpf, docs: Fix typos in instructions-set.rstDave Thaler-3/+4
* "imm32" should just be "imm" * Add blank line to fix formatting error reported by Stephen Rothwell [0] [0]: https://lore.kernel.org/bpf/20240206153301.4ead0bad@canb.auug.org.au/T/#u Signed-off-by: Dave Thaler <dthaler1968@gmail.com> Acked-by: David Vernet <void@manifault.com> Link: https://lore.kernel.org/r/20240206045146.4965-1-dthaler1968@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>