<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/kernel/bpf/syscall.c, branch v6.15</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v6.15</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v6.15'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2025-04-25T16:21:23Z</updated>
<entry>
<title>bpf: Add namespace to BPF internal symbols</title>
<updated>2025-04-25T16:21:23Z</updated>
<author>
<name>Alexei Starovoitov</name>
<email>ast@kernel.org</email>
</author>
<published>2025-04-25T01:45:42Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=f88886de0927a2adf4c1b4c5c1f1d31d2023ef74'/>
<id>urn:sha1:f88886de0927a2adf4c1b4c5c1f1d31d2023ef74</id>
<content type='text'>
Add namespace to BPF internal symbols used by light skeleton
to prevent abuse and document with the code their allowed usage.

Fixes: b1d18a7574d0 ("bpf: Extend sys_bpf commands for bpf_syscall programs.")
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Signed-off-by: Andrii Nakryiko &lt;andrii@kernel.org&gt;
Acked-by: Kumar Kartikeya Dwivedi &lt;memxor@gmail.com&gt;
Link: https://lore.kernel.org/bpf/20250425014542.62385-1-alexei.starovoitov@gmail.com
</content>
</entry>
<entry>
<title>Merge tag 'bpf_try_alloc_pages' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next</title>
<updated>2025-03-30T20:45:28Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2025-03-30T20:45:28Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=aa918db707fba507e85217961643281ee8dfb2ed'/>
<id>urn:sha1:aa918db707fba507e85217961643281ee8dfb2ed</id>
<content type='text'>
Pull bpf try_alloc_pages() support from Alexei Starovoitov:
 "The pull includes work from Sebastian, Vlastimil and myself with a lot
  of help from Michal and Shakeel.

  This is a first step towards making kmalloc reentrant to get rid of
  slab wrappers: bpf_mem_alloc, kretprobe's objpool, etc. These patches
  make page allocator safe from any context.

  Vlastimil kicked off this effort at LSFMM 2024:

    https://lwn.net/Articles/974138/

  and we continued at LSFMM 2025:

    https://lore.kernel.org/all/CAADnVQKfkGxudNUkcPJgwe3nTZ=xohnRshx9kLZBTmR_E1DFEg@mail.gmail.com/

  Why:

  SLAB wrappers bind memory to a particular subsystem making it
  unavailable to the rest of the kernel. Some BPF maps in production
  consume Gbytes of preallocated memory. Top 5 in Meta: 1.5G, 1.2G,
  1.1G, 300M, 200M. Once we have kmalloc that works in any context BPF
  map preallocation won't be necessary.

  How:

  Synchronous kmalloc/page alloc stack has multiple stages going from
  fast to slow: cmpxchg16 -&gt; slab_alloc -&gt; new_slab -&gt; alloc_pages -&gt;
  rmqueue_pcplist -&gt; __rmqueue, where rmqueue_pcplist was already
  relying on trylock.

  This set changes rmqueue_bulk/rmqueue_buddy to attempt a trylock and
  return ENOMEM if alloc_flags &amp; ALLOC_TRYLOCK. It then wraps this
  functionality into try_alloc_pages() helper. We make sure that the
  logic is sane in PREEMPT_RT.

  End result: try_alloc_pages()/free_pages_nolock() are safe to call
  from any context.

  try_kmalloc() for any context with similar trylock approach will
  follow. It will use try_alloc_pages() when slab needs a new page.
  Though such try_kmalloc/page_alloc() is an opportunistic allocator,
  this design ensures that the probability of successful allocation of
  small objects (up to one page in size) is high.

  Even before we have try_kmalloc(), we already use try_alloc_pages() in
  BPF arena implementation and it's going to be used more extensively in
  BPF"

* tag 'bpf_try_alloc_pages' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next:
  mm: Fix the flipped condition in gfpflags_allow_spinning()
  bpf: Use try_alloc_pages() to allocate pages for bpf needs.
  mm, bpf: Use memcg in try_alloc_pages().
  memcg: Use trylock to access memcg stock_lock.
  mm, bpf: Introduce free_pages_nolock()
  mm, bpf: Introduce try_alloc_pages() for opportunistic page allocation
  locking/local_lock: Introduce localtry_lock_t
</content>
</entry>
<entry>
<title>bpf: Implement verifier support for rqspinlock</title>
<updated>2025-03-19T15:03:06Z</updated>
<author>
<name>Kumar Kartikeya Dwivedi</name>
<email>memxor@gmail.com</email>
</author>
<published>2025-03-16T04:05:39Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=0de2046137f976e7302d43ac01d9894d07ac1fff'/>
<id>urn:sha1:0de2046137f976e7302d43ac01d9894d07ac1fff</id>
<content type='text'>
Introduce verifier-side support for rqspinlock kfuncs. The first step is
allowing bpf_res_spin_lock type to be defined in map values and
allocated objects, so BTF-side is updated with a new BPF_RES_SPIN_LOCK
field to recognize and validate.

Any object cannot have both bpf_spin_lock and bpf_res_spin_lock, only
one of them (and at most one of them per-object, like before) must be
present. The bpf_res_spin_lock can also be used to protect objects that
require lock protection for their kfuncs, like BPF rbtree and linked
list.

The verifier plumbing to simulate success and failure cases when calling
the kfuncs is done by pushing a new verifier state to the verifier state
stack which will verify the failure case upon calling the kfunc. The
path where success is indicated creates all lock reference state and IRQ
state (if necessary for irqsave variants). In the case of failure, the
state clears the registers r0-r5, sets the return value, and skips kfunc
processing, proceeding to the next instruction.

When marking the return value for success case, the value is marked as
0, and for the failure case as [-MAX_ERRNO, -1]. Then, in the program,
whenever user checks the return value as 'if (ret)' or 'if (ret &lt; 0)'
the verifier never traverses such branches for success cases, and would
be aware that the lock is not held in such cases.

We push the kfunc state in check_kfunc_call whenever rqspinlock kfuncs
are invoked. We introduce a kfunc_class state to avoid mixing lock
irqrestore kfuncs with IRQ state created by bpf_local_irq_save.

With all this infrastructure, these kfuncs become usable in programs
while satisfying all safety properties required by the kernel.

Acked-by: Eduard Zingerman &lt;eddyz87@gmail.com&gt;
Signed-off-by: Kumar Kartikeya Dwivedi &lt;memxor@gmail.com&gt;
Link: https://lore.kernel.org/r/20250316040541.108729-24-memxor@gmail.com
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
</content>
</entry>
<entry>
<title>bpf: Return prog btf_id without capable check</title>
<updated>2025-03-17T20:45:12Z</updated>
<author>
<name>Mykyta Yatsenko</name>
<email>yatsenko@meta.com</email>
</author>
<published>2025-03-17T17:40:37Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=07651ccda9ff10a8ca427670cdd06ce2c8e4269c'/>
<id>urn:sha1:07651ccda9ff10a8ca427670cdd06ce2c8e4269c</id>
<content type='text'>
Return prog's btf_id from bpf_prog_get_info_by_fd regardless of capable
check. This patch enables scenario, when freplace program, running
from user namespace, requires to query target prog's btf.

Signed-off-by: Mykyta Yatsenko &lt;yatsenko@meta.com&gt;
Signed-off-by: Andrii Nakryiko &lt;andrii@kernel.org&gt;
Acked-by: Yonghong Song &lt;yonghong.song@linux.dev&gt;
Link: https://lore.kernel.org/bpf/20250317174039.161275-3-mykyta.yatsenko5@gmail.com
</content>
</entry>
<entry>
<title>bpf: BPF token support for BPF_BTF_GET_FD_BY_ID</title>
<updated>2025-03-17T20:45:11Z</updated>
<author>
<name>Mykyta Yatsenko</name>
<email>yatsenko@meta.com</email>
</author>
<published>2025-03-17T17:40:36Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=0de445d18e36ca5914337217c118016ba5db574d'/>
<id>urn:sha1:0de445d18e36ca5914337217c118016ba5db574d</id>
<content type='text'>
Currently BPF_BTF_GET_FD_BY_ID requires CAP_SYS_ADMIN, which does not
allow running it from user namespace. This creates a problem when
freplace program running from user namespace needs to query target
program BTF.
This patch relaxes capable check from CAP_SYS_ADMIN to CAP_BPF and adds
support for BPF token that can be passed in attributes to syscall.

Signed-off-by: Mykyta Yatsenko &lt;yatsenko@meta.com&gt;
Signed-off-by: Andrii Nakryiko &lt;andrii@kernel.org&gt;
Link: https://lore.kernel.org/bpf/20250317174039.161275-2-mykyta.yatsenko5@gmail.com
</content>
</entry>
<entry>
<title>security: Propagate caller information in bpf hooks</title>
<updated>2025-03-15T18:48:58Z</updated>
<author>
<name>Blaise Boscaccy</name>
<email>bboscaccy@linux.microsoft.com</email>
</author>
<published>2025-03-10T22:17:11Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=082f1db02c8034fee787ea9809775ea861c50430'/>
<id>urn:sha1:082f1db02c8034fee787ea9809775ea861c50430</id>
<content type='text'>
Certain bpf syscall subcommands are available for usage from both
userspace and the kernel. LSM modules or eBPF gatekeeper programs may
need to take a different course of action depending on whether or not
a BPF syscall originated from the kernel or userspace.

Additionally, some of the bpf_attr struct fields contain pointers to
arbitrary memory. Currently the functionality to determine whether or
not a pointer refers to kernel memory or userspace memory is exposed
to the bpf verifier, but that information is missing from various LSM
hooks.

Here we augment the LSM hooks to provide this data, by simply passing
a boolean flag indicating whether or not the call originated in the
kernel, in any hook that contains a bpf_attr struct that corresponds
to a subcommand that may be called from the kernel.

Signed-off-by: Blaise Boscaccy &lt;bboscaccy@linux.microsoft.com&gt;
Acked-by: Song Liu &lt;song@kernel.org&gt;
Acked-by: Paul Moore &lt;paul@paul-moore.com&gt;
Link: https://lore.kernel.org/r/20250310221737.821889-2-bboscaccy@linux.microsoft.com
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
</content>
</entry>
<entry>
<title>bpf: no longer acquire map_idr_lock in bpf_map_inc_not_zero()</title>
<updated>2025-03-15T18:48:26Z</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2025-03-01T19:13:15Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=f8ac5a4e1a97354c9a076e9a0c289032618a4f56'/>
<id>urn:sha1:f8ac5a4e1a97354c9a076e9a0c289032618a4f56</id>
<content type='text'>
bpf_sk_storage_clone() is the only caller of bpf_map_inc_not_zero()
and is holding rcu_read_lock().

map_idr_lock does not add any protection, just remove the cost
for passive TCP flows.

Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Cc: Kui-Feng Lee &lt;kuifeng@meta.com&gt;
Cc: Martin KaFai Lau &lt;martin.lau@kernel.org&gt;
Acked-by: Stanislav Fomichev &lt;sdf@fomichev.me&gt;
Link: https://lore.kernel.org/r/20250301191315.1532629-1-edumazet@google.com
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
</content>
</entry>
<entry>
<title>bpf: Allow pre-ordering for bpf cgroup progs</title>
<updated>2025-03-15T18:48:25Z</updated>
<author>
<name>Yonghong Song</name>
<email>yonghong.song@linux.dev</email>
</author>
<published>2025-02-24T23:01:16Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=4b82b181a26cff8bf7adc3a85a88d121d92edeaf'/>
<id>urn:sha1:4b82b181a26cff8bf7adc3a85a88d121d92edeaf</id>
<content type='text'>
Currently for bpf progs in a cgroup hierarchy, the effective prog array
is computed from bottom cgroup to upper cgroups (post-ordering). For
example, the following cgroup hierarchy
    root cgroup: p1, p2
        subcgroup: p3, p4
have BPF_F_ALLOW_MULTI for both cgroup levels.
The effective cgroup array ordering looks like
    p3 p4 p1 p2
and at run time, progs will execute based on that order.

But in some cases, it is desirable to have root prog executes earlier than
children progs (pre-ordering). For example,
  - prog p1 intends to collect original pkt dest addresses.
  - prog p3 will modify original pkt dest addresses to a proxy address for
    security reason.
The end result is that prog p1 gets proxy address which is not what it
wants. Putting p1 to every child cgroup is not desirable either as it
will duplicate itself in many child cgroups. And this is exactly a use case
we are encountering in Meta.

To fix this issue, let us introduce a flag BPF_F_PREORDER. If the flag
is specified at attachment time, the prog has higher priority and the
ordering with that flag will be from top to bottom (pre-ordering).
For example, in the above example,
    root cgroup: p1, p2
        subcgroup: p3, p4
Let us say p2 and p4 are marked with BPF_F_PREORDER. The final
effective array ordering will be
    p2 p4 p3 p1

Suggested-by: Andrii Nakryiko &lt;andrii@kernel.org&gt;
Acked-by: Andrii Nakryiko &lt;andrii@kernel.org&gt;
Signed-off-by: Yonghong Song &lt;yonghong.song@linux.dev&gt;
Link: https://lore.kernel.org/r/20250224230116.283071-1-yonghong.song@linux.dev
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
</content>
</entry>
<entry>
<title>bpf: Use try_alloc_pages() to allocate pages for bpf needs.</title>
<updated>2025-02-27T17:39:44Z</updated>
<author>
<name>Alexei Starovoitov</name>
<email>ast@kernel.org</email>
</author>
<published>2025-02-22T02:44:27Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=c9eb8102e21e8c4c6fb74d1921d99e4d43713520'/>
<id>urn:sha1:c9eb8102e21e8c4c6fb74d1921d99e4d43713520</id>
<content type='text'>
Use try_alloc_pages() and free_pages_nolock() for BPF needs
when context doesn't allow using normal alloc_pages.
This is a prerequisite for further work.

Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Link: https://lore.kernel.org/r/20250222024427.30294-7-alexei.starovoitov@gmail.com
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
</content>
</entry>
<entry>
<title>Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf bpf-6.14-rc4</title>
<updated>2025-02-21T02:13:57Z</updated>
<author>
<name>Alexei Starovoitov</name>
<email>ast@kernel.org</email>
</author>
<published>2025-02-21T02:10:24Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=bd4319b6c2b36aa5c6534aec1dbe16921ec960ef'/>
<id>urn:sha1:bd4319b6c2b36aa5c6534aec1dbe16921ec960ef</id>
<content type='text'>
Cross-merge bpf fixes after downstream PR (bpf-6.14-rc4).

Minor conflict:
  kernel/bpf/btf.c
Adjacent changes:
  kernel/bpf/arena.c
  kernel/bpf/btf.c
  kernel/bpf/syscall.c
  kernel/bpf/verifier.c
  mm/memory.c

Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
</content>
</entry>
</feed>
