<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/kernel/bpf/syscall.c, branch v5.1</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v5.1</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v5.1'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2019-03-18T15:48:25Z</updated>
<entry>
<title>bpf: Try harder when allocating memory for large maps</title>
<updated>2019-03-18T15:48:25Z</updated>
<author>
<name>Martynas Pumputis</name>
<email>m@lambda.lt</email>
</author>
<published>2019-03-18T15:10:26Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=f01a7dbe98ae4265023fa5d3af0f076f0b18a647'/>
<id>urn:sha1:f01a7dbe98ae4265023fa5d3af0f076f0b18a647</id>
<content type='text'>
It has been observed that sometimes a higher order memory allocation
for BPF maps fails when there is no obvious memory pressure in a system.
E.g. the map (BPF_MAP_TYPE_LRU_HASH, key=38, value=56, max_elems=524288)
could not be created due to vmalloc unable to allocate 75497472B,
when the system's memory consumption (in MB) was the following:

    Total: 3942 Used: 837 (21.24%) Free: 138 Buffers: 239 Cached: 2727

Later analysis [1] by Michal Hocko showed that the vmalloc was not trying
to reclaim memory from the page cache and was failing prematurely due to
__GFP_NORETRY.

Considering dcda9b0471 ("mm, tree wide: replace __GFP_REPEAT by
__GFP_RETRY_MAYFAIL with more useful semantic") and [1], we can replace
__GFP_NORETRY with __GFP_RETRY_MAYFAIL, as it won't invoke OOM killer
and will try harder to fulfil allocation requests.

Unfortunately, replacing the body of the BPF map memory allocation
function with the kvmalloc_node helper function is not an option at
this point in time, given 1) kmalloc is non-optional for higher order
allocations, and 2) passing __GFP_RETRY_MAYFAIL to the kmalloc would
stress the slab allocator too much for large requests.

The change has been tested with the workloads mentioned above and by
observing oom_kill value from /proc/vmstat.

[1]: https://lore.kernel.org/bpf/20190310071318.GW5232@dhcp22.suse.cz/

Signed-off-by: Martynas Pumputis &lt;m@lambda.lt&gt;
Acked-by: Yonghong Song &lt;yhs@fb.com&gt;
Cc: Michal Hocko &lt;mhocko@suse.com&gt;
Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Link: https://lore.kernel.org/bpf/20190318153940.GL8924@dhcp22.suse.cz/
</content>
</entry>
<entry>
<title>Merge branch 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip</title>
<updated>2019-03-06T15:59:36Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2019-03-06T15:59:36Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=203b6609e0ede49eb0b97008b1150c69e9d2ffd3'/>
<id>urn:sha1:203b6609e0ede49eb0b97008b1150c69e9d2ffd3</id>
<content type='text'>
Pull perf updates from Ingo Molnar:
 "Lots of tooling updates - too many to list, here's a few highlights:

   - Various subcommand updates to 'perf trace', 'perf report', 'perf
     record', 'perf annotate', 'perf script', 'perf test', etc.

   - CPU and NUMA topology and affinity handling improvements,

   - HW tracing and HW support updates:
      - Intel PT updates
      - ARM CoreSight updates
      - vendor HW event updates

   - BPF updates

   - Tons of infrastructure updates, both on the build system and the
     library support side

   - Documentation updates.

   - ... and lots of other changes, see the changelog for details.

  Kernel side updates:

   - Tighten up kprobes blacklist handling, reduce the number of places
     where developers can install a kprobe and hang/crash the system.

   - Fix/enhance vma address filter handling.

   - Various PMU driver updates, small fixes and additions.

   - refcount_t conversions

   - BPF updates

   - error code propagation enhancements

   - misc other changes"

* 'perf-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (238 commits)
  perf script python: Add Python3 support to syscall-counts-by-pid.py
  perf script python: Add Python3 support to syscall-counts.py
  perf script python: Add Python3 support to stat-cpi.py
  perf script python: Add Python3 support to stackcollapse.py
  perf script python: Add Python3 support to sctop.py
  perf script python: Add Python3 support to powerpc-hcalls.py
  perf script python: Add Python3 support to net_dropmonitor.py
  perf script python: Add Python3 support to mem-phys-addr.py
  perf script python: Add Python3 support to failed-syscalls-by-pid.py
  perf script python: Add Python3 support to netdev-times.py
  perf tools: Add perf_exe() helper to find perf binary
  perf script: Handle missing fields with -F +..
  perf data: Add perf_data__open_dir_data function
  perf data: Add perf_data__(create_dir|close_dir) functions
  perf data: Fail check_backup in case of error
  perf data: Make check_backup work over directories
  perf tools: Add rm_rf_perf_data function
  perf tools: Add pattern name checking to rm_rf
  perf tools: Add depth checking to rm_rf
  perf data: Add global path holder
  ...
</content>
</entry>
<entry>
<title>Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next</title>
<updated>2019-03-04T18:14:31Z</updated>
<author>
<name>David S. Miller</name>
<email>davem@davemloft.net</email>
</author>
<published>2019-03-04T18:14:31Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=f7fb7c1a1c8f86005d34f28278524213c521f761'/>
<id>urn:sha1:f7fb7c1a1c8f86005d34f28278524213c521f761</id>
<content type='text'>
Daniel Borkmann says:

====================
pull-request: bpf-next 2019-03-04

The following pull-request contains BPF updates for your *net-next* tree.

The main changes are:

1) Add AF_XDP support to libbpf. Rationale is to facilitate writing
   AF_XDP applications by offering higher-level APIs that hide many
   of the details of the AF_XDP uapi. Sample programs are converted
   over to this new interface as well, from Magnus.

2) Introduce a new cant_sleep() macro for annotation of functions
   that cannot sleep and use it in BPF_PROG_RUN() to assert that
   BPF programs run under preemption disabled context, from Peter.

3) Introduce per BPF prog stats in order to monitor the usage
   of BPF; this is controlled by kernel.bpf_stats_enabled sysctl
   knob where monitoring tools can make use of this to efficiently
   determine the average cost of programs, from Alexei.

4) Split up BPF selftest's test_progs similarly as we already
   did with test_verifier. This allows to further reduce merge
   conflicts in future and to get more structure into our
   quickly growing BPF selftest suite, from Stanislav.

5) Fix a bug in BTF's dedup algorithm which can cause an infinite
   loop in some circumstances; also various BPF doc fixes and
   improvements, from Andrii.

6) Various BPF sample cleanups and migration to libbpf in order
   to further isolate the old sample loader code (so we can get
   rid of it at some point), from Jakub.

7) Add a new BPF helper for BPF cgroup skb progs that allows
   to set ECN CE code point and a Host Bandwidth Manager (HBM)
   sample program for limiting the bandwidth used by v2 cgroups,
   from Lawrence.

8) Enable write access to skb-&gt;queue_mapping from tc BPF egress
   programs in order to let BPF pick TX queue, from Jesper.

9) Fix a bug in BPF spinlock handling for map-in-map which did
   not propagate spin_lock_off to the meta map, from Yonghong.

10) Fix a bug in the new per-CPU BPF prog counters to properly
    initialize stats for each CPU, from Eric.

11) Add various BPF helper prototypes to selftest's bpf_helpers.h,
    from Willem.

12) Fix various BPF samples bugs in XDP and tracing progs,
    from Toke, Daniel and Yonghong.

13) Silence preemption splat in test_bpf after BPF_PROG_RUN()
    enforces it now everywhere, from Anders.

14) Fix a signedness bug in libbpf's btf_dedup_ref_type() to
    get error handling working, from Dan.

15) Fix bpftool documentation and auto-completion with regards
    to stream_{verdict,parser} attach types, from Alban.
====================

Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net</title>
<updated>2019-03-02T20:54:35Z</updated>
<author>
<name>David S. Miller</name>
<email>davem@davemloft.net</email>
</author>
<published>2019-03-02T20:54:35Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=9eb359140cd307f8a14f61c19b155ffca5291057'/>
<id>urn:sha1:9eb359140cd307f8a14f61c19b155ffca5291057</id>
<content type='text'>
</content>
</entry>
<entry>
<title>bpf: drop refcount if bpf_map_new_fd() fails in map_create()</title>
<updated>2019-03-01T15:04:29Z</updated>
<author>
<name>Peng Sun</name>
<email>sironhide0null@gmail.com</email>
</author>
<published>2019-02-27T14:36:25Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=352d20d611414715353ee65fc206ee57ab1a6984'/>
<id>urn:sha1:352d20d611414715353ee65fc206ee57ab1a6984</id>
<content type='text'>
In bpf/syscall.c, map_create() first set map-&gt;usercnt to 1, a file
descriptor is supposed to return to userspace. When bpf_map_new_fd()
fails, drop the refcount.

Fixes: bd5f5f4ecb78 ("bpf: Add BPF_MAP_GET_FD_BY_ID")
Signed-off-by: Peng Sun &lt;sironhide0null@gmail.com&gt;
Acked-by: Martin KaFai Lau &lt;kafai@fb.com&gt;
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
</content>
</entry>
<entry>
<title>Merge branch 'linus' into perf/core, to pick up fixes</title>
<updated>2019-02-28T07:27:17Z</updated>
<author>
<name>Ingo Molnar</name>
<email>mingo@kernel.org</email>
</author>
<published>2019-02-28T07:27:17Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=9ed8f1a6e7670aadd5aef30456a90b456ed1b185'/>
<id>urn:sha1:9ed8f1a6e7670aadd5aef30456a90b456ed1b185</id>
<content type='text'>
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>bpf: expose program stats via bpf_prog_info</title>
<updated>2019-02-27T16:22:50Z</updated>
<author>
<name>Alexei Starovoitov</name>
<email>ast@kernel.org</email>
</author>
<published>2019-02-25T22:28:40Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=5f8f8b93aeb8371c54af08bece2bd04bc2d48707'/>
<id>urn:sha1:5f8f8b93aeb8371c54af08bece2bd04bc2d48707</id>
<content type='text'>
Return bpf program run_time_ns and run_cnt via bpf_prog_info

Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Acked-by: Andrii Nakryiko &lt;andriin@fb.com&gt;
Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
</content>
</entry>
<entry>
<title>bpf: enable program stats</title>
<updated>2019-02-27T16:22:50Z</updated>
<author>
<name>Alexei Starovoitov</name>
<email>ast@kernel.org</email>
</author>
<published>2019-02-25T22:28:39Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=492ecee892c2a4ba6a14903d5d586ff750b7e805'/>
<id>urn:sha1:492ecee892c2a4ba6a14903d5d586ff750b7e805</id>
<content type='text'>
JITed BPF programs are indistinguishable from kernel functions, but unlike
kernel code BPF code can be changed often.
Typical approach of "perf record" + "perf report" profiling and tuning of
kernel code works just as well for BPF programs, but kernel code doesn't
need to be monitored whereas BPF programs do.
Users load and run large amount of BPF programs.
These BPF stats allow tools monitor the usage of BPF on the server.
The monitoring tools will turn sysctl kernel.bpf_stats_enabled
on and off for few seconds to sample average cost of the programs.
Aggregated data over hours and days will provide an insight into cost of BPF
and alarms can trigger in case given program suddenly gets more expensive.

The cost of two sched_clock() per program invocation adds ~20 nsec.
Fast BPF progs (like selftests/bpf/progs/test_pkt_access.c) will slow down
from ~10 nsec to ~30 nsec.
static_key minimizes the cost of the stats collection.
There is no measurable difference before/after this patch
with kernel.bpf_stats_enabled=0

Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
</content>
</entry>
<entry>
<title>bpf: decrease usercnt if bpf_map_new_fd() fails in bpf_map_get_fd_by_id()</title>
<updated>2019-02-26T18:08:30Z</updated>
<author>
<name>Peng Sun</name>
<email>sironhide0null@gmail.com</email>
</author>
<published>2019-02-26T14:15:37Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=781e62823cb81b972dc8652c1827205cda2ac9ac'/>
<id>urn:sha1:781e62823cb81b972dc8652c1827205cda2ac9ac</id>
<content type='text'>
In bpf/syscall.c, bpf_map_get_fd_by_id() use bpf_map_inc_not_zero()
to increase the refcount, both map-&gt;refcnt and map-&gt;usercnt. Then, if
bpf_map_new_fd() fails, should handle map-&gt;usercnt too.

Fixes: bd5f5f4ecb78 ("bpf: Add BPF_MAP_GET_FD_BY_ID")
Signed-off-by: Peng Sun &lt;sironhide0null@gmail.com&gt;
Acked-by: Martin KaFai Lau &lt;kafai@fb.com&gt;
Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
</content>
</entry>
<entry>
<title>Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net</title>
<updated>2019-02-08T23:00:17Z</updated>
<author>
<name>David S. Miller</name>
<email>davem@davemloft.net</email>
</author>
<published>2019-02-08T23:00:17Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=a655fe9f194842693258f43b5382855db1c2f654'/>
<id>urn:sha1:a655fe9f194842693258f43b5382855db1c2f654</id>
<content type='text'>
An ipvlan bug fix in 'net' conflicted with the abstraction away
of the IPV6 specific support in 'net-next'.

Similarly, a bug fix for mlx5 in 'net' conflicted with the flow
action conversion in 'net-next'.

Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
</feed>
