<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/kernel/bpf, branch v4.6</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v4.6</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v4.6'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2016-04-28T21:29:45Z</updated>
<entry>
<title>bpf: fix check_map_func_compatibility logic</title>
<updated>2016-04-28T21:29:45Z</updated>
<author>
<name>Alexei Starovoitov</name>
<email>ast@fb.com</email>
</author>
<published>2016-04-28T01:56:21Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=6aff67c85c9e5a4bc99e5211c1bac547936626ca'/>
<id>urn:sha1:6aff67c85c9e5a4bc99e5211c1bac547936626ca</id>
<content type='text'>
The commit 35578d798400 ("bpf: Implement function bpf_perf_event_read() that get the selected hardware PMU conuter")
introduced clever way to check bpf_helper&lt;-&gt;map_type compatibility.
Later on commit a43eec304259 ("bpf: introduce bpf_perf_event_output() helper") adjusted
the logic and inadvertently broke it.
Get rid of the clever bool compare and go back to two-way check
from map and from helper perspective.

Fixes: a43eec304259 ("bpf: introduce bpf_perf_event_output() helper")
Reported-by: Jann Horn &lt;jannh@google.com&gt;
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>bpf: fix refcnt overflow</title>
<updated>2016-04-28T21:29:45Z</updated>
<author>
<name>Alexei Starovoitov</name>
<email>ast@fb.com</email>
</author>
<published>2016-04-28T01:56:20Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=92117d8443bc5afacc8d5ba82e541946310f106e'/>
<id>urn:sha1:92117d8443bc5afacc8d5ba82e541946310f106e</id>
<content type='text'>
On a system with &gt;32Gbyte of phyiscal memory and infinite RLIMIT_MEMLOCK,
the malicious application may overflow 32-bit bpf program refcnt.
It's also possible to overflow map refcnt on 1Tb system.
Impose 32k hard limit which means that the same bpf program or
map cannot be shared by more than 32k processes.

Fixes: 1be7f75d1668 ("bpf: enable non-root eBPF programs")
Reported-by: Jann Horn &lt;jannh@google.com&gt;
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Acked-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>bpf: fix double-fdput in replace_map_fd_with_map_ptr()</title>
<updated>2016-04-26T21:37:21Z</updated>
<author>
<name>Jann Horn</name>
<email>jannh@google.com</email>
</author>
<published>2016-04-26T20:26:26Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=8358b02bf67d3a5d8a825070e1aa73f25fb2e4c7'/>
<id>urn:sha1:8358b02bf67d3a5d8a825070e1aa73f25fb2e4c7</id>
<content type='text'>
When bpf(BPF_PROG_LOAD, ...) was invoked with a BPF program whose bytecode
references a non-map file descriptor as a map file descriptor, the error
handling code called fdput() twice instead of once (in __bpf_map_get() and
in replace_map_fd_with_map_ptr()). If the file descriptor table of the
current task is shared, this causes f_count to be decremented too much,
allowing the struct file to be freed while it is still in use
(use-after-free). This can be exploited to gain root privileges by an
unprivileged user.

This bug was introduced in
commit 0246e64d9a5f ("bpf: handle pseudo BPF_LD_IMM64 insn"), but is only
exploitable since
commit 1be7f75d1668 ("bpf: enable non-root eBPF programs") because
previously, CAP_SYS_ADMIN was required to reach the vulnerable code.

(posted publicly according to request by maintainer)

Signed-off-by: Jann Horn &lt;jannh@google.com&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Acked-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Acked-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>bpf/verifier: reject invalid LD_ABS | BPF_DW instruction</title>
<updated>2016-04-14T05:31:50Z</updated>
<author>
<name>Alexei Starovoitov</name>
<email>ast@fb.com</email>
</author>
<published>2016-04-12T17:26:19Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=d82bccc69041a51f7b7b9b4a36db0772f4cdba21'/>
<id>urn:sha1:d82bccc69041a51f7b7b9b4a36db0772f4cdba21</id>
<content type='text'>
verifier must check for reserved size bits in instruction opcode and
reject BPF_LD | BPF_ABS | BPF_DW and BPF_LD | BPF_IND | BPF_DW instructions,
otherwise interpreter will WARN_RATELIMIT on them during execution.

Fixes: ddd872bc3098 ("bpf: verifier: add checks for BPF_ABS | BPF_IND instructions")
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Acked-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>bpf: add missing map_flags to bpf_map_show_fdinfo</title>
<updated>2016-03-25T15:36:41Z</updated>
<author>
<name>Daniel Borkmann</name>
<email>daniel@iogearbox.net</email>
</author>
<published>2016-03-24T23:30:25Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=322cea2f41adb62c975f46a3242f4e3b43226fa1'/>
<id>urn:sha1:322cea2f41adb62c975f46a3242f4e3b43226fa1</id>
<content type='text'>
Add map_flags attribute to bpf_map_show_fdinfo(), so that tools like
tc can check for them when loading objects from a pinned entry, e.g.
if user intent wrt allocation (BPF_F_NO_PREALLOC) is different to the
pinned object, it can bail out. Follow-up to 6c9059817432 ("bpf:
pre-allocate hash map elements"), so that tc can still support this
with v4.6.

Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Acked-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>Merge branch 'core-objtool-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip</title>
<updated>2016-03-21T01:23:21Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2016-03-21T01:23:21Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=26660a4046b171a752e72a1dd32153230234fe3a'/>
<id>urn:sha1:26660a4046b171a752e72a1dd32153230234fe3a</id>
<content type='text'>
Pull 'objtool' stack frame validation from Ingo Molnar:
 "This tree adds a new kernel build-time object file validation feature
  (ONFIG_STACK_VALIDATION=y): kernel stack frame correctness validation.
  It was written by and is maintained by Josh Poimboeuf.

  The motivation: there's a category of hard to find kernel bugs, most
  of them in assembly code (but also occasionally in C code), that
  degrades the quality of kernel stack dumps/backtraces.  These bugs are
  hard to detect at the source code level.  Such bugs result in
  incorrect/incomplete backtraces most of time - but can also in some
  rare cases result in crashes or other undefined behavior.

  The build time correctness checking is done via the new 'objtool'
  user-space utility that was written for this purpose and which is
  hosted in the kernel repository in tools/objtool/.  The tool's (very
  simple) UI and source code design is shaped after Git and perf and
  shares quite a bit of infrastructure with tools/perf (which tooling
  infrastructure sharing effort got merged via perf and is already
  upstream).  Objtool follows the well-known kernel coding style.

  Objtool does not try to check .c or .S files, it instead analyzes the
  resulting .o generated machine code from first principles: it decodes
  the instruction stream and interprets it.  (Right now objtool supports
  the x86-64 architecture.)

  From tools/objtool/Documentation/stack-validation.txt:

   "The kernel CONFIG_STACK_VALIDATION option enables a host tool named
    objtool which runs at compile time.  It has a "check" subcommand
    which analyzes every .o file and ensures the validity of its stack
    metadata.  It enforces a set of rules on asm code and C inline
    assembly code so that stack traces can be reliable.

    Currently it only checks frame pointer usage, but there are plans to
    add CFI validation for C files and CFI generation for asm files.

    For each function, it recursively follows all possible code paths
    and validates the correct frame pointer state at each instruction.

    It also follows code paths involving special sections, like
    .altinstructions, __jump_table, and __ex_table, which can add
    alternative execution paths to a given instruction (or set of
    instructions).  Similarly, it knows how to follow switch statements,
    for which gcc sometimes uses jump tables."

  When this new kernel option is enabled (it's disabled by default), the
  tool, if it finds any suspicious assembly code pattern, outputs
  warnings in compiler warning format:

    warning: objtool: rtlwifi_rate_mapping()+0x2e7: frame pointer state mismatch
    warning: objtool: cik_tiling_mode_table_init()+0x6ce: call without frame pointer save/setup
    warning: objtool:__schedule()+0x3c0: duplicate frame pointer save
    warning: objtool:__schedule()+0x3fd: sibling call from callable instruction with changed frame pointer

  ... so that scripts that pick up compiler warnings will notice them.
  All known warnings triggered by the tool are fixed by the tree, most
  of the commits in fact prepare the kernel to be warning-free.  Most of
  them are bugfixes or cleanups that stand on their own, but there are
  also some annotations of 'special' stack frames for justified cases
  such entries to JIT-ed code (BPF) or really special boot time code.

  There are two other long-term motivations behind this tool as well:

   - To improve the quality and reliability of kernel stack frames, so
     that they can be used for optimized live patching.

   - To create independent infrastructure to check the correctness of
     CFI stack frames at build time.  CFI debuginfo is notoriously
     unreliable and we cannot use it in the kernel as-is without extra
     checking done both on the kernel side and on the build side.

  The quality of kernel stack frames matters to debuggability as well,
  so IMO we can merge this without having to consider the live patching
  or CFI debuginfo angle"

* 'core-objtool-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (52 commits)
  objtool: Only print one warning per function
  objtool: Add several performance improvements
  tools: Copy hashtable.h into tools directory
  objtool: Fix false positive warnings for functions with multiple switch statements
  objtool: Rename some variables and functions
  objtool: Remove superflous INIT_LIST_HEAD
  objtool: Add helper macros for traversing instructions
  objtool: Fix false positive warnings related to sibling calls
  objtool: Compile with debugging symbols
  objtool: Detect infinite recursion
  objtool: Prevent infinite recursion in noreturn detection
  objtool: Detect and warn if libelf is missing and don't break the build
  tools: Support relative directory path for 'O='
  objtool: Support CROSS_COMPILE
  x86/asm/decoder: Use explicitly signed chars
  objtool: Enable stack metadata validation on 64-bit x86
  objtool: Add CONFIG_STACK_VALIDATION option
  objtool: Add tool to perform compile-time stack metadata validation
  x86/kprobes: Mark kretprobe_trampoline() stack frame as non-standard
  sched: Always inline context_switch()
  ...
</content>
</entry>
<entry>
<title>bpf: avoid copying junk bytes in bpf_get_current_comm()</title>
<updated>2016-03-10T04:27:30Z</updated>
<author>
<name>Alexei Starovoitov</name>
<email>ast@fb.com</email>
</author>
<published>2016-03-10T04:02:33Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=cdc4e47da8f4c32eeb6b2061a8a834f4362a12b7'/>
<id>urn:sha1:cdc4e47da8f4c32eeb6b2061a8a834f4362a12b7</id>
<content type='text'>
Lots of places in the kernel use memcpy(buf, comm, TASK_COMM_LEN); but
the result is typically passed to print("%s", buf) and extra bytes
after zero don't cause any harm.
In bpf the result of bpf_get_current_comm() is used as the part of
map key and was causing spurious hash map mismatches.
Use strlcpy() to guarantee zero-terminated string.
bpf verifier checks that output buffer is zero-initialized,
so even for short task names the output buffer don't have junk bytes.
Note it's not a security concern, since kprobe+bpf is root only.

Fixes: ffeedafbf023 ("bpf: introduce current-&gt;pid, tgid, uid, gid, comm accessors")
Reported-by: Tobias Waldekranz &lt;tobias@waldekranz.com&gt;
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>bpf: bpf_stackmap_copy depends on CONFIG_PERF_EVENTS</title>
<updated>2016-03-10T04:26:51Z</updated>
<author>
<name>Alexei Starovoitov</name>
<email>ast@fb.com</email>
</author>
<published>2016-03-10T02:56:49Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=b8cdc05173f05d212627b7aba7ec47fa334a79f2'/>
<id>urn:sha1:b8cdc05173f05d212627b7aba7ec47fa334a79f2</id>
<content type='text'>
0-day bot reported build error:
kernel/built-in.o: In function `map_lookup_elem':
&gt;&gt; kernel/bpf/.tmp_syscall.o:(.text+0x329b3c): undefined reference to `bpf_stackmap_copy'
when CONFIG_BPF_SYSCALL is set and CONFIG_PERF_EVENTS is not.
Add weak definition to resolve it.
This code path in map_lookup_elem() is never taken
when CONFIG_PERF_EVENTS is not set.

Fixes: 557c0c6e7df8 ("bpf: convert stackmap to pre-allocation")
Reported-by: Fengguang Wu &lt;fengguang.wu@intel.com&gt;
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>bpf: convert stackmap to pre-allocation</title>
<updated>2016-03-08T20:28:31Z</updated>
<author>
<name>Alexei Starovoitov</name>
<email>ast@fb.com</email>
</author>
<published>2016-03-08T05:57:17Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=557c0c6e7df8e14a46bd7560d193fa5bbc00a858'/>
<id>urn:sha1:557c0c6e7df8e14a46bd7560d193fa5bbc00a858</id>
<content type='text'>
It was observed that calling bpf_get_stackid() from a kprobe inside
slub or from spin_unlock causes similar deadlock as with hashmap,
therefore convert stackmap to use pre-allocated memory.

The call_rcu is no longer feasible mechanism, since delayed freeing
causes bpf_get_stackid() to fail unpredictably when number of actual
stacks is significantly less than user requested max_entries.
Since elements are no longer freed into slub, we can push elements into
freelist immediately and let them be recycled.
However the very unlikley race between user space map_lookup() and
program-side recycling is possible:
     cpu0                          cpu1
     ----                          ----
user does lookup(stackidX)
starts copying ips into buffer
                                   delete(stackidX)
                                   calls bpf_get_stackid()
				   which recyles the element and
                                   overwrites with new stack trace

To avoid user space seeing a partial stack trace consisting of two
merged stack traces, do bucket = xchg(, NULL); copy; xchg(,bucket);
to preserve consistent stack trace delivery to user space.
Now we can move memset(,0) of left-over element value from critical
path of bpf_get_stackid() into slow-path of user space lookup.
Also disallow lookup() from bpf program, since it's useless and
program shouldn't be messing with collected stack trace.

Note that similar race between user space lookup and kernel side updates
is also present in hashmap, but it's not a new race. bpf programs were
always allowed to modify hash and array map elements while user space
is copying them.

Fixes: d5a3b1f69186 ("bpf: introduce BPF_MAP_TYPE_STACK_TRACE")
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>bpf: check for reserved flag bits in array and stack maps</title>
<updated>2016-03-08T20:28:31Z</updated>
<author>
<name>Alexei Starovoitov</name>
<email>ast@fb.com</email>
</author>
<published>2016-03-08T05:57:16Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=823707b68d6e6c4b1be619b039c7045fef1740e6'/>
<id>urn:sha1:823707b68d6e6c4b1be619b039c7045fef1740e6</id>
<content type='text'>
Suggested-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
</feed>
