<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/kernel/bpf/verifier.c, branch v4.4</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v4.4</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v4.4'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2015-11-25T17:14:09Z</updated>
<entry>
<title>bpf: fix clearing on persistent program array maps</title>
<updated>2015-11-25T17:14:09Z</updated>
<author>
<name>Daniel Borkmann</name>
<email>daniel@iogearbox.net</email>
</author>
<published>2015-11-24T20:28:15Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=c9da161c6517ba12154059d3b965c2cbaf16f90f'/>
<id>urn:sha1:c9da161c6517ba12154059d3b965c2cbaf16f90f</id>
<content type='text'>
Currently, when having map file descriptors pointing to program arrays,
there's still the issue that we unconditionally flush program array
contents via bpf_fd_array_map_clear() in bpf_map_release(). This happens
when such a file descriptor is released and is independent of the map's
refcount.

Having this flush independent of the refcount is for a reason: there
can be arbitrary complex dependency chains among tail calls, also circular
ones (direct or indirect, nesting limit determined during runtime), and
we need to make sure that the map drops all references to eBPF programs
it holds, so that the map's refcount can eventually drop to zero and
initiate its freeing. Btw, a walk of the whole dependency graph would
not be possible for various reasons, one being complexity and another
one inconsistency, i.e. new programs can be added to parts of the graph
at any time, so there's no guaranteed consistent state for the time of
such a walk.

Now, the program array pinning itself works, but the issue is that each
derived file descriptor on close would nevertheless call unconditionally
into bpf_fd_array_map_clear(). Instead, keep track of users and postpone
this flush until the last reference to a user is dropped. As this only
concerns a subset of references (f.e. a prog array could hold a program
that itself has reference on the prog array holding it, etc), we need to
track them separately.

Short analysis on the refcounting: on map creation time usercnt will be
one, so there's no change in behaviour for bpf_map_release(), if unpinned.
If we already fail in map_create(), we are immediately freed, and no
file descriptor has been made public yet. In bpf_obj_pin_user(), we need
to probe for a possible map in bpf_fd_probe_obj() already with a usercnt
reference, so before we drop the reference on the fd with fdput().
Therefore, if actual pinning fails, we need to drop that reference again
in bpf_any_put(), otherwise we keep holding it. When last reference
drops on the inode, the bpf_any_put() in bpf_evict_inode() will take
care of dropping the usercnt again. In the bpf_obj_get_user() case, the
bpf_any_get() will grab a reference on the usercnt, still at a time when
we have the reference on the path. Should we later on fail to grab a new
file descriptor, bpf_any_put() will drop it, otherwise we hold it until
bpf_map_release() time.

Joint work with Alexei.

Fixes: b2197755b263 ("bpf: add support for persistent maps/progs")
Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>bpf, verifier: annotate verbose printer with __printf</title>
<updated>2015-11-03T16:29:56Z</updated>
<author>
<name>Daniel Borkmann</name>
<email>daniel@iogearbox.net</email>
</author>
<published>2015-11-03T10:39:20Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=1d056d9c95be87725c07e514930b41c2c7174e75'/>
<id>urn:sha1:1d056d9c95be87725c07e514930b41c2c7174e75</id>
<content type='text'>
The verbose() printer dumps the verifier state to user space, so let gcc
take care to check calls to verbose() for (future) errors. make with W=1
correctly suggests: function might be possible candidate for 'gnu_printf'
format attribute [-Wsuggest-attribute=format].

Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>bpf: align and clean bpf_{map,prog}_get helpers</title>
<updated>2015-11-03T03:48:39Z</updated>
<author>
<name>Daniel Borkmann</name>
<email>daniel@iogearbox.net</email>
</author>
<published>2015-10-29T13:58:07Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=c210129760a010b555372ef74f4e1a46d4eb8a22'/>
<id>urn:sha1:c210129760a010b555372ef74f4e1a46d4eb8a22</id>
<content type='text'>
Add a bpf_map_get() function that we're going to use later on and
align/clean the remaining helpers a bit so that we have them a bit
more consistent:

  - __bpf_map_get() and __bpf_prog_get() that both work on the fd
    struct, check whether the descriptor is eBPF and return the
    pointer to the map/prog stored in the private data.

    Also, we can return f.file-&gt;private_data directly, the function
    signature is enough of a documentation already.

  - bpf_map_get() and bpf_prog_get() that both work on u32 user fd,
    call their respective __bpf_map_get()/__bpf_prog_get() variants,
    and take a reference.

Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Acked-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>bpf: introduce bpf_perf_event_output() helper</title>
<updated>2015-10-22T13:42:15Z</updated>
<author>
<name>Alexei Starovoitov</name>
<email>ast@plumgrid.com</email>
</author>
<published>2015-10-21T03:02:34Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=a43eec304259a6c637f4014a6d4767159b6a3aa3'/>
<id>urn:sha1:a43eec304259a6c637f4014a6d4767159b6a3aa3</id>
<content type='text'>
This helper is used to send raw data from eBPF program into
special PERF_TYPE_SOFTWARE/PERF_COUNT_SW_BPF_OUTPUT perf_event.
User space needs to perf_event_open() it (either for one or all cpus) and
store FD into perf_event_array (similar to bpf_perf_event_read() helper)
before eBPF program can send data into it.

Today the programs triggered by kprobe collect the data and either store
it into the maps or print it via bpf_trace_printk() where latter is the debug
facility and not suitable to stream the data. This new helper replaces
such bpf_trace_printk() usage and allows programs to have dedicated
channel into user space for post-processing of the raw data collected.

Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>bpf: enable non-root eBPF programs</title>
<updated>2015-10-13T02:13:35Z</updated>
<author>
<name>Alexei Starovoitov</name>
<email>ast@plumgrid.com</email>
</author>
<published>2015-10-08T05:23:21Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=1be7f75d1668d6296b80bf35dcf6762393530afc'/>
<id>urn:sha1:1be7f75d1668d6296b80bf35dcf6762393530afc</id>
<content type='text'>
In order to let unprivileged users load and execute eBPF programs
teach verifier to prevent pointer leaks.
Verifier will prevent
- any arithmetic on pointers
  (except R10+Imm which is used to compute stack addresses)
- comparison of pointers
  (except if (map_value_ptr == 0) ... )
- passing pointers to helper functions
- indirectly passing pointers in stack to helper functions
- returning pointer from bpf program
- storing pointers into ctx or maps

Spill/fill of pointers into stack is allowed, but mangling
of pointers stored in the stack or reading them byte by byte is not.

Within bpf programs the pointers do exist, since programs need to
be able to access maps, pass skb pointer to LD_ABS insns, etc
but programs cannot pass such pointer values to the outside
or obfuscate them.

Only allow BPF_PROG_TYPE_SOCKET_FILTER unprivileged programs,
so that socket filters (tcpdump), af_packet (quic acceleration)
and future kcm can use it.
tracing and tc cls/act program types still require root permissions,
since tracing actually needs to be able to see all kernel pointers
and tc is for root only.

For example, the following unprivileged socket filter program is allowed:
int bpf_prog1(struct __sk_buff *skb)
{
  u32 index = load_byte(skb, ETH_HLEN + offsetof(struct iphdr, protocol));
  u64 *value = bpf_map_lookup_elem(&amp;my_map, &amp;index);

  if (value)
	*value += skb-&gt;len;
  return 0;
}

but the following program is not:
int bpf_prog1(struct __sk_buff *skb)
{
  u32 index = load_byte(skb, ETH_HLEN + offsetof(struct iphdr, protocol));
  u64 *value = bpf_map_lookup_elem(&amp;my_map, &amp;index);

  if (value)
	*value += (u64) skb;
  return 0;
}
since it would leak the kernel address into the map.

Unprivileged socket filter bpf programs have access to the
following helper functions:
- map lookup/update/delete (but they cannot store kernel pointers into them)
- get_random (it's already exposed to unprivileged user space)
- get_smp_processor_id
- tail_call into another socket filter program
- ktime_get_ns

The feature is controlled by sysctl kernel.unprivileged_bpf_disabled.
This toggle defaults to off (0), but can be set true (1).  Once true,
bpf programs and maps cannot be accessed from unprivileged process,
and the toggle cannot be set back to false.

Signed-off-by: Alexei Starovoitov &lt;ast@plumgrid.com&gt;
Reviewed-by: Kees Cook &lt;keescook@chromium.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>bpf: fix cb access in socket filter programs</title>
<updated>2015-10-11T11:40:05Z</updated>
<author>
<name>Alexei Starovoitov</name>
<email>ast@plumgrid.com</email>
</author>
<published>2015-10-07T17:55:41Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=ff936a04e5f28b7e0455be0e7fa91334f89e4b44'/>
<id>urn:sha1:ff936a04e5f28b7e0455be0e7fa91334f89e4b44</id>
<content type='text'>
eBPF socket filter programs may see junk in 'u32 cb[5]' area,
since it could have been used by protocol layers earlier.

For socket filter programs used in af_packet we need to clean
20 bytes of skb-&gt;cb area if it could be used by the program.
For programs attached to TCP/UDP sockets we need to save/restore
these 20 bytes, since it's used by protocol layers.

Remove SK_RUN_FILTER macro, since it's no longer used.

Long term we may move this bpf cb area to per-cpu scratch, but that
requires addition of new 'per-cpu load/store' instructions,
so not suitable as a short term fix.

Fixes: d691f9e8d440 ("bpf: allow programs to write to certain skb fields")
Reported-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: Alexei Starovoitov &lt;ast@plumgrid.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>bpf: fix out of bounds access in verifier log</title>
<updated>2015-09-09T21:11:55Z</updated>
<author>
<name>Alexei Starovoitov</name>
<email>ast@plumgrid.com</email>
</author>
<published>2015-09-08T20:40:01Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=687f07156b0c99205c21aa4e2986564046d342fe'/>
<id>urn:sha1:687f07156b0c99205c21aa4e2986564046d342fe</id>
<content type='text'>
when the verifier log is enabled the print_bpf_insn() is doing
bpf_alu_string[BPF_OP(insn-&gt;code) &gt;&gt; 4]
and
bpf_jmp_string[BPF_OP(insn-&gt;code) &gt;&gt; 4]
where BPF_OP is a 4-bit instruction opcode.
Malformed insns can cause out of bounds access.
Fix it by sizing arrays appropriately.

The bug was found by clang address sanitizer with libfuzzer.

Reported-by: Yonghong Song &lt;yhs@plumgrid.com&gt;
Signed-off-by: Alexei Starovoitov &lt;ast@plumgrid.com&gt;
Acked-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>bpf: fix bpf_perf_event_read() loop upper bound</title>
<updated>2015-08-12T23:42:50Z</updated>
<author>
<name>Wei-Chun Chao</name>
<email>weichunc@plumgrid.com</email>
</author>
<published>2015-08-12T14:57:12Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=140d8b335a9beb234fd0ed9a15aa6a47f47fd771'/>
<id>urn:sha1:140d8b335a9beb234fd0ed9a15aa6a47f47fd771</id>
<content type='text'>
Verifier rejects programs incorrectly.

Fixes: 35578d798400 ("bpf: Implement function bpf_perf_event_read()")
Cc: Kaixu Xia &lt;xiakaixu@huawei.com&gt;
Cc: Alexei Starovoitov &lt;ast@plumgrid.com&gt;
Signed-off-by: Wei-Chun Chao &lt;weichunc@plumgrid.com&gt;
Acked-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Acked-by: Alexei Starovoitov &lt;ast@plumgrid.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>bpf: Implement function bpf_perf_event_read() that get the selected hardware PMU conuter</title>
<updated>2015-08-10T05:50:06Z</updated>
<author>
<name>Kaixu Xia</name>
<email>xiakaixu@huawei.com</email>
</author>
<published>2015-08-06T07:02:35Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=35578d7984003097af2b1e34502bc943d40c1804'/>
<id>urn:sha1:35578d7984003097af2b1e34502bc943d40c1804</id>
<content type='text'>
According to the perf_event_map_fd and index, the function
bpf_perf_event_read() can convert the corresponding map
value to the pointer to struct perf_event and return the
Hardware PMU counter value.

Signed-off-by: Kaixu Xia &lt;xiakaixu@huawei.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>ebpf: Allow dereferences of PTR_TO_STACK registers</title>
<updated>2015-07-27T07:54:10Z</updated>
<author>
<name>Alex Gartrell</name>
<email>agartrell@fb.com</email>
</author>
<published>2015-07-23T21:24:40Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=24b4d2abd0bd628f396dada3e915d395cbf459eb'/>
<id>urn:sha1:24b4d2abd0bd628f396dada3e915d395cbf459eb</id>
<content type='text'>
mov %rsp, %r1           ; r1 = rsp
        add $-8, %r1            ; r1 = rsp - 8
        store_q $123, -8(%rsp)  ; *(u64*)r1 = 123  &lt;- valid
        store_q $123, (%r1)     ; *(u64*)r1 = 123  &lt;- previously invalid
        mov $0, %r0
        exit                    ; Always need to exit

And we'd get the following error:

	0: (bf) r1 = r10
	1: (07) r1 += -8
	2: (7a) *(u64 *)(r10 -8) = 999
	3: (7a) *(u64 *)(r1 +0) = 999
	R1 invalid mem access 'fp'

	Unable to load program

We already know that a register is a stack address and the appropriate
offset, so we should be able to validate those references as well.

Signed-off-by: Alex Gartrell &lt;agartrell@fb.com&gt;
Acked-by: Alexei Starovoitov &lt;ast@plumgrid.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
</feed>
