<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/kernel/bpf/verifier.c, branch v4.3</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v4.3</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v4.3'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2015-09-09T21:11:55Z</updated>
<entry>
<title>bpf: fix out of bounds access in verifier log</title>
<updated>2015-09-09T21:11:55Z</updated>
<author>
<name>Alexei Starovoitov</name>
<email>ast@plumgrid.com</email>
</author>
<published>2015-09-08T20:40:01Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=687f07156b0c99205c21aa4e2986564046d342fe'/>
<id>urn:sha1:687f07156b0c99205c21aa4e2986564046d342fe</id>
<content type='text'>
when the verifier log is enabled the print_bpf_insn() is doing
bpf_alu_string[BPF_OP(insn-&gt;code) &gt;&gt; 4]
and
bpf_jmp_string[BPF_OP(insn-&gt;code) &gt;&gt; 4]
where BPF_OP is a 4-bit instruction opcode.
Malformed insns can cause out of bounds access.
Fix it by sizing arrays appropriately.

The bug was found by clang address sanitizer with libfuzzer.

Reported-by: Yonghong Song &lt;yhs@plumgrid.com&gt;
Signed-off-by: Alexei Starovoitov &lt;ast@plumgrid.com&gt;
Acked-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>bpf: fix bpf_perf_event_read() loop upper bound</title>
<updated>2015-08-12T23:42:50Z</updated>
<author>
<name>Wei-Chun Chao</name>
<email>weichunc@plumgrid.com</email>
</author>
<published>2015-08-12T14:57:12Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=140d8b335a9beb234fd0ed9a15aa6a47f47fd771'/>
<id>urn:sha1:140d8b335a9beb234fd0ed9a15aa6a47f47fd771</id>
<content type='text'>
Verifier rejects programs incorrectly.

Fixes: 35578d798400 ("bpf: Implement function bpf_perf_event_read()")
Cc: Kaixu Xia &lt;xiakaixu@huawei.com&gt;
Cc: Alexei Starovoitov &lt;ast@plumgrid.com&gt;
Signed-off-by: Wei-Chun Chao &lt;weichunc@plumgrid.com&gt;
Acked-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Acked-by: Alexei Starovoitov &lt;ast@plumgrid.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>bpf: Implement function bpf_perf_event_read() that get the selected hardware PMU conuter</title>
<updated>2015-08-10T05:50:06Z</updated>
<author>
<name>Kaixu Xia</name>
<email>xiakaixu@huawei.com</email>
</author>
<published>2015-08-06T07:02:35Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=35578d7984003097af2b1e34502bc943d40c1804'/>
<id>urn:sha1:35578d7984003097af2b1e34502bc943d40c1804</id>
<content type='text'>
According to the perf_event_map_fd and index, the function
bpf_perf_event_read() can convert the corresponding map
value to the pointer to struct perf_event and return the
Hardware PMU counter value.

Signed-off-by: Kaixu Xia &lt;xiakaixu@huawei.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>ebpf: Allow dereferences of PTR_TO_STACK registers</title>
<updated>2015-07-27T07:54:10Z</updated>
<author>
<name>Alex Gartrell</name>
<email>agartrell@fb.com</email>
</author>
<published>2015-07-23T21:24:40Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=24b4d2abd0bd628f396dada3e915d395cbf459eb'/>
<id>urn:sha1:24b4d2abd0bd628f396dada3e915d395cbf459eb</id>
<content type='text'>
mov %rsp, %r1           ; r1 = rsp
        add $-8, %r1            ; r1 = rsp - 8
        store_q $123, -8(%rsp)  ; *(u64*)r1 = 123  &lt;- valid
        store_q $123, (%r1)     ; *(u64*)r1 = 123  &lt;- previously invalid
        mov $0, %r0
        exit                    ; Always need to exit

And we'd get the following error:

	0: (bf) r1 = r10
	1: (07) r1 += -8
	2: (7a) *(u64 *)(r10 -8) = 999
	3: (7a) *(u64 *)(r1 +0) = 999
	R1 invalid mem access 'fp'

	Unable to load program

We already know that a register is a stack address and the appropriate
offset, so we should be able to validate those references as well.

Signed-off-by: Alex Gartrell &lt;agartrell@fb.com&gt;
Acked-by: Alexei Starovoitov &lt;ast@plumgrid.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>bpf: allow programs to write to certain skb fields</title>
<updated>2015-06-07T09:01:33Z</updated>
<author>
<name>Alexei Starovoitov</name>
<email>ast@plumgrid.com</email>
</author>
<published>2015-06-04T17:11:54Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=d691f9e8d4405c334aa10d556e73c8bf44cb0e01'/>
<id>urn:sha1:d691f9e8d4405c334aa10d556e73c8bf44cb0e01</id>
<content type='text'>
allow programs read/write skb-&gt;mark, tc_index fields and
((struct qdisc_skb_cb *)cb)-&gt;data.

mark and tc_index are generically useful in TC.
cb[0]-cb[4] are primarily used to pass arguments from one
program to another called via bpf_tail_call() which can
be seen in sockex3_kern.c example.

All fields of 'struct __sk_buff' are readable to socket and tc_cls_act progs.
mark, tc_index are writeable from tc_cls_act only.
cb[0]-cb[4] are writeable by both sockets and tc_cls_act.

Add verifier tests and improve sample code.

Signed-off-by: Alexei Starovoitov &lt;ast@plumgrid.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>bpf: allow bpf programs to tail-call other bpf programs</title>
<updated>2015-05-21T21:07:59Z</updated>
<author>
<name>Alexei Starovoitov</name>
<email>ast@plumgrid.com</email>
</author>
<published>2015-05-19T23:59:03Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=04fd61ab36ec065e194ab5e74ae34a5240d992bb'/>
<id>urn:sha1:04fd61ab36ec065e194ab5e74ae34a5240d992bb</id>
<content type='text'>
introduce bpf_tail_call(ctx, &amp;jmp_table, index) helper function
which can be used from BPF programs like:
int bpf_prog(struct pt_regs *ctx)
{
  ...
  bpf_tail_call(ctx, &amp;jmp_table, index);
  ...
}
that is roughly equivalent to:
int bpf_prog(struct pt_regs *ctx)
{
  ...
  if (jmp_table[index])
    return (*jmp_table[index])(ctx);
  ...
}
The important detail that it's not a normal call, but a tail call.
The kernel stack is precious, so this helper reuses the current
stack frame and jumps into another BPF program without adding
extra call frame.
It's trivially done in interpreter and a bit trickier in JITs.
In case of x64 JIT the bigger part of generated assembler prologue
is common for all programs, so it is simply skipped while jumping.
Other JITs can do similar prologue-skipping optimization or
do stack unwind before jumping into the next program.

bpf_tail_call() arguments:
ctx - context pointer
jmp_table - one of BPF_MAP_TYPE_PROG_ARRAY maps used as the jump table
index - index in the jump table

Since all BPF programs are idenitified by file descriptor, user space
need to populate the jmp_table with FDs of other BPF programs.
If jmp_table[index] is empty the bpf_tail_call() doesn't jump anywhere
and program execution continues as normal.

New BPF_MAP_TYPE_PROG_ARRAY map type is introduced so that user space can
populate this jmp_table array with FDs of other bpf programs.
Programs can share the same jmp_table array or use multiple jmp_tables.

The chain of tail calls can form unpredictable dynamic loops therefore
tail_call_cnt is used to limit the number of calls and currently is set to 32.

Use cases:
Acked-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;

==========
- simplify complex programs by splitting them into a sequence of small programs

- dispatch routine
  For tracing and future seccomp the program may be triggered on all system
  calls, but processing of syscall arguments will be different. It's more
  efficient to implement them as:
  int syscall_entry(struct seccomp_data *ctx)
  {
     bpf_tail_call(ctx, &amp;syscall_jmp_table, ctx-&gt;nr /* syscall number */);
     ... default: process unknown syscall ...
  }
  int sys_write_event(struct seccomp_data *ctx) {...}
  int sys_read_event(struct seccomp_data *ctx) {...}
  syscall_jmp_table[__NR_write] = sys_write_event;
  syscall_jmp_table[__NR_read] = sys_read_event;

  For networking the program may call into different parsers depending on
  packet format, like:
  int packet_parser(struct __sk_buff *skb)
  {
     ... parse L2, L3 here ...
     __u8 ipproto = load_byte(skb, ... offsetof(struct iphdr, protocol));
     bpf_tail_call(skb, &amp;ipproto_jmp_table, ipproto);
     ... default: process unknown protocol ...
  }
  int parse_tcp(struct __sk_buff *skb) {...}
  int parse_udp(struct __sk_buff *skb) {...}
  ipproto_jmp_table[IPPROTO_TCP] = parse_tcp;
  ipproto_jmp_table[IPPROTO_UDP] = parse_udp;

- for TC use case, bpf_tail_call() allows to implement reclassify-like logic

- bpf_map_update_elem/delete calls into BPF_MAP_TYPE_PROG_ARRAY jump table
  are atomic, so user space can build chains of BPF programs on the fly

Implementation details:
=======================
- high performance of bpf_tail_call() is the goal.
  It could have been implemented without JIT changes as a wrapper on top of
  BPF_PROG_RUN() macro, but with two downsides:
  . all programs would have to pay performance penalty for this feature and
    tail call itself would be slower, since mandatory stack unwind, return,
    stack allocate would be done for every tailcall.
  . tailcall would be limited to programs running preempt_disabled, since
    generic 'void *ctx' doesn't have room for 'tail_call_cnt' and it would
    need to be either global per_cpu variable accessed by helper and by wrapper
    or global variable protected by locks.

  In this implementation x64 JIT bypasses stack unwind and jumps into the
  callee program after prologue.

- bpf_prog_array_compatible() ensures that prog_type of callee and caller
  are the same and JITed/non-JITed flag is the same, since calling JITed
  program from non-JITed is invalid, since stack frames are different.
  Similarly calling kprobe type program from socket type program is invalid.

- jump table is implemented as BPF_MAP_TYPE_PROG_ARRAY to reuse 'map'
  abstraction, its user space API and all of verifier logic.
  It's in the existing arraymap.c file, since several functions are
  shared with regular array map.

Signed-off-by: Alexei Starovoitov &lt;ast@plumgrid.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>bpf: fix two bugs in verification logic when accessing 'ctx' pointer</title>
<updated>2015-04-16T18:08:49Z</updated>
<author>
<name>Alexei Starovoitov</name>
<email>ast@plumgrid.com</email>
</author>
<published>2015-04-15T23:19:33Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=725f9dcd58dedfea49ef958babf6c0bf6b7594a9'/>
<id>urn:sha1:725f9dcd58dedfea49ef958babf6c0bf6b7594a9</id>
<content type='text'>
1.
first bug is a silly mistake. It broke tracing examples and prevented
simple bpf programs from loading.

In the following code:
if (insn-&gt;imm == 0 &amp;&amp; BPF_SIZE(insn-&gt;code) == BPF_W) {
} else if (...) {
  // this part should have been executed when
  // insn-&gt;code == BPF_W and insn-&gt;imm != 0
}

Obviously it's not doing that. So simple instructions like:
r2 = *(u64 *)(r1 + 8)
will be rejected. Note the comments in the code around these branches
were and still valid and indicate the true intent.

Replace it with:
if (BPF_SIZE(insn-&gt;code) != BPF_W)
  continue;

if (insn-&gt;imm == 0) {
} else if (...) {
  // now this code will be executed when
  // insn-&gt;code == BPF_W and insn-&gt;imm != 0
}

2.
second bug is more subtle.
If malicious code is using the same dest register as source register,
the checks designed to prevent the same instruction to be used with different
pointer types will fail to trigger, since we were assigning src_reg_type
when it was already overwritten by check_mem_access().
The fix is trivial. Just move line:
src_reg_type = regs[insn-&gt;src_reg].type;
before check_mem_access().
Add new 'access skb fields bad4' test to check this case.

Fixes: 9bac3d6d548e ("bpf: allow extended BPF programs access skb fields")
Signed-off-by: Alexei Starovoitov &lt;ast@plumgrid.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>bpf: fix verifier memory corruption</title>
<updated>2015-04-16T16:06:11Z</updated>
<author>
<name>Alexei Starovoitov</name>
<email>ast@plumgrid.com</email>
</author>
<published>2015-04-14T22:57:13Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=c3de6317d748e23b9e46ba36e10483728d00d144'/>
<id>urn:sha1:c3de6317d748e23b9e46ba36e10483728d00d144</id>
<content type='text'>
Due to missing bounds check the DAG pass of the BPF verifier can corrupt
the memory which can cause random crashes during program loading:

[8.449451] BUG: unable to handle kernel paging request at ffffffffffffffff
[8.451293] IP: [&lt;ffffffff811de33d&gt;] kmem_cache_alloc_trace+0x8d/0x2f0
[8.452329] Oops: 0000 [#1] SMP
[8.452329] Call Trace:
[8.452329]  [&lt;ffffffff8116cc82&gt;] bpf_check+0x852/0x2000
[8.452329]  [&lt;ffffffff8116b7e4&gt;] bpf_prog_load+0x1e4/0x310
[8.452329]  [&lt;ffffffff811b190f&gt;] ? might_fault+0x5f/0xb0
[8.452329]  [&lt;ffffffff8116c206&gt;] SyS_bpf+0x806/0xa30

Fixes: f1bca824dabb ("bpf: add search pruning optimization to verifier")
Signed-off-by: Alexei Starovoitov &lt;ast@plumgrid.com&gt;
Acked-by: Hannes Frederic Sowa &lt;hannes@stressinduktion.org&gt;
Acked-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>tc: bpf: generalize pedit action</title>
<updated>2015-03-29T20:26:54Z</updated>
<author>
<name>Alexei Starovoitov</name>
<email>ast@plumgrid.com</email>
</author>
<published>2015-03-27T02:53:57Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=608cd71a9c7c9db76e78a792c5a4101e12fea43f'/>
<id>urn:sha1:608cd71a9c7c9db76e78a792c5a4101e12fea43f</id>
<content type='text'>
existing TC action 'pedit' can munge any bits of the packet.
Generalize it for use in bpf programs attached as cls_bpf and act_bpf via
bpf_skb_store_bytes() helper function.

Signed-off-by: Alexei Starovoitov &lt;ast@plumgrid.com&gt;
Reviewed-by: Jiri Pirko &lt;jiri@resnulli.us&gt;
Acked-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>ebpf: add sched_act_type and map it to sk_filter's verifier ops</title>
<updated>2015-03-20T23:10:44Z</updated>
<author>
<name>Daniel Borkmann</name>
<email>daniel@iogearbox.net</email>
</author>
<published>2015-03-20T14:11:11Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=94caee8c312d96522bcdae88791aaa9ebcd5f22c'/>
<id>urn:sha1:94caee8c312d96522bcdae88791aaa9ebcd5f22c</id>
<content type='text'>
In order to prepare eBPF support for tc action, we need to add
sched_act_type, so that the eBPF verifier is aware of what helper
function act_bpf may use, that it can load skb data and read out
currently available skb fields.

This is bascially analogous to 96be4325f443 ("ebpf: add sched_cls_type
and map it to sk_filter's verifier ops").

BPF_PROG_TYPE_SCHED_CLS and BPF_PROG_TYPE_SCHED_ACT need to be
separate since both will have a different set of functionality in
future (classifier vs action), thus we won't run into ABI troubles
when the point in time comes to diverge functionality from the
classifier.

The future plan for act_bpf would be that it will be able to write
into skb-&gt;data and alter selected fields mirrored in struct __sk_buff.

For an initial support, it's sufficient to map it to sk_filter_ops.

Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Cc: Jiri Pirko &lt;jiri@resnulli.us&gt;
Reviewed-by: Jiri Pirko &lt;jiri@resnulli.us&gt;
Acked-by: Alexei Starovoitov &lt;ast@plumgrid.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
</feed>
