<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/kernel/bpf/verifier.c, branch v4.14</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v4.14</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v4.14'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2017-10-21T23:56:09Z</updated>
<entry>
<title>bpf: fix pattern matches for direct packet access</title>
<updated>2017-10-21T23:56:09Z</updated>
<author>
<name>Daniel Borkmann</name>
<email>daniel@iogearbox.net</email>
</author>
<published>2017-10-21T00:34:22Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=0fd4759c5515b7f2297d7fee5c45e5d9dd733001'/>
<id>urn:sha1:0fd4759c5515b7f2297d7fee5c45e5d9dd733001</id>
<content type='text'>
Alexander had a test program with direct packet access, where
the access test was in the form of data + X &gt; data_end. In an
unrelated change to the program LLVM decided to swap the branches
and emitted code for the test in form of data + X &lt;= data_end.
We hadn't seen these being generated previously, thus verifier
would reject the program. Therefore, fix up the verifier to
detect all test cases, so we don't run into such issues in the
future.

Fixes: b4e432f1000a ("bpf: enable BPF_J{LT, LE, SLT, SLE} opcodes in verifier")
Reported-by: Alexander Alemayhu &lt;alexander@alemayhu.com&gt;
Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Acked-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Acked-by: John Fastabend &lt;john.fastabend@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>bpf: fix off by one for range markings with L{T, E} patterns</title>
<updated>2017-10-21T23:56:09Z</updated>
<author>
<name>Daniel Borkmann</name>
<email>daniel@iogearbox.net</email>
</author>
<published>2017-10-21T00:34:21Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=fb2a311a31d3457fe8c3ee16f5609877e2ead9f7'/>
<id>urn:sha1:fb2a311a31d3457fe8c3ee16f5609877e2ead9f7</id>
<content type='text'>
During review I noticed that the current logic for direct packet
access marking in check_cond_jmp_op() has an off by one for the
upper right range border when marking in find_good_pkt_pointers()
with BPF_JLT and BPF_JLE. It's not really harmful given access
up to pkt_end is always safe, but we should nevertheless correct
the range marking before it becomes ABI. If pkt_data' denotes a
pkt_data derived pointer (pkt_data + X), then for pkt_data' &lt; pkt_end
in the true branch as well as for pkt_end &lt;= pkt_data' in the false
branch we mark the range with X although it should really be X - 1
in these cases. For example, X could be pkt_end - pkt_data, then
when testing for pkt_data' &lt; pkt_end the verifier simulation cannot
deduce that a byte load of pkt_data' - 1 would succeed in this
branch.

Fixes: b4e432f1000a ("bpf: enable BPF_J{LT, LE, SLT, SLE} opcodes in verifier")
Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Acked-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Acked-by: John Fastabend &lt;john.fastabend@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>bpf: disallow arithmetic operations on context pointer</title>
<updated>2017-10-18T12:21:13Z</updated>
<author>
<name>Jakub Kicinski</name>
<email>jakub.kicinski@netronome.com</email>
</author>
<published>2017-10-16T18:16:55Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=28e33f9d78eefe98ea86673ab31e988b37a9a738'/>
<id>urn:sha1:28e33f9d78eefe98ea86673ab31e988b37a9a738</id>
<content type='text'>
Commit f1174f77b50c ("bpf/verifier: rework value tracking")
removed the crafty selection of which pointer types are
allowed to be modified.  This is OK for most pointer types
since adjust_ptr_min_max_vals() will catch operations on
immutable pointers.  One exception is PTR_TO_CTX which is
now allowed to be offseted freely.

The intent of aforementioned commit was to allow context
access via modified registers.  The offset passed to
-&gt;is_valid_access() verifier callback has been adjusted
by the value of the variable offset.

What is missing, however, is taking the variable offset
into account when the context register is used.  Or in terms
of the code adding the offset to the value passed to the
-&gt;convert_ctx_access() callback.  This leads to the following
eBPF user code:

     r1 += 68
     r0 = *(u32 *)(r1 + 8)
     exit

being translated to this in kernel space:

   0: (07) r1 += 68
   1: (61) r0 = *(u32 *)(r1 +180)
   2: (95) exit

Offset 8 is corresponding to 180 in the kernel, but offset
76 is valid too.  Verifier will "accept" access to offset
68+8=76 but then "convert" access to offset 8 as 180.
Effective access to offset 248 is beyond the kernel context.
(This is a __sk_buff example on a debug-heavy kernel -
packet mark is 8 -&gt; 180, 76 would be data.)

Dereferencing the modified context pointer is not as easy
as dereferencing other types, because we have to translate
the access to reading a field in kernel structures which is
usually at a different offset and often of a different size.
To allow modifying the pointer we would have to make sure
that given eBPF instruction will always access the same
field or the fields accessed are "compatible" in terms of
offset and size...

Disallow dereferencing modified context pointers and add
to selftests the test case described here.

Fixes: f1174f77b50c ("bpf/verifier: rework value tracking")
Signed-off-by: Jakub Kicinski &lt;jakub.kicinski@netronome.com&gt;
Acked-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Acked-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Acked-by: Edward Cree &lt;ecree@solarflare.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>bpf: fix liveness marking</title>
<updated>2017-10-07T22:25:17Z</updated>
<author>
<name>Alexei Starovoitov</name>
<email>ast@fb.com</email>
</author>
<published>2017-10-05T23:20:56Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=8fe2d6ccd52b086268f2f36e5e2fc0fe3aeffa80'/>
<id>urn:sha1:8fe2d6ccd52b086268f2f36e5e2fc0fe3aeffa80</id>
<content type='text'>
while processing Rx = Ry instruction the verifier does
regs[insn-&gt;dst_reg] = regs[insn-&gt;src_reg]
which often clears write mark (when Ry doesn't have it)
that was just set by check_reg_arg(Rx) prior to the assignment.
That causes mark_reg_read() to keep marking Rx in this block as
REG_LIVE_READ (since the logic incorrectly misses that it's
screened by the write) and in many of its parents (until lucky
write into the same Rx or beginning of the program).
That causes is_state_visited() logic to miss many pruning opportunities.

Furthermore mark_reg_read() logic propagates the read mark
for BPF_REG_FP as well (though it's readonly) which causes
harmless but unnecssary work during is_state_visited().
Note that do_propagate_liveness() skips FP correctly,
so do the same in mark_reg_read() as well.
It saves 0.2 seconds for the test below

program               before  after
bpf_lb-DLB_L3.o       2604    2304
bpf_lb-DLB_L4.o       11159   3723
bpf_lb-DUNKNOWN.o     1116    1110
bpf_lxc-DDROP_ALL.o   34566   28004
bpf_lxc-DUNKNOWN.o    53267   39026
bpf_netdev.o          17843   16943
bpf_overlay.o         8672    7929
time                  ~11 sec  ~4 sec

Fixes: dc503a8ad984 ("bpf/verifier: track liveness for pruning")
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Acked-by: Edward Cree &lt;ecree@solarflare.com&gt;
Acked-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>bpf: fix ri-&gt;map_owner pointer on bpf_prog_realloc</title>
<updated>2017-09-19T23:38:53Z</updated>
<author>
<name>Daniel Borkmann</name>
<email>daniel@iogearbox.net</email>
</author>
<published>2017-09-19T22:44:21Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=7c30013133964aaa2f45c17d6e9782ac6cfd7f5f'/>
<id>urn:sha1:7c30013133964aaa2f45c17d6e9782ac6cfd7f5f</id>
<content type='text'>
Commit 109980b894e9 ("bpf: don't select potentially stale
ri-&gt;map from buggy xdp progs") passed the pointer to the prog
itself to be loaded into r4 prior on bpf_redirect_map() helper
call, so that we can store the owner into ri-&gt;map_owner out of
the helper.

Issue with that is that the actual address of the prog is still
subject to change when subsequent rewrites occur that require
slow path in bpf_prog_realloc() to alloc more memory, e.g. from
patching inlining helper functions or constant blinding. Thus,
we really need to take prog-&gt;aux as the address we're holding,
which also works with prog clones as they share the same aux
object.

Instead of then fetching aux-&gt;prog during runtime, which could
potentially incur cache misses due to false sharing, we are
going to just use aux for comparison on the map owner. This
will also keep the patchlet of the same size, and later check
in xdp_map_invalid() only accesses read-only aux pointer from
the prog, it's also in the same cacheline already from prior
access when calling bpf_func.

Fixes: 109980b894e9 ("bpf: don't select potentially stale ri-&gt;map from buggy xdp progs")
Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Acked-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>bpf/verifier: reject BPF_ALU64|BPF_END</title>
<updated>2017-09-15T22:01:32Z</updated>
<author>
<name>Edward Cree</name>
<email>ecree@solarflare.com</email>
</author>
<published>2017-09-15T13:37:38Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=e67b8a685c7c984e834e3181ef4619cd7025a136'/>
<id>urn:sha1:e67b8a685c7c984e834e3181ef4619cd7025a136</id>
<content type='text'>
Neither ___bpf_prog_run nor the JITs accept it.
Also adds a new test case.

Fixes: 17a5267067f3 ("bpf: verifier (add verifier core)")
Signed-off-by: Edward Cree &lt;ecree@solarflare.com&gt;
Acked-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Acked-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>bpf: don't select potentially stale ri-&gt;map from buggy xdp progs</title>
<updated>2017-09-09T03:58:09Z</updated>
<author>
<name>Daniel Borkmann</name>
<email>daniel@iogearbox.net</email>
</author>
<published>2017-09-07T22:14:51Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=109980b894e9dae66c37c3d804a415aa68b19c7e'/>
<id>urn:sha1:109980b894e9dae66c37c3d804a415aa68b19c7e</id>
<content type='text'>
We can potentially run into a couple of issues with the XDP
bpf_redirect_map() helper. The ri-&gt;map in the per CPU storage
can become stale in several ways, mostly due to misuse, where
we can then trigger a use after free on the map:

i) prog A is calling bpf_redirect_map(), returning XDP_REDIRECT
and running on a driver not supporting XDP_REDIRECT yet. The
ri-&gt;map on that CPU becomes stale when the XDP program is unloaded
on the driver, and a prog B loaded on a different driver which
supports XDP_REDIRECT return code. prog B would have to omit
calling to bpf_redirect_map() and just return XDP_REDIRECT, which
would then access the freed map in xdp_do_redirect() since not
cleared for that CPU.

ii) prog A is calling bpf_redirect_map(), returning a code other
than XDP_REDIRECT. prog A is then detached, which triggers release
of the map. prog B is attached which, similarly as in i), would
just return XDP_REDIRECT without having called bpf_redirect_map()
and thus be accessing the freed map in xdp_do_redirect() since
not cleared for that CPU.

iii) prog A is attached to generic XDP, calling the bpf_redirect_map()
helper and returning XDP_REDIRECT. xdp_do_generic_redirect() is
currently not handling ri-&gt;map (will be fixed by Jesper), so it's
not being reset. Later loading a e.g. native prog B which would,
say, call bpf_xdp_redirect() and then returns XDP_REDIRECT would
find in xdp_do_redirect() that a map was set and uses that causing
use after free on map access.

Fix thus needs to avoid accessing stale ri-&gt;map pointers, naive
way would be to call a BPF function from drivers that just resets
it to NULL for all XDP return codes but XDP_REDIRECT and including
XDP_REDIRECT for drivers not supporting it yet (and let ri-&gt;map
being handled in xdp_do_generic_redirect()). There is a less
intrusive way w/o letting drivers call a reset for each BPF run.

The verifier knows we're calling into bpf_xdp_redirect_map()
helper, so it can do a small insn rewrite transparent to the prog
itself in the sense that it fills R4 with a pointer to the own
bpf_prog. We have that pointer at verification time anyway and
R4 is allowed to be used as per calling convention we scratch
R0 to R5 anyway, so they become inaccessible and program cannot
read them prior to a write. Then, the helper would store the prog
pointer in the current CPUs struct redirect_info. Later in
xdp_do_*_redirect() we check whether the redirect_info's prog
pointer is the same as passed xdp_prog pointer, and if that's
the case then all good, since the prog holds a ref on the map
anyway, so it is always valid at that point in time and must
have a reference count of at least 1. If in the unlikely case
they are not equal, it means we got a stale pointer, so we clear
and bail out right there. Also do reset map and the owning prog
in bpf_xdp_redirect(), so that bpf_xdp_redirect_map() and
bpf_xdp_redirect() won't get mixed up, only the last call should
take precedence. A tc bpf_redirect() doesn't use map anywhere
yet, so no need to clear it there since never accessed in that
layer.

Note that in case the prog is released, and thus the map as
well we're still under RCU read critical section at that time
and have preemption disabled as well. Once we commit with the
__dev_map_insert_ctx() from xdp_do_redirect_map() and set the
map to ri-&gt;map_to_flush, we still wait for a xdp_do_flush_map()
to finish in devmap dismantle time once flush_needed bit is set,
so that is fine.

Fixes: 97f91a7cf04f ("bpf: add bpf_redirect_map helper routine")
Reported-by: Jesper Dangaard Brouer &lt;brouer@redhat.com&gt;
Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Signed-off-by: John Fastabend &lt;john.fastabend@gmail.com&gt;
Acked-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>bpf/verifier: document liveness analysis</title>
<updated>2017-08-24T05:38:08Z</updated>
<author>
<name>Edward Cree</name>
<email>ecree@solarflare.com</email>
</author>
<published>2017-08-23T14:11:21Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=8e9cd9ce90d48369b2c5ddd79fe3d4a4cb1ccb56'/>
<id>urn:sha1:8e9cd9ce90d48369b2c5ddd79fe3d4a4cb1ccb56</id>
<content type='text'>
The liveness tracking algorithm is quite subtle; add comments to explain it.

Signed-off-by: Edward Cree &lt;ecree@solarflare.com&gt;
Acked-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Acked-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>bpf/verifier: remove varlen_map_value_access flag</title>
<updated>2017-08-24T05:38:08Z</updated>
<author>
<name>Edward Cree</name>
<email>ecree@solarflare.com</email>
</author>
<published>2017-08-23T14:10:50Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=1b688a19a92223cf2d1892c9d05d64dc397b33e3'/>
<id>urn:sha1:1b688a19a92223cf2d1892c9d05d64dc397b33e3</id>
<content type='text'>
The optimisation it does is broken when the 'new' register value has a
 variable offset and the 'old' was constant.  I broke it with my pointer
 types unification (see Fixes tag below), before which the 'new' value
 would have type PTR_TO_MAP_VALUE_ADJ and would thus not compare equal;
 other changes in that patch mean that its original behaviour (ignore
 min/max values) cannot be restored.
Tests on a sample set of cilium programs show no change in count of
 processed instructions.

Fixes: f1174f77b50c ("bpf/verifier: rework value tracking")
Signed-off-by: Edward Cree &lt;ecree@solarflare.com&gt;
Acked-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Acked-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>bpf/verifier: when pruning a branch, ignore its write marks</title>
<updated>2017-08-24T05:38:07Z</updated>
<author>
<name>Edward Cree</name>
<email>ecree@solarflare.com</email>
</author>
<published>2017-08-23T14:10:03Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=63f45f840634ab5fd71bbc07acff915277764068'/>
<id>urn:sha1:63f45f840634ab5fd71bbc07acff915277764068</id>
<content type='text'>
The fact that writes occurred in reaching the continuation state does
 not screen off its reads from us, because we're not really its parent.
So detect 'not really the parent' in do_propagate_liveness, and ignore
 write marks in that case.

Fixes: dc503a8ad984 ("bpf/verifier: track liveness for pruning")
Signed-off-by: Edward Cree &lt;ecree@solarflare.com&gt;
Acked-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Acked-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
</feed>
