<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/kernel/bpf/verifier.c, branch v5.10</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v5.10</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v5.10'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2020-12-10T21:02:53Z</updated>
<entry>
<title>bpf: Fix propagation of 32-bit signed bounds from 64-bit bounds.</title>
<updated>2020-12-10T21:02:53Z</updated>
<author>
<name>Alexei Starovoitov</name>
<email>ast@kernel.org</email>
</author>
<published>2020-12-08T18:01:51Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=b02709587ea3d699a608568ee8157d8db4fd8cae'/>
<id>urn:sha1:b02709587ea3d699a608568ee8157d8db4fd8cae</id>
<content type='text'>
The 64-bit signed bounds should not affect 32-bit signed bounds unless the
verifier knows that upper 32-bits are either all 1s or all 0s. For example the
register with smin_value==1 doesn't mean that s32_min_value is also equal to 1,
since smax_value could be larger than 32-bit subregister can hold.
The verifier refines the smax/s32_max return value from certain helpers in
do_refine_retval_range(). Teach the verifier to recognize that smin/s32_min
value is also bounded. When both smin and smax bounds fit into 32-bit
subregister the verifier can propagate those bounds.

Fixes: 3f50f132d840 ("bpf: Verifier, do explicit ALU32 bounds tracking")
Reported-by: Jean-Philippe Brucker &lt;jean-philippe@linaro.org&gt;
Acked-by: John Fastabend &lt;john.fastabend@gmail.com&gt;
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
</content>
</entry>
<entry>
<title>bpf: Relax return code check for subprograms</title>
<updated>2020-11-14T16:17:27Z</updated>
<author>
<name>Dmitrii Banshchikov</name>
<email>me@ubique.spb.ru</email>
</author>
<published>2020-11-13T17:17:56Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=f782e2c300a717233b64697affda3ea7aac00b2b'/>
<id>urn:sha1:f782e2c300a717233b64697affda3ea7aac00b2b</id>
<content type='text'>
Currently verifier enforces return code checks for subprograms in the
same manner as it does for program entry points. This prevents returning
arbitrary scalar values from subprograms. Scalar type of returned values
is checked by btf_prepare_func_args() and hence it should be safe to
allow only scalars for now. Relax return code checks for subprograms and
allow any correct scalar values.

Fixes: 51c39bb1d5d10 (bpf: Introduce function-by-function verification)
Signed-off-by: Dmitrii Banshchikov &lt;me@ubique.spb.ru&gt;
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Acked-by: Andrii Nakryiko &lt;andrii@kernel.org&gt;
Link: https://lore.kernel.org/bpf/20201113171756.90594-1-me@ubique.spb.ru
</content>
</entry>
<entry>
<title>bpf: Fix unsigned 'datasec_id' compared with zero in check_pseudo_btf_id</title>
<updated>2020-11-11T09:50:22Z</updated>
<author>
<name>Kaixu Xia</name>
<email>kaixuxia@tencent.com</email>
</author>
<published>2020-11-11T05:03:46Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=f16e631333a8f12ae8128826e695db4b2a528407'/>
<id>urn:sha1:f16e631333a8f12ae8128826e695db4b2a528407</id>
<content type='text'>
The unsigned variable datasec_id is assigned a return value from the call
to check_pseudo_btf_id(), which may return negative error code.

This fixes the following coccicheck warning:

  ./kernel/bpf/verifier.c:9616:5-15: WARNING: Unsigned expression compared with zero: datasec_id &gt; 0

Fixes: eaa6bcb71ef6 ("bpf: Introduce bpf_per_cpu_ptr()")
Reported-by: Tosk Robot &lt;tencent_os_robot@tencent.com&gt;
Signed-off-by: Kaixu Xia &lt;kaixuxia@tencent.com&gt;
Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Acked-by: Andrii Nakryiko &lt;andrii@kernel.org&gt;
Acked-by: John Fastabend &lt;john.fastabend@gmail.com&gt;
Cc: Hao Luo &lt;haoluo@google.com&gt;
Link: https://lore.kernel.org/bpf/1605071026-25906-1-git-send-email-kaixuxia@tencent.com
</content>
</entry>
<entry>
<title>bpf: Enforce id generation for all may-be-null register type</title>
<updated>2020-10-19T22:57:42Z</updated>
<author>
<name>Martin KaFai Lau</name>
<email>kafai@fb.com</email>
</author>
<published>2020-10-19T19:42:12Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=93c230e3f5bd6e1d2b2759d582fdfe9c2731473b'/>
<id>urn:sha1:93c230e3f5bd6e1d2b2759d582fdfe9c2731473b</id>
<content type='text'>
The commit af7ec1383361 ("bpf: Add bpf_skc_to_tcp6_sock() helper")
introduces RET_PTR_TO_BTF_ID_OR_NULL and
the commit eaa6bcb71ef6 ("bpf: Introduce bpf_per_cpu_ptr()")
introduces RET_PTR_TO_MEM_OR_BTF_ID_OR_NULL.
Note that for RET_PTR_TO_MEM_OR_BTF_ID_OR_NULL, the reg0-&gt;type
could become PTR_TO_MEM_OR_NULL which is not covered by
BPF_PROBE_MEM.

The BPF_REG_0 will then hold a _OR_NULL pointer type. This _OR_NULL
pointer type requires the bpf program to explicitly do a NULL check first.
After NULL check, the verifier will mark all registers having
the same reg-&gt;id as safe to use.  However, the reg-&gt;id
is not set for those new _OR_NULL return types.  One of the ways
that may be wrong is, checking NULL for one btf_id typed pointer will
end up validating all other btf_id typed pointers because
all of them have id == 0.  The later tests will exercise
this path.

To fix it and also avoid similar issue in the future, this patch
moves the id generation logic out of each individual RET type
test in check_helper_call().  Instead, it does one
reg_type_may_be_null() test and then do the id generation
if needed.

This patch also adds a WARN_ON_ONCE in mark_ptr_or_null_reg()
to catch future breakage.

The _OR_NULL pointer usage in the bpf_iter_reg.ctx_arg_info is
fine because it just happens that the existing id generation after
check_ctx_access() has covered it.  It is also using the
reg_type_may_be_null() to decide if id generation is needed or not.

Fixes: af7ec1383361 ("bpf: Add bpf_skc_to_tcp6_sock() helper")
Fixes: eaa6bcb71ef6 ("bpf: Introduce bpf_per_cpu_ptr()")
Signed-off-by: Martin KaFai Lau &lt;kafai@fb.com&gt;
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Link: https://lore.kernel.org/bpf/20201019194212.1050855-1-kafai@fb.com
</content>
</entry>
<entry>
<title>bpf: Fix register equivalence tracking.</title>
<updated>2020-10-15T14:05:31Z</updated>
<author>
<name>Alexei Starovoitov</name>
<email>ast@kernel.org</email>
</author>
<published>2020-10-14T17:56:08Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=e688c3db7ca69bea1872c5706aec6a7fdf89df17'/>
<id>urn:sha1:e688c3db7ca69bea1872c5706aec6a7fdf89df17</id>
<content type='text'>
The 64-bit JEQ/JNE handling in reg_set_min_max() was clearing reg-&gt;id in either
true or false branch. In the case 'if (reg-&gt;id)' check was done on the other
branch the counter part register would have reg-&gt;id == 0 when called into
find_equal_scalars(). In such case the helper would incorrectly identify other
registers with id == 0 as equivalent and propagate the state incorrectly.
Fix it by preserving ID across reg_set_min_max().

In other words any kind of comparison operator on the scalar register
should preserve its ID to recognize:

r1 = r2
if (r1 == 20) {
  #1 here both r1 and r2 == 20
} else if (r2 &lt; 20) {
  #2 here both r1 and r2 &lt; 20
}

The patch is addressing #1 case. The #2 was working correctly already.

Fixes: 75748837b7e5 ("bpf: Propagate scalar ranges through register assignments.")
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Acked-by: Andrii Nakryiko &lt;andrii@kernel.org&gt;
Acked-by: John Fastabend &lt;john.fastabend@gmail.com&gt;
Tested-by: Yonghong Song &lt;yhs@fb.com&gt;
Link: https://lore.kernel.org/bpf/20201014175608.1416-1-alexei.starovoitov@gmail.com
</content>
</entry>
<entry>
<title>Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next</title>
<updated>2020-10-12T23:16:50Z</updated>
<author>
<name>Jakub Kicinski</name>
<email>kuba@kernel.org</email>
</author>
<published>2020-10-12T23:16:50Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=ccdf7fae3afaeaf0e5dd03311b86ffa56adf85ae'/>
<id>urn:sha1:ccdf7fae3afaeaf0e5dd03311b86ffa56adf85ae</id>
<content type='text'>
Alexei Starovoitov says:

====================
pull-request: bpf-next 2020-10-12

The main changes are:

1) The BPF verifier improvements to track register allocation pattern, from Alexei and Yonghong.

2) libbpf relocation support for different size load/store, from Andrii.

3) bpf_redirect_peer() helper and support for inner map array with different max_entries, from Daniel.

4) BPF support for per-cpu variables, form Hao.

5) sockmap improvements, from John.
====================

Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
</entry>
<entry>
<title>bpf: Allow for map-in-map with dynamic inner array map entries</title>
<updated>2020-10-11T17:21:04Z</updated>
<author>
<name>Daniel Borkmann</name>
<email>daniel@iogearbox.net</email>
</author>
<published>2020-10-10T23:40:03Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=4a8f87e60f6db40e640f1db555d063b2c4dea5f1'/>
<id>urn:sha1:4a8f87e60f6db40e640f1db555d063b2c4dea5f1</id>
<content type='text'>
Recent work in f4d05259213f ("bpf: Add map_meta_equal map ops") and 134fede4eecf
("bpf: Relax max_entries check for most of the inner map types") added support
for dynamic inner max elements for most map-in-map types. Exceptions were maps
like array or prog array where the map_gen_lookup() callback uses the maps'
max_entries field as a constant when emitting instructions.

We recently implemented Maglev consistent hashing into Cilium's load balancer
which uses map-in-map with an outer map being hash and inner being array holding
the Maglev backend table for each service. This has been designed this way in
order to reduce overall memory consumption given the outer hash map allows to
avoid preallocating a large, flat memory area for all services. Also, the
number of service mappings is not always known a-priori.

The use case for dynamic inner array map entries is to further reduce memory
overhead, for example, some services might just have a small number of back
ends while others could have a large number. Right now the Maglev backend table
for small and large number of backends would need to have the same inner array
map entries which adds a lot of unneeded overhead.

Dynamic inner array map entries can be realized by avoiding the inlined code
generation for their lookup. The lookup will still be efficient since it will
be calling into array_map_lookup_elem() directly and thus avoiding retpoline.
The patch adds a BPF_F_INNER_MAP flag to map creation which therefore skips
inline code generation and relaxes array_map_meta_equal() check to ignore both
maps' max_entries. This also still allows to have faster lookups for map-in-map
when BPF_F_INNER_MAP is not specified and hence dynamic max_entries not needed.

Example code generation where inner map is dynamic sized array:

  # bpftool p d x i 125
  int handle__sys_enter(void * ctx):
  ; int handle__sys_enter(void *ctx)
     0: (b4) w1 = 0
  ; int key = 0;
     1: (63) *(u32 *)(r10 -4) = r1
     2: (bf) r2 = r10
  ;
     3: (07) r2 += -4
  ; inner_map = bpf_map_lookup_elem(&amp;outer_arr_dyn, &amp;key);
     4: (18) r1 = map[id:468]
     6: (07) r1 += 272
     7: (61) r0 = *(u32 *)(r2 +0)
     8: (35) if r0 &gt;= 0x3 goto pc+5
     9: (67) r0 &lt;&lt;= 3
    10: (0f) r0 += r1
    11: (79) r0 = *(u64 *)(r0 +0)
    12: (15) if r0 == 0x0 goto pc+1
    13: (05) goto pc+1
    14: (b7) r0 = 0
    15: (b4) w6 = -1
  ; if (!inner_map)
    16: (15) if r0 == 0x0 goto pc+6
    17: (bf) r2 = r10
  ;
    18: (07) r2 += -4
  ; val = bpf_map_lookup_elem(inner_map, &amp;key);
    19: (bf) r1 = r0                               | No inlining but instead
    20: (85) call array_map_lookup_elem#149280     | call to array_map_lookup_elem()
  ; return val ? *val : -1;                        | for inner array lookup.
    21: (15) if r0 == 0x0 goto pc+1
  ; return val ? *val : -1;
    22: (61) r6 = *(u32 *)(r0 +0)
  ; }
    23: (bc) w0 = w6
    24: (95) exit

Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Acked-by: Andrii Nakryiko &lt;andrii@kernel.org&gt;
Link: https://lore.kernel.org/bpf/20201010234006.7075-4-daniel@iogearbox.net
</content>
</entry>
<entry>
<title>bpf: Track spill/fill of bounded scalars.</title>
<updated>2020-10-09T20:03:06Z</updated>
<author>
<name>Yonghong Song</name>
<email>yhs@fb.com</email>
</author>
<published>2020-10-09T01:12:38Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=5689d49b71ad25409e41c6052dd9549fbeaf9bde'/>
<id>urn:sha1:5689d49b71ad25409e41c6052dd9549fbeaf9bde</id>
<content type='text'>
Under register pressure the llvm may spill registers with bounds into the stack.
The verifier has to track them through spill/fill otherwise many kinds of bound
errors will be seen. The spill/fill of induction variables was already
happening. This patch extends this logic from tracking spill/fill of a constant
into any bounded register. There is no need to track spill/fill of unbounded,
since no new information will be retrieved from the stack during register fill.

Though extra stack difference could cause state pruning to be less effective, no
adverse affects were seen from this patch on selftests and on cilium programs.

Signed-off-by: Yonghong Song &lt;yhs@fb.com&gt;
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Acked-by: Andrii Nakryiko &lt;andrii@kernel.org&gt;
Acked-by: John Fastabend &lt;john.fastabend@gmail.com&gt;
Link: https://lore.kernel.org/bpf/20201009011240.48506-3-alexei.starovoitov@gmail.com
</content>
</entry>
<entry>
<title>bpf: Propagate scalar ranges through register assignments.</title>
<updated>2020-10-09T20:03:06Z</updated>
<author>
<name>Alexei Starovoitov</name>
<email>ast@kernel.org</email>
</author>
<published>2020-10-09T01:12:37Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=75748837b7e56919679e02163f45d5818c644d03'/>
<id>urn:sha1:75748837b7e56919679e02163f45d5818c644d03</id>
<content type='text'>
The llvm register allocator may use two different registers representing the
same virtual register. In such case the following pattern can be observed:
1047: (bf) r9 = r6
1048: (a5) if r6 &lt; 0x1000 goto pc+1
1050: ...
1051: (a5) if r9 &lt; 0x2 goto pc+66
1052: ...
1053: (bf) r2 = r9 /* r2 needs to have upper and lower bounds */

This is normal behavior of greedy register allocator.
The slides 137+ explain why regalloc introduces such register copy:
http://llvm.org/devmtg/2018-04/slides/Yatsina-LLVM%20Greedy%20Register%20Allocator.pdf
There is no way to tell llvm 'not to do this'.
Hence the verifier has to recognize such patterns.

In order to track this information without backtracking allocate ID
for scalars in a similar way as it's done for find_good_pkt_pointers().

When the verifier encounters r9 = r6 assignment it will assign the same ID
to both registers. Later if either register range is narrowed via conditional
jump propagate the register state into the other register.

Clear register ID in adjust_reg_min_max_vals() for any alu instruction. The
register ID is ignored for scalars in regsafe() and doesn't affect state
pruning. mark_reg_unknown() clears the ID. It's used to process call, endian
and other instructions. Hence ID is explicitly cleared only in
adjust_reg_min_max_vals() and in 32-bit mov.

Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Acked-by: Andrii Nakryiko &lt;andrii@kernel.org&gt;
Acked-by: John Fastabend &lt;john.fastabend@gmail.com&gt;
Link: https://lore.kernel.org/bpf/20201009011240.48506-2-alexei.starovoitov@gmail.com
</content>
</entry>
<entry>
<title>Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net</title>
<updated>2020-10-08T22:44:50Z</updated>
<author>
<name>Jakub Kicinski</name>
<email>kuba@kernel.org</email>
</author>
<published>2020-10-08T22:44:50Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=9d49aea13f1e35869158abe7e314e16dc8f50ff1'/>
<id>urn:sha1:9d49aea13f1e35869158abe7e314e16dc8f50ff1</id>
<content type='text'>
Small conflict around locking in rxrpc_process_event() -
channel_lock moved to bundle in next, while state lock
needs _bh() from net.

Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
</entry>
</feed>
