diff options
| author | Alexei Starovoitov <ast@kernel.org> | 2025-06-09 20:11:10 -0700 |
|---|---|---|
| committer | Alexei Starovoitov <ast@kernel.org> | 2025-06-09 22:17:39 -0700 |
| commit | 5fcf896efe28ca11212fdb6594cd709abb7c1735 (patch) | |
| tree | 0c6a53af50759602c897860dc8a6fa992d317264 | |
| parent | bpf: Add cookie in fdinfo for raw_tp (diff) | |
| parent | selftests/bpf: Add test for Spectre v1 mitigation (diff) | |
| download | linux-5fcf896efe28ca11212fdb6594cd709abb7c1735.tar.gz linux-5fcf896efe28ca11212fdb6594cd709abb7c1735.zip | |
Merge branch 'bpf-mitigate-spectre-v1-using-barriers'
Luis Gerhorst says:
====================
This improves the expressiveness of unprivileged BPF by inserting
speculation barriers instead of rejecting the programs.
The approach was previously presented at LPC'24 [1] and RAID'24 [2].
To mitigate the Spectre v1 (PHT) vulnerability, the kernel rejects
potentially-dangerous unprivileged BPF programs as of
commit 9183671af6db ("bpf: Fix leakage under speculation on mispredicted
branches"). In [2], we have analyzed 364 object files from open source
projects (Linux Samples and Selftests, BCC, Loxilb, Cilium, libbpf
Examples, Parca, and Prevail) and found that this affects 31% to 54% of
programs.
To resolve this in the majority of cases this patchset adds a fall-back
for mitigating Spectre v1 using speculation barriers. The kernel still
optimistically attempts to verify all speculative paths but uses
speculation barriers against v1 when unsafe behavior is detected. This
allows for more programs to be accepted without disabling the BPF
Spectre mitigations (e.g., by setting cpu_mitigations_off()).
For this, it relies on the fact that speculation barriers generally
prevent all later instructions from executing if the speculation was not
correct (not only loads). See patch 7 ("bpf: Fall back to nospec for
Spectre v1") for a detailed description and references to the relevant
vendor documentation (AMD and Intel x86-64, ARM64, and PowerPC).
In [1] we have measured the overhead of this approach relative to having
mitigations off and including the upstream Spectre v4 mitigations. For
event tracing and stack-sampling profilers, we found that mitigations
increase BPF program execution time by 0% to 62%. For the Loxilb network
load balancer, we have measured a 14% slowdown in SCTP performance but
no significant slowdown for TCP. This overhead only applies to programs
that were previously rejected.
I reran the expressiveness-evaluation with v6.14 and made sure the main
results still match those from [1] and [2] (which used v6.5).
Main design decisions are:
* Do not use separate bytecode insns for v1 and v4 barriers (inspired by
Daniel Borkmann's question at LPC). This simplifies the verifier
significantly and has the only downside that performance on PowerPC is
not as high as it could be.
* Allow archs to still disable v1/v4 mitigations separately by setting
bpf_jit_bypass_spec_v1/v4(). This has the benefit that archs can
benefit from improved BPF expressiveness / performance if they are not
vulnerable (e.g., ARM64 for v4 in the kernel).
* Do not remove the empty BPF_NOSPEC implementation for backends for
which it is unknown whether they are vulnerable to Spectre v1.
[1] https://lpc.events/event/18/contributions/1954/ ("Mitigating
Spectre-PHT using Speculation Barriers in Linux eBPF")
[2] https://arxiv.org/pdf/2405.00078 ("VeriFence: Lightweight and
Precise Spectre Defenses for Untrusted Linux Kernel Extensions")
Changes:
* v3 -> v4:
- Remove insn parameter from do_check_insn() and extract
process_bpf_exit_full as a function as requested by Eduard
- Investigate apparent sanitize_check_bounds() bug reported by
Kartikeya (does appear to not be a bug but only confusing code),
sent separate patch to document it and add an assert
- Remove already-merged commit 1 ("selftests/bpf: Fix caps for
__xlated/jited_unpriv")
- Drop former commit 10 ("bpf: Allow nospec-protected var-offset stack
access") as it did not include a test and there are other places
where var-off is rejected. Also, none of the tested real-world
programs used var-off in the paper. Therefore keep the old behavior
for now and potentially prepare a patch that converts all cases
later if required.
- Add link to AMD lfence and PowerPC speculation barrier (ori 31,31,0)
documentation
- Move detailed barrier documentation to commit 7 ("bpf: Fall back to
nospec for Spectre v1")
- Link to v3: https://lore.kernel.org/all/20250501073603.1402960-1-luis.gerhorst@fau.de/
* v2 -> v3:
- Fix
https://lore.kernel.org/oe-kbuild-all/202504212030.IF1SLhz6-lkp@intel.com/
and similar by moving the bpf_jit_bypass_spec_v1/v4() prototypes out
of the #ifdef CONFIG_BPF_SYSCALL. Decided not to move them to
filter.h (where similar bpf_jit_*() prototypes live) as they would
still have to be duplicated in bpf.h to be usable to
bpf_bypass_spec_v1/v4() (unless including filter.h in bpf.h is an
option).
- Fix
https://lore.kernel.org/oe-kbuild-all/202504220035.SoGveGpj-lkp@intel.com/
by moving the variable declarations out of the switch-case.
- Build touched C files with W=2 and bpf config on x86 to check that
there are no other warnings introduced.
- Found 3 more checkpatch warnings that can be fixed without degrading
readability.
- Rebase to bpf-next 2025-05-01
- Link to v2: https://lore.kernel.org/bpf/20250421091802.3234859-1-luis.gerhorst@fau.de/
* v1 -> v2:
- Drop former commits 9 ("bpf: Return PTR_ERR from push_stack()") and 11
("bpf: Fall back to nospec for spec path verification") as suggested
by Alexei. This series therefore no longer changes push_stack() to
return PTR_ERR.
- Add detailed explanation of how lfence works internally and how it
affects the algorithm.
- Add tests checking that nospec instructions are inserted in expected
locations using __xlated_unpriv as suggested by Eduard (also,
include a fix for __xlated_unpriv)
- Add a test for the mitigations from the description of
commit 9183671af6db ("bpf: Fix leakage under speculation on
mispredicted branches")
- Remove unused variables from do_check[_insn]() as suggested by
Eduard.
- Remove INSN_IDX_MODIFIED to improve readability as suggested by
Eduard. This also causes the nospec_result-check to run (and fail)
for jumping-ops. Add a warning to assert that this check must never
succeed in that case.
- Add details on the safety of patch 10 ("bpf: Allow nospec-protected
var-offset stack access") based on the feedback on v1.
- Rebase to bpf-next-250420
- Link to v1: https://lore.kernel.org/all/20250313172127.1098195-1-luis.gerhorst@fau.de/
* RFC -> v1:
- rebase to bpf-next-250313
- tests: mark expected successes/new errors
- add bpt_jit_bypass_spec_v1/v4() to avoid #ifdef in
bpf_bypass_spec_v1/v4()
- ensure that nospec with v1-support is implemented for archs for
which GCC supports speculation barriers, except for MIPS
- arm64: emit speculation barrier
- powerpc: change nospec to include v1 barrier
- discuss potential security (archs that do not impl. BPF nospec) and
performance (only PowerPC) regressions
- Link to RFC: https://lore.kernel.org/bpf/20250224203619.594724-1-luis.gerhorst@fau.de/
====================
Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://patch.msgid.link/20250603205800.334980-1-luis.gerhorst@fau.de
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
| -rw-r--r-- | arch/arm64/net/bpf_jit.h | 5 | ||||
| -rw-r--r-- | arch/arm64/net/bpf_jit_comp.c | 28 | ||||
| -rw-r--r-- | arch/powerpc/net/bpf_jit_comp64.c | 80 | ||||
| -rw-r--r-- | include/linux/bpf.h | 11 | ||||
| -rw-r--r-- | include/linux/bpf_verifier.h | 3 | ||||
| -rw-r--r-- | include/linux/filter.h | 2 | ||||
| -rw-r--r-- | kernel/bpf/core.c | 32 | ||||
| -rw-r--r-- | kernel/bpf/verifier.c | 544 | ||||
| -rw-r--r-- | tools/testing/selftests/bpf/progs/bpf_misc.h | 4 | ||||
| -rw-r--r-- | tools/testing/selftests/bpf/progs/verifier_and.c | 8 | ||||
| -rw-r--r-- | tools/testing/selftests/bpf/progs/verifier_bounds.c | 61 | ||||
| -rw-r--r-- | tools/testing/selftests/bpf/progs/verifier_movsx.c | 16 | ||||
| -rw-r--r-- | tools/testing/selftests/bpf/progs/verifier_unpriv.c | 65 | ||||
| -rw-r--r-- | tools/testing/selftests/bpf/progs/verifier_value_ptr_arith.c | 16 | ||||
| -rw-r--r-- | tools/testing/selftests/bpf/verifier/dead_code.c | 3 | ||||
| -rw-r--r-- | tools/testing/selftests/bpf/verifier/jmp32.c | 33 | ||||
| -rw-r--r-- | tools/testing/selftests/bpf/verifier/jset.c | 10 |
17 files changed, 601 insertions, 320 deletions
diff --git a/arch/arm64/net/bpf_jit.h b/arch/arm64/net/bpf_jit.h index a3b0e693a125..bbea4f36f9f2 100644 --- a/arch/arm64/net/bpf_jit.h +++ b/arch/arm64/net/bpf_jit.h @@ -325,4 +325,9 @@ #define A64_MRS_SP_EL0(Rt) \ aarch64_insn_gen_mrs(Rt, AARCH64_INSN_SYSREG_SP_EL0) +/* Barriers */ +#define A64_SB aarch64_insn_get_sb_value() +#define A64_DSB_NSH (aarch64_insn_get_dsb_base_value() | 0x7 << 8) +#define A64_ISB aarch64_insn_get_isb_value() + #endif /* _BPF_JIT_H */ diff --git a/arch/arm64/net/bpf_jit_comp.c b/arch/arm64/net/bpf_jit_comp.c index da8b89dd2910..b6c42b5c9668 100644 --- a/arch/arm64/net/bpf_jit_comp.c +++ b/arch/arm64/net/bpf_jit_comp.c @@ -1630,17 +1630,14 @@ emit_cond_jmp: return ret; break; - /* speculation barrier */ + /* speculation barrier against v1 and v4 */ case BPF_ST | BPF_NOSPEC: - /* - * Nothing required here. - * - * In case of arm64, we rely on the firmware mitigation of - * Speculative Store Bypass as controlled via the ssbd kernel - * parameter. Whenever the mitigation is enabled, it works - * for all of the kernel code with no need to provide any - * additional instructions. - */ + if (alternative_has_cap_likely(ARM64_HAS_SB)) { + emit(A64_SB, ctx); + } else { + emit(A64_DSB_NSH, ctx); + emit(A64_ISB, ctx); + } break; /* ST: *(size *)(dst + off) = imm */ @@ -2911,6 +2908,17 @@ bool bpf_jit_supports_percpu_insn(void) return true; } +bool bpf_jit_bypass_spec_v4(void) +{ + /* In case of arm64, we rely on the firmware mitigation of Speculative + * Store Bypass as controlled via the ssbd kernel parameter. Whenever + * the mitigation is enabled, it works for all of the kernel code with + * no need to provide any additional instructions. Therefore, skip + * inserting nospec insns against Spectre v4. + */ + return true; +} + bool bpf_jit_inlines_helper_call(s32 imm) { switch (imm) { diff --git a/arch/powerpc/net/bpf_jit_comp64.c b/arch/powerpc/net/bpf_jit_comp64.c index 5daa77aee7f7..3665ff8bb4bc 100644 --- a/arch/powerpc/net/bpf_jit_comp64.c +++ b/arch/powerpc/net/bpf_jit_comp64.c @@ -370,6 +370,23 @@ static int bpf_jit_emit_tail_call(u32 *image, struct codegen_context *ctx, u32 o return 0; } +bool bpf_jit_bypass_spec_v1(void) +{ +#if defined(CONFIG_PPC_E500) || defined(CONFIG_PPC_BOOK3S_64) + return !(security_ftr_enabled(SEC_FTR_FAVOUR_SECURITY) && + security_ftr_enabled(SEC_FTR_BNDS_CHK_SPEC_BAR)); +#else + return true; +#endif +} + +bool bpf_jit_bypass_spec_v4(void) +{ + return !(security_ftr_enabled(SEC_FTR_FAVOUR_SECURITY) && + security_ftr_enabled(SEC_FTR_STF_BARRIER) && + stf_barrier_type_get() != STF_BARRIER_NONE); +} + /* * We spill into the redzone always, even if the bpf program has its own stackframe. * Offsets hardcoded based on BPF_PPC_STACK_SAVE -- see bpf_jit_stack_local() @@ -397,6 +414,7 @@ int bpf_jit_build_body(struct bpf_prog *fp, u32 *image, u32 *fimage, struct code u32 *addrs, int pass, bool extra_pass) { enum stf_barrier_type stf_barrier = stf_barrier_type_get(); + bool sync_emitted, ori31_emitted; const struct bpf_insn *insn = fp->insnsi; int flen = fp->len; int i, ret; @@ -789,30 +807,52 @@ emit_clear: /* * BPF_ST NOSPEC (speculation barrier) + * + * The following must act as a barrier against both Spectre v1 + * and v4 if we requested both mitigations. Therefore, also emit + * 'isync; sync' on E500 or 'ori31' on BOOK3S_64 in addition to + * the insns needed for a Spectre v4 barrier. + * + * If we requested only !bypass_spec_v1 OR only !bypass_spec_v4, + * we can skip the respective other barrier type as an + * optimization. */ case BPF_ST | BPF_NOSPEC: - if (!security_ftr_enabled(SEC_FTR_FAVOUR_SECURITY) || - !security_ftr_enabled(SEC_FTR_STF_BARRIER)) - break; - - switch (stf_barrier) { - case STF_BARRIER_EIEIO: - EMIT(PPC_RAW_EIEIO() | 0x02000000); - break; - case STF_BARRIER_SYNC_ORI: + sync_emitted = false; + ori31_emitted = false; +#ifdef CONFIG_PPC_E500 + if (!bpf_jit_bypass_spec_v1()) { + EMIT(PPC_RAW_ISYNC()); EMIT(PPC_RAW_SYNC()); - EMIT(PPC_RAW_LD(tmp1_reg, _R13, 0)); - EMIT(PPC_RAW_ORI(_R31, _R31, 0)); - break; - case STF_BARRIER_FALLBACK: - ctx->seen |= SEEN_FUNC; - PPC_LI64(_R12, dereference_kernel_function_descriptor(bpf_stf_barrier)); - EMIT(PPC_RAW_MTCTR(_R12)); - EMIT(PPC_RAW_BCTRL()); - break; - case STF_BARRIER_NONE: - break; + sync_emitted = true; + } +#endif + if (!bpf_jit_bypass_spec_v4()) { + switch (stf_barrier) { + case STF_BARRIER_EIEIO: + EMIT(PPC_RAW_EIEIO() | 0x02000000); + break; + case STF_BARRIER_SYNC_ORI: + if (!sync_emitted) + EMIT(PPC_RAW_SYNC()); + EMIT(PPC_RAW_LD(tmp1_reg, _R13, 0)); + EMIT(PPC_RAW_ORI(_R31, _R31, 0)); + ori31_emitted = true; + break; + case STF_BARRIER_FALLBACK: + ctx->seen |= SEEN_FUNC; + PPC_LI64(_R12, dereference_kernel_function_descriptor(bpf_stf_barrier)); + EMIT(PPC_RAW_MTCTR(_R12)); + EMIT(PPC_RAW_BCTRL()); + break; + case STF_BARRIER_NONE: + break; + } } +#ifdef CONFIG_PPC_BOOK3S_64 + if (!bpf_jit_bypass_spec_v1() && !ori31_emitted) + EMIT(PPC_RAW_ORI(_R31, _R31, 0)); +#endif break; /* diff --git a/include/linux/bpf.h b/include/linux/bpf.h index 5b25d278409b..5dd556e89cce 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -2288,6 +2288,9 @@ bpf_prog_run_array_uprobe(const struct bpf_prog_array *array, return ret; } +bool bpf_jit_bypass_spec_v1(void); +bool bpf_jit_bypass_spec_v4(void); + #ifdef CONFIG_BPF_SYSCALL DECLARE_PER_CPU(int, bpf_prog_active); extern struct mutex bpf_stats_enabled_mutex; @@ -2475,12 +2478,16 @@ static inline bool bpf_allow_uninit_stack(const struct bpf_token *token) static inline bool bpf_bypass_spec_v1(const struct bpf_token *token) { - return cpu_mitigations_off() || bpf_token_capable(token, CAP_PERFMON); + return bpf_jit_bypass_spec_v1() || + cpu_mitigations_off() || + bpf_token_capable(token, CAP_PERFMON); } static inline bool bpf_bypass_spec_v4(const struct bpf_token *token) { - return cpu_mitigations_off() || bpf_token_capable(token, CAP_PERFMON); + return bpf_jit_bypass_spec_v4() || + cpu_mitigations_off() || + bpf_token_capable(token, CAP_PERFMON); } int bpf_map_new_fd(struct bpf_map *map, int flags); diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h index 256274acb1d8..e6c26393c029 100644 --- a/include/linux/bpf_verifier.h +++ b/include/linux/bpf_verifier.h @@ -580,7 +580,8 @@ struct bpf_insn_aux_data { u64 map_key_state; /* constant (32 bit) key tracking for maps */ int ctx_field_size; /* the ctx field size for load insn, maybe 0 */ u32 seen; /* this insn was processed by the verifier at env->pass_cnt */ - bool sanitize_stack_spill; /* subject to Spectre v4 sanitation */ + bool nospec; /* do not execute this instruction speculatively */ + bool nospec_result; /* result is unsafe under speculation, nospec must follow */ bool zext_dst; /* this insn zero extends dst reg */ bool needs_zext; /* alu op needs to clear upper bits */ bool storage_get_func_atomic; /* bpf_*_storage_get() with atomic memory alloc */ diff --git a/include/linux/filter.h b/include/linux/filter.h index f5cf4d35d83e..eca229752cbe 100644 --- a/include/linux/filter.h +++ b/include/linux/filter.h @@ -82,7 +82,7 @@ struct ctl_table_header; #define BPF_CALL_ARGS 0xe0 /* unused opcode to mark speculation barrier for mitigating - * Speculative Store Bypass + * Spectre v1 and v4 */ #define BPF_NOSPEC 0xc0 diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c index c20babbf998f..e536a34a32c8 100644 --- a/kernel/bpf/core.c +++ b/kernel/bpf/core.c @@ -2102,14 +2102,15 @@ out: #undef COND_JMP /* ST, STX and LDX*/ ST_NOSPEC: - /* Speculation barrier for mitigating Speculative Store Bypass. - * In case of arm64, we rely on the firmware mitigation as - * controlled via the ssbd kernel parameter. Whenever the - * mitigation is enabled, it works for all of the kernel code - * with no need to provide any additional instructions here. - * In case of x86, we use 'lfence' insn for mitigation. We - * reuse preexisting logic from Spectre v1 mitigation that - * happens to produce the required code on x86 for v4 as well. + /* Speculation barrier for mitigating Speculative Store Bypass, + * Bounds-Check Bypass and Type Confusion. In case of arm64, we + * rely on the firmware mitigation as controlled via the ssbd + * kernel parameter. Whenever the mitigation is enabled, it + * works for all of the kernel code with no need to provide any + * additional instructions here. In case of x86, we use 'lfence' + * insn for mitigation. We reuse preexisting logic from Spectre + * v1 mitigation that happens to produce the required code on + * x86 for v4 as well. */ barrier_nospec(); CONT; @@ -3034,6 +3035,21 @@ bool __weak bpf_jit_needs_zext(void) return false; } +/* By default, enable the verifier's mitigations against Spectre v1 and v4 for + * all archs. The value returned must not change at runtime as there is + * currently no support for reloading programs that were loaded without + * mitigations. + */ +bool __weak bpf_jit_bypass_spec_v1(void) +{ + return false; +} + +bool __weak bpf_jit_bypass_spec_v4(void) +{ + return false; +} + /* Return true if the JIT inlines the call to the helper corresponding to * the imm. * diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 48a3241cda0f..b1f797616f20 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -2013,6 +2013,18 @@ static int pop_stack(struct bpf_verifier_env *env, int *prev_insn_idx, return 0; } +static bool error_recoverable_with_nospec(int err) +{ + /* Should only return true for non-fatal errors that are allowed to + * occur during speculative verification. For these we can insert a + * nospec and the program might still be accepted. Do not include + * something like ENOMEM because it is likely to re-occur for the next + * architectural path once it has been recovered-from in all speculative + * paths. + */ + return err == -EPERM || err == -EACCES || err == -EINVAL; +} + static struct bpf_verifier_state *push_stack(struct bpf_verifier_env *env, int insn_idx, int prev_insn_idx, bool speculative) @@ -5027,7 +5039,7 @@ static int check_stack_write_fixed_off(struct bpf_verifier_env *env, } if (sanitize) - env->insn_aux_data[insn_idx].sanitize_stack_spill = true; + env->insn_aux_data[insn_idx].nospec_result = true; } err = destroy_if_dynptr_stack_slot(env, state, spi); @@ -8955,7 +8967,7 @@ static int resolve_map_arg_type(struct bpf_verifier_env *env, if (!meta->map_ptr) { /* kernel subsystem misconfigured verifier */ verbose(env, "invalid map_ptr to access map->type\n"); - return -EACCES; + return -EFAULT; } switch (meta->map_ptr->map_type) { @@ -9643,7 +9655,7 @@ skip_type_check: * that kernel subsystem misconfigured verifier */ verbose(env, "invalid map_ptr to access map->key\n"); - return -EACCES; + return -EFAULT; } key_size = meta->map_ptr->key_size; err = check_helper_mem_access(env, regno, key_size, BPF_READ, false, NULL); @@ -9670,7 +9682,7 @@ skip_type_check: if (!meta->map_ptr) { /* kernel subsystem misconfigured verifier */ verbose(env, "invalid map_ptr to access map->value\n"); - return -EACCES; + return -EFAULT; } meta->raw_mode = arg_type & MEM_UNINIT; err = check_helper_mem_access(env, regno, meta->map_ptr->value_size, @@ -10966,7 +10978,7 @@ record_func_map(struct bpf_verifier_env *env, struct bpf_call_arg_meta *meta, if (map == NULL) { verbose(env, "kernel subsystem misconfigured verifier\n"); - return -EINVAL; + return -EFAULT; } /* In case of read-only, some additional restrictions @@ -11005,7 +11017,7 @@ record_func_key(struct bpf_verifier_env *env, struct bpf_call_arg_meta *meta, return 0; if (!map || map->map_type != BPF_MAP_TYPE_PROG_ARRAY) { verbose(env, "kernel subsystem misconfigured verifier\n"); - return -EINVAL; + return -EFAULT; } reg = ®s[BPF_REG_3]; @@ -11147,7 +11159,7 @@ static int check_get_func_ip(struct bpf_verifier_env *env) return -ENOTSUPP; } -static struct bpf_insn_aux_data *cur_aux(struct bpf_verifier_env *env) +static struct bpf_insn_aux_data *cur_aux(const struct bpf_verifier_env *env) { return &env->insn_aux_data[env->insn_idx]; } @@ -11259,7 +11271,7 @@ static int check_helper_call(struct bpf_verifier_env *env, struct bpf_insn *insn if (changes_data && fn->arg1_type != ARG_PTR_TO_CTX) { verbose(env, "kernel subsystem misconfigured func %s#%d: r1 != ctx\n", func_id_name(func_id), func_id); - return -EINVAL; + return -EFAULT; } memset(&meta, 0, sizeof(meta)); @@ -11561,7 +11573,7 @@ static int check_helper_call(struct bpf_verifier_env *env, struct bpf_insn *insn if (meta.map_ptr == NULL) { verbose(env, "kernel subsystem misconfigured verifier\n"); - return -EINVAL; + return -EFAULT; } if (func_id == BPF_FUNC_map_lookup_elem && @@ -11654,7 +11666,7 @@ static int check_helper_call(struct bpf_verifier_env *env, struct bpf_insn *insn verbose(env, "verifier internal error:"); verbose(env, "func %s has non-overwritten BPF_PTR_POISON return type\n", func_id_name(func_id)); - return -EINVAL; + return -EFAULT; } ret_btf = btf_vmlinux; ret_btf_id = *fn->ret_btf_id; @@ -14015,7 +14027,9 @@ static int retrieve_ptr_limit(const struct bpf_reg_state *ptr_reg, static bool can_skip_alu_sanitation(const struct bpf_verifier_env *env, const struct bpf_insn *insn) { - return env->bypass_spec_v1 || BPF_SRC(insn->code) == BPF_K; + return env->bypass_spec_v1 || + BPF_SRC(insn->code) == BPF_K || + cur_aux(env)->nospec; } static int update_alu_sanitation_state(struct bpf_insn_aux_data *aux, @@ -15287,12 +15301,12 @@ static int adjust_reg_min_max_vals(struct bpf_verifier_env *env, if (WARN_ON_ONCE(ptr_reg)) { print_verifier_state(env, vstate, vstate->curframe, true); verbose(env, "verifier internal error: unexpected ptr_reg\n"); - return -EINVAL; + return -EFAULT; } if (WARN_ON(!src_reg)) { print_verifier_state(env, vstate, vstate->curframe, true); verbose(env, "verifier internal error: no src_reg\n"); - return -EINVAL; + return -EFAULT; } err = adjust_scalar_min_max_vals(env, insn, dst_reg, *src_reg); if (err) @@ -16738,7 +16752,7 @@ static int check_ld_imm(struct bpf_verifier_env *env, struct bpf_insn *insn) dst_reg->type = CONST_PTR_TO_MAP; } else { verbose(env, "bpf verifier is misconfigured\n"); - return -EINVAL; + return -EFAULT; } return 0; @@ -16785,7 +16799,7 @@ static int check_ld_abs(struct bpf_verifier_env *env, struct bpf_insn *insn) if (!env->ops->gen_ld_abs) { verbose(env, "bpf verifier is misconfigured\n"); - return -EINVAL; + return -EFAULT; } if (insn->dst_reg != BPF_REG_0 || insn->off != 0 || @@ -19430,20 +19444,223 @@ static int save_aux_ptr_type(struct bpf_verifier_env *env, enum bpf_reg_type typ return 0; } +enum { + PROCESS_BPF_EXIT = 1 +}; + +static int process_bpf_exit_full(struct bpf_verifier_env *env, + bool *do_print_state, + bool exception_exit) +{ + /* We must do check_reference_leak here before + * prepare_func_exit to handle the case when + * state->curframe > 0, it may be a callback function, + * for which reference_state must match caller reference + * state when it exits. + */ + int err = check_resource_leak(env, exception_exit, + !env->cur_state->curframe, + "BPF_EXIT instruction in main prog"); + if (err) + return err; + + /* The side effect of the prepare_func_exit which is + * being skipped is that it frees bpf_func_state. + * Typically, process_bpf_exit will only be hit with + * outermost exit. copy_verifier_state in pop_stack will + * handle freeing of any extra bpf_func_state left over + * from not processing all nested function exits. We + * also skip return code checks as they are not needed + * for exceptional exits. + */ + if (exception_exit) + return PROCESS_BPF_EXIT; + + if (env->cur_state->curframe) { + /* exit from nested function */ + err = prepare_func_exit(env, &env->insn_idx); + if (err) + return err; + *do_print_state = true; + return 0; + } + + err = check_return_code(env, BPF_REG_0, "R0"); + if (err) + return err; + return PROCESS_BPF_EXIT; +} + +static int do_check_insn(struct bpf_verifier_env *env, bool *do_print_state) +{ + int err; + struct bpf_insn *insn = &env->prog->insnsi[env->insn_idx]; + u8 class = BPF_CLASS(insn->code); + + if (class == BPF_ALU || class == BPF_ALU64) { + err = check_alu_op(env, insn); + if (err) + return err; + + } else if (class == BPF_LDX) { + bool is_ldsx = BPF_MODE(insn->code) == BPF_MEMSX; + + /* Check for reserved fields is already done in + * resolve_pseudo_ldimm64(). + */ + err = check_load_mem(env, insn, false, is_ldsx, true, "ldx"); + if (err) + return err; + } else if (class == BPF_STX) { + if (BPF_MODE(insn->code) == BPF_ATOMIC) { + err = check_atomic(env, insn); + if (err) + return err; + env->insn_idx++; + return 0; + } + + if (BPF_MODE(insn->code) != BPF_MEM || insn->imm != 0) { + verbose(env, "BPF_STX uses reserved fields\n"); + return -EINVAL; + } + + err = check_store_reg(env, insn, false); + if (err) + return err; + } else if (class == BPF_ST) { + enum bpf_reg_type dst_reg_type; + + if (BPF_MODE(insn->code) != BPF_MEM || + insn->src_reg != BPF_REG_0) { + verbose(env, "BPF_ST uses reserved fields\n"); + return -EINVAL; + } + /* check src operand */ + err = check_reg_arg(env, insn->dst_reg, SRC_OP); + if (err) + return err; + + dst_reg_type = cur_regs(env)[insn->dst_reg].type; + + /* check that memory (dst_reg + off) is writeable */ + err = check_mem_access(env, env->insn_idx, insn->dst_reg, + insn->off, BPF_SIZE(insn->code), + BPF_WRITE, -1, false, false); + if (err) + return err; + + err = save_aux_ptr_type(env, dst_reg_type, false); + if (err) + return err; + } else if (class == BPF_JMP || class == BPF_JMP32) { + u8 opcode = BPF_OP(insn->code); + + env->jmps_processed++; + if (opcode == BPF_CALL) { + if (BPF_SRC(insn->code) != BPF_K || + (insn->src_reg != BPF_PSEUDO_KFUNC_CALL && + insn->off != 0) || + (insn->src_reg != BPF_REG_0 && + insn->src_reg != BPF_PSEUDO_CALL && + insn->src_reg != BPF_PSEUDO_KFUNC_CALL) || + insn->dst_reg != BPF_REG_0 || class == BPF_JMP32) { + verbose(env, "BPF_CALL uses reserved fields\n"); + return -EINVAL; + } + + if (env->cur_state->active_locks) { + if ((insn->src_reg == BPF_REG_0 && + insn->imm != BPF_FUNC_spin_unlock) || + (insn->src_reg == BPF_PSEUDO_KFUNC_CALL && + (insn->off != 0 || !kfunc_spin_allowed(insn->imm)))) { + verbose(env, + "function calls are not allowed while holding a lock\n"); + return -EINVAL; + } + } + if (insn->src_reg == BPF_PSEUDO_CALL) { + err = check_func_call(env, insn, &env->insn_idx); + } else if (insn->src_reg == BPF_PSEUDO_KFUNC_CALL) { + err = check_kfunc_call(env, insn, &env->insn_idx); + if (!err && is_bpf_throw_kfunc(insn)) + return process_bpf_exit_full(env, do_print_state, true); + } else { + err = check_helper_call(env, insn, &env->insn_idx); + } + if (err) + return err; + + mark_reg_scratched(env, BPF_REG_0); + } else if (opcode == BPF_JA) { + if (BPF_SRC(insn->code) != BPF_K || + insn->src_reg != BPF_REG_0 || + insn->dst_reg != BPF_REG_0 || + (class == BPF_JMP && insn->imm != 0) || + (class == BPF_JMP32 && insn->off != 0)) { + verbose(env, "BPF_JA uses reserved fields\n"); + return -EINVAL; + } + + if (class == BPF_JMP) + env->insn_idx += insn->off + 1; + else + env->insn_idx += insn->imm + 1; + return 0; + } else if (opcode == BPF_EXIT) { + if (BPF_SRC(insn->code) != BPF_K || + insn->imm != 0 || + insn->src_reg != BPF_REG_0 || + insn->dst_reg != BPF_REG_0 || + class == BPF_JMP32) { + verbose(env, "BPF_EXIT uses reserved fields\n"); + return -EINVAL; + } + return process_bpf_exit_full(env, do_print_state, false); + } else { + err = check_cond_jmp_op(env, insn, &env->insn_idx); + if (err) + return err; + } + } else if (class == BPF_LD) { + u8 mode = BPF_MODE(insn->code); + + if (mode == BPF_ABS || mode == BPF_IND) { + err = check_ld_abs(env, insn); + if (err) + return err; + + } else if (mode == BPF_IMM) { + err = check_ld_imm(env, insn); + if (err) + return err; + + env->insn_idx++; + sanitize_mark_insn_seen(env); + } else { + verbose(env, "invalid BPF_LD mode\n"); + return -EINVAL; + } + } else { + verbose(env, "unknown insn class %d\n", class); + return -EINVAL; + } + + env->insn_idx++; + return 0; +} + static int do_check(struct bpf_verifier_env *env) { bool pop_log = !(env->log.level & BPF_LOG_LEVEL2); struct bpf_verifier_state *state = env->cur_state; struct bpf_insn *insns = env->prog->insnsi; - struct bpf_reg_state *regs; int insn_cnt = env->prog->len; bool do_print_state = false; int prev_insn_idx = -1; for (;;) { - bool exception_exit = false; struct bpf_insn *insn; - u8 class; int err; /* reset current history entry on each new instruction */ @@ -19457,7 +19674,6 @@ static int do_check(struct bpf_verifier_env *env) } insn = &insns[env->insn_idx]; - class = BPF_CLASS(insn->code); if (++env->insn_processed > BPF_COMPLEXITY_LIMIT_INSNS) { verbose(env, @@ -19527,215 +19743,61 @@ static int do_check(struct bpf_verifier_env *env) return err; } - regs = cur_regs(env); sanitize_mark_insn_seen(env); prev_insn_idx = env->insn_idx; - if (class == BPF_ALU || class == BPF_ALU64) { - err = check_alu_op(env, insn); - if (err) - return err; - - } else if (class == BPF_LDX) { - bool is_ldsx = BPF_MODE(insn->code) == BPF_MEMSX; + /* Reduce verification complexity by stopping speculative path + * verification when a nospec is encountered. + */ + if (state->speculative && cur_aux(env)->nospec) + goto process_bpf_exit; - /* Check for reserved fields is already done in - * resolve_pseudo_ldimm64(). + err = do_check_insn(env, &do_print_state); + if (state->speculative && error_recoverable_with_nospec(err)) { + /* Prevent this speculative path from ever reaching the + * insn that would have been unsafe to execute. */ - err = check_load_mem(env, insn, false, is_ldsx, true, - "ldx"); - if (err) - return err; - } else if (class == BPF_STX) { - if (BPF_MODE(insn->code) == BPF_ATOMIC) { - err = check_atomic(env, insn); - if (err) - return err; - env->insn_idx++; - continue; - } - - if (BPF_MODE(insn->code) != BPF_MEM || insn->imm != 0) { - verbose(env, "BPF_STX uses reserved fields\n"); - return -EINVAL; - } - - err = check_store_reg(env, insn, false); - if (err) - return err; - } else if (class == BPF_ST) { - enum bpf_reg_type dst_reg_type; - - if (BPF_MODE(insn->code) != BPF_MEM || - insn->src_reg != BPF_REG_0) { - verbose(env, "BPF_ST uses reserved fields\n"); - return -EINVAL; - } - /* check src operand */ - err = check_reg_arg(env, insn->dst_reg, SRC_OP); - if (err) - return err; - - dst_reg_type = regs[insn->dst_reg].type; - - /* check that memory (dst_reg + off) is writeable */ - err = check_mem_access(env, env->insn_idx, insn->dst_reg, - insn->off, BPF_SIZE(insn->code), - BPF_WRITE, -1, false, false); - if (err) - return err; - - err = save_aux_ptr_type(env, dst_reg_type, false); - if (err) - return err; - } else if (class == BPF_JMP || class == BPF_JMP32) { - u8 opcode = BPF_OP(insn->code); - - env->jmps_processed++; - if (opcode == BPF_CALL) { - if (BPF_SRC(insn->code) != BPF_K || - (insn->src_reg != BPF_PSEUDO_KFUNC_CALL - && insn->off != 0) || - (insn->src_reg != BPF_REG_0 && - insn->src_reg != BPF_PSEUDO_CALL && - insn->src_reg != BPF_PSEUDO_KFUNC_CALL) || - insn->dst_reg != BPF_REG_0 || - class == BPF_JMP32) { - verbose(env, "BPF_CALL uses reserved fields\n"); - return -EINVAL; - } - - if (env->cur_state->active_locks) { - if ((insn->src_reg == BPF_REG_0 && insn->imm != BPF_FUNC_spin_unlock) || - (insn->src_reg == BPF_PSEUDO_KFUNC_CALL && - (insn->off != 0 || !kfunc_spin_allowed(insn->imm)))) { - verbose(env, "function calls are not allowed while holding a lock\n"); - return -EINVAL; - } - } - if (insn->src_reg == BPF_PSEUDO_CALL) { - err = check_func_call(env, insn, &env->insn_idx); - } else if (insn->src_reg == BPF_PSEUDO_KFUNC_CALL) { - err = check_kfunc_call(env, insn, &env->insn_idx); - if (!err && is_bpf_throw_kfunc(insn)) { - exception_exit = true; - goto process_bpf_exit_full; - } - } else { - err = check_helper_call(env, insn, &env->insn_idx); - } - if (err) - return err; - - mark_reg_scratched(env, BPF_REG_0); - } else if (opcode == BPF_JA) { - if (BPF_SRC(insn->code) != BPF_K || - insn->src_reg != BPF_REG_0 || - insn->dst_reg != BPF_REG_0 || - (class == BPF_JMP && insn->imm != 0) || - (class == BPF_JMP32 && insn->off != 0)) { - verbose(env, "BPF_JA uses reserved fields\n"); - return -EINVAL; - } - - if (class == BPF_JMP) - env->insn_idx += insn->off + 1; - else - env->insn_idx += insn->imm + 1; - continue; - - } else if (opcode == BPF_EXIT) { - if (BPF_SRC(insn->code) != BPF_K || - insn->imm != 0 || - insn->src_reg != BPF_REG_0 || - insn->dst_reg != BPF_REG_0 || - class == BPF_JMP32) { - verbose(env, "BPF_EXIT uses reserved fields\n"); - return -EINVAL; - } -process_bpf_exit_full: - /* We must do check_reference_leak here before - * prepare_func_exit to handle the case when - * state->curframe > 0, it may be a callback - * function, for which reference_state must - * match caller reference state when it exits. - */ - err = check_resource_leak(env, exception_exit, !env->cur_state->curframe, - "BPF_EXIT instruction in main prog"); - if (err) - return err; - - /* The side effect of the prepare_func_exit - * which is being skipped is that it frees - * bpf_func_state. Typically, process_bpf_exit - * will only be hit with outermost exit. - * copy_verifier_state in pop_stack will handle - * freeing of any extra bpf_func_state left over - * from not processing all nested function - * exits. We also skip return code checks as - * they are not needed for exceptional exits. - */ - if (exception_exit) - goto process_bpf_exit; - - if (state->curframe) { - /* exit from nested function */ - err = prepare_func_exit(env, &env->insn_idx); - if (err) - return err; - do_print_state = true; - continue; - } - - err = check_return_code(env, BPF_REG_0, "R0"); - if (err) - return err; + cur_aux(env)->nospec = true; + /* If it was an ADD/SUB insn, potentially remove any + * markings for alu sanitization. + */ + cur_aux(env)->alu_state = 0; + goto process_bpf_exit; + } else if (err < 0) { + return err; + } else if (err == PROCESS_BPF_EXIT) { + goto process_bpf_exit; + } + WARN_ON_ONCE(err); + + if (state->speculative && cur_aux(env)->nospec_result) { + /* If we are on a path that performed a jump-op, this + * may skip a nospec patched-in after the jump. This can + * currently never happen because nospec_result is only + * used for the write-ops + * `*(size*)(dst_reg+off)=src_reg|imm32` which must + * never skip the following insn. Still, add a warning + * to document this in case nospec_result is used + * elsewhere in the future. + */ + WARN_ON_ONCE(env->insn_idx != prev_insn_idx + 1); process_bpf_exit: - mark_verifier_state_scratched(env); - update_branch_counts(env, env->cur_state); - err = pop_stack(env, &prev_insn_idx, - &env->insn_idx, pop_log); - if (err < 0) { - if (err != -ENOENT) - return err; - break; - } else { - if (verifier_bug_if(env->cur_state->loop_entry, env, - "broken loop detection")) - return -EFAULT; - do_print_state = true; - continue; - } - } else { - err = check_cond_jmp_op(env, insn, &env->insn_idx); - if (err) - return err; - } - } else if (class == BPF_LD) { - u8 mode = BPF_MODE(insn->code); - - if (mode == BPF_ABS || mode == BPF_IND) { - err = check_ld_abs(env, insn); - if (err) - return err; - - } else if (mode == BPF_IMM) { - err = check_ld_imm(env, insn); - if (err) + mark_verifier_state_scratched(env); + update_branch_counts(env, env->cur_state); + err = pop_stack(env, &prev_insn_idx, &env->insn_idx, + pop_log); + if (err < 0) { + if (err != -ENOENT) return err; - - env->insn_idx++; - sanitize_mark_insn_seen(env); + break; } else { - verbose(env, "invalid BPF_LD mode\n"); - return -EINVAL; + if (verifier_bug_if(env->cur_state->loop_entry, env, + "broken loop detection")) + return -EFAULT; + do_print_state = true; + continue; } - } else { - verbose(env, "unknown insn class %d\n", class); - return -EINVAL; } - - env->insn_idx++; } return 0; @@ -20807,7 +20869,7 @@ static int convert_ctx_accesses(struct bpf_verifier_env *env) -(subprogs[0].stack_depth + 8)); if (epilogue_cnt >= INSN_BUF_SIZE) { verbose(env, "bpf verifier is misconfigured\n"); - return -EINVAL; + return -EFAULT; } else if (epilogue_cnt) { /* Save the ARG_PTR_TO_CTX for the epilogue to use */ cnt = 0; @@ -20830,13 +20892,13 @@ static int convert_ctx_accesses(struct bpf_verifier_env *env) if (ops->gen_prologue || env->seen_direct_write) { if (!ops->gen_prologue) { verbose(env, "bpf verifier is misconfigured\n"); - return -EINVAL; + return -EFAULT; } cnt = ops->gen_prologue(insn_buf, env->seen_direct_write, env->prog); if (cnt >= INSN_BUF_SIZE) { verbose(env, "bpf verifier is misconfigured\n"); - return -EINVAL; + return -EFAULT; } else if (cnt) { new_prog = bpf_patch_insn_data(env, 0, insn_buf, cnt); if (!new_prog) @@ -20863,6 +20925,29 @@ static int convert_ctx_accesses(struct bpf_verifier_env *env) bpf_convert_ctx_access_t convert_ctx_access; u8 mode; + if (env->insn_aux_data[i + delta].nospec) { + WARN_ON_ONCE(env->insn_aux_data[i + delta].alu_state); + struct bpf_insn patch[] = { + BPF_ST_NOSPEC(), + *insn, + }; + + cnt = ARRAY_SIZE(patch); + new_prog = bpf_patch_insn_data(env, i + delta, patch, cnt); + if (!new_prog) + return -ENOMEM; + + delta += cnt - 1; + env->prog = new_prog; + insn = new_prog->insnsi + i + delta; + /* This can not be easily merged with the + * nospec_result-case, because an insn may require a + * nospec before and after itself. Therefore also do not + * 'continue' here but potentially apply further + * patching to insn. *insn should equal patch[1] now. + */ + } + if (insn->code == (BPF_LDX | BPF_MEM | BPF_B) || insn->code == (BPF_LDX | BPF_MEM | BPF_H) || insn->code == (BPF_LDX | BPF_MEM | BPF_W) || @@ -20912,7 +20997,10 @@ static int convert_ctx_accesses(struct bpf_verifier_env *env) } if (type == BPF_WRITE && - env->insn_aux_data[i + delta].sanitize_stack_spill) { + env->insn_aux_data[i + delta].nospec_result) { + /* nospec_result is only used to mitigate Spectre v4 and + * to limit verification-time for Spectre v1. + */ struct bpf_insn patch[] = { *insn, BPF_ST_NOSPEC(), @@ -20993,7 +21081,7 @@ static int convert_ctx_accesses(struct bpf_verifier_env *env) if (type == BPF_WRITE) { verbose(env, "bpf verifier narrow ctx access misconfigured\n"); - return -EINVAL; + return -EFAULT; } size_code = BPF_H; @@ -21012,7 +21100,7 @@ static int convert_ctx_accesses(struct bpf_verifier_env *env) if (cnt == 0 || cnt >= INSN_BUF_SIZE || (ctx_field_size && !target_size)) { verbose(env, "bpf verifier is misconfigured\n"); - return -EINVAL; + return -EFAULT; } if (is_narrower_load && size < target_size) { @@ -21020,7 +21108,7 @@ static int convert_ctx_accesses(struct bpf_verifier_env *env) off, size, size_default) * 8; if (shift && cnt + 1 >= INSN_BUF_SIZE) { verbose(env, "bpf verifier narrow ctx load misconfigured\n"); - return -EINVAL; + return -EFAULT; } if (ctx_field_size <= 4) { if (shift) @@ -21785,7 +21873,7 @@ static int do_misc_fixups(struct bpf_verifier_env *env) cnt = env->ops->gen_ld_abs(insn, insn_buf); if (cnt == 0 || cnt >= INSN_BUF_SIZE) { verbose(env, "bpf verifier is misconfigured\n"); - return -EINVAL; + return -EFAULT; } new_prog = bpf_patch_insn_data(env, i + delta, insn_buf, cnt); @@ -22121,7 +22209,7 @@ static int do_misc_fixups(struct bpf_verifier_env *env) goto patch_map_ops_generic; if (cnt <= 0 || cnt >= INSN_BUF_SIZE) { verbose(env, "bpf verifier is misconfigured\n"); - return -EINVAL; + return -EFAULT; } new_prog = bpf_patch_insn_data(env, i + delta, @@ -22481,7 +22569,7 @@ next_insn: !map_ptr->ops->map_poke_untrack || !map_ptr->ops->map_poke_run) { verbose(env, "bpf verifier is misconfigured\n"); - return -EINVAL; + return -EFAULT; } ret = map_ptr->ops->map_poke_track(map_ptr, prog->aux); diff --git a/tools/testing/selftests/bpf/progs/bpf_misc.h b/tools/testing/selftests/bpf/progs/bpf_misc.h index 6e208e24ba3b..a678463e972c 100644 --- a/tools/testing/selftests/bpf/progs/bpf_misc.h +++ b/tools/testing/selftests/bpf/progs/bpf_misc.h @@ -231,4 +231,8 @@ #define CAN_USE_LOAD_ACQ_STORE_REL #endif +#if defined(__TARGET_ARCH_arm64) || defined(__TARGET_ARCH_x86) +#define SPEC_V1 +#endif + #endif diff --git a/tools/testing/selftests/bpf/progs/verifier_and.c b/tools/testing/selftests/bpf/progs/verifier_and.c index e97e518516b6..2b4fdca162be 100644 --- a/tools/testing/selftests/bpf/progs/verifier_and.c +++ b/tools/testing/selftests/bpf/progs/verifier_and.c @@ -85,8 +85,14 @@ l0_%=: r0 = r0; \ SEC("socket") __description("check known subreg with unknown reg") -__success __failure_unpriv __msg_unpriv("R1 !read_ok") +__success __success_unpriv __retval(0) +#ifdef SPEC_V1 +__xlated_unpriv("if w0 < 0x1 goto pc+2") +__xlated_unpriv("nospec") /* inserted to prevent `R1 !read_ok'` */ +__xlated_unpriv("goto pc-1") /* `r1 = *(u32*)(r1 + 512)`, sanitized dead code */ +__xlated_unpriv("r0 = 0") +#endif __naked void known_subreg_with_unknown_reg(void) { asm volatile (" \ diff --git a/tools/testing/selftests/bpf/progs/verifier_bounds.c b/tools/testing/selftests/bpf/progs/verifier_bounds.c index 0eb33bb801b5..30e16153fdf1 100644 --- a/tools/testing/selftests/bpf/progs/verifier_bounds.c +++ b/tools/testing/selftests/bpf/progs/verifier_bounds.c @@ -620,8 +620,14 @@ l1_%=: exit; \ SEC("socket") __description("bounds check mixed 32bit and 64bit arithmetic. test1") -__success __failure_unpriv __msg_unpriv("R0 invalid mem access 'scalar'") +__success __success_unpriv __retval(0) +#ifdef SPEC_V1 +__xlated_unpriv("goto pc+2") +__xlated_unpriv("nospec") /* inserted to prevent `R0 invalid mem access 'scalar'` */ +__xlated_unpriv("goto pc-1") /* sanitized dead code */ +__xlated_unpriv("exit") +#endif __naked void _32bit_and_64bit_arithmetic_test1(void) { asm volatile (" \ @@ -643,8 +649,14 @@ l1_%=: exit; \ SEC("socket") __description("bounds check mixed 32bit and 64bit arithmetic. test2") -__success __failure_unpriv __msg_unpriv("R0 invalid mem access 'scalar'") +__success __success_unpriv __retval(0) +#ifdef SPEC_V1 +__xlated_unpriv("goto pc+2") +__xlated_unpriv("nospec") /* inserted to prevent `R0 invalid mem access 'scalar'` */ +__xlated_unpriv("goto pc-1") /* sanitized dead code */ +__xlated_unpriv("exit") +#endif __naked void _32bit_and_64bit_arithmetic_test2(void) { asm volatile (" \ @@ -691,9 +703,14 @@ l0_%=: r0 = 0; \ SEC("socket") __description("bounds check for reg = 0, reg xor 1") -__success __failure_unpriv -__msg_unpriv("R0 min value is outside of the allowed memory range") +__success __success_unpriv __retval(0) +#ifdef SPEC_V1 +__xlated_unpriv("if r1 != 0x0 goto pc+2") +__xlated_unpriv("nospec") /* inserted to prevent `R0 min value is outside of the allowed memory range` */ +__xlated_unpriv("goto pc-1") /* sanitized dead code */ +__xlated_unpriv("r0 = 0") +#endif __naked void reg_0_reg_xor_1(void) { asm volatile (" \ @@ -719,9 +736,14 @@ l1_%=: r0 = 0; \ SEC("socket") __description("bounds check for reg32 = 0, reg32 xor 1") -__success __failure_unpriv -__msg_unpriv("R0 min value is outside of the allowed memory range") +__success __success_unpriv __retval(0) +#ifdef SPEC_V1 +__xlated_unpriv("if w1 != 0x0 goto pc+2") +__xlated_unpriv("nospec") /* inserted to prevent `R0 min value is outside of the allowed memory range` */ +__xlated_unpriv("goto pc-1") /* sanitized dead code */ +__xlated_unpriv("r0 = 0") +#endif __naked void reg32_0_reg32_xor_1(void) { asm volatile (" \ @@ -747,9 +769,14 @@ l1_%=: r0 = 0; \ SEC("socket") __description("bounds check for reg = 2, reg xor 3") -__success __failure_unpriv -__msg_unpriv("R0 min value is outside of the allowed memory range") +__success __success_unpriv __retval(0) +#ifdef SPEC_V1 +__xlated_unpriv("if r1 > 0x0 goto pc+2") +__xlated_unpriv("nospec") /* inserted to prevent `R0 min value is outside of the allowed memory range` */ +__xlated_unpriv("goto pc-1") /* sanitized dead code */ +__xlated_unpriv("r0 = 0") +#endif __naked void reg_2_reg_xor_3(void) { asm volatile (" \ @@ -829,9 +856,14 @@ l1_%=: r0 = 0; \ SEC("socket") __description("bounds check for reg > 0, reg xor 3") -__success __failure_unpriv -__msg_unpriv("R0 min value is outside of the allowed memory range") +__success __success_unpriv __retval(0) +#ifdef SPEC_V1 +__xlated_unpriv("if r1 >= 0x0 goto pc+2") +__xlated_unpriv("nospec") /* inserted to prevent `R0 min value is outside of the allowed memory range` */ +__xlated_unpriv("goto pc-1") /* sanitized dead code */ +__xlated_unpriv("r0 = 0") +#endif __naked void reg_0_reg_xor_3(void) { asm volatile (" \ @@ -858,9 +890,14 @@ l1_%=: r0 = 0; \ SEC("socket") __description("bounds check for reg32 > 0, reg32 xor 3") -__success __failure_unpriv -__msg_unpriv("R0 min value is outside of the allowed memory range") +__success __success_unpriv __retval(0) +#ifdef SPEC_V1 +__xlated_unpriv("if w1 >= 0x0 goto pc+2") +__xlated_unpriv("nospec") /* inserted to prevent `R0 min value is outside of the allowed memory range` */ +__xlated_unpriv("goto pc-1") /* sanitized dead code */ +__xlated_unpriv("r0 = 0") +#endif __naked void reg32_0_reg32_xor_3(void) { asm volatile (" \ diff --git a/tools/testing/selftests/bpf/progs/verifier_movsx.c b/tools/testing/selftests/bpf/progs/verifier_movsx.c index 994bbc346d25..a4d8814eb5ed 100644 --- a/tools/testing/selftests/bpf/progs/verifier_movsx.c +++ b/tools/testing/selftests/bpf/progs/verifier_movsx.c @@ -245,7 +245,13 @@ l0_%=: \ SEC("socket") __description("MOV32SX, S8, var_off not u32_max, positive after s8 extension") __success __retval(0) -__failure_unpriv __msg_unpriv("frame pointer is read only") +__success_unpriv +#ifdef SPEC_V1 +__xlated_unpriv("w0 = 0") +__xlated_unpriv("exit") +__xlated_unpriv("nospec") /* inserted to prevent `frame pointer is read only` */ +__xlated_unpriv("goto pc-1") +#endif __naked void mov64sx_s32_varoff_2(void) { asm volatile (" \ @@ -267,7 +273,13 @@ l0_%=: \ SEC("socket") __description("MOV32SX, S8, var_off not u32_max, negative after s8 extension") __success __retval(0) -__failure_unpriv __msg_unpriv("frame pointer is read only") +__success_unpriv +#ifdef SPEC_V1 +__xlated_unpriv("w0 = 0") +__xlated_unpriv("exit") +__xlated_unpriv("nospec") /* inserted to prevent `frame pointer is read only` */ +__xlated_unpriv("goto pc-1") +#endif __naked void mov64sx_s32_varoff_3(void) { asm volatile (" \ diff --git a/tools/testing/selftests/bpf/progs/verifier_unpriv.c b/tools/testing/selftests/bpf/progs/verifier_unpriv.c index db52ba66e880..4470541b5e71 100644 --- a/tools/testing/selftests/bpf/progs/verifier_unpriv.c +++ b/tools/testing/selftests/bpf/progs/verifier_unpriv.c @@ -572,8 +572,14 @@ l0_%=: exit; \ SEC("socket") __description("alu32: mov u32 const") -__success __failure_unpriv __msg_unpriv("R7 invalid mem access 'scalar'") +__success __success_unpriv __retval(0) +#ifdef SPEC_V1 +__xlated_unpriv("if r0 == 0x0 goto pc+2") +__xlated_unpriv("nospec") /* inserted to prevent `R7 invalid mem access 'scalar'` */ +__xlated_unpriv("goto pc-1") /* sanitized dead code */ +__xlated_unpriv("exit") +#endif __naked void alu32_mov_u32_const(void) { asm volatile (" \ @@ -738,4 +744,61 @@ l0_%=: r0 = 0; \ " ::: __clobber_all); } +SEC("socket") +__description("unpriv: Spectre v1 path-based type confusion of scalar as stack-ptr") +__success __success_unpriv __retval(0) +#ifdef SPEC_V1 +__xlated_unpriv("if r0 != 0x1 goto pc+2") +/* This nospec prevents the exploit because it forces the mispredicted (not + * taken) `if r0 != 0x0 goto l0_%=` to resolve before using r6 as a pointer. + * This causes the CPU to realize that `r6 = r9` should have never executed. It + * ensures that r6 always contains a readable stack slot ptr when the insn after + * the nospec executes. + */ +__xlated_unpriv("nospec") +__xlated_unpriv("r9 = *(u8 *)(r6 +0)") +#endif +__naked void unpriv_spec_v1_type_confusion(void) +{ + asm volatile (" \ + r1 = 0; \ + *(u64*)(r10 - 8) = r1; \ + r2 = r10; \ + r2 += -8; \ + r1 = %[map_hash_8b] ll; \ + call %[bpf_map_lookup_elem]; \ + if r0 == 0 goto l2_%=; \ + /* r0: pointer to a map array entry */ \ + r2 = r10; \ + r2 += -8; \ + r1 = %[map_hash_8b] ll; \ + /* r1, r2: prepared call args */ \ + r6 = r10; \ + r6 += -8; \ + /* r6: pointer to readable stack slot */ \ + r9 = 0xffffc900; \ + r9 <<= 32; \ + /* r9: scalar controlled by attacker */ \ + r0 = *(u64 *)(r0 + 0); /* cache miss */ \ + if r0 != 0x0 goto l0_%=; \ + r6 = r9; \ +l0_%=: if r0 != 0x1 goto l1_%=; \ + r9 = *(u8 *)(r6 + 0); \ +l1_%=: /* leak r9 */ \ + r9 &= 1; \ + r9 <<= 9; \ + *(u64*)(r10 - 8) = r9; \ + call %[bpf_map_lookup_elem]; \ + if r0 == 0 goto l2_%=; \ + /* leak secret into is_cached(map[0|512]): */ \ + r0 = *(u64 *)(r0 + 0); \ +l2_%=: \ + r0 = 0; \ + exit; \ +" : + : __imm(bpf_map_lookup_elem), + __imm_addr(map_hash_8b) + : __clobber_all); +} + char _license[] SEC("license") = "GPL"; diff --git a/tools/testing/selftests/bpf/progs/verifier_value_ptr_arith.c b/tools/testing/selftests/bpf/progs/verifier_value_ptr_arith.c index 5ba6e53571c8..fcea9819e359 100644 --- a/tools/testing/selftests/bpf/progs/verifier_value_ptr_arith.c +++ b/tools/testing/selftests/bpf/progs/verifier_value_ptr_arith.c @@ -398,7 +398,8 @@ l2_%=: r0 = 1; \ SEC("socket") __description("map access: mixing value pointer and scalar, 1") -__success __failure_unpriv __msg_unpriv("R2 pointer comparison prohibited") +__success __failure_unpriv +__msg_unpriv("R2 tried to add from different maps, paths or scalars, pointer arithmetic with it prohibited for !root") __retval(0) __naked void value_pointer_and_scalar_1(void) { @@ -433,6 +434,7 @@ l2_%=: /* common instruction */ \ l3_%=: /* branch B */ \ r0 = 0x13371337; \ /* verifier follows fall-through */ \ + /* unpriv: nospec (inserted to prevent `R2 pointer comparison prohibited`) */\ if r2 != 0x100000 goto l4_%=; \ r0 = 0; \ exit; \ @@ -450,7 +452,8 @@ l4_%=: /* fake-dead code; targeted from branch A to \ SEC("socket") __description("map access: mixing value pointer and scalar, 2") -__success __failure_unpriv __msg_unpriv("R0 invalid mem access 'scalar'") +__success __failure_unpriv +__msg_unpriv("R2 tried to add from different maps, paths or scalars, pointer arithmetic with it prohibited for !root") __retval(0) __naked void value_pointer_and_scalar_2(void) { @@ -492,6 +495,7 @@ l4_%=: /* fake-dead code; targeted from branch A to \ * prevent dead code sanitization, rejected \ * via branch B however \ */ \ + /* unpriv: nospec (inserted to prevent `R0 invalid mem access 'scalar'`) */\ r0 = *(u8*)(r0 + 0); \ r0 = 0; \ exit; \ @@ -1296,9 +1300,13 @@ l0_%=: r0 = 1; \ SEC("socket") __description("map access: value_ptr -= unknown scalar, 2") -__success __failure_unpriv -__msg_unpriv("R0 pointer arithmetic of map value goes out of range") +__success __success_unpriv __retval(1) +#ifdef SPEC_V1 +__xlated_unpriv("r1 &= 7") +__xlated_unpriv("nospec") /* inserted to prevent `R0 pointer arithmetic of map value goes out of range` */ +__xlated_unpriv("r0 -= r1") +#endif __naked void value_ptr_unknown_scalar_2_2(void) { asm volatile (" \ diff --git a/tools/testing/selftests/bpf/verifier/dead_code.c b/tools/testing/selftests/bpf/verifier/dead_code.c index ee454327e5c6..77207b498c6f 100644 --- a/tools/testing/selftests/bpf/verifier/dead_code.c +++ b/tools/testing/selftests/bpf/verifier/dead_code.c @@ -2,14 +2,13 @@ "dead code: start", .insns = { BPF_JMP_IMM(BPF_JA, 0, 0, 2), + /* unpriv: nospec (inserted to prevent "R9 !read_ok") */ BPF_LDX_MEM(BPF_B, BPF_REG_8, BPF_REG_9, 0), BPF_JMP_IMM(BPF_JA, 0, 0, 2), BPF_MOV64_IMM(BPF_REG_0, 7), BPF_JMP_IMM(BPF_JGE, BPF_REG_0, 10, -4), BPF_EXIT_INSN(), }, - .errstr_unpriv = "R9 !read_ok", - .result_unpriv = REJECT, .result = ACCEPT, .retval = 7, }, diff --git a/tools/testing/selftests/bpf/verifier/jmp32.c b/tools/testing/selftests/bpf/verifier/jmp32.c index 43776f6f92f4..91d83e9cb148 100644 --- a/tools/testing/selftests/bpf/verifier/jmp32.c +++ b/tools/testing/selftests/bpf/verifier/jmp32.c @@ -84,11 +84,10 @@ BPF_JMP32_IMM(BPF_JSET, BPF_REG_7, 0x10, 1), BPF_EXIT_INSN(), BPF_JMP32_IMM(BPF_JGE, BPF_REG_7, 0x10, 1), + /* unpriv: nospec (inserted to prevent "R9 !read_ok") */ BPF_LDX_MEM(BPF_B, BPF_REG_8, BPF_REG_9, 0), BPF_EXIT_INSN(), }, - .errstr_unpriv = "R9 !read_ok", - .result_unpriv = REJECT, .result = ACCEPT, }, { @@ -149,11 +148,10 @@ BPF_JMP32_IMM(BPF_JEQ, BPF_REG_7, 0x10, 1), BPF_EXIT_INSN(), BPF_JMP32_IMM(BPF_JSGE, BPF_REG_7, 0xf, 1), + /* unpriv: nospec (inserted to prevent "R9 !read_ok") */ BPF_LDX_MEM(BPF_B, BPF_REG_8, BPF_REG_9, 0), BPF_EXIT_INSN(), }, - .errstr_unpriv = "R9 !read_ok", - .result_unpriv = REJECT, .result = ACCEPT, }, { @@ -214,11 +212,10 @@ BPF_JMP32_IMM(BPF_JNE, BPF_REG_7, 0x10, 1), BPF_JMP_IMM(BPF_JNE, BPF_REG_7, 0x10, 1), BPF_EXIT_INSN(), + /* unpriv: nospec (inserted to prevent "R9 !read_ok") */ BPF_LDX_MEM(BPF_B, BPF_REG_8, BPF_REG_9, 0), BPF_EXIT_INSN(), }, - .errstr_unpriv = "R9 !read_ok", - .result_unpriv = REJECT, .result = ACCEPT, }, { @@ -283,11 +280,10 @@ BPF_JMP32_REG(BPF_JGE, BPF_REG_7, BPF_REG_8, 1), BPF_EXIT_INSN(), BPF_JMP32_IMM(BPF_JGE, BPF_REG_7, 0x7ffffff0, 1), + /* unpriv: nospec (inserted to prevent "R0 invalid mem access 'scalar'") */ BPF_LDX_MEM(BPF_DW, BPF_REG_0, BPF_REG_0, 0), BPF_EXIT_INSN(), }, - .errstr_unpriv = "R0 invalid mem access 'scalar'", - .result_unpriv = REJECT, .result = ACCEPT, .retval = 2, .flags = F_NEEDS_EFFICIENT_UNALIGNED_ACCESS, @@ -354,11 +350,10 @@ BPF_JMP32_REG(BPF_JGT, BPF_REG_7, BPF_REG_8, 1), BPF_EXIT_INSN(), BPF_JMP_IMM(BPF_JGT, BPF_REG_7, 0x7ffffff0, 1), + /* unpriv: nospec (inserted to prevent "R0 invalid mem access 'scalar'") */ BPF_LDX_MEM(BPF_DW, BPF_REG_0, BPF_REG_0, 0), BPF_EXIT_INSN(), }, - .errstr_unpriv = "R0 invalid mem access 'scalar'", - .result_unpriv = REJECT, .result = ACCEPT, .retval = 2, .flags = F_NEEDS_EFFICIENT_UNALIGNED_ACCESS, @@ -425,11 +420,10 @@ BPF_JMP32_REG(BPF_JLE, BPF_REG_7, BPF_REG_8, 1), BPF_EXIT_INSN(), BPF_JMP32_IMM(BPF_JLE, BPF_REG_7, 0x7ffffff0, 1), + /* unpriv: nospec (inserted to prevent "R0 invalid mem access 'scalar'") */ BPF_LDX_MEM(BPF_DW, BPF_REG_0, BPF_REG_0, 0), BPF_EXIT_INSN(), }, - .errstr_unpriv = "R0 invalid mem access 'scalar'", - .result_unpriv = REJECT, .result = ACCEPT, .retval = 2, .flags = F_NEEDS_EFFICIENT_UNALIGNED_ACCESS, @@ -496,11 +490,10 @@ BPF_JMP32_REG(BPF_JLT, BPF_REG_7, BPF_REG_8, 1), BPF_EXIT_INSN(), BPF_JMP_IMM(BPF_JSLT, BPF_REG_7, 0x7ffffff0, 1), + /* unpriv: nospec (inserted to prevent "R0 invalid mem access 'scalar'") */ BPF_LDX_MEM(BPF_DW, BPF_REG_0, BPF_REG_0, 0), BPF_EXIT_INSN(), }, - .errstr_unpriv = "R0 invalid mem access 'scalar'", - .result_unpriv = REJECT, .result = ACCEPT, .retval = 2, .flags = F_NEEDS_EFFICIENT_UNALIGNED_ACCESS, @@ -567,11 +560,10 @@ BPF_JMP32_REG(BPF_JSGE, BPF_REG_7, BPF_REG_8, 1), BPF_EXIT_INSN(), BPF_JMP_IMM(BPF_JSGE, BPF_REG_7, 0x7ffffff0, 1), + /* unpriv: nospec (inserted to prevent "R0 invalid mem access 'scalar'") */ BPF_LDX_MEM(BPF_DW, BPF_REG_0, BPF_REG_0, 0), BPF_EXIT_INSN(), }, - .errstr_unpriv = "R0 invalid mem access 'scalar'", - .result_unpriv = REJECT, .result = ACCEPT, .retval = 2, .flags = F_NEEDS_EFFICIENT_UNALIGNED_ACCESS, @@ -638,11 +630,10 @@ BPF_JMP32_REG(BPF_JSGT, BPF_REG_7, BPF_REG_8, 1), BPF_EXIT_INSN(), BPF_JMP_IMM(BPF_JSGT, BPF_REG_7, -2, 1), + /* unpriv: nospec (inserted to prevent "R0 invalid mem access 'scalar'") */ BPF_LDX_MEM(BPF_DW, BPF_REG_0, BPF_REG_0, 0), BPF_EXIT_INSN(), }, - .errstr_unpriv = "R0 invalid mem access 'scalar'", - .result_unpriv = REJECT, .result = ACCEPT, .retval = 2, .flags = F_NEEDS_EFFICIENT_UNALIGNED_ACCESS, @@ -709,11 +700,10 @@ BPF_JMP32_REG(BPF_JSLE, BPF_REG_7, BPF_REG_8, 1), BPF_EXIT_INSN(), BPF_JMP_IMM(BPF_JSLE, BPF_REG_7, 0x7ffffff0, 1), + /* unpriv: nospec (inserted to prevent "R0 invalid mem access 'scalar'") */ BPF_LDX_MEM(BPF_DW, BPF_REG_0, BPF_REG_0, 0), BPF_EXIT_INSN(), }, - .errstr_unpriv = "R0 invalid mem access 'scalar'", - .result_unpriv = REJECT, .result = ACCEPT, .retval = 2, .flags = F_NEEDS_EFFICIENT_UNALIGNED_ACCESS, @@ -780,11 +770,10 @@ BPF_JMP32_REG(BPF_JSLT, BPF_REG_7, BPF_REG_8, 1), BPF_EXIT_INSN(), BPF_JMP32_IMM(BPF_JSLT, BPF_REG_7, -1, 1), + /* unpriv: nospec (inserted to prevent "R0 invalid mem access 'scalar'") */ BPF_LDX_MEM(BPF_DW, BPF_REG_0, BPF_REG_0, 0), BPF_EXIT_INSN(), }, - .errstr_unpriv = "R0 invalid mem access 'scalar'", - .result_unpriv = REJECT, .result = ACCEPT, .retval = 2, .flags = F_NEEDS_EFFICIENT_UNALIGNED_ACCESS, diff --git a/tools/testing/selftests/bpf/verifier/jset.c b/tools/testing/selftests/bpf/verifier/jset.c index 11fc68da735e..e901eefd774a 100644 --- a/tools/testing/selftests/bpf/verifier/jset.c +++ b/tools/testing/selftests/bpf/verifier/jset.c @@ -78,12 +78,11 @@ .insns = { BPF_MOV64_IMM(BPF_REG_0, 1), BPF_JMP_IMM(BPF_JSET, BPF_REG_0, 1, 1), + /* unpriv: nospec (inserted to prevent "R9 !read_ok") */ BPF_LDX_MEM(BPF_B, BPF_REG_8, BPF_REG_9, 0), BPF_EXIT_INSN(), }, .prog_type = BPF_PROG_TYPE_SOCKET_FILTER, - .errstr_unpriv = "R9 !read_ok", - .result_unpriv = REJECT, .retval = 1, .result = ACCEPT, }, @@ -136,13 +135,12 @@ BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_get_prandom_u32), BPF_ALU64_IMM(BPF_OR, BPF_REG_0, 2), BPF_JMP_IMM(BPF_JSET, BPF_REG_0, 3, 1), + /* unpriv: nospec (inserted to prevent "R9 !read_ok") */ BPF_LDX_MEM(BPF_B, BPF_REG_8, BPF_REG_9, 0), BPF_MOV64_IMM(BPF_REG_0, 0), BPF_EXIT_INSN(), }, .prog_type = BPF_PROG_TYPE_SOCKET_FILTER, - .errstr_unpriv = "R9 !read_ok", - .result_unpriv = REJECT, .result = ACCEPT, }, { @@ -154,16 +152,16 @@ BPF_ALU64_IMM(BPF_AND, BPF_REG_1, 0xff), BPF_JMP_IMM(BPF_JSET, BPF_REG_1, 0xf0, 3), BPF_JMP_IMM(BPF_JLT, BPF_REG_1, 0x10, 1), + /* unpriv: nospec (inserted to prevent "R9 !read_ok") */ BPF_LDX_MEM(BPF_B, BPF_REG_8, BPF_REG_9, 0), BPF_EXIT_INSN(), BPF_JMP_IMM(BPF_JSET, BPF_REG_1, 0x10, 1), BPF_EXIT_INSN(), BPF_JMP_IMM(BPF_JGE, BPF_REG_1, 0x10, 1), + /* unpriv: nospec (inserted to prevent "R9 !read_ok") */ BPF_LDX_MEM(BPF_B, BPF_REG_8, BPF_REG_9, 0), BPF_EXIT_INSN(), }, .prog_type = BPF_PROG_TYPE_SOCKET_FILTER, - .errstr_unpriv = "R9 !read_ok", - .result_unpriv = REJECT, .result = ACCEPT, }, |
