| Age | Commit message (Collapse) | Author | Files | Lines |
|
The kexec code will call set_pages_state() after tearing down all the GHCBs,
which will therefore result in a call to early_set_pages_state().
This means the __init annotation is wrong, and must be dropped.
Fixes: c5c30a373693 ("x86/boot: Move startup code out of __head section")
Reported-by: Srikanth Aithal <Srikanth.Aithal@amd.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Tested-by: Srikanth Aithal <Srikanth.Aithal@amd.com>
|
|
Document the CPUID reading the different SEV guest types do - the SNP
one which relies on the presence of a CPUID table and the SEV-ES one,
which reads the CPUID supplied by the hypervisor.
The intent being to clarify the two back-to-back, similar CPUID
invocations.
No functional changes.
[ bp: Turn into a proper patch. ]
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/fbb24767-0e06-d1d6-36e0-1757d98aca66@amd.com
|
|
It is a silly oneliner anyway. Replace it with its equivalent.
No functional changes.
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
|
|
Move startup code out of the __head section, now that this no longer has
a special significance. Move everything into .text or .init.text as
appropriate, so that startup code is not kept around unnecessarily.
[ bp: Fold in hunk to fix 32-bit CPU hotplug:
Reported-by: kernel test robot <oliver.sang@intel.com>
Closes: https://lore.kernel.org/oe-lkp/202509022207.56fd97f4-lkp@intel.com ]
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/20250828102202.1849035-45-ardb+git@google.com
|
|
In order to be able to have tight control over which code may execute
from the early 1:1 mapping of memory, but still link vmlinux as a single
executable, prefix all symbol references in startup code with __pi_, and
invoke it from outside using the __pi_ prefix.
Use objtool to check that no absolute symbol references are present in
the startup code, as these cannot be used from code running from the 1:1
mapping.
Note that this also requires disabling the latent-entropy GCC plugin, as
the global symbol references that it injects would require explicit
exports, and given that the startup code rarely executes more than once,
it is not a useful source of entropy anyway.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/20250828102202.1849035-43-ardb+git@google.com
|
|
Invoke objtool on each startup code object individually to check for the
absence of absolute relocations. This is needed because this code will
be invoked from the 1:1 mapping of memory before those absolute virtual
addresses (which are derived from the kernel virtual base address
provided to the linker and possibly shifted at boot) are mapped.
Only objects built under arch/x86/boot/startup/ have this restriction,
and once they have been incorporated into vmlinux.o, this distinction is
difficult to make. So force the invocation of objtool for each object
file individually, even if objtool is deferred to vmlinux.o for the rest
of the build. In the latter case, only pass --noabs and nothing else;
otherwise, append it to the existing objtool command line.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/20250828102202.1849035-40-ardb+git@google.com
|
|
Create aliases that expose routines that are part of the startup code to
other code in the core kernel, so that they can be called later as well.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/20250828102202.1849035-38-ardb+git@google.com
|
|
Rename sev-nmi.c to noinstr.c, and move the get/put GHCB routines into it too,
which are also annotated as 'noinstr' and suffer from the same problem as the
NMI code, i.e., that GCC may ignore the __no_sanitize_address__ function
attribute implied by 'noinstr' and insert KASAN instrumentation anyway.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/20250828102202.1849035-37-ardb+git@google.com
|
|
Provide PIC aliases for data objects that are shared between the SEV startup
code and the SEV code that executes later. This is needed so that the confined
startup code is permitted to access them.
This requires some of these variables to be moved into a source file that is
not part of the startup code, as the PIC alias is already implied, and
exporting variables in the opposite direction is not supported.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/20250828102202.1849035-36-ardb+git@google.com
|
|
To avoid having to reason about whether or not to use the per-CPU SVSM calling
area when running startup and init code on the boot CPU, reuse the boot SVSM
calling area as the per-CPU area for the BSP.
Thus, remove the need to make the per-CPU variables and associated state in
sev_cfg accessible to the startup code once confined.
[ bp: Massage commit message. ]
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/20250828102202.1849035-33-ardb+git@google.com
|
|
The early page state change API is mostly only used very early, when only the
boot time SVSM calling area is in use. However, this API is also called by the
kexec finishing code, which runs very late, and potentially from a different
CPU (which uses a different calling area).
To avoid pulling the per-CPU SVSM calling area pointers and related SEV state
into the startup code, refactor the page state change API so the SVSM calling
area virtual and physical addresses can be provided by the caller.
No functional change intended.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/20250828102202.1849035-32-ardb+git@google.com
|
|
Both the decompressor and the SEV startup code implement the exact same
sequence for invoking the MSR based communication protocol to effectuate
a page state change.
Before tweaking the internal APIs used in both versions, merge them and
share them so those tweaks are only needed in a single place.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/20250828102202.1849035-31-ardb+git@google.com
|
|
The boottime SVSM calling area is used both by the startup code running from
a 1:1 mapping, and potentially later on running from the ordinary kernel
mapping.
This SVSM calling area is statically allocated, and so its physical address
doesn't change. However, its virtual address depends on the calling context
(1:1 mapping or kernel virtual mapping), and even though the variable that
holds the virtual address of this calling area gets updated from 1:1 address
to kernel address during the boot, it is hard to reason about why this is
guaranteed to be safe.
So instead, take the RIP-relative address of the boottime SVSM calling area
whenever its virtual address is required, and only use a global variable for
the physical address.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Link: https://lore.kernel.org/20250828102202.1849035-30-ardb+git@google.com
|
|
Both the decompressor and the core kernel implement an early #VC handler,
which only deals with CPUID instructions, and full featured one, which can
handle any #VC exception.
The former communicates with the hypervisor using the MSR based protocol,
whereas the latter uses a shared GHCB page, which is configured a bit later
during the boot, when the kernel runs from its ordinary virtual mapping,
rather than the 1:1 mapping that the startup code uses.
Accessing this shared GHCB page from the core kernel's startup code is
problematic, because it involves converting the GHCB address provided by the
caller to a physical address. In the startup code, virtual to physical address
translations are problematic, given that the virtual address might be a 1:1
mapped address, and such translations should therefore be avoided.
This means that exposing startup code dealing with the GHCB to callers that
execute from the ordinary kernel virtual mapping should be avoided too. So
move all GHCB page based communication out of the startup code, now that all
communication occurring before the kernel virtual mapping is up relies on the
MSR protocol only.
As an exception, add a flag representing the need to apply the coherency
fix in order to avoid exporting CPUID* helpers because of the code
running too early for the *cpu_has* infrastructure.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/20250828102202.1849035-29-ardb+git@google.com
|
|
Determining the VMPL at which the kernel runs involves performing a RMPADJUST
operation on an arbitrary page of memory, and observing whether it succeeds.
The use of boot_ghcb_page in the core kernel in this case is completely
arbitrary, but results in the need to provide a PIC alias for it. So use
boot_svsm_ca_page instead, which already needs this alias for other reasons.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Link: https://lore.kernel.org/20250828102202.1849035-28-ardb+git@google.com
|
|
The early page state change API performs an SVSM call to PVALIDATE each page
when running under a SVSM, and this involves either a GHCB page based call or
a call based on the MSR protocol.
The GHCB page based variant involves VA to PA translation of the GHCB address,
and this is best avoided in the startup code, where virtual addresses are
ambiguous (1:1 or kernel virtual).
As this is the last remaining occurrence of svsm_perform_call_protocol() in
the startup code, switch to the MSR protocol exclusively in this particular
case, so that the GHCB based plumbing can be moved out of the startup code
entirely in a subsequent patch.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Link: https://lore.kernel.org/20250828102202.1849035-27-ardb+git@google.com
|
|
As the preceding code comment already indicates, remapping the SVSM
calling area occurs long before the GHCB page is configured, and so
calling svsm_perform_call_protocol() is guaranteed to result in a call
to svsm_perform_msr_protocol().
So just call the latter directly. This allows most of the GHCB based API
infrastructure to be moved out of the startup code in a subsequent
patch.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Link: https://lore.kernel.org/20250828102202.1849035-26-ardb+git@google.com
|
|
There are two distinct callers of snp_cpuid(): the MSR protocol and the GHCB
page based interface.
The snp_cpuid() logic does not care about the distinction, which only matters
at a lower level. But the fact that it supports both interfaces means that the
GHCB page based logic is pulled into the early startup code where PA to VA
conversions are problematic, given that it runs from the 1:1 mapping of memory.
So keep snp_cpuid() itself in the startup code, but factor out the hypervisor
calls via a callback, so that the GHCB page handling can be moved out.
Code refactoring only - no functional change intended.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Link: https://lore.kernel.org/20250828102202.1849035-25-ardb+git@google.com
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Borislav Petkov:
- Remove a transitional asm/cpuid.h header which was added only as a
fallback during cpuid helpers reorg
- Initialize reserved fields in the SVSM page validation calls
structure to zero in order to allow for future structure extensions
- Have the sev-guest driver's buffers used in encryption operations be
in linear mapping space as the encryption operation can be offloaded
to an accelerator
- Have a read-only MSR write when in an AMD SNP guest trap to the
hypervisor as it is usually done. This makes the guest user
experience better by simply raising a #GP instead of terminating said
guest
- Do not output AVX512 elapsed time for kernel threads because the data
is wrong and fix a NULL pointer dereferencing in the process
- Adjust the SRSO mitigation selection to the new attack vectors
* tag 'x86_urgent_for_v6.17_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/cpuid: Remove transitional <asm/cpuid.h> header
x86/sev: Ensure SVSM reserved fields in a page validation entry are initialized to zero
virt: sev-guest: Satisfy linear mapping requirement in get_derived_key()
x86/sev: Improve handling of writes to intercepted TSC MSRs
x86/fpu: Fix NULL dereference in avx512_status()
x86/bugs: Select best SRSO mitigation
|
|
initialized to zero
In order to support future versions of the SVSM_CORE_PVALIDATE call, all
reserved fields within a PVALIDATE entry must be set to zero as an SVSM should
be ensuring all reserved fields are zero in order to support future usage of
reserved areas based on the protocol version.
Fixes: fcd042e86422 ("x86/sev: Perform PVALIDATE using the SVSM when not at VMPL0")
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Joerg Roedel <joerg.roedel@amd.com>
Cc: <stable@kernel.org>
Link: https://lore.kernel.org/7cde412f8b057ea13a646fb166b1ca023f6a5031.1755098819.git.thomas.lendacky@amd.com
|
|
An SNP cache coherency vulnerability requires a cache line eviction
mitigation when validating memory after a page state change to private.
The specific mitigation is to touch the first and last byte of each 4K
page that is being validated. There is no need to perform the mitigation
when performing a page state change to shared and rescinding validation.
CPUID bit Fn8000001F_EBX[31] defines the COHERENCY_SFW_NO CPUID bit
that, when set, indicates that the software mitigation for this
vulnerability is not needed.
Implement the mitigation and invoke it when validating memory (making it
private) and the COHERENCY_SFW_NO bit is not set, indicating the SNP
guest is vulnerable.
Co-developed-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
|
|
Both Intel and AMD CPUs support 5-level paging, which is expected to
become more widely adopted in the future. All major x86 Linux
distributions have the feature enabled.
Remove CONFIG_X86_5LEVEL and related #ifdeffery for it to make it more readable.
Suggested-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Borislav Petkov (AMD) <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/20250516123306.3812286-4-kirill.shutemov@linux.intel.com
|
|
The main CPUID header <asm/cpuid.h> was originally a storefront for the
headers:
<asm/cpuid/api.h>
<asm/cpuid/leaf_0x2_api.h>
Now that the latter CPUID(0x2) header has been merged into the former,
there is no practical difference between <asm/cpuid.h> and
<asm/cpuid/api.h>.
Migrate all users to the <asm/cpuid/api.h> header, in preparation of
the removal of <asm/cpuid.h>.
Don't remove <asm/cpuid.h> just yet, in case some new code in -next
started using it.
Suggested-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Ahmed S. Darwish <darwi@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: John Ogness <john.ogness@linutronix.de>
Cc: x86-cpuid@lists.linux.dev
Link: https://lore.kernel.org/r/20250508150240.172915-3-darwi@linutronix.de
|
|
The global pseudo-constants 'page_offset_base', 'vmalloc_base' and
'vmemmap_base' are not used extremely early during the boot, and cannot be
used safely until after the KASLR memory randomization code in
kernel_randomize_memory() executes, which may update their values.
So there is no point in setting these variables extremely early, and it
can wait until after the kernel itself is mapped and running from its
permanent virtual mapping.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/20250513111157.717727-9-ardb+git@google.com
|
|
Conflicts:
arch/x86/boot/startup/sme.c
arch/x86/coco/sev/core.c
arch/x86/kernel/fpu/core.c
arch/x86/kernel/fpu/xstate.c
Semantic conflict:
arch/x86/include/asm/sev-internal.h
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
Most of the SEV support code used to reside in a single C source file
that was included in two places: the core kernel, and the decompressor.
The code that is actually shared with the decompressor was moved into a
separate, shared source file under startup/, on the basis that the
decompressor also executes from the early 1:1 mapping of memory.
However, while the elaborate #VC handling and instruction decoding that
it involves is also performed by the decompressor, it does not actually
occur in the core kernel at early boot, and therefore, does not need to
be part of the confined early startup code.
So split off the #VC handling code and move it back into arch/x86/coco
where it came from, into another C source file that is included from
both the decompressor and the core kernel.
Code movement only - no functional change intended.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: Dionna Amalie Glaze <dionnaglaze@google.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Kevin Loughlin <kevinloughlin@google.com>
Cc: Len Brown <len.brown@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: linux-efi@vger.kernel.org
Link: https://lore.kernel.org/r/20250504095230.2932860-31-ardb+git@google.com
|
|
The SME encryption startup code populates page tables using the ordinary
set_pXX() helpers, and in a PTI build, these will call out to
__pti_set_user_pgtbl() to manipulate the shadow copy of the page tables
for user space.
This is unneeded for the startup code, which only manipulates the
swapper page tables, and so this call could be avoided in this
particular case. So instead of exposing the ordinary
__pti_set_user_pgtblt() to the startup code after its gets confined into
its own symbol space, provide an alternative which just returns pgd,
which is always correct in the startup context.
Annotate it as __weak for now, this will be dropped in a subsequent
patch.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: Dionna Amalie Glaze <dionnaglaze@google.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Kevin Loughlin <kevinloughlin@google.com>
Cc: Len Brown <len.brown@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: linux-efi@vger.kernel.org
Link: https://lore.kernel.org/r/20250504095230.2932860-40-ardb+git@google.com
|
|
__supported_pte_mask is statically initialized to U64_MAX and never
assigned until long after the startup code executes that creates the
initial page tables. So applying the mask is unnecessary, and can be
avoided.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: Dionna Amalie Glaze <dionnaglaze@google.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Kevin Loughlin <kevinloughlin@google.com>
Cc: Len Brown <len.brown@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: linux-efi@vger.kernel.org
Link: https://lore.kernel.org/r/20250504095230.2932860-27-ardb+git@google.com
|
|
Move early_setup_gdt() out of the startup code that is callable from the
1:1 mapping - this is not needed, and instead, it is better to expose
the helper that does reside in __head directly.
This reduces the amount of code that needs special checks for 1:1
execution suitability. In particular, it avoids dealing with the GHCB
page (and its physical address) in startup code, which runs from the
1:1 mapping, making physical to virtual translations ambiguous.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: Dionna Amalie Glaze <dionnaglaze@google.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Kevin Loughlin <kevinloughlin@google.com>
Cc: Len Brown <len.brown@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: linux-efi@vger.kernel.org
Link: https://lore.kernel.org/r/20250504095230.2932860-26-ardb+git@google.com
|
|
This commits breaks SNP guests:
234cf67fc3bd ("x86/sev: Split off startup code from core code")
The SNP guest boots, but no longer has access to the VMPCK keys needed
to communicate with the ASP, which is used, for example, to obtain an
attestation report.
The secrets_pa value is defined as static in both startup.c and
core.c. It is set by a function in startup.c and so when used in
core.c its value will be 0.
Share it again and add the sev_ prefix to put it into the global
SEV symbols namespace.
[ mingo: Renamed to sev_secrets_pa ]
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Cc: Dionna Amalie Glaze <dionnaglaze@google.com>
Cc: Kevin Loughlin <kevinloughlin@google.com>
Link: https://lore.kernel.org/r/cf878810-81ed-3017-52c6-ce6aa41b5f01@amd.com
|
|
objtool already struggles to identify jump tables correctly in non-PIC
code, where the idiom is something like
jmpq *table(,%idx,8)
and the table is a list of absolute addresses of jump targets.
When using -fPIC, both the table reference as well as the jump targets
are emitted in a RIP-relative manner, resulting in something like
leaq table(%rip), %tbl
movslq (%tbl,%idx,4), %offset
addq %offset, %tbl
jmpq *%tbl
and the table is a list of offsets of the jump targets relative to the
start of the entire table.
Considering that this sequence of instructions can be interleaved with
other instructions that have nothing to do with the jump table in
question, it is extremely difficult to infer the control flow by
deriving the jump targets from the indirect jump, the location of the
table and the relative offsets it contains.
So let's not bother and disable jump tables for code built with -fPIC
under arch/x86/boot/startup.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linux-efi@vger.kernel.org
Link: https://lore.kernel.org/r/20250422210510.600354-2-ardb+git@google.com
|
|
Now that the early SEV code is built with -fPIC, RIP_REL_REF() has no
effect and can be dropped.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: Dionna Amalie Glaze <dionnaglaze@google.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Kevin Loughlin <kevinloughlin@google.com>
Cc: Len Brown <len.brown@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Link: https://lore.kernel.org/r/20250418141253.2601348-13-ardb+git@google.com
|
|
Move the SEV startup code into arch/x86/boot/startup/, where it will
reside along with other code that executes extremely early, and
therefore needs to be built in a special manner.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: Dionna Amalie Glaze <dionnaglaze@google.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Kevin Loughlin <kevinloughlin@google.com>
Cc: Len Brown <len.brown@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Link: https://lore.kernel.org/r/20250418141253.2601348-12-ardb+git@google.com
|
|
When building with CONFIG_LTO_CLANG, there is an error in the x86 boot
startup code because it builds with a different code model than the rest
of the kernel:
ld.lld: error: Function Import: link error: linking module flags 'Code Model': IDs have conflicting values: 'i32 2' from vmlinux.a(head64.o at 1302448), and 'i32 1' from vmlinux.a(map_kernel.o at 1314208)
ld.lld: error: Function Import: link error: linking module flags 'Code Model': IDs have conflicting values: 'i32 2' from vmlinux.a(common.o at 1306108), and 'i32 1' from vmlinux.a(gdt_idt.o at 1314148)
As this directory is for code that only runs during early system
initialization, LTO is not very important, so filter out the LTO flags
from KBUILD_CFLAGS for arch/x86/boot/startup to resolve the build error.
Fixes: 4cecebf200ef ("x86/boot: Move the early GDT/IDT setup code into startup/")
Reported-by: Linux Kernel Functional Testing <lkft@linaro.org>
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: llvm@lists.linux.dev
Link: https://lore.kernel.org/r/20250414-x86-boot-startup-lto-error-v1-1-7c8bed7c131c@kernel.org
Closes: https://lore.kernel.org/CA+G9fYvnun+bhYgtt425LWxzOmj+8Jf3ruKeYxQSx-F6U7aisg@mail.gmail.com/
|
|
RIP_REL_REF() has no effect on code residing in arch/x86/boot/startup,
as it is built with -fPIC. So remove any occurrences from the SME
startup code.
Note the SME is the only caller of cc_set_mask() that requires this, so
drop it from there as well.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Dionna Amalie Glaze <dionnaglaze@google.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Kevin Loughlin <kevinloughlin@google.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: linux-efi@vger.kernel.org
Link: https://lore.kernel.org/r/20250410134117.3713574-19-ardb+git@google.com
|
|
Move the SME initialization code, which runs from the 1:1 mapping of
memory as it operates on the kernel virtual mapping, into the new
sub-directory arch/x86/boot/startup/ where all startup code will reside
that needs to tolerate executing from the 1:1 mapping.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Dionna Amalie Glaze <dionnaglaze@google.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Kevin Loughlin <kevinloughlin@google.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: linux-efi@vger.kernel.org
Link: https://lore.kernel.org/r/20250410134117.3713574-18-ardb+git@google.com
|
|
Now that __startup_64() is built using -fPIC, RIP_REL_REF() has become a
NOP and can be removed. Only some occurrences of rip_rel_ptr() will
remain, to explicitly take the address of certain global structures in
the 1:1 mapping of memory.
While at it, update the code comment to describe why this is needed.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Dionna Amalie Glaze <dionnaglaze@google.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Kevin Loughlin <kevinloughlin@google.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: linux-efi@vger.kernel.org
Link: https://lore.kernel.org/r/20250410134117.3713574-17-ardb+git@google.com
|
|
The startup code that constructs the kernel virtual mapping runs from
the 1:1 mapping of memory itself, and therefore, cannot use absolute
symbol references. Before making changes in subsequent patches, move
this code into a separate source file under arch/x86/boot/startup/ where
all such code will be kept from now on.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Dionna Amalie Glaze <dionnaglaze@google.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Kevin Loughlin <kevinloughlin@google.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: linux-efi@vger.kernel.org
Link: https://lore.kernel.org/r/20250410134117.3713574-16-ardb+git@google.com
|
|
Move the early GDT/IDT setup code that runs long before the kernel
virtual mapping is up into arch/x86/boot/startup/, and build it in a way
that ensures that the code tolerates being called from the 1:1 mapping
of memory. The code itself is left unchanged by this patch.
Also tweak the sed symbol matching pattern in the decompressor to match
on lower case 't' or 'b', as these will be emitted by Clang for symbols
with hidden linkage.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Dionna Amalie Glaze <dionnaglaze@google.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Kevin Loughlin <kevinloughlin@google.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Tom Lendacky <thomas.lendacky@amd.com>
Cc: linux-efi@vger.kernel.org
Link: https://lore.kernel.org/r/20250410134117.3713574-15-ardb+git@google.com
|
|
The library code built under arch/x86/boot/startup is not intended to be
linked into vmlinux but only into the decompressor and/or the EFI stub.
This means objtool validation is not needed here, and may result in
false positive errors for things like missing retpolines.
So disable it for all objects added to lib-y
Tested-by: Chaitanya Kumar Borah <chaitanya.kumar.borah@intel.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Len Brown <len.brown@intel.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20250408085254.836788-10-ardb+git@google.com
|
|
startup/
Linus expressed a strong preference for arch-specific asm code (i.e.,
virtually all of it) to reside under arch/ rather than anywhere else.
So move the EFI mixed mode startup code back, and put it under
arch/x86/boot/startup/ where all shared x86 startup code is going to
live.
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20250401133416.1436741-11-ardb+git@google.com
|
|
The 5-level paging trampoline is used by both the EFI stub and the
traditional decompressor. Move it out of the decompressor sources into
the newly minted arch/x86/boot/startup/ sub-directory which will hold
startup code that may be shared between the decompressor, the EFI stub
and the kernel proper, and needs to tolerate being called during early
boot, before the kernel virtual mapping has been created.
This will allow the 5-level paging trampoline to be used by EFI boot
images such as zboot that omit the traditional decompressor entirely.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: David Woodhouse <dwmw@amazon.co.uk>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/20250401133416.1436741-10-ardb+git@google.com
|