summaryrefslogtreecommitdiffstats
path: root/arch/x86/coco
AgeCommit message (Collapse)AuthorLines
13 daysx86/sev: Allow IBPB-on-Entry feature for SNP guestsKim Phillips-0/+1
The SEV-SNP IBPB-on-Entry feature does not require a guest-side implementation. It was added in Zen5 h/w, after the first SNP Zen implementation, and thus was not accounted for when the initial set of SNP features were added to the kernel. In its abundant precaution, commit 8c29f0165405 ("x86/sev: Add SEV-SNP guest feature negotiation support") included SEV_STATUS' IBPB-on-Entry bit as a reserved bit, thereby masking guests from using the feature. Allow guests to make use of IBPB-on-Entry when supported by the hypervisor, as the bit is now architecturally defined and safe to expose. Fixes: 8c29f0165405 ("x86/sev: Add SEV-SNP guest feature negotiation support") Signed-off-by: Kim Phillips <kim.phillips@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Nikunj A Dadhania <nikunj@amd.com> Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com> Cc: stable@kernel.org Link: https://patch.msgid.link/20260203222405.4065706-2-kim.phillips@amd.com
2026-02-21Convert 'alloc_obj' family to use the new default GFP_KERNEL argumentLinus Torvalds-3/+3
This was done entirely with mindless brute force, using git grep -l '\<k[vmz]*alloc_objs*(.*, GFP_KERNEL)' | xargs sed -i 's/\(alloc_objs*(.*\), GFP_KERNEL)/\1)/' to convert the new alloc_obj() users that had a simple GFP_KERNEL argument to just drop that argument. Note that due to the extreme simplicity of the scripting, any slightly more complex cases spread over multiple lines would not be triggered: they definitely exist, but this covers the vast bulk of the cases, and the resulting diff is also then easier to check automatically. For the same reason the 'flex' versions will be done as a separate conversion. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2026-02-21treewide: Replace kmalloc with kmalloc_obj for non-scalar typesKees Cook-3/+3
This is the result of running the Coccinelle script from scripts/coccinelle/api/kmalloc_objs.cocci. The script is designed to avoid scalar types (which need careful case-by-case checking), and instead replace kmalloc-family calls that allocate struct or union object instances: Single allocations: kmalloc(sizeof(TYPE), ...) are replaced with: kmalloc_obj(TYPE, ...) Array allocations: kmalloc_array(COUNT, sizeof(TYPE), ...) are replaced with: kmalloc_objs(TYPE, COUNT, ...) Flex array allocations: kmalloc(struct_size(PTR, FAM, COUNT), ...) are replaced with: kmalloc_flex(*PTR, FAM, COUNT, ...) (where TYPE may also be *VAR) The resulting allocations no longer return "void *", instead returning "TYPE *". Signed-off-by: Kees Cook <kees@kernel.org>
2026-02-10Merge tag 'x86_sev_for_v7.0_rc1' of ↵Linus Torvalds-387/+493
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 SEV updates from Borislav Petkov: - Make the SEV internal header really internal and carve out the SVSM-specific code into a separate compilation unit, along with other cleanups and fixups [ TLA translation service: 'SEV' is AMD's 'Secure Encrypted Virtualization' and SVSM is an ETLA ('Enhanced TLA') for 'Secure VM Service Module'. Some of us have trouble keeping track of this all and need all the help we can get ] * tag 'x86_sev_for_v7.0_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/sev: Don't emit BSS_DECRYPTED section unless it is in use x86/sev: Use kfree_sensitive() when freeing a SNP message descriptor x86/sev: Rename sev_es_ghcb_handle_msr() to __vc_handle_msr() x86/sev: Carve out the SVSM code into a separate compilation unit x86/sev: Add internal header guards x86/sev: Move the internal header
2026-01-20x86/sev: Use kfree_sensitive() when freeing a SNP message descriptorBorislav Petkov (AMD)-2/+1
Use the proper helper instead of an open-coded variant. Closes: https://lore.kernel.org/r/202512202235.WHPQkLZu-lkp@intel.com Reported-by: kernel test robot <lkp@intel.com> Reported-by: Julia Lawall <julia.lawall@inria.fr> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com> Link: https://patch.msgid.link/20260112114147.GBaWTd-8HSy_Xp4S3X@fat_crate.local
2026-01-19x86/sev: Rename sev_es_ghcb_handle_msr() to __vc_handle_msr()Borislav Petkov (AMD)-5/+5
Forgot to do that during the Secure AVIC review. :-\ No functional changes. Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com> Link: https://patch.msgid.link/20260119141310.29605-1-bp@kernel.org
2026-01-05x86/sev: Disable GCOV on noinstr objectBrendan Jackman-0/+2
With Debian clang version 19.1.7 (3+build5) there are calls to kasan_check_write() from __sev_es_nmi_complete(), which violates noinstr. Fix it by disabling GCOV for the noinstr object, as has been done for previous such instrumentation issues. Note that this file already disables __SANITIZE_ADDRESS__ and __SANITIZE_THREAD__, thus calls like kasan_check_write() ought to be nops regardless of GCOV. This has been fixed in other patches. However, to avoid any other accidental instrumentation showing up, (and since, in principle GCOV is instrumentation and hence should be disabled for noinstr code anyway), disable GCOV overall as well. Signed-off-by: Brendan Jackman <jackmanb@google.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Acked-by: Marco Elver <elver@google.com> Link: https://patch.msgid.link/20251216-gcov-inline-noinstr-v3-3-10244d154451@google.com
2025-12-31x86/sev: Carve out the SVSM code into a separate compilation unitBorislav Petkov (AMD)-378/+392
Move the SVSM-related machinery into a separate compilation unit in order to keep sev/core.c slim and "on-topic". No functional changes. Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://patch.msgid.link/20251204124809.31783-4-bp@kernel.org
2025-12-31x86/sev: Add internal header guardsBorislav Petkov (AMD)-0/+3
All headers need guards ifdeffery. Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://patch.msgid.link/20251204124809.31783-3-bp@kernel.org
2025-12-31x86/sev: Move the internal headerBorislav Petkov (AMD)-3/+93
Move the internal header out of the usual include/asm/ include path because having an "internal" header there doesn't really make it internal - quite the opposite - that's the normal arch include path. So move where it belongs and make it really internal. No functional changes. Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20251204145716.GDaTGhTEHNOtSdTkEe@fat_crate.local
2025-11-11x86/coco/sev: Convert has_cpuflag() to use cpu_feature_enabled()Borislav Petkov (AMD)-2/+1
Drop one redundant definition, while at it. There should be no functional changes. Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://patch.msgid.link/20251031122122.GKaQSpwhLvkinKKbjG@fat_crate.local
2025-10-30x86/sev: Include XSS value in GHCB CPUID requestJohn Allen-0/+11
When a guest issues a CPUID instruction for Fn0000000D_x01, the hypervisor may be intercepting the CPUID instruction and need to access the guest XSS value. For SEV-ES, the XSS value is encrypted and needs to be included in the GHCB to be visible to the hypervisor. Signed-off-by: John Allen <john.allen@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com> Link: https://patch.msgid.link/all/20250924200852.4452-3-john.allen@amd.com/
2025-09-05Merge branch 'x86/apic' into x86/sev, to resolve conflictIngo Molnar-5/+121
Conflicts: arch/x86/include/asm/sev-internal.h Signed-off-by: Ingo Molnar <mingo@kernel.org>
2025-09-04x86/apic/savic: Do not use snp_abort()Borislav Petkov (AMD)-2/+2
This function is going away so replace the callsites with the equivalent functionality. Add a new SAVIC-specific termination reason. If more granularity is needed there, it will be revisited in the future. Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
2025-09-03x86/boot: Create a confined code area for startup codeArd Biesheuvel-1/+1
In order to be able to have tight control over which code may execute from the early 1:1 mapping of memory, but still link vmlinux as a single executable, prefix all symbol references in startup code with __pi_, and invoke it from outside using the __pi_ prefix. Use objtool to check that no absolute symbol references are present in the startup code, as these cannot be used from code running from the 1:1 mapping. Note that this also requires disabling the latent-entropy GCC plugin, as the global symbol references that it injects would require explicit exports, and given that the startup code rarely executes more than once, it is not a useful source of entropy anyway. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/20250828102202.1849035-43-ardb+git@google.com
2025-09-03x86/sev: Move __sev_[get|put]_ghcb() into separate noinstr objectArd Biesheuvel-4/+78
Rename sev-nmi.c to noinstr.c, and move the get/put GHCB routines into it too, which are also annotated as 'noinstr' and suffer from the same problem as the NMI code, i.e., that GCC may ignore the __no_sanitize_address__ function attribute implied by 'noinstr' and insert KASAN instrumentation anyway. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/20250828102202.1849035-37-ardb+git@google.com
2025-09-03x86/sev: Provide PIC aliases for SEV related data objectsArd Biesheuvel-0/+34
Provide PIC aliases for data objects that are shared between the SEV startup code and the SEV code that executes later. This is needed so that the confined startup code is permitted to access them. This requires some of these variables to be moved into a source file that is not part of the startup code, as the PIC alias is already implied, and exporting variables in the opposite direction is not supported. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/20250828102202.1849035-36-ardb+git@google.com
2025-09-03x86/sev: Use boot SVSM CA for all startup and init codeArd Biesheuvel-25/+22
To avoid having to reason about whether or not to use the per-CPU SVSM calling area when running startup and init code on the boot CPU, reuse the boot SVSM calling area as the per-CPU area for the BSP. Thus, remove the need to make the per-CPU variables and associated state in sev_cfg accessible to the startup code once confined. [ bp: Massage commit message. ] Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/20250828102202.1849035-33-ardb+git@google.com
2025-09-03x86/sev: Pass SVSM calling area down to early page state change APIArd Biesheuvel-2/+5
The early page state change API is mostly only used very early, when only the boot time SVSM calling area is in use. However, this API is also called by the kexec finishing code, which runs very late, and potentially from a different CPU (which uses a different calling area). To avoid pulling the per-CPU SVSM calling area pointers and related SEV state into the startup code, refactor the page state change API so the SVSM calling area virtual and physical addresses can be provided by the caller. No functional change intended. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/20250828102202.1849035-32-ardb+git@google.com
2025-09-03x86/sev: Avoid global variable to store virtual address of SVSM areaArd Biesheuvel-9/+0
The boottime SVSM calling area is used both by the startup code running from a 1:1 mapping, and potentially later on running from the ordinary kernel mapping. This SVSM calling area is statically allocated, and so its physical address doesn't change. However, its virtual address depends on the calling context (1:1 mapping or kernel virtual mapping), and even though the variable that holds the virtual address of this calling area gets updated from 1:1 address to kernel address during the boot, it is hard to reason about why this is guaranteed to be safe. So instead, take the RIP-relative address of the boottime SVSM calling area whenever its virtual address is required, and only use a global variable for the physical address. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com> Link: https://lore.kernel.org/20250828102202.1849035-30-ardb+git@google.com
2025-09-03x86/sev: Move GHCB page based HV communication out of startup codeArd Biesheuvel-0/+172
Both the decompressor and the core kernel implement an early #VC handler, which only deals with CPUID instructions, and full featured one, which can handle any #VC exception. The former communicates with the hypervisor using the MSR based protocol, whereas the latter uses a shared GHCB page, which is configured a bit later during the boot, when the kernel runs from its ordinary virtual mapping, rather than the 1:1 mapping that the startup code uses. Accessing this shared GHCB page from the core kernel's startup code is problematic, because it involves converting the GHCB address provided by the caller to a physical address. In the startup code, virtual to physical address translations are problematic, given that the virtual address might be a 1:1 mapped address, and such translations should therefore be avoided. This means that exposing startup code dealing with the GHCB to callers that execute from the ordinary kernel virtual mapping should be avoided too. So move all GHCB page based communication out of the startup code, now that all communication occurring before the kernel virtual mapping is up relies on the MSR protocol only. As an exception, add a flag representing the need to apply the coherency fix in order to avoid exporting CPUID* helpers because of the code running too early for the *cpu_has* infrastructure. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/20250828102202.1849035-29-ardb+git@google.com
2025-09-01x86/sev: Prevent SECURE_AVIC_CONTROL MSR interception for Secure AVIC guestsNeeraj Upadhyay-0/+9
The SECURE_AVIC_CONTROL MSR holds the GPA of the guest APIC backing page and bitfields to control enablement of Secure AVIC and whether the guest allows NMIs to be injected by the hypervisor. This MSR is populated by the guest and can be read by the guest to get the GPA of the APIC backing page. The MSR can only be accessed in Secure AVIC mode. Any attempt to access it when not in Secure AVIC mode results in #GP. So, the hypervisor should not intercept it. A #VC exception will be generated otherwise. If this occurs and Secure AVIC is enabled, terminate the guest execution. Signed-off-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Tianyu Lan <tiala@microsoft.com> Link: https://lore.kernel.org/20250828113119.209135-1-Neeraj.Upadhyay@amd.com
2025-09-01x86/apic: Add kexec support for Secure AVICNeeraj Upadhyay-0/+23
Add a apic->teardown() callback to disable Secure AVIC before rebooting into the new kernel. This ensures that the new kernel does not access the old APIC backing page which was allocated by the previous kernel. Such accesses can happen if there are any APIC accesses done during the guest boot before Secure AVIC driver probe is done by the new kernel (as Secure AVIC would have remained enabled in the Secure AVIC control MSR). Signed-off-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/20250828112008.209013-1-Neeraj.Upadhyay@amd.com
2025-09-01x86/sev: Enable NMI support for Secure AVICKishon Vijay Abraham I-1/+1
Now that support to send NMI IPI and support to inject NMI from the hypervisor has been added, set V_NMI_ENABLE in the VINTR_CTRL field of the VMSA to enable NMI for Secure AVIC guests. [ bp: Zap useless brackets. ] Signed-off-by: Kishon Vijay Abraham I <kvijayab@amd.com> Signed-off-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Tianyu Lan <tiala@microsoft.com> Link: https://lore.kernel.org/20250828111315.208959-1-Neeraj.Upadhyay@amd.com
2025-09-01x86/sev: Initialize VGIF for secondary vCPUs for Secure AVICKishon Vijay Abraham I-0/+3
Virtual GIF (VGIF) provides masking capability for when virtual interrupts (virtual maskable interrupts, virtual NMIs) can be taken by the guest vCPU. The Secure AVIC hardware reads VGIF state from the vCPU's VMSA. So, set VGIF for secondary CPUs (the configuration for the boot CPU is done by the hypervisor), to unmask delivery of virtual interrupts to the vCPU. Signed-off-by: Kishon Vijay Abraham I <kvijayab@amd.com> Signed-off-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Tianyu Lan <tiala@microsoft.com> Link: https://lore.kernel.org/20250828111141.208920-1-Neeraj.Upadhyay@amd.com
2025-09-01x86/apic: Support LAPIC timer for Secure AVICNeeraj Upadhyay-0/+26
Secure AVIC requires the LAPIC timer to be emulated by the hypervisor. KVM already supports emulating the LAPIC timer using hrtimers. In order to emulate it, APIC_LVTT, APIC_TMICT and APIC_TDCR register values need to be propagated to the hypervisor for arming the timer. APIC_TMCCT register value has to be read from the hypervisor, which is required for calibrating the APIC timer. So, read/write all APIC timer registers from/to the hypervisor. Co-developed-by: Kishon Vijay Abraham I <kvijayab@amd.com> Signed-off-by: Kishon Vijay Abraham I <kvijayab@amd.com> Signed-off-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Tianyu Lan <tiala@microsoft.com> Link: https://lore.kernel.org/20250828110926.208866-1-Neeraj.Upadhyay@amd.com
2025-09-01x86/apic: Add support to send IPI for Secure AVICNeeraj Upadhyay-5/+34
Secure AVIC hardware accelerates only Self-IPI, i.e. on WRMSR to APIC_SELF_IPI and APIC_ICR (with destination shorthand equal to Self) registers, hardware takes care of updating the APIC_IRR in the APIC backing page of the vCPU. For other IPI types (cross-vCPU, broadcast IPIs), software needs to take care of updating the APIC_IRR state of the target vCPUs and to ensure that the target vCPUs notice the new pending interrupt. Add new callbacks in the Secure AVIC driver for sending IPI requests. These callbacks update the IRR in the target guest vCPU's APIC backing page. To ensure that the remote vCPU notices the new pending interrupt, reuse the GHCB MSR handling code in vc_handle_msr() to issue APIC_ICR MSR-write GHCB protocol event to the hypervisor. For Secure AVIC guests, on APIC_ICR write MSR exits, the hypervisor notifies the target vCPU by either sending an AVIC doorbell (if target vCPU is running) or by waking up the non-running target vCPU. Co-developed-by: Kishon Vijay Abraham I <kvijayab@amd.com> Signed-off-by: Kishon Vijay Abraham I <kvijayab@amd.com> Signed-off-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Tianyu Lan <tiala@microsoft.com> Link: https://lore.kernel.org/20250828110824.208851-1-Neeraj.Upadhyay@amd.com
2025-08-31x86/apic: Initialize Secure AVIC APIC backing pageNeeraj Upadhyay-0/+22
With Secure AVIC, the APIC backing page is owned and managed by the guest. Allocate and initialize APIC backing page for all guest CPUs. The NPT entry for a vCPU's APIC backing page must always be present when the vCPU is running in order for Secure AVIC to function. A VMEXIT_BUSY is returned on VMRUN and the vCPU cannot be resumed otherwise. To handle this, notify GPA of the vCPU's APIC backing page to the hypervisor by using the SVM_VMGEXIT_SECURE_AVIC GHCB protocol event. Before executing VMRUN, the hypervisor makes use of this information to make sure the APIC backing page is mapped in the NPT. [ bp: Massage commit message. ] Co-developed-by: Kishon Vijay Abraham I <kvijayab@amd.com> Signed-off-by: Kishon Vijay Abraham I <kvijayab@amd.com> Signed-off-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Tianyu Lan <tiala@microsoft.com> Link: https://lore.kernel.org/20250828070334.208401-3-Neeraj.Upadhyay@amd.com
2025-08-28x86/apic: Add new driver for Secure AVICNeeraj Upadhyay-0/+4
The Secure AVIC feature provides SEV-SNP guests hardware acceleration for performance sensitive APIC accesses while securely managing the guest-owned APIC state through the use of a private APIC backing page. This helps prevent the hypervisor from generating unexpected interrupts for a vCPU or otherwise violate architectural assumptions around the APIC behavior. Add a new x2APIC driver that will serve as the base of the Secure AVIC support. It is initially the same as the x2APIC physical driver (without IPI callbacks), but will be modified as features are implemented. As the new driver does not implement Secure AVIC features yet, if the hypervisor sets the Secure AVIC bit in SEV_STATUS, maintain the existing behavior to enforce the guest termination. [ bp: Massage commit message. ] Co-developed-by: Kishon Vijay Abraham I <kvijayab@amd.com> Signed-off-by: Kishon Vijay Abraham I <kvijayab@amd.com> Signed-off-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Tianyu Lan <tiala@microsoft.com> Link: https://lore.kernel.org/20250828070334.208401-2-Neeraj.Upadhyay@amd.com
2025-08-28x86/sev: Separate MSR and GHCB based snp_cpuid() via a callbackArd Biesheuvel-1/+48
There are two distinct callers of snp_cpuid(): the MSR protocol and the GHCB page based interface. The snp_cpuid() logic does not care about the distinction, which only matters at a lower level. But the fact that it supports both interfaces means that the GHCB page based logic is pulled into the early startup code where PA to VA conversions are problematic, given that it runs from the 1:1 mapping of memory. So keep snp_cpuid() itself in the startup code, but factor out the hypervisor calls via a callback, so that the GHCB page handling can be moved out. Code refactoring only - no functional change intended. Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com> Link: https://lore.kernel.org/20250828102202.1849035-25-ardb+git@google.com
2025-08-17Merge tag 'x86_urgent_for_v6.17_rc2' of ↵Linus Torvalds-15/+18
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Borislav Petkov: - Remove a transitional asm/cpuid.h header which was added only as a fallback during cpuid helpers reorg - Initialize reserved fields in the SVSM page validation calls structure to zero in order to allow for future structure extensions - Have the sev-guest driver's buffers used in encryption operations be in linear mapping space as the encryption operation can be offloaded to an accelerator - Have a read-only MSR write when in an AMD SNP guest trap to the hypervisor as it is usually done. This makes the guest user experience better by simply raising a #GP instead of terminating said guest - Do not output AVX512 elapsed time for kernel threads because the data is wrong and fix a NULL pointer dereferencing in the process - Adjust the SRSO mitigation selection to the new attack vectors * tag 'x86_urgent_for_v6.17_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/cpuid: Remove transitional <asm/cpuid.h> header x86/sev: Ensure SVSM reserved fields in a page validation entry are initialized to zero virt: sev-guest: Satisfy linear mapping requirement in get_derived_key() x86/sev: Improve handling of writes to intercepted TSC MSRs x86/fpu: Fix NULL dereference in avx512_status() x86/bugs: Select best SRSO mitigation
2025-08-15x86/sev: Ensure SVSM reserved fields in a page validation entry are ↵Tom Lendacky-0/+2
initialized to zero In order to support future versions of the SVSM_CORE_PVALIDATE call, all reserved fields within a PVALIDATE entry must be set to zero as an SVSM should be ensuring all reserved fields are zero in order to support future usage of reserved areas based on the protocol version. Fixes: fcd042e86422 ("x86/sev: Perform PVALIDATE using the SVSM when not at VMPL0") Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Joerg Roedel <joerg.roedel@amd.com> Cc: <stable@kernel.org> Link: https://lore.kernel.org/7cde412f8b057ea13a646fb166b1ca023f6a5031.1755098819.git.thomas.lendacky@amd.com
2025-08-12x86/sev: Improve handling of writes to intercepted TSC MSRsNikunj A Dadhania-15/+16
Currently, when a Secure TSC enabled SNP guest attempts to write to the intercepted GUEST_TSC_FREQ MSR (a read-only MSR), the guest kernel response incorrectly implies a VMM configuration error, when in fact it is the usual VMM configuration to intercept writes to read-only MSRs, unless explicitly documented. Modify the intercepted TSC MSR #VC handling: * Write to GUEST_TSC_FREQ will generate a #GP instead of terminating the guest * Write to MSR_IA32_TSC will generate a #GP instead of silently ignoring it However, continue to terminate the guest when reading from intercepted GUEST_TSC_FREQ MSR with Secure TSC enabled, as intercepted reads indicate an improper VMM configuration for Secure TSC enabled SNP guests. [ bp: simplify comment. ] Fixes: 38cc6495cdec ("x86/sev: Prevent GUEST_TSC_FREQ MSR interception for Secure TSC enabled guests") Suggested-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Nikunj A Dadhania <nikunj@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com> Link: https://lore.kernel.org/20250722074853.22253-1-nikunj@amd.com
2025-08-06x86/sev: Evict cache lines during SNP memory validationTom Lendacky-0/+21
An SNP cache coherency vulnerability requires a cache line eviction mitigation when validating memory after a page state change to private. The specific mitigation is to touch the first and last byte of each 4K page that is being validated. There is no need to perform the mitigation when performing a page state change to shared and rescinding validation. CPUID bit Fn8000001F_EBX[31] defines the COHERENCY_SFW_NO CPUID bit that, when set, indicates that the software mitigation for this vulnerability is not needed. Implement the mitigation and invoke it when validating memory (making it private) and the COHERENCY_SFW_NO bit is not set, indicating the SNP guest is vulnerable. Co-developed-by: Michael Roth <michael.roth@amd.com> Signed-off-by: Michael Roth <michael.roth@amd.com> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Acked-by: Thomas Gleixner <tglx@linutronix.de>
2025-07-29Merge tag 'x86_sev_for_v6.17_rc1' of ↵Linus Torvalds-42/+56
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 SEV updates from Borislav Petkov: - Map the SNP calling area pages too so that OVMF EFI fw can issue SVSM calls properly with the goal of implementing EFI variable store in the SVSM - a component which is trusted by the guest, vs in the firmware, which is not - Allow the kernel to handle #VC exceptions from EFI runtime services properly when running as a SNP guest - Rework and cleanup the SNP guest request issue glue code a bit * tag 'x86_sev_for_v6.17_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/sev: Let sev_es_efi_map_ghcbs() map the CA pages too x86/sev/vc: Fix EFI runtime instruction emulation x86/sev: Drop unnecessary parameter in snp_issue_guest_request() x86/sev: Document requirement for linear mapping of guest request buffers x86/sev: Allocate request in TSC_INFO_REQ on stack virt: sev-guest: Contain snp_guest_request_ioctl in sev-guest
2025-07-15x86/sev: Work around broken noinstr on GCCArd Biesheuvel-1/+2
Forcibly disable KCSAN for the sev-nmi.c source file, which only contains functions annotated as 'noinstr' but is emitted with calls to KCSAN instrumentation nonetheless. E.g., vmlinux.o: error: objtool: __sev_es_nmi_complete+0x58: call to __kcsan_check_access() leaves .noinstr.text section make[2]: *** [/usr/local/google/home/ardb/linux/scripts/Makefile.vmlinux_o:72: vmlinux.o] Error 1 Fixes: a3cbbb4717e1 ("x86/boot: Move SEV startup code into startup/") Reported-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Acked-by: Randy Dunlap <rdunlap@infradead.org> Tested-by: Randy Dunlap <rdunlap@infradead.org> Link: https://lore.kernel.org/20250714073402.4107091-2-ardb+git@google.com
2025-07-01x86/sev: Use TSC_FACTOR for Secure TSC frequency calculationNikunj A Dadhania-3/+19
When using Secure TSC, the GUEST_TSC_FREQ MSR reports a frequency based on the nominal P0 frequency, which deviates slightly (typically ~0.2%) from the actual mean TSC frequency due to clocking parameters. Over extended VM uptime, this discrepancy accumulates, causing clock skew between the hypervisor and a SEV-SNP VM, leading to early timer interrupts as perceived by the guest. The guest kernel relies on the reported nominal frequency for TSC-based timekeeping, while the actual frequency set during SNP_LAUNCH_START may differ. This mismatch results in inaccurate time calculations, causing the guest to perceive hrtimers as firing earlier than expected. Utilize the TSC_FACTOR from the SEV firmware's secrets page (see "Secrets Page Format" in the SNP Firmware ABI Specification) to calculate the mean TSC frequency, ensuring accurate timekeeping and mitigating clock skew in SEV-SNP VMs. Use early_ioremap_encrypted() to map the secrets page as ioremap_encrypted() uses kmalloc() which is not available during early TSC initialization and causes a panic. [ bp: Drop the silly dummy var: https://lore.kernel.org/r/20250630192726.GBaGLlHl84xIopx4Pt@fat_crate.local ] Fixes: 73bbf3b0fbba ("x86/tsc: Init the TSC for Secure TSC guests") Signed-off-by: Nikunj A Dadhania <nikunj@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/20250630081858.485187-1-nikunj@amd.com
2025-06-27x86/sev: Let sev_es_efi_map_ghcbs() map the CA pages tooGerd Hoffmann-2/+15
OVMF EFI firmware needs access to the CA page to do SVSM protocol calls. For example, when the SVSM implements an EFI variable store, such calls will be necessary. So add that to sev_es_efi_map_ghcbs() and also rename the function to reflect the additional job it is doing now. [ bp: Massage. ] Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/20250626114014.373748-4-kraxel@redhat.com
2025-06-27x86/sev/vc: Fix EFI runtime instruction emulationGerd Hoffmann-1/+8
In case efi_mm is active go use the userspace instruction decoder which supports fetching instructions from active_mm. This is needed to make instruction emulation work for EFI runtime code, so it can use CPUID and RDMSR. EFI runtime code uses the CPUID instruction to gather information about the environment it is running in, such as SEV being enabled or not, and choose (if needed) the SEV code path for ioport access. EFI runtime code uses the RDMSR instruction to get the location of the CAA page (see SVSM spec, section 4.2 - "Post Boot"). The big picture behind this is that the kernel needs to be able to properly handle #VC exceptions that come from EFI runtime services. Since EFI runtime services have a special page table mapping for the EFI virtual address space, the efi_mm context must be used when decoding instructions during #VC handling. [ bp: Massage. ] Signed-off-by: Gerd Hoffmann <kraxel@redhat.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Pankaj Gupta <pankaj.gupta@amd.com> Link: https://lore.kernel.org/20250626114014.373748-2-kraxel@redhat.com
2025-06-18x86/sev: Drop unnecessary parameter in snp_issue_guest_request()Alexey Kardashevskiy-2/+3
Commit 3e385c0d6ce8 ("virt: sev-guest: Move SNP Guest Request data pages handling under snp_cmd_mutex") moved @input from snp_msg_desc to snp_guest_req which is passed to snp_issue_guest_request(). Drop the extra parameter. No functional change intended. Signed-off-by: Alexey Kardashevskiy <aik@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com> Reviewed-by: Dionna Glaze <dionnaglaze@google.com> Link: https://lore.kernel.org/20250611040842.2667262-5-aik@amd.com
2025-06-18x86/sev: Document requirement for linear mapping of guest request buffersAlexey Kardashevskiy-0/+9
The Guest Request supports 3 types of messages now, the largest is the extended variant of MSG_REPORT_REQ: sizeof(snp_ext_report_req)==112. These used to be allocated on stack and then moved to the SNP guest platform device (snp_guest_dev) for the reason explained in db10cb9b5746 ("virt: sevguest: Fix passing a stack buffer as a scatterlist target"): aesgcm_encrypt() and aesgcm_decrypt() are used for guest messages and might potentially use a crypto accelerator which requires DMA buffers to be in the linear mapping. Add a comment, warn and return an error when the buffers are not in linear mapping. Signed-off-by: Alexey Kardashevskiy <aik@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com> Reviewed-by: Dionna Glaze <dionnaglaze@google.com> Link: https://lore.kernel.org/20250611040842.2667262-4-aik@amd.com
2025-06-18x86/sev: Allocate request in TSC_INFO_REQ on stackAlexey Kardashevskiy-18/+12
Allocate a 88 byte request structure on stack and skip needless kzalloc/kfree. While at this, correct indent. No functional change intended. Signed-off-by: Alexey Kardashevskiy <aik@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com> Reviewed-by: Dionna Glaze <dionnaglaze@google.com> Link: https://lore.kernel.org/20250611040842.2667262-3-aik@amd.com
2025-06-18virt: sev-guest: Contain snp_guest_request_ioctl in sev-guestAlexey Kardashevskiy-23/+13
SNP Guest Request uses only exitinfo2 which is a return value from GHCB, has meaning beyond ioctl and therefore belongs to struct snp_guest_req. Move exitinfo2 there and remove snp_guest_request_ioctl from the SEV platform code. No functional change intended. Signed-off-by: Alexey Kardashevskiy <aik@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com> Reviewed-by: Dionna Glaze <dionnaglaze@google.com> Link: https://lore.kernel.org/20250611040842.2667262-2-aik@amd.com
2025-06-03Merge tag 'hyperv-next-signed-20250602' of ↵Linus Torvalds-11/+2
git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux Pull hyperv updates from Wei Liu: - Support for Virtual Trust Level (VTL) on arm64 (Roman Kisel) - Fixes for Hyper-V UIO driver (Long Li) - Fixes for Hyper-V PCI driver (Michael Kelley) - Select CONFIG_SYSFB for Hyper-V guests (Michael Kelley) - Documentation updates for Hyper-V VMBus (Michael Kelley) - Enhance logging for hv_kvp_daemon (Shradha Gupta) * tag 'hyperv-next-signed-20250602' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux: (23 commits) Drivers: hv: Always select CONFIG_SYSFB for Hyper-V guests Drivers: hv: vmbus: Add comments about races with "channels" sysfs dir Documentation: hyperv: Update VMBus doc with new features and info PCI: hv: Remove unnecessary flex array in struct pci_packet Drivers: hv: Remove hv_alloc/free_* helpers Drivers: hv: Use kzalloc for panic page allocation uio_hv_generic: Align ring size to system page uio_hv_generic: Use correct size for interrupt and monitor pages Drivers: hv: Allocate interrupt and monitor pages aligned to system page boundary arch/x86: Provide the CPU number in the wakeup AP callback x86/hyperv: Fix APIC ID and VP index confusion in hv_snp_boot_ap() PCI: hv: Get vPCI MSI IRQ domain from DeviceTree ACPI: irq: Introduce acpi_get_gsi_dispatcher() Drivers: hv: vmbus: Introduce hv_get_vmbus_root_device() Drivers: hv: vmbus: Get the IRQ number from DeviceTree dt-bindings: microsoft,vmbus: Add interrupt and DMA coherence properties arm64, x86: hyperv: Report the VTL the system boots in arm64: hyperv: Initialize the Virtual Trust Level field Drivers: hv: Provide arch-neutral implementation of get_vtl() Drivers: hv: Enable VTL mode for arm64 ...
2025-05-29Merge tag 'tsm-for-6.16' of ↵Linus Torvalds-5/+45
git://git.kernel.org/pub/scm/linux/kernel/git/devsec/tsm Pull trusted security manager (TSM) updates from Dan Williams: - Add a general sysfs scheme for publishing "Measurement" values provided by the architecture's TEE Security Manager. Use it to publish TDX "Runtime Measurement Registers" ("RTMRs") that either maintain a hash of stored values (similar to a TPM PCR) or provide statically provisioned data. These measurements are validated by a relying party. - Reorganize the drivers/virt/coco/ directory for "host" and "guest" shared infrastructure. - Fix a configfs-tsm-report unregister bug - With CONFIG_TSM_MEASUREMENTS joining CONFIG_TSM_REPORTS and in anticipation of more shared "TSM" infrastructure arriving, rename the maintainer entry to "TRUSTED SECURITY MODULE (TSM) INFRASTRUCTURE". * tag 'tsm-for-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/devsec/tsm: tsm-mr: Fix init breakage after bin_attrs constification by scoping non-const pointers to init phase sample/tsm-mr: Fix missing static for sample_report virt: tdx-guest: Transition to scoped_cond_guard for mutex operations virt: tdx-guest: Refactor and streamline TDREPORT generation virt: tdx-guest: Expose TDX MRs as sysfs attributes x86/tdx: tdx_mcall_get_report0: Return -EBUSY on TDCALL_OPERAND_BUSY error x86/tdx: Add tdx_mcall_extend_rtmr() interface tsm-mr: Add tsm-mr sample code tsm-mr: Add TVM Measurement Register support configfs-tsm-report: Fix NULL dereference of tsm_ops coco/guest: Move shared guest CC infrastructure to drivers/virt/coco/guest/ configfs-tsm: Namespace TSM report symbols
2025-05-27Merge tag 'x86_sev_for_v6.16_rc1' of ↵Linus Torvalds-1/+68
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull AMD SEV update from Borislav Petkov: "Add a virtual TPM driver glue which allows a guest kernel to talk to a TPM device emulated by a Secure VM Service Module (SVSM) - a helper module of sorts which runs at a different privilege level in the SEV-SNP VM stack. The intent being that a TPM device is emulated by a trusted entity and not by the untrusted host which is the default assumption in the confidential computing scenarios" * tag 'x86_sev_for_v6.16_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/sev: Register tpm-svsm platform device tpm: Add SNP SVSM vTPM driver svsm: Add header with SVSM_VTPM_CMD helpers x86/sev: Add SVSM vTPM probe/send_command functions
2025-05-23arch/x86: Provide the CPU number in the wakeup AP callbackRoman Kisel-11/+2
When starting APs, confidential guests and paravisor guests need to know the CPU number, and the pattern of using the linear search has emerged in several places. With N processors that leads to the O(N^2) time complexity. Provide the CPU number in the AP wake up callback so that one can get the CPU number in constant time. Suggested-by: Michael Kelley <mhklinux@outlook.com> Signed-off-by: Roman Kisel <romank@linux.microsoft.com> Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Michael Kelley <mhklinux@outlook.com> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Link: https://lore.kernel.org/r/20250507182227.7421-3-romank@linux.microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org> Message-ID: <20250507182227.7421-3-romank@linux.microsoft.com>
2025-05-21Merge tag 'v6.15-rc7' into x86/core, to pick up fixesIngo Molnar-90/+165
Pick up build fixes from upstream to make this tree more testable. Signed-off-by: Ingo Molnar <mingo@kernel.org>
2025-05-15x86/cpuid: Set <asm/cpuid/api.h> as the main CPUID headerAhmed S. Darwish-2/+2
The main CPUID header <asm/cpuid.h> was originally a storefront for the headers: <asm/cpuid/api.h> <asm/cpuid/leaf_0x2_api.h> Now that the latter CPUID(0x2) header has been merged into the former, there is no practical difference between <asm/cpuid.h> and <asm/cpuid/api.h>. Migrate all users to the <asm/cpuid/api.h> header, in preparation of the removal of <asm/cpuid.h>. Don't remove <asm/cpuid.h> just yet, in case some new code in -next started using it. Suggested-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Ahmed S. Darwish <darwi@linutronix.de> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Andrew Cooper <andrew.cooper3@citrix.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: John Ogness <john.ogness@linutronix.de> Cc: x86-cpuid@lists.linux.dev Link: https://lore.kernel.org/r/20250508150240.172915-3-darwi@linutronix.de
2025-05-13x86/sev: Make sure pages are not skipped during kdumpAshish Kalra-4/+7
When shared pages are being converted to private during kdump, additional checks are performed. They include handling the case of a GHCB page being contained within a huge page. Currently, this check incorrectly skips a page just below the GHCB page from being transitioned back to private during kdump preparation. This skipped page causes a 0x404 #VC exception when it is accessed later while dumping guest memory for vmcore generation. Correct the range to be checked for GHCB contained in a huge page. Also, ensure that the skipped huge page containing the GHCB page is transitioned back to private by applying the correct address mask later when changing GHCBs to private at end of kdump preparation. [ bp: Massage commit message. ] Fixes: 3074152e56c9 ("x86/sev: Convert shared memory back to private on kexec") Signed-off-by: Ashish Kalra <ashish.kalra@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com> Tested-by: Srikanth Aithal <sraithal@amd.com> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/20250506183529.289549-1-Ashish.Kalra@amd.com