summaryrefslogtreecommitdiffstats
path: root/arch/x86/hyperv
AgeCommit message (Collapse)AuthorLines
2026-02-21Convert more 'alloc_obj' cases to default GFP_KERNEL argumentsLinus Torvalds-2/+1
This converts some of the visually simpler cases that have been split over multiple lines. I only did the ones that are easy to verify the resulting diff by having just that final GFP_KERNEL argument on the next line. Somebody should probably do a proper coccinelle script for this, but for me the trivial script actually resulted in an assertion failure in the middle of the script. I probably had made it a bit _too_ trivial. So after fighting that far a while I decided to just do some of the syntactically simpler cases with variations of the previous 'sed' scripts. The more syntactically complex multi-line cases would mostly really want whitespace cleanup anyway. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2026-02-21Convert 'alloc_obj' family to use the new default GFP_KERNEL argumentLinus Torvalds-2/+2
This was done entirely with mindless brute force, using git grep -l '\<k[vmz]*alloc_objs*(.*, GFP_KERNEL)' | xargs sed -i 's/\(alloc_objs*(.*\), GFP_KERNEL)/\1)/' to convert the new alloc_obj() users that had a simple GFP_KERNEL argument to just drop that argument. Note that due to the extreme simplicity of the scripting, any slightly more complex cases spread over multiple lines would not be triggered: they definitely exist, but this covers the vast bulk of the cases, and the resulting diff is also then easier to check automatically. For the same reason the 'flex' versions will be done as a separate conversion. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2026-02-21treewide: Replace kmalloc with kmalloc_obj for non-scalar typesKees Cook-6/+5
This is the result of running the Coccinelle script from scripts/coccinelle/api/kmalloc_objs.cocci. The script is designed to avoid scalar types (which need careful case-by-case checking), and instead replace kmalloc-family calls that allocate struct or union object instances: Single allocations: kmalloc(sizeof(TYPE), ...) are replaced with: kmalloc_obj(TYPE, ...) Array allocations: kmalloc_array(COUNT, sizeof(TYPE), ...) are replaced with: kmalloc_objs(TYPE, COUNT, ...) Flex array allocations: kmalloc(struct_size(PTR, FAM, COUNT), ...) are replaced with: kmalloc_flex(*PTR, FAM, COUNT, ...) (where TYPE may also be *VAR) The resulting allocations no longer return "void *", instead returning "TYPE *". Signed-off-by: Kees Cook <kees@kernel.org>
2026-02-20Merge tag 'hyperv-next-signed-20260218' of ↵Linus Torvalds-17/+25
git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux Pull Hyper-V updates from Wei Liu: - Debugfs support for MSHV statistics (Nuno Das Neves) - Support for the integrated scheduler (Stanislav Kinsburskii) - Various fixes for MSHV memory management and hypervisor status handling (Stanislav Kinsburskii) - Expose more capabilities and flags for MSHV partition management (Anatol Belski, Muminul Islam, Magnus Kulke) - Miscellaneous fixes to improve code quality and stability (Carlos López, Ethan Nelson-Moore, Li RongQing, Michael Kelley, Mukesh Rathor, Purna Pavan Chandra Aekkaladevi, Stanislav Kinsburskii, Uros Bizjak) - PREEMPT_RT fixes for vmbus interrupts (Jan Kiszka) * tag 'hyperv-next-signed-20260218' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux: (34 commits) mshv: Handle insufficient root memory hypervisor statuses mshv: Handle insufficient contiguous memory hypervisor status mshv: Introduce hv_deposit_memory helper functions mshv: Introduce hv_result_needs_memory() helper function mshv: Add SMT_ENABLED_GUEST partition creation flag mshv: Add nested virtualization creation flag Drivers: hv: vmbus: Simplify allocation of vmbus_evt mshv: expose the scrub partition hypercall mshv: Add support for integrated scheduler mshv: Use try_cmpxchg() instead of cmpxchg() x86/hyperv: Fix error pointer dereference x86/hyperv: Reserve 3 interrupt vectors used exclusively by MSHV Drivers: hv: vmbus: Use kthread for vmbus interrupts on PREEMPT_RT x86/hyperv: Remove ASM_CALL_CONSTRAINT with VMMCALL insn x86/hyperv: Use savesegment() instead of inline asm() to save segment registers mshv: fix SRCU protection in irqfd resampler ack handler mshv: make field names descriptive in a header struct x86/hyperv: Update comment in hyperv_cleanup() mshv: clear eventfd counter on irqfd shutdown x86/hyperv: Use memremap()/memunmap() instead of ioremap_cache()/iounmap() ...
2026-02-18x86/hyperv: Fix error pointer dereferenceEthan Tidmore-3/+5
The function idle_thread_get() can return an error pointer and is not checked for it. Add check for error pointer. Detected by Smatch: arch/x86/hyperv/hv_vtl.c:126 hv_vtl_bringup_vcpu() error: 'idle' dereferencing possible ERR_PTR() Fixes: 2b4b90e053a29 ("x86/hyperv: Use per cpu initial stack for vtl context") Signed-off-by: Ethan Tidmore <ethantidmore06@gmail.com> Signed-off-by: Wei Liu <wei.liu@kernel.org>
2026-02-18x86/hyperv: Remove ASM_CALL_CONSTRAINT with VMMCALL insnUros Bizjak-1/+1
Unlike CALL instruction, VMMCALL does not push to the stack, so it's OK to allow the compiler to insert it before the frame pointer gets set up by the containing function. ASM_CALL_CONSTRAINT is for CALLs that must be inserted after the frame pointer is set up, so it is over-constraining here and can be removed. Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Tested-by: Michael Kelley <mhklinux@outlook.com> Cc: K. Y. Srinivasan <kys@microsoft.com> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Wei Liu <wei.liu@kernel.org> Cc: Dexuan Cui <decui@microsoft.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Wei Liu <wei.liu@kernel.org>
2026-02-18x86/hyperv: Use savesegment() instead of inline asm() to save segment registersUros Bizjak-4/+5
Use standard savesegment() utility macro to save segment registers. Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Acked-by: Wei Liu <wei.liu@kernel.org> Tested-by: Michael Kelley <mhklinux@outlook.com> Cc: K. Y. Srinivasan <kys@microsoft.com> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Dexuan Cui <decui@microsoft.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Wei Liu <wei.liu@kernel.org>
2026-02-10Merge tag 'x86_paravirt_for_v7.0_rc1' of ↵Linus Torvalds-6/+5
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 paravirt updates from Borislav Petkov: - A nice cleanup to the paravirt code containing a unification of the paravirt clock interface, taming the include hell by splitting the pv_ops structure and removing of a bunch of obsolete code (Juergen Gross) * tag 'x86_paravirt_for_v7.0_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (23 commits) x86/paravirt: Use XOR r32,r32 to clear register in pv_vcpu_is_preempted() x86/paravirt: Remove trailing semicolons from alternative asm templates x86/pvlocks: Move paravirt spinlock functions into own header x86/paravirt: Specify pv_ops array in paravirt macros x86/paravirt: Allow pv-calls outside paravirt.h objtool: Allow multiple pv_ops arrays x86/xen: Drop xen_mmu_ops x86/xen: Drop xen_cpu_ops x86/xen: Drop xen_irq_ops x86/paravirt: Move pv_native_*() prototypes to paravirt.c x86/paravirt: Introduce new paravirt-base.h header x86/paravirt: Move paravirt_sched_clock() related code into tsc.c x86/paravirt: Use common code for paravirt_steal_clock() riscv/paravirt: Use common code for paravirt_steal_clock() loongarch/paravirt: Use common code for paravirt_steal_clock() arm64/paravirt: Use common code for paravirt_steal_clock() arm/paravirt: Use common code for paravirt_steal_clock() sched: Move clock related paravirt code to kernel/sched paravirt: Remove asm/paravirt_api_clock.h x86/paravirt: Move thunk macros to paravirt_types.h ...
2026-02-04x86/hyperv: Update comment in hyperv_cleanup()Michael Kelley-3/+7
The comment in hyperv_cleanup() became out-of-date as a result of commit c8ed0812646e ("x86/hyperv: Use direct call to hypercall-page"). Update the comment. No code or functional change. Signed-off-by: Michael Kelley <mhklinux@outlook.com> Signed-off-by: Wei Liu <wei.liu@kernel.org>
2026-02-04x86/hyperv: Use memremap()/memunmap() instead of ioremap_cache()/iounmap()Michael Kelley-3/+3
When running with a paravisor and SEV-SNP, the GHCB page is provided by the paravisor instead of being allocated by Linux. The provided page is normal memory, but is outside of the physical address space seen by Linux. As such it cannot be accessed via the kernel's direct map, and must be explicitly mapped to a kernel virtual address. Current code uses ioremap_cache() and iounmap() to map and unmap the page. These functions are for use on I/O address space that may not behave as normal memory, so they generate or expect addresses with the __iomem attribute. For normal memory, the preferred functions are memremap() and memunmap(), which operate similarly but without __iomem. At the time of the original work on CoCo VMs on Hyper-V, memremap() did not support creating a decrypted mapping, so ioremap_cache() was used instead, since I/O address space is always mapped decrypted. memremap() has since been enhanced to allow decrypted mappings, so replace ioremap_cache() with memremap() when mapping the GHCB page. Similarly, replace iounmap() with memunmap(). As a side benefit, the replacement cleans up 'sparse' warnings about __iomem mismatches. The replacement is done to use the correct functions as long-term goodness and to clean up the sparse warnings. No runtime bugs are fixed. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202311111925.iPGGJik4-lkp@intel.com/ Signed-off-by: Michael Kelley <mhklinux@outlook.com> Signed-off-by: Wei Liu <wei.liu@kernel.org>
2026-02-04x86/hyperv: Move hv crash init after hypercall pg setupMukesh R-1/+3
hv_root_crash_init() is not setting up the hypervisor crash collection for baremetal cases because when it's called, hypervisor page is not setup. Fix is simple, just move the crash init call after the hypercall page setup. Signed-off-by: Mukesh Rathor <mrathor@linux.microsoft.com> Signed-off-by: Wei Liu <wei.liu@kernel.org>
2026-02-04x86/hyperv: fix a compiler warning in hv_crash.cMukesh R-2/+1
Fix a compiler warning that status is defined by not used. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202512301641.FC6OAbGM-lkp@intel.com/ Signed-off-by: Mukesh R <mrathor@linux.microsoft.com> Signed-off-by: Wei Liu <wei.liu@kernel.org>
2026-01-28x86/hyperv: Fix smp_ops build failure on UP kernelsIngo Molnar-0/+2
CI testing found this build failure: arch/x86/hyperv/hv_crash.c:631:9: error: ‘smp_ops’ undeclared (first use in this function) And I bisected it back to the initial commit that enabled this feature: 77c860d2dbb72d1f3c6a2e882a07d19eca399db5 is the first bad commit commit 77c860d2dbb72d1f3c6a2e882a07d19eca399db5 (HEAD) Author: Mukesh Rathor <mrathor@linux.microsoft.com> Date: Mon Oct 6 15:42:08 2025 -0700 x86/hyperv: Enable build of hypervisor crashdump collection files Hyperv should probably be limited to SMP kernels, as nobody appears to be testing it on UP kernels. Until then, fix the smp_ops assumption. Build tested only. Fixes: 77c860d2dbb72 ("x86/hyperv: Enable build of hypervisor crashdump collection files") Cc: Mukesh Rathor <mrathor@linux.microsoft.com> Cc: Wei Liu <wei.liu@kernel.org> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2026-01-13x86/pvlocks: Move paravirt spinlock functions into own headerJuergen Gross-5/+5
Instead of having the pv spinlock function definitions in paravirt.h, move them into the new header paravirt-spinlock.h. Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://patch.msgid.link/20260105110520.21356-22-jgross@suse.com
2026-01-12x86/paravirt: Remove not needed includes of paravirt.hJuergen Gross-1/+0
In some places asm/paravirt.h is included without really being needed. Remove the related #include statements. Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://patch.msgid.link/20260105110520.21356-2-jgross@suse.com
2025-12-13x86/hv: Add gitignore entry for generated header fileLinus Torvalds-0/+1
Commit 7bfe3b8ea6e3 ("Drivers: hv: Introduce mshv_vtl driver") added a new generated header file for the offsets into the mshv_vtl_cpu_context structure to be used by the low-level assembly code. But it didn't add the .gitignore file to go with it, so 'git status' and friends will mention it. Let's add the gitignore file before somebody thinks that generated header should be committed. Fixes: 7bfe3b8ea6e3 ("Drivers: hv: Introduce mshv_vtl driver") Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2025-12-09Merge tag 'hyperv-next-signed-20251207' of ↵Linus Torvalds-1/+958
git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux Pull hyperv updates from Wei Liu: - Enhancements to Linux as the root partition for Microsoft Hypervisor: - Support a new mode called L1VH, which allows Linux to drive the hypervisor running the Azure Host directly - Support for MSHV crash dump collection - Allow Linux's memory management subsystem to better manage guest memory regions - Fix issues that prevented a clean shutdown of the whole system on bare metal and nested configurations - ARM64 support for the MSHV driver - Various other bug fixes and cleanups - Add support for Confidential VMBus for Linux guest on Hyper-V - Secure AVIC support for Linux guests on Hyper-V - Add the mshv_vtl driver to allow Linux to run as the secure kernel in a higher virtual trust level for Hyper-V * tag 'hyperv-next-signed-20251207' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux: (58 commits) mshv: Cleanly shutdown root partition with MSHV mshv: Use reboot notifier to configure sleep state mshv: Add definitions for MSHV sleep state configuration mshv: Add support for movable memory regions mshv: Add refcount and locking to mem regions mshv: Fix huge page handling in memory region traversal mshv: Move region management to mshv_regions.c mshv: Centralize guest memory region destruction mshv: Refactor and rename memory region handling functions mshv: adjust interrupt control structure for ARM64 Drivers: hv: use kmalloc_array() instead of kmalloc() mshv: Add ioctl for self targeted passthrough hvcalls Drivers: hv: Introduce mshv_vtl driver Drivers: hv: Export some symbols for mshv_vtl static_call: allow using STATIC_CALL_TRAMP_STR() from assembly mshv: Extend create partition ioctl to support cpu features mshv: Allow mappings that overlap in uaddr mshv: Fix create memory region overlap check mshv: add WQ_PERCPU to alloc_workqueue users Drivers: hv: Use kmalloc_array() instead of kmalloc() ...
2025-12-05mshv: Use reboot notifier to configure sleep statePraveen K Paladugu-0/+1
Configure sleep state information from ACPI in MSHV hypervisor using a reboot notifier. This data allows the hypervisor to correctly power off the host during shutdown. Signed-off-by: Praveen K Paladugu <prapal@linux.microsoft.com> Co-developed-by: Anatol Belski <anbelski@linux.microsoft.com> Signed-off-by: Anatol Belski <anbelski@linux.microsoft.com> Reviewed-by: Easwar Hariharan <easwar.hariharan@linux.microsoft.com> Reviewed-by: Stansialv Kinsburskii <skinsburskii@linux.miscrosoft.com> Acked-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Nuno Das Neves <nunodasneves@linux.microsoft.com> Signed-off-by: Wei Liu <wei.liu@kernel.org>
2025-12-05Drivers: hv: Introduce mshv_vtl driverNaman Jain-1/+192
Provide an interface for Virtual Machine Monitor like OpenVMM and its use as OpenHCL paravisor to control VTL0 (Virtual trust Level). Expose devices and support IOCTLs for features like VTL creation, VTL0 memory management, context switch, making hypercalls, mapping VTL0 address space to VTL2 userspace, getting new VMBus messages and channel events in VTL2 etc. Co-developed-by: Roman Kisel <romank@linux.microsoft.com> Signed-off-by: Roman Kisel <romank@linux.microsoft.com> Co-developed-by: Saurabh Sengar <ssengar@linux.microsoft.com> Signed-off-by: Saurabh Sengar <ssengar@linux.microsoft.com> Reviewed-by: Michael Kelley <mhklinux@outlook.com> Signed-off-by: Naman Jain <namjain@linux.microsoft.com> Signed-off-by: Wei Liu <wei.liu@kernel.org>
2025-11-15x86/hyperv: Enable build of hypervisor crashdump collection filesMukesh Rathor-0/+7
Enable build of the new files introduced in the earlier commits and add call to do the setup during boot. Signed-off-by: Mukesh Rathor <mrathor@linux.microsoft.com> [ wei: fix build ] Signed-off-by: Wei Liu <wei.liu@kernel.org>
2025-11-15x86/hyperv: Implement hypervisor RAM collection into vmcoreMukesh Rathor-0/+642
Introduce a new file to implement collection of hypervisor RAM into the vmcore collected by linux. By default, the hypervisor RAM is locked, ie, protected via hw page table. Hyper-V implements a disable hypercall which essentially devirtualizes the system on the fly. This mechanism makes the hypervisor RAM accessible to linux. Because the hypervisor RAM is already mapped into linux address space (as reserved RAM), it is automatically collected into the vmcore without extra work. More details of the implementation are available in the file prologue. Signed-off-by: Mukesh Rathor <mrathor@linux.microsoft.com> Signed-off-by: Wei Liu <wei.liu@kernel.org>
2025-11-15x86/hyperv: Add trampoline asm code to transition from hypervisorMukesh Rathor-0/+101
Introduce a small asm stub to transition from the hypervisor to Linux after devirtualization. Devirtualization means disabling hypervisor on the fly, so after it is done, the code is running on physical processor instead of virtual, and hypervisor is gone. This can be done by a root vm only. At a high level, during panic of either the hypervisor or the root, the NMI handler asks hypervisor to devirtualize. As part of that, the arguments include an entry point to return back to Linux. This asm stub implements that entry point. The stub is entered in protected mode, uses temporary gdt and page table to enable long mode and get to kernel entry point which then restores full kernel context to resume execution to kexec. Signed-off-by: Mukesh Rathor <mrathor@linux.microsoft.com> Signed-off-by: Wei Liu <wei.liu@kernel.org>
2025-11-15x86/hyperv: Allow Hyper-V to inject STIMER0 interruptsTianyu Lan-0/+7
When Secure AVIC is enabled, call Secure AVIC function to allow Hyper-V to inject STIMER0 interrupt. Reviewed-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com> Reviewed-by: Michael Kelley <mhklinux@outlook.com> Signed-off-by: Tianyu Lan <tiala@microsoft.com> Signed-off-by: Wei Liu <wei.liu@kernel.org>
2025-11-15drivers: hv: Allow vmbus message synic interrupt injected from Hyper-VTianyu Lan-0/+5
When Secure AVIC is enabled, VMBus driver should call x2apic Secure AVIC interface to allow Hyper-V to inject VMBus message interrupt. Reviewed-by: Michael Kelley <mhklinux@outlook.com> Reviewed-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com> Signed-off-by: Tianyu Lan <tiala@microsoft.com> Signed-off-by: Wei Liu <wei.liu@kernel.org>
2025-11-15x86/hyperv: Don't use hv apic driver when Secure AVIC is availableTianyu Lan-0/+3
When Secure AVIC is available, the AMD x2apic Secure AVIC driver will be selected. In that case, have hv_apic_init() return immediately without doing anything. Reviewed-by: Michael Kelley <mhklinux@outlook.com> Reviewed-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com> Signed-off-by: Tianyu Lan <tiala@microsoft.com> Signed-off-by: Wei Liu <wei.liu@kernel.org>
2025-11-14syscore: Pass context data to callbacksThierry Reding-4/+8
Several drivers can benefit from registering per-instance data along with the syscore operations. To achieve this, move the modifiable fields out of the syscore_ops structure and into a separate struct syscore that can be registered with the framework. Add a void * driver data field for drivers to store contextual data that will be passed to the syscore ops. Acked-by: Rafael J. Wysocki (Intel) <rafael@kernel.org> Signed-off-by: Thierry Reding <treding@nvidia.com>
2025-10-11Merge tag 'x86_core_for_v6.18_rc1' of ↵Linus Torvalds-25/+59
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull more x86 updates from Borislav Petkov: - Remove a bunch of asm implementing condition flags testing in KVM's emulator in favor of int3_emulate_jcc() which is written in C - Replace KVM fastops with C-based stubs which avoids problems with the fastop infra related to latter not adhering to the C ABI due to their special calling convention and, more importantly, bypassing compiler control-flow integrity checking because they're written in asm - Remove wrongly used static branches and other ugliness accumulated over time in hyperv's hypercall implementation with a proper static function call to the correct hypervisor call variant - Add some fixes and modifications to allow running FRED-enabled kernels in KVM even on non-FRED hardware - Add kCFI improvements like validating indirect calls and prepare for enabling kCFI with GCC. Add cmdline params documentation and other code cleanups - Use the single-byte 0xd6 insn as the official #UD single-byte undefined opcode instruction as agreed upon by both x86 vendors - Other smaller cleanups and touchups all over the place * tag 'x86_core_for_v6.18_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (24 commits) x86,retpoline: Optimize patch_retpoline() x86,ibt: Use UDB instead of 0xEA x86/cfi: Remove __noinitretpoline and __noretpoline x86/cfi: Add "debug" option to "cfi=" bootparam x86/cfi: Standardize on common "CFI:" prefix for CFI reports x86/cfi: Document the "cfi=" bootparam options x86/traps: Clarify KCFI instruction layout compiler_types.h: Move __nocfi out of compiler-specific header objtool: Validate kCFI calls x86/fred: KVM: VMX: Always use FRED for IRQs when CONFIG_X86_FRED=y x86/fred: Play nice with invoking asm_fred_entry_from_kvm() on non-FRED hardware x86/fred: Install system vector handlers even if FRED isn't fully enabled x86/hyperv: Use direct call to hypercall-page x86/hyperv: Clean up hv_do_hypercall() KVM: x86: Remove fastops KVM: x86: Convert em_salc() to C KVM: x86: Introduce EM_ASM_3WCL KVM: x86: Introduce EM_ASM_1SRC2 KVM: x86: Introduce EM_ASM_2CL KVM: x86: Introduce EM_ASM_2W ...
2025-09-30x86/hyperv: Switch to msi_create_parent_irq_domain()Nam Cao-35/+76
Move away from the legacy MSI domain setup, switch to use msi_create_parent_irq_domain(). While doing the conversion, I noticed that hv_irq_compose_msi_msg() is doing more than it is supposed to (composing message content). The interrupt allocation bits should be moved into hv_msi_domain_alloc(). However, I have no hardware to test this change, therefore I leave a TODO note. Signed-off-by: Nam Cao <namcao@linutronix.de> Acked-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Nuno Das Neves <nunodasneves@linux.microsoft.com> Reviewed-by: Nuno Das Neves <nunodasneves@linux.microsoft.com> Signed-off-by: Wei Liu <wei.liu@kernel.org>
2025-09-30x86/hyperv: Add kexec/kdump support on Azure CVMsVitaly Kuznetsov-1/+210
Azure CVM instance types featuring a paravisor hang upon kdump. The investigation shows that makedumpfile causes a hang when it steps on a page which was previously share with the host (HVCALL_MODIFY_SPARSE_GPA_PAGE_HOST_VISIBILITY). The new kernel has no knowledge of these 'special' regions (which are Vmbus connection pages, GPADL buffers, ...). There are several ways to approach the issue: - Convey the knowledge about these regions to the new kernel somehow. - Unshare these regions before accessing in the new kernel (it is unclear if there's a way to query the status for a given GPA range). - Unshare these regions before jumping to the new kernel (which this patch implements). To make the procedure as robust as possible, store PFN ranges of shared regions in a linked list instead of storing GVAs and re-using hv_vtom_set_host_visibility(). This also allows to avoid memory allocation on the kdump/kexec path. Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com> Reviewed-by: Michael Kelley <mhklinux@outlook.com> Reviewed-by: Tianyu Lan <tiala@microsoft.com> Signed-off-by: Wei Liu <wei.liu@kernel.org>
2025-08-18x86/hyperv: Use direct call to hypercall-pagePeter Zijlstra-31/+30
Instead of using an indirect call to the hypercall page, use a direct call instead. This avoids all CFI problems, including the one where the hypercall page doesn't have IBT on. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Michael Kelley <mhklinux@outlook.com> Acked-by: Sean Christopherson <seanjc@google.com> Link: https://lkml.kernel.org/r/20250714103441.011387946@infradead.org
2025-08-18x86/hyperv: Clean up hv_do_hypercall()Peter Zijlstra-0/+35
What used to be a simple few instructions has turned into a giant mess (for x86_64). Not only does it use static_branch wrong, it mixes it with dynamic branches for no apparent reason. Notably it uses static_branch through an out-of-line function call, which completely defeats the purpose, since instead of a simple JMP/NOP site, you get a CALL+RET+TEST+Jcc sequence in return, which is absolutely idiotic. Add to that a dynamic test of hyperv_paravisor_present, something which is set once and never changed. Replace all this idiocy with a single direct function call to the right hypercall variant. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Michael Kelley <mhklinux@outlook.com> Tested-by: Michael Kelley <mhklinux@outlook.com> Acked-by: Sean Christopherson <seanjc@google.com> Link: https://lkml.kernel.org/r/20250714103440.897136093@infradead.org
2025-07-15x86/hyperv: Expose hv_map_msi_interrupt()Stanislav Kinsburskii-11/+29
Move some of the logic of hv_irq_compose_irq_message() into hv_map_msi_interrupt(). Make hv_map_msi_interrupt() a globally-available helper function, which will be used to map PCI interrupts when running in the root partition. Signed-off-by: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com> Signed-off-by: Nuno Das Neves <nunodasneves@linux.microsoft.com> Reviewed-by: Roman Kisel <romank@linux.microsoft.com> Reviewed-by: Michael Kelley <mhklinux@outlook.com> Link: https://lore.kernel.org/r/1752261532-7225-3-git-send-email-nunodasneves@linux.microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org> Message-ID: <1752261532-7225-3-git-send-email-nunodasneves@linux.microsoft.com>
2025-07-09x86/hyperv: Clean up hv_map/unmap_interrupt() return valuesNuno Das Neves-18/+14
Fix the return values of these hypercall helpers so they return a negated errno either directly or via hv_result_to_errno(). Update the callers to check for errno instead of using hv_status_success(), and remove redundant error printing. While at it, rearrange some variable declarations to adhere to style guidelines i.e. "reverse fir tree order". Signed-off-by: Nuno Das Neves <nunodasneves@linux.microsoft.com> Reviewed-by: Michael Kelley <mhklinux@outlook.com> Link: https://lore.kernel.org/r/1751582677-30930-5-git-send-email-nunodasneves@linux.microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org> Message-ID: <1751582677-30930-5-git-send-email-nunodasneves@linux.microsoft.com>
2025-07-09x86/hyperv: Fix usage of cpu_online_mask to get valid cpuNuno Das Neves-3/+1
Accessing cpu_online_mask here is problematic because the cpus read lock is not held in this context. However, cpu_online_mask isn't needed here since the effective affinity mask is guaranteed to be valid in this callback. So, just use cpumask_first() to get the cpu instead of ANDing it with cpus_online_mask unnecessarily. Fixes: e39397d1fd68 ("x86/hyperv: implement an MSI domain for root partition") Reported-by: Michael Kelley <mhklinux@outlook.com> Closes: https://lore.kernel.org/linux-hyperv/SN6PR02MB4157639630F8AD2D8FD8F52FD475A@SN6PR02MB4157.namprd02.prod.outlook.com/ Suggested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Nuno Das Neves <nunodasneves@linux.microsoft.com> Reviewed-by: Michael Kelley <mhklinux@outlook.com> Link: https://lore.kernel.org/r/1751582677-30930-4-git-send-email-nunodasneves@linux.microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org> Message-ID: <1751582677-30930-4-git-send-email-nunodasneves@linux.microsoft.com>
2025-07-09x86/hyperv: Fix warnings for missing export.h header inclusionNaman Jain-0/+4
Fix below warning in Hyper-V drivers that comes when kernel is compiled with W=1 option. Include export.h in driver files to fix it. * warning: EXPORT_SYMBOL() is used, but #include <linux/export.h> is missing Signed-off-by: Naman Jain <namjain@linux.microsoft.com> Reviewed-by: Saurabh Sengar <ssengar@linux.microsoft.com> Link: https://lore.kernel.org/r/20250611100459.92900-3-namjain@linux.microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org> Message-ID: <20250611100459.92900-3-namjain@linux.microsoft.com>
2025-06-03Merge tag 'hyperv-next-signed-20250602' of ↵Linus Torvalds-84/+55
git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux Pull hyperv updates from Wei Liu: - Support for Virtual Trust Level (VTL) on arm64 (Roman Kisel) - Fixes for Hyper-V UIO driver (Long Li) - Fixes for Hyper-V PCI driver (Michael Kelley) - Select CONFIG_SYSFB for Hyper-V guests (Michael Kelley) - Documentation updates for Hyper-V VMBus (Michael Kelley) - Enhance logging for hv_kvp_daemon (Shradha Gupta) * tag 'hyperv-next-signed-20250602' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux: (23 commits) Drivers: hv: Always select CONFIG_SYSFB for Hyper-V guests Drivers: hv: vmbus: Add comments about races with "channels" sysfs dir Documentation: hyperv: Update VMBus doc with new features and info PCI: hv: Remove unnecessary flex array in struct pci_packet Drivers: hv: Remove hv_alloc/free_* helpers Drivers: hv: Use kzalloc for panic page allocation uio_hv_generic: Align ring size to system page uio_hv_generic: Use correct size for interrupt and monitor pages Drivers: hv: Allocate interrupt and monitor pages aligned to system page boundary arch/x86: Provide the CPU number in the wakeup AP callback x86/hyperv: Fix APIC ID and VP index confusion in hv_snp_boot_ap() PCI: hv: Get vPCI MSI IRQ domain from DeviceTree ACPI: irq: Introduce acpi_get_gsi_dispatcher() Drivers: hv: vmbus: Introduce hv_get_vmbus_root_device() Drivers: hv: vmbus: Get the IRQ number from DeviceTree dt-bindings: microsoft,vmbus: Add interrupt and DMA coherence properties arm64, x86: hyperv: Report the VTL the system boots in arm64: hyperv: Initialize the Virtual Trust Level field Drivers: hv: Provide arch-neutral implementation of get_vtl() Drivers: hv: Enable VTL mode for arm64 ...
2025-05-23arch/x86: Provide the CPU number in the wakeup AP callbackRoman Kisel-23/+4
When starting APs, confidential guests and paravisor guests need to know the CPU number, and the pattern of using the linear search has emerged in several places. With N processors that leads to the O(N^2) time complexity. Provide the CPU number in the AP wake up callback so that one can get the CPU number in constant time. Suggested-by: Michael Kelley <mhklinux@outlook.com> Signed-off-by: Roman Kisel <romank@linux.microsoft.com> Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Michael Kelley <mhklinux@outlook.com> Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Link: https://lore.kernel.org/r/20250507182227.7421-3-romank@linux.microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org> Message-ID: <20250507182227.7421-3-romank@linux.microsoft.com>
2025-05-23x86/hyperv: Fix APIC ID and VP index confusion in hv_snp_boot_ap()Roman Kisel-40/+59
To start an application processor in SNP-isolated guest, a hypercall is used that takes a virtual processor index. The hv_snp_boot_ap() function uses that START_VP hypercall but passes as VP index to it what it receives as a wakeup_secondary_cpu_64 callback: the APIC ID. As those two aren't generally interchangeable, that may lead to hung APs if the VP index and the APIC ID don't match up. Update the parameter names to avoid confusion as to what the parameter is. Use the APIC ID to the VP index conversion to provide the correct input to the hypercall. Cc: stable@vger.kernel.org Fixes: 44676bb9d566 ("x86/hyperv: Add smp support for SEV-SNP guest") Signed-off-by: Roman Kisel <romank@linux.microsoft.com> Reviewed-by: Michael Kelley <mhklinux@outlook.com> Link: https://lore.kernel.org/r/20250507182227.7421-2-romank@linux.microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org> Message-ID: <20250507182227.7421-2-romank@linux.microsoft.com>
2025-05-23arm64, x86: hyperv: Report the VTL the system boots inRoman Kisel-1/+6
The hyperv guest code might run in various Virtual Trust Levels. Report the level when the kernel boots in the non-default (0) one. Signed-off-by: Roman Kisel <romank@linux.microsoft.com> Reviewed-by: Michael Kelley <mhklinux@outlook.com> Link: https://lore.kernel.org/r/20250428210742.435282-7-romank@linux.microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org> Message-ID: <20250428210742.435282-7-romank@linux.microsoft.com>
2025-05-23Drivers: hv: Provide arch-neutral implementation of get_vtl()Roman Kisel-34/+0
To run in the VTL mode, Hyper-V drivers have to know what VTL the system boots in, and the arm64/hyperv code does not have the means to compute that. Refactor the code to hoist the function that detects VTL, make it arch-neutral to be able to employ it to get the VTL on arm64. Signed-off-by: Roman Kisel <romank@linux.microsoft.com> Reviewed-by: Michael Kelley <mhklinux@outlook.com> Reviewed-by: Tianyu Lan <tiala@microsoft.com> Link: https://lore.kernel.org/r/20250428210742.435282-5-romank@linux.microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org> Message-ID: <20250428210742.435282-5-romank@linux.microsoft.com>
2025-05-02x86/msr: Replace wrmsr(msr, low, 0) with wrmsrq(msr, low)Xin Li (Intel)-3/+3
The third argument in wrmsr(msr, low, 0) is unnecessary. Instead, use wrmsrq(msr, low), which automatically sets the higher 32 bits of the MSR value to 0. Signed-off-by: Xin Li (Intel) <xin@zytor.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Brian Gerst <brgerst@gmail.com> Cc: David Woodhouse <dwmw2@infradead.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Stefano Stabellini <sstabellini@kernel.org> Cc: Uros Bizjak <ubizjak@gmail.com> Cc: Vitaly Kuznetsov <vkuznets@redhat.com> Link: https://lore.kernel.org/r/20250427092027.1598740-15-xin@zytor.com
2025-05-02x86/msr: Convert __rdmsr() uses to native_rdmsrq() usesXin Li (Intel)-3/+3
__rdmsr() is the lowest level MSR write API, with native_rdmsr() and native_rdmsrq() serving as higher-level wrappers around it. #define native_rdmsr(msr, val1, val2) \ do { \ u64 __val = __rdmsr((msr)); \ (void)((val1) = (u32)__val); \ (void)((val2) = (u32)(__val >> 32)); \ } while (0) static __always_inline u64 native_rdmsrq(u32 msr) { return __rdmsr(msr); } However, __rdmsr() continues to be utilized in various locations. MSR APIs are designed for different scenarios, such as native or pvops, with or without trace, and safe or non-safe. Unfortunately, the current MSR API names do not adequately reflect these factors, making it challenging to select the most appropriate API for various situations. To pave the way for improving MSR API names, convert __rdmsr() uses to native_rdmsrq() to ensure consistent usage. Later, these APIs can be renamed to better reflect their implications, such as native or pvops, with or without trace, and safe or non-safe. No functional change intended. Signed-off-by: Xin Li (Intel) <xin@zytor.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Andy Lutomirski <luto@kernel.org> Cc: Brian Gerst <brgerst@gmail.com> Cc: David Woodhouse <dwmw2@infradead.org> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Juergen Gross <jgross@suse.com> Cc: Kees Cook <keescook@chromium.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Stefano Stabellini <sstabellini@kernel.org> Cc: Uros Bizjak <ubizjak@gmail.com> Cc: Vitaly Kuznetsov <vkuznets@redhat.com> Link: https://lore.kernel.org/r/20250427092027.1598740-10-xin@zytor.com
2025-05-02x86/msr: Add explicit includes of <asm/msr.h>Xin Li (Intel)-0/+5
For historic reasons there are some TSC-related functions in the <asm/msr.h> header, even though there's an <asm/tsc.h> header. To facilitate the relocation of rdtsc{,_ordered}() from <asm/msr.h> to <asm/tsc.h> and to eventually eliminate the inclusion of <asm/msr.h> in <asm/tsc.h>, add an explicit <asm/msr.h> dependency to the source files that reference definitions from <asm/msr.h>. [ mingo: Clarified the changelog. ] Signed-off-by: Xin Li (Intel) <xin@zytor.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Dave Hansen <dave.hansen@linux.intel.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Brian Gerst <brgerst@gmail.com> Cc: Juergen Gross <jgross@suse.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Kees Cook <keescook@chromium.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Uros Bizjak <ubizjak@gmail.com> Link: https://lore.kernel.org/r/20250501054241.1245648-1-xin@zytor.com
2025-04-10x86/msr: Rename 'native_wrmsrl()' to 'native_wrmsrq()'Ingo Molnar-1/+1
Suggested-by: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Juergen Gross <jgross@suse.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Xin Li <xin@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org>
2025-04-10x86/msr: Rename 'wrmsrl()' to 'wrmsrq()'Ingo Molnar-21/+21
Suggested-by: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Juergen Gross <jgross@suse.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Xin Li <xin@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org>
2025-04-10x86/msr: Rename 'rdmsrl()' to 'rdmsrq()'Ingo Molnar-17/+17
Suggested-by: "H. Peter Anvin" <hpa@zytor.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Juergen Gross <jgross@suse.com> Cc: Dave Hansen <dave.hansen@intel.com> Cc: Xin Li <xin@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org>
2025-03-25Merge tag 'hyperv-next-signed-20250324' of ↵Linus Torvalds-232/+54
git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux Pull hyperv updates from Wei Liu: - Add support for running as the root partition in Hyper-V (Microsoft Hypervisor) by exposing /dev/mshv (Nuno and various people) - Add support for CPU offlining in Hyper-V (Hamza Mahfooz) - Misc fixes and cleanups (Roman Kisel, Tianyu Lan, Wei Liu, Michael Kelley, Thorsten Blum) * tag 'hyperv-next-signed-20250324' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux: (24 commits) x86/hyperv: fix an indentation issue in mshyperv.h x86/hyperv: Add comments about hv_vpset and var size hypercall input args Drivers: hv: Introduce mshv_root module to expose /dev/mshv to VMMs hyperv: Add definitions for root partition driver to hv headers x86: hyperv: Add mshv_handler() irq handler and setup function Drivers: hv: Introduce per-cpu event ring tail Drivers: hv: Export some functions for use by root partition module acpi: numa: Export node_to_pxm() hyperv: Introduce hv_recommend_using_aeoi() arm64/hyperv: Add some missing functions to arm64 x86/mshyperv: Add support for extended Hyper-V features hyperv: Log hypercall status codes as strings x86/hyperv: Fix check of return value from snp_set_vmsa() x86/hyperv: Add VTL mode callback for restarting the system x86/hyperv: Add VTL mode emergency restart callback hyperv: Remove unused union and structs hyperv: Add CONFIG_MSHV_ROOT to gate root partition support hyperv: Change hv_root_partition into a function hyperv: Convert hypercall statuses to linux error codes drivers/hv: add CPU offlining support ...
2025-03-21x86/hyperv: Add comments about hv_vpset and var size hypercall input argsMichael Kelley-0/+9
Current code varies in how the size of the variable size input header for hypercalls is calculated when the input contains struct hv_vpset. Surprisingly, this variation is correct, as different hypercalls make different choices for what portion of struct hv_vpset is treated as part of the variable size input header. The Hyper-V TLFS is silent on these details, but the behavior has been confirmed with Hyper-V developers. To avoid future confusion about these differences, add comments to struct hv_vpset, and to hypercall call sites with input that contains a struct hv_vpset. The comments describe the overall situation and the calculation that should be used at each particular call site. No functional change as only comments are updated. Signed-off-by: Michael Kelley <mhklinux@outlook.com> Link: https://lore.kernel.org/r/20250318214919.958953-1-mhklinux@outlook.com Signed-off-by: Wei Liu <wei.liu@kernel.org> Message-ID: <20250318214919.958953-1-mhklinux@outlook.com>
2025-03-20hyperv: Log hypercall status codes as stringsNuno Das Neves-3/+3
Introduce hv_status_printk() macros as a convenience to log hypercall errors, formatting them with the status code (HV_STATUS_*) as a raw hex value and also as a string, which saves some time while debugging. Create a table of HV_STATUS_ codes with strings and mapped errnos, and use it for hv_result_to_string() and hv_result_to_errno(). Use the new hv_status_printk()s in hv_proc.c, hyperv-iommu.c, and irqdomain.c hypercalls to aid debugging in the root partition. Signed-off-by: Nuno Das Neves <nunodasneves@linux.microsoft.com> Reviewed-by: Stanislav Kinsburskii <skinsburskii@linux.microsoft.com> Link: https://lore.kernel.org/r/1741980536-3865-2-git-send-email-nunodasneves@linux.microsoft.com Signed-off-by: Wei Liu <wei.liu@kernel.org> Message-ID: <1741980536-3865-2-git-send-email-nunodasneves@linux.microsoft.com>
2025-03-20x86/hyperv: Fix check of return value from snp_set_vmsa()Tianyu Lan-1/+1
snp_set_vmsa() returns 0 as success result and so fix it. Cc: stable@vger.kernel.org Fixes: 44676bb9d566 ("x86/hyperv: Add smp support for SEV-SNP guest") Signed-off-by: Tianyu Lan <tiala@microsoft.com> Reviewed-by: Michael Kelley <mhklinux@outlook.com> Link: https://lore.kernel.org/r/20250313085217.45483-1-ltykernel@gmail.com Signed-off-by: Wei Liu <wei.liu@kernel.org> Message-ID: <20250313085217.45483-1-ltykernel@gmail.com>