aboutsummaryrefslogtreecommitdiffstats
path: root/arch (follow)
AgeCommit message (Collapse)AuthorFilesLines
2025-09-15Merge tag 'samsung-dt-6.18' of ↵Arnd Bergmann3-4/+50
https://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux into soc/dt Samsung DTS ARM changes for v6.18 1. Drop S3C2416 SoC from bindings, because it was removed from kernel in 2023. 2. Add Ethernet attached via SROM controller (memory bus) on SMDK5250. This wasn't tested, but code should work just like it is working on Exynos5410-based boards. * tag 'samsung-dt-6.18' of https://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux: ARM: dts: samsung: smdk5250: add sromc node ARM: dts: samsung: exynos5250: describe sromc bank memory map ARM: dts: samsung: exynos5410: use multiple tuples for sromc ranges dt-bindings: arm: samsung: Drop S3C2416 Link: https://lore.kernel.org/r/20250909184559.105777-2-krzysztof.kozlowski@linaro.org Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2025-09-15Merge tag 'dt64-cleanup-6.18' of ↵Arnd Bergmann6-0/+6
https://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux-dt into soc/dt Minor improvements in ARM64 DTS for v6.18 Add default address cells for interrupt controllers to fix dtc W=1 warnings on Amazon, APM, Socionext and Toshiba boards. * tag 'dt64-cleanup-6.18' of https://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux-dt: arm64: dts: toshiba: tmpv7708: Add default GIC address cells arm64: dts: amazon: alpine-v3: Add default GIC address cells arm64: dts: amazon: alpine-v2: Add default GIC address cells arm64: dts: apm: storm: Add default GIC address cells arm64: dts: socionext: uniphier-pxs3: Add default PCI interrup controller address cells arm64: dts: socionext: uniphier-ld20: Add default PCI interrup controller address cells Link: https://lore.kernel.org/r/20250909182256.102840-2-krzysztof.kozlowski@linaro.org Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2025-09-15Merge tag 'i2c-gpio-fixes-for-6.18' of ↵Arnd Bergmann2-4/+4
git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux into soc/dt i2c-gpio-fixes-for-6.18 We have dedictaded bindings for scl/sda nowadays. Switch away from the deprecated plain 'gpios' property. * tag 'i2c-gpio-fixes-for-6.18' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux: ARM: dts: stm32: use recent scl/sda gpio bindings ARM: dts: cirrus: ep7211: use recent scl/sda gpio bindings Link: https://lore.kernel.org/r/aLlgGdrFEjh26knK@shikoro Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2025-09-15ARM: mach-hpe: Rework support and directory structureAndrew Davis6-42/+25
Having a platform need a mach-* directory should be seen as a negative, it means the platform needs special non-standard handling. ARM64 support does not allow mach-* directories at all. While we may not get to that given all the non-standard architectures we support, we should still try to get as close as we can and reduce the number of mach directories. The mach-hpe/ directory and files, provides just one "feature": having the kernel print the machine name if the DTB does not also contain a "model" string (which they always do). To reduce the number of mach-* directories let's do without that feature and remove this directory. Note, we drop the l2c_aux_mask = ~0 line, but this is safe as the fallback GENERIC_DT machine has that as the default. Signed-off-by: Andrew Davis <afd@ti.com> Link: https://lore.kernel.org/r/20250813170308.290349-1-afd@ti.com Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2025-09-15Merge tag 'samsung-dt64-6.18' of ↵Arnd Bergmann18-38/+1990
https://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux into soc/dt Samsung DTS ARM64 changes for v6.18 1. Exynos850 e850 board: Enable Ethernet. 2. Exynos990: Enable watchdog and USB, add more clock controllers. 3. Exynos2200: Switch to 32-bit address space for blocks, because all peripherals fit there. Add remaining serial engine (USI) nodes (serial, I2C). 4. New Artpec ARTPEC-8 SoC with board. That's a design from Samsung, sharing all basic blocks with other Samsung SoCs (busses, clock controllers, pin controllers, PCIe, USB) and having media/video related blocks from Axis. Only basic support is added here: few clock controllers, pin controller and UART. 5. Several cleanups. * tag 'samsung-dt64-6.18' of https://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux: arm64: dts: exynos990: Enable PERIC0 and PERIC1 clock controllers arm64: dts: axis: Add ARTPEC-8 Grizzly dts support arm64: dts: exynos: axis: Add initial ARTPEC-8 SoC support dt-bindings: arm: axis: Add ARTPEC-8 grizzly board arm64: dts: exynos8895: Minor whitespace cleanup dt-bindings: arm: Convert Axis board/soc bindings to json-schema arm64: dts: exynos2200: Add default GIC address cells arm64: dts: fsd: Add default GIC address cells arm64: dts: google: gs101: Add default GIC address cells arm64: dts: exynos5433: Add default GIC address cells arm64: dts: exynos2200: define all usi nodes arm64: dts: exynos2200: increase the size of all syscons arm64: dts: exynos2200: use 32-bit address space for /soc arm64: dts: exynos2200: fix typo in hsi2c23 bus pins label arm64: dts: exynos990-r8s: Enable USB arm64: dts: exynos990-c1s: Enable USB arm64: dts: exynos990-x1s-common: Enable USB arm64: dts: exynos990: Add USB nodes arm64: dts: exynos990: Enable watchdog timer arm64: dts: exynos: Add Ethernet node for E850-96 board Link: https://lore.kernel.org/r/20250909180127.99783-4-krzysztof.kozlowski@linaro.org Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2025-09-15Merge tag 'socfpga_dts_updates_for_v6.18' of ↵Arnd Bergmann2-0/+356
git://git.kernel.org/pub/scm/linux/kernel/git/dinguyen/linux into soc/dt SoCFPGA DTS updates for v6.18 - Add and enable gmac for Agilex5 * tag 'socfpga_dts_updates_for_v6.18' of git://git.kernel.org/pub/scm/linux/kernel/git/dinguyen/linux: arm64: dts: socfpga: agilex5: enable gmac2 on the Agilex5 dev kit arm64: dts: Agilex5 Add gmac nodes to DTSI for Agilex5 Link: https://lore.kernel.org/r/20250908040718.187857-1-dinguyen@kernel.org Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2025-09-15Merge tag 'v6.18-rockchip-dts32-1' of ↵Arnd Bergmann1-0/+22
git://git.kernel.org/pub/scm/linux/kernel/git/mmind/linux-rockchip into soc/dt HDMI-CEC and -audio on RK3288-Miqi * tag 'v6.18-rockchip-dts32-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mmind/linux-rockchip: ARM: dts: rockchip: add HDMI audio to rk3288-miqi ARM: dts: rockchip: add CEC pinctrl to rk3288-miqi Link: https://lore.kernel.org/r/12138356.VV5PYv0bhD@phil Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2025-09-15Merge tag 'v6.18-rockchip-dts64-1' of ↵Arnd Bergmann32-20/+3172
git://git.kernel.org/pub/scm/linux/kernel/git/mmind/linux-rockchip into soc/dt New boards: FriendlyElec NanoPi Zero2, ArmSoM Sige1, Radxa ROCK 2A/2F, HINLINK H66K / H68K . Interesting new peripherals: I guess the most interesting one is likely the NPU on RK3588. The rocket driver has been merged into both the DRM tree as well as mainline Mesa. Other stll interesting ones are DW-Displayport on RK3588, DSI on RK3576 (missing soc pwm-support to be useful on most boards), thermal support and watchdog on RK3576. The rest peripheral additions on a number of boards (Beelink A1, Pine{phone,book}, rk3576-evb1-v10, Rock 5*, ...) * tag 'v6.18-rockchip-dts64-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mmind/linux-rockchip: (46 commits) arm64: dts: rockchip: Enable DP2HDMI for ROCK 5 ITX arm64: dts: rockchip: Enable DisplayPort for rk3588s Cool Pi 4B arm64: dts: rockchip: Add DP1 for rk3588 arm64: dts: rockchip: Add DP0 for rk3588 arm64: dts: rockchip: Add FriendlyElec NanoPi Zero2 dt-bindings: arm: rockchip: Add FriendlyElec NanoPi Zero2 arm64: dts: rockchip: Add ArmSoM Sige1 dt-bindings: arm: rockchip: Add ArmSoM Sige1 arm64: dts: rockchip: Add Radxa ROCK 2A/2F dt-bindings: arm: rockchip: Add Radxa ROCK 2A/2F dt-bindings: soc: rockchip: add missing clock reference for rk3576-dcphy syscon arm64: dts: rockchip: add USB3 on Beelink A1 arm64: dts: rockchip: add SPDIF audio to Beelink A1 arm64: dts: rockchip: add IR receiver to rk3328-roc arm64: dts: rockchip: Further describe the WiFi for the Pinephone Pro arm64: dts: rockchip: Further describe the WiFi for the Pinebook Pro arm64: dts: rockchip: Enable the NPU on NanoPi R6C/R6S arm64: dts: rockchip: enable NPU on OPI5/5B arm64: dts: rockchip: Add Bluetooth on rk3576-evb1-v10 arm64: dts: rockchip: Add WiFi on rk3576-evb1-v10 ... Link: https://lore.kernel.org/r/5241735.C4sosBPzcN@phil Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2025-09-15Merge tag 'thead-dt-for-v6.18' of ↵Arnd Bergmann1-0/+21
git://git.kernel.org/pub/scm/linux/kernel/git/fustini/linux into soc/dt T-HEAD Devicetrees for v6.18 Add a device tree node for the IMG BXM-4-64 GPU present in the T-HEAD TH1520 SoC used by the Lichee Pi 4A board. This node enables support for the GPU using the drm/imagination driver. By adding this node, the kernel can recognize and initialize the GPU, providing graphics acceleration capabilities on the Lichee Pi 4A and other boards based on the TH1520 SoC. The display controller and HDMI output are still a work in progress. Also included is a MAINTAINERS patch that adds an entry for the T-Head SoC patchwork. Signed-off-by: Drew Fustini <fustini@kernel.org> * tag 'thead-dt-for-v6.18' of git://git.kernel.org/pub/scm/linux/kernel/git/fustini/linux: MAINTAINERS: Add RISC-V T-HEAD SoC patchwork riscv: dts: thead: th1520: Add IMG BXM-4-64 GPU node Link: https://lore.kernel.org/r/aLyIXR1G9DUzwGWc@x1 Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2025-09-15m68k: defconfig: Update defconfigs for v6.17-rc1Geert Uytterhoeven12-72/+36
- Enable Netfilter legacy tables support, - Drop CONFIG_IP_NF_FILTER=m, CONFIG_IP_NF_MANGLE=m, CONFIG_IP6_NF_FILTER=m, and CONFIG_IP6_NF_MANGLE=m (auto-modular since commit 9fce66583f06c212 ("netfilter: Exclude LEGACY TABLES on PREEMPT_RT.")), - Enable legacy EBTABLES support (no longer auto-selected since commit 9fce66583f06c212 ("netfilter: Exclude LEGACY TABLES on PREEMPT_RT.")), - Drop CONFIG_CDROM_PKTCDVD=m (removed in commit 1cea5180f2f812c4 ("block: remove pktcdvd driver")), - Move CONFIG_CRC_BENCHMARK=y (moved in commit 89a51591405e09a8 ("lib/crc: Move files into lib/crc/")). Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Link: https://patch.msgid.link/012bc96a01eef989b39eedbe84591bd50c022e57.1754904412.git.geert@linux-m68k.org
2025-09-15m68k: bitops: Fix find_*_bit() signaturesGeert Uytterhoeven1-11/+14
The function signatures of the m68k-optimized implementations of the find_{first,next}_{,zero_}bit() helpers do not match the generic variants. Fix this by changing all non-pointer inputs and outputs to "unsigned long", and updating a few local variables. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202509092305.ncd9mzaZ-lkp@intel.com/ Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Acked-by: "Yury Norov (NVIDIA)" <yury.norov@gmail.com> Link: https://patch.msgid.link/de6919554fbb4cd1427155c6bafbac8a9df822c8.1757517135.git.geert@linux-m68k.org
2025-09-15x86/cpu: Detect FreeBSD Bhyve hypervisorDavid Woodhouse5-0/+81
Detect the Bhyve hypervisor and enable 15-bit MSI support if available. Detecting Bhyve used to be a purely cosmetic issue of the kernel printing 'Hypervisor detected: Bhyve' at boot time. But FreeBSD 15.0 will support¹ the 15-bit MSI enlightenment to support more than 255 vCPUs (http://david.woodhou.se/ExtDestId.pdf) which means there's now actually some functional reason to do so. ¹ https://github.com/freebsd/freebsd-src/commit/313a68ea20b4 [ bp: Massage, move tail comment ontop. ] Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Acked-by: Ahmed S. Darwish <darwi@linutronix.de> Link: https://lore.kernel.org/03802f6f7f5b5cf8c5e8adfe123c397ca8e21093.camel@infradead.org
2025-09-15KVM: arm64: Map hyp text as RO and dump instr on panicMostafa Saleh2-5/+11
Map the hyp text section as RO, there are no secrets there and that allows the kernel extract info for debugging. As in case of panic we can now dump the faulting instructions similar to the kernel. Signed-off-by: Mostafa Saleh <smostafa@google.com> Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-09-15KVM: arm64: Dump instruction on hyp panicMostafa Saleh3-6/+15
Similar to the kernel panic, where the instruction code is printed, we can do the same for hypervisor panics. This patch does that only in case of “CONFIG_NVHE_EL2_DEBUG” or nvhe. The next patch adds support for pKVM. Also, remove the hardcoded argument dump_kernel_instr(). Signed-off-by: Mostafa Saleh <smostafa@google.com> Tested-by: Kunwu Chan <chentao@kylinos.cn> Reviewed-by: Kunwu Chan <chentao@kylinos.cn> Acked-by: Will Deacon <will@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-09-15uprobes/x86: Return error from uprobe syscall when not called from trampolineJiri Olsa1-1/+1
Currently uprobe syscall handles all errors with forcing SIGILL to current process. As suggested by Andrii it'd be helpful for uprobe syscall detection to return error value for the !in_uprobe_trampoline check. This way we could just call uprobe syscall and based on return value we will find out if the kernel has it. Suggested-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Oleg Nesterov <oleg@redhat.com>
2025-09-15riscv, bpf: Sign extend struct ops return values properlyHengqi Chen1-1/+41
The ns_bpf_qdisc selftest triggers a kernel panic: Unable to handle kernel paging request at virtual address ffffffffa38dbf58 Current test_progs pgtable: 4K pagesize, 57-bit VAs, pgdp=0x00000001109cc000 [ffffffffa38dbf58] pgd=000000011fffd801, p4d=000000011fffd401, pud=000000011fffd001, pmd=0000000000000000 Oops [#1] Modules linked in: bpf_testmod(OE) xt_conntrack nls_iso8859_1 [...] [last unloaded: bpf_testmod(OE)] CPU: 1 UID: 0 PID: 23584 Comm: test_progs Tainted: G W OE 6.17.0-rc1-g2465bb83e0b4 #1 NONE Tainted: [W]=WARN, [O]=OOT_MODULE, [E]=UNSIGNED_MODULE Hardware name: Unknown Unknown Product/Unknown Product, BIOS 2024.01+dfsg-1ubuntu5.1 01/01/2024 epc : __qdisc_run+0x82/0x6f0 ra : __qdisc_run+0x6e/0x6f0 epc : ffffffff80bd5c7a ra : ffffffff80bd5c66 sp : ff2000000eecb550 gp : ffffffff82472098 tp : ff60000096895940 t0 : ffffffff8001f180 t1 : ffffffff801e1664 t2 : 0000000000000000 s0 : ff2000000eecb5d0 s1 : ff60000093a6a600 a0 : ffffffffa38dbee8 a1 : 0000000000000001 a2 : ff2000000eecb510 a3 : 0000000000000001 a4 : 0000000000000000 a5 : 0000000000000010 a6 : 0000000000000000 a7 : 0000000000735049 s2 : ffffffffa38dbee8 s3 : 0000000000000040 s4 : ff6000008bcda000 s5 : 0000000000000008 s6 : ff60000093a6a680 s7 : ff60000093a6a6f0 s8 : ff60000093a6a6ac s9 : ff60000093140000 s10: 0000000000000000 s11: ff2000000eecb9d0 t3 : 0000000000000000 t4 : 0000000000ff0000 t5 : 0000000000000000 t6 : ff60000093a6a8b6 status: 0000000200000120 badaddr: ffffffffa38dbf58 cause: 000000000000000d [<ffffffff80bd5c7a>] __qdisc_run+0x82/0x6f0 [<ffffffff80b6fe58>] __dev_queue_xmit+0x4c0/0x1128 [<ffffffff80b80ae0>] neigh_resolve_output+0xd0/0x170 [<ffffffff80d2daf6>] ip6_finish_output2+0x226/0x6c8 [<ffffffff80d31254>] ip6_finish_output+0x10c/0x2a0 [<ffffffff80d31446>] ip6_output+0x5e/0x178 [<ffffffff80d2e232>] ip6_xmit+0x29a/0x608 [<ffffffff80d6f4c6>] inet6_csk_xmit+0xe6/0x140 [<ffffffff80c985e4>] __tcp_transmit_skb+0x45c/0xaa8 [<ffffffff80c995fe>] tcp_connect+0x9ce/0xd10 [<ffffffff80d66524>] tcp_v6_connect+0x4ac/0x5e8 [<ffffffff80cc19b8>] __inet_stream_connect+0xd8/0x318 [<ffffffff80cc1c36>] inet_stream_connect+0x3e/0x68 [<ffffffff80b42b20>] __sys_connect_file+0x50/0x88 [<ffffffff80b42bee>] __sys_connect+0x96/0xc8 [<ffffffff80b42c40>] __riscv_sys_connect+0x20/0x30 [<ffffffff80e5bcae>] do_trap_ecall_u+0x256/0x378 [<ffffffff80e69af2>] handle_exception+0x14a/0x156 Code: 892a 0363 1205 489c 8bc1 c7e5 2d03 084a 2703 080a (2783) 0709 ---[ end trace 0000000000000000 ]--- The bpf_fifo_dequeue prog returns a skb which is a pointer. The pointer is treated as a 32bit value and sign extend to 64bit in epilogue. This behavior is right for most bpf prog types but wrong for struct ops which requires RISC-V ABI. So let's sign extend struct ops return values according to the function model and RISC-V ABI([0]). [0]: https://riscv.org/wp-content/uploads/2024/12/riscv-calling.pdf Fixes: 25ad10658dc1 ("riscv, bpf: Adapt bpf trampoline to optimized riscv ftrace framework") Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Tested-by: Pu Lehui <pulehui@huawei.com> Reviewed-by: Pu Lehui <pulehui@huawei.com> Link: https://lore.kernel.org/bpf/20250908012448.1695-1-hengqi.chen@gmail.com
2025-09-15riscv, bpf: Remove duplicated bpf_flush_icache()Hengqi Chen1-1/+0
The bpf_flush_icache() is done by bpf_arch_text_copy() already. Remove the duplicated one in arch_prepare_bpf_trampoline(). Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Reviewed-by: Pu Lehui <pulehui@huawei.com> Link: https://lore.kernel.org/bpf/20250904105119.21861-1-hengqi.chen@gmail.com
2025-09-15powerpc64/modules: replace stub allocation sentinel with an explicit counterJoe Lawrence2-18/+9
The logic for allocating ppc64_stub_entry trampolines in the .stubs section relies on an inline sentinel, where a NULL .funcdata member indicates an available slot. While preceding commits fixed the initialization bugs that led to ftrace stub corruption, the sentinel-based approach remains fragile: it depends on an implicit convention between subsystems modifying different struct types in the same memory area. Replace the sentinel with an explicit counter, module->arch.num_stubs. Instead of iterating through memory to find a NULL marker, the module loader uses this counter as the boundary for the next free slot. This simplifies the allocation code, hardens it against future changes to stub structures, and removes the need for an extra relocation slot previously reserved to terminate the sentinel search. Signed-off-by: Joe Lawrence <joe.lawrence@redhat.com> Acked-by: Naveen N Rao (AMD) <naveen@kernel.org> Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com> Link: https://patch.msgid.link/20250912142740.3581368-4-joe.lawrence@redhat.com
2025-09-15powerpc64/modules: correctly iterate over stubs in setup_ftrace_ool_stubsJoe Lawrence1-1/+1
CONFIG_PPC_FTRACE_OUT_OF_LINE introduced setup_ftrace_ool_stubs() to extend the ppc64le module .stubs section with an array of ftrace_ool_stub structures for each patchable function. Fix its ppc64_stub_entry stub reservation loop to properly write across all of the num_stubs used and not just the first entry. Fixes: eec37961a56a ("powerpc64/ftrace: Move ftrace sequence out of line") Signed-off-by: Joe Lawrence <joe.lawrence@redhat.com> Acked-by: Naveen N Rao (AMD) <naveen@kernel.org> Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com> Link: https://patch.msgid.link/20250912142740.3581368-3-joe.lawrence@redhat.com
2025-09-15powerpc/ftrace: ensure ftrace record ops are always set for NOPsJoe Lawrence1-2/+8
When an ftrace call site is converted to a NOP, its corresponding dyn_ftrace record should have its ftrace_ops pointer set to ftrace_nop_ops. Correct the powerpc implementation to ensure the ftrace_rec_set_nop_ops() helper is called on all successful NOP initialization paths. This ensures all ftrace records are consistent before being handled by the ftrace core. Fixes: eec37961a56a ("powerpc64/ftrace: Move ftrace sequence out of line") Suggested-by: Naveen N Rao <naveen@kernel.org> Signed-off-by: Joe Lawrence <joe.lawrence@redhat.com> Acked-by: Naveen N Rao (AMD) <naveen@kernel.org> Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com> Link: https://patch.msgid.link/20250912142740.3581368-2-joe.lawrence@redhat.com
2025-09-15x86/resctrl: Configure mbm_event mode if supportedBabu Moger3-0/+16
Configure mbm_event mode on AMD platforms. On AMD platforms, it is recommended to use the mbm_event mode, if supported, to prevent the hardware from resetting counters between reads. This can result in misleading values or display "Unavailable" if no counter is assigned to the event. Enable mbm_event mode, known as ABMC (Assignable Bandwidth Monitoring Counters) on AMD, by default if the system supports it. Update ABMC across all logical processors within the resctrl domain to ensure proper functionality. Signed-off-by: Babu Moger <babu.moger@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Link: https://lore.kernel.org/cover.1757108044.git.babu.moger@amd.com
2025-09-15KVM: arm64: Return early from trace helpers when KVM isn't availableYingchao Deng1-11/+11
When Linux is booted at EL1, host_data_ptr() resolves to the nVHE hypervisor's copy of host data. When hyp mode isn't available for KVM the nVHE percpu bases remain uninitialized. Consequently, any usage of host_data_ptr() will result in a NULL dereference which has been observed in KVM's trace filtering helpers. Add an early return to the trace filtering helpers if KVM isn't initialized, avoiding the NULL dereference. Take this opportunity to move the TRBE-skipping checks to a common helper. Fixes: 054b88391bbe2 ("KVM: arm64: Support trace filtering for guests") Signed-off-by: Yingchao Deng <yingchao.deng@oss.qualcomm.com> Reviewed-by: James Clark <james.clark@linaro.org> [maz: repainted the helpers to be readable, and the commit message with Oliver's suggestion] Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-09-15x86/resctrl: Implement resctrl_arch_reset_cntr() and resctrl_arch_cntr_read()Babu Moger2-0/+75
System software reads resctrl event data for a particular resource by writing the RMID and Event Identifier (EvtID) to the QM_EVTSEL register and then reading the event data from the QM_CTR register. In ABMC mode, the event data of a specific counter ID is read by setting the following fields: QM_EVTSEL.ExtendedEvtID = 1, QM_EVTSEL.EvtID = L3CacheABMC (=1) and setting QM_EVTSEL.RMID to the desired counter ID. Reading the QM_CTR then returns the contents of the specified counter ID. RMID_VAL_ERROR bit is set if the counter configuration is invalid, or if an invalid counter ID is set in the QM_EVTSEL.RMID field. RMID_VAL_UNAVAIL bit is set if the counter data is unavailable. Introduce resctrl_arch_reset_cntr() and resctrl_arch_cntr_read() to reset and read event data for a specific counter. Signed-off-by: Babu Moger <babu.moger@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Link: https://lore.kernel.org/cover.1757108044.git.babu.moger@amd.com
2025-09-15x86/resctrl: Refactor resctrl_arch_rmid_read()Babu Moger1-15/+23
resctrl_arch_rmid_read() adjusts the value obtained from MSR_IA32_QM_CTR to account for the overflow for MBM events and apply counter scaling for all the events. This logic is common to both reading an RMID and reading a hardware counter directly. Refactor the hardware value adjustment logic into get_corrected_val() to prepare for support of reading a hardware counter. Signed-off-by: Babu Moger <babu.moger@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Link: https://lore.kernel.org/cover.1757108044.git.babu.moger@amd.com
2025-09-15x86,fs/resctrl: Implement resctrl_arch_config_cntr() to assign a counter ↵Babu Moger1-0/+36
with ABMC The ABMC feature allows users to assign a hardware counter to an RMID, event pair and monitor bandwidth usage as long as it is assigned. The hardware continues to track the assigned counter until it is explicitly unassigned by the user. Implement an x86 architecture-specific handler to configure a counter. This architecture specific handler is called by resctrl fs when a counter is assigned or unassigned as well as when an already assigned counter's configuration should be updated. Configure counters by writing to the L3_QOS_ABMC_CFG MSR, specifying the counter ID, bandwidth source (RMID), and event configuration. The ABMC feature details are documented in APM [1] available from [2]. [1] AMD64 Architecture Programmer's Manual Volume 2: System Programming Publication # 24593 Revision 3.41 section 19.3.3.3 Assignable Bandwidth Monitoring (ABMC). Signed-off-by: Babu Moger <babu.moger@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Link: https://lore.kernel.org/cover.1757108044.git.babu.moger@amd.com Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537 # [2]
2025-09-15x86/resctrl: Add data structures and definitions for ABMC assignmentBabu Moger2-0/+37
The ABMC feature allows users to assign a hardware counter to an RMID, event pair and monitor bandwidth usage as long as it is assigned. The hardware continues to track the assigned counter until it is explicitly unassigned by the user. The ABMC feature implements an MSR L3_QOS_ABMC_CFG (C000_03FDh). ABMC counter assignment is done by setting the counter id, bandwidth source (RMID) and bandwidth configuration. Attempts to read or write the MSR when ABMC is not enabled will result in a #GP(0) exception. Introduce the data structures and definitions for MSR L3_QOS_ABMC_CFG (0xC000_03FDh): ========================================================================= Bits Mnemonic Description Access Reset Type Value ========================================================================= 63 CfgEn Configuration Enable R/W 0 62 CtrEn Enable/disable counting R/W 0 61:53 – Reserved MBZ 0 52:48 CtrID Counter Identifier R/W 0 47 IsCOS BwSrc field is a CLOSID R/W 0 (not an RMID) 46:44 – Reserved MBZ 0 43:32 BwSrc Bandwidth Source R/W 0 (RMID or CLOSID) 31:0 BwType Bandwidth configuration R/W 0 tracked by the CtrID ========================================================================== The ABMC feature details are documented in APM [1] available from [2]. [1] AMD64 Architecture Programmer's Manual Volume 2: System Programming Publication # 24593 Revision 3.41 section 19.3.3.3 Assignable Bandwidth Monitoring (ABMC). [ bp: Touchups. ] Signed-off-by: Babu Moger <babu.moger@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Link: https://lore.kernel.org/cover.1757108044.git.babu.moger@amd.com Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537 # [2]
2025-09-15x86/resctrl: Add support to enable/disable AMD ABMC featureBabu Moger3-0/+51
Add the functionality to enable/disable the AMD ABMC feature. The AMD ABMC feature is enabled by setting enabled bit(0) in the L3_QOS_EXT_CFG MSR. When the state of ABMC is changed, the MSR needs to be updated on all the logical processors in the QOS Domain. Hardware counters will reset when ABMC state is changed. [ bp: Massage commit message. ] Signed-off-by: Babu Moger <babu.moger@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Link: https://lore.kernel.org/cover.1757108044.git.babu.moger@amd.com Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537 # [2]
2025-09-15x86,fs/resctrl: Detect Assignable Bandwidth Monitoring feature detailsBabu Moger2-5/+13
ABMC feature details are reported via CPUID Fn8000_0020_EBX_x5. Bits Description 15:0 MAX_ABMC Maximum Supported Assignable Bandwidth Monitoring Counter ID + 1 The ABMC feature details are documented in APM [1] available from [2]. [1] AMD64 Architecture Programmer's Manual Volume 2: System Programming Publication # 24593 Revision 3.41 section 19.3.3.3 Assignable Bandwidth Monitoring (ABMC). Detect the feature and number of assignable counters supported. For backward compatibility, upon detecting the assignable counter feature, enable the mbm_total_bytes and mbm_local_bytes events that users are familiar with as part of original L3 MBM support. Signed-off-by: Babu Moger <babu.moger@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Link: https://lore.kernel.org/cover.1757108044.git.babu.moger@amd.com Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537 # [2]
2025-09-15x86,fs/resctrl: Consolidate monitoring related data from rdt_resourceBabu Moger2-7/+7
The cache allocation and memory bandwidth allocation feature properties are consolidated into struct resctrl_cache and struct resctrl_membw respectively. In preparation for more monitoring properties that will clobber the existing resource struct more, re-organize the monitoring specific properties to also be in a separate structure. Also convert "bandwidth sources" terminology to "memory transactions" to have consistency within resctrl for related monitoring features. [ bp: Massage commit message. ] Suggested-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Babu Moger <babu.moger@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Link: https://lore.kernel.org/cover.1757108044.git.babu.moger@amd.com
2025-09-15x86/resctrl: Add ABMC feature in the command line optionsBabu Moger1-0/+2
Add a kernel command-line parameter to enable or disable the exposure of the ABMC (Assignable Bandwidth Monitoring Counters) hardware feature to resctrl. Signed-off-by: Babu Moger <babu.moger@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Link: https://lore.kernel.org/cover.1757108044.git.babu.moger@amd.com
2025-09-15x86/cpufeatures: Add support for Assignable Bandwidth Monitoring Counters (ABMC)Babu Moger2-0/+2
Users can create as many monitor groups as RMIDs supported by the hardware. However, the bandwidth monitoring feature on AMD only guarantees that RMIDs currently assigned to a processor will be tracked by hardware. The counters of any other RMIDs which are no longer being tracked will be reset to zero. The MBM event counters return "Unavailable" for the RMIDs that are not tracked by hardware. So, there can be only limited number of groups that can give guaranteed monitoring numbers. With ever changing configurations there is no way to definitely know which of these groups are being tracked during a particular time. Users do not have the option to monitor a group or set of groups for a certain period of time without worrying about RMID being reset in between. The ABMC feature allows users to assign a hardware counter to an RMID, event pair and monitor bandwidth usage as long as it is assigned. The hardware continues to track the assigned counter until it is explicitly unassigned by the user. There is no need to worry about counters being reset during this period. Additionally, the user can specify the type of memory transactions (e.g., reads, writes) for the counter to track. Without ABMC enabled, monitoring will work in current mode without assignment option. The Linux resctrl subsystem provides an interface that allows monitoring of up to two memory bandwidth events per group, selected from a combination of available total and local events. When ABMC is enabled, two events will be assigned to each group by default, in line with the current interface design. Users will also have the option to configure which types of memory transactions are counted by these events. Due to the limited number of available counters (32), users may quickly exhaust the available counters. If the system runs out of assignable ABMC counters, the kernel will report an error. In such cases, users will need to unassign one or more active counters to free up counters for new assignments. resctrl will provide options to assign or unassign events through the group-specific interface file. The feature is detected via CPUID_Fn80000020_EBX_x00 bit 5: ABMC (Assignable Bandwidth Monitoring Counters). The ABMC feature details are documented in APM [1] available from [2]. [1] AMD64 Architecture Programmer's Manual Volume 2: System Programming Publication # 24593 Revision 3.41 section 19.3.3.3 Assignable Bandwidth Monitoring (ABMC). [ bp: Massage commit message, fixup enumeration due to VMSCAPE ] Signed-off-by: Babu Moger <babu.moger@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Link: https://lore.kernel.org/cover.1757108044.git.babu.moger@amd.com Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537 # [2]
2025-09-15Merge back earlier cpufreq material for 6.18Rafael J. Wysocki1-1/+1
2025-09-15x86,fs/resctrl: Prepare for more monitor eventsTony Luck3-41/+43
There's a rule in computer programming that objects appear zero, once, or many times. So code accordingly. There are two MBM events and resctrl is coded with a lot of if (local) do one thing if (total) do a different thing Change the rdt_mon_domain and rdt_hw_mon_domain structures to hold arrays of pointers to per event data instead of explicit fields for total and local bandwidth. Simplify by coding for many events using loops on which are enabled. Move resctrl_is_mbm_event() to <linux/resctrl.h> so it can be used more widely. Also provide a for_each_mbm_event_id() helper macro. Cleanup variable names in functions touched to consistently use "eventid" for those with type enum resctrl_event_id. Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Babu Moger <babu.moger@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Link: https://lore.kernel.org/cover.1757108044.git.babu.moger@amd.com
2025-09-15x86/resctrl: Remove the rdt_mon_features global variableTony Luck3-10/+5
rdt_mon_features is used as a bitmask of enabled monitor events. A monitor event's status is now maintained in mon_evt::enabled with all monitor events' mon_evt structures found in the filesystem's mon_event_all[] array. Remove the remaining uses of rdt_mon_features. Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Babu Moger <babu.moger@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Link: https://lore.kernel.org/cover.1757108044.git.babu.moger@amd.com
2025-09-15KVM: Implement barriers before accessing kvm->buses[] on SRCU read pathsKeir Fraser1-0/+7
This ensures that, if a VCPU has "observed" that an IO registration has occurred, the instruction currently being trapped or emulated will also observe the IO registration. At the same time, enforce that kvm_get_bus() is used only on the update side, ensuring that a long-term reference cannot be obtained by an SRCU reader. Signed-off-by: Keir Fraser <keirf@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-09-15KVM: arm64: vgic: Explicitly implement vgic_dist::ready orderingKeir Fraser1-9/+2
In preparation to remove synchronize_srcu() from MMIO registration, remove the distributor's dependency on this implicit barrier by direct acquire-release synchronization on the flag write and its lock-free check. Signed-off-by: Keir Fraser <keirf@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-09-15KVM: arm64: vgic-init: Remove vgic_ready() macroKeir Fraser1-3/+2
It is now used only within kvm_vgic_map_resources(). vgic_dist::ready is already written directly by this function, so it is clearer to bypass the macro for reads as well. Signed-off-by: Keir Fraser <keirf@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-09-15x86,fs/resctrl: Replace architecture event enabled checksTony Luck3-19/+4
The resctrl file system now has complete knowledge of the status of every event. So there is no need for per-event function calls to check. Replace each of the resctrl_arch_is_{event}enabled() calls with resctrl_is_mon_event_enabled(QOS_{EVENT}). No functional change. Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Babu Moger <babu.moger@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Fenghua Yu <fenghuay@nvidia.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Link: https://lore.kernel.org/cover.1757108044.git.babu.moger@amd.com
2025-09-15Merge branch kvm-arm64/pkvm_vm_handle into kvmarm-master/nextMarc Zyngier9-75/+221
* kvm-arm64/pkvm_vm_handle: : pKVM VM handle allocation fixes, courtesy of Fuad Tabba. : : From the cover letter (20250909072437.4110547-1-tabba@google.com): : : "In pKVM, this handle is allocated when the VM is initialized at the : hypervisor, which is on the first vCPU run. However, the host starts : initializing the VM and setting up its data structures earlier. MMU : notifiers for the VMs are also registered before VM initialization at : the hypervisor, and rely on the handle to identify the VM. : : Therefore, there is a potential gap between when the VM is (partially) : setup at the host, but still without a valid pKVM handle to identify it : when communicating with the hypervisor." KVM: arm64: Reserve pKVM handle during pkvm_init_host_vm() KVM: arm64: Introduce separate hypercalls for pKVM VM reservation and initialization KVM: arm64: Consolidate pKVM hypervisor VM initialization logic KVM: arm64: Separate allocation and insertion of pKVM VM table entries KVM: arm64: Decouple hyp VM creation state from its handle KVM: arm64: Clarify comments to distinguish pKVM mode from protected VMs KVM: arm64: Rename 'host_kvm' to 'kvm' in pKVM host code KVM: arm64: Rename pkvm.enabled to pkvm.is_protected KVM: arm64: Add build-time check for duplicate DECLARE_REG use Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-09-15KVM: arm64: Reserve pKVM handle during pkvm_init_host_vm()Fuad Tabba2-14/+33
When a pKVM guest is active, TLB invalidations triggered by host MMU notifiers require a valid hypervisor handle. Currently, this handle is only allocated when the first vCPU is run. However, the guest's memory is associated with the host MMU much earlier, during kvm_arch_init_vm(). This creates a window where an MMU invalidation could occur after the kvm_pgtable pointer checked by the notifiers is set but before the pKVM handle has been created. Fix this by reserving the pKVM handle when the host VM is first set up. Move the call to the __pkvm_reserve_vm hypercall from the first-vCPU-run path into pkvm_init_host_vm(), which is called during initial VM setup. This ensures the handle is available before any subsystem can trigger an MMU notification for the VM. The VM destruction path is updated to call __pkvm_unreserve_vm for cases where a VM was reserved but never fully created at the hypervisor, ensuring the handle is properly released. This fix leverages the two-stage reservation/initialization hypercall interface introduced in preceding patches. Signed-off-by: Fuad Tabba <tabba@google.com> Tested-by: Mark Brown <broonie@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-09-15KVM: arm64: Introduce separate hypercalls for pKVM VM reservation and ↵Fuad Tabba5-24/+108
initialization The existing __pkvm_init_vm hypercall performs both the reservation of a VM table entry and the initialization of the hypervisor VM state in a single operation. This design prevents the host from obtaining a VM handle from the hypervisor until all preparation for the creation and the initialization of the VM is done, which is on the first vCPU run operation. To support more flexible VM lifecycle management, the host needs the ability to reserve a handle early, before the first vCPU run. Refactor the hypercall interface to enable this, splitting the single hypercall into a two-stage process: - __pkvm_reserve_vm: A new hypercall that allocates a slot in the hypervisor's vm_table, marks it as reserved, and returns a unique handle to the host. - __pkvm_unreserve_vm: A corresponding cleanup hypercall to safely release the reservation if the host fails to proceed with full initialization. - __pkvm_init_vm: The existing hypercall is modified to no longer allocate a slot. It now expects a pre-reserved handle and commits the donated VM memory to that slot. For now, the host-side code in __pkvm_create_hyp_vm calls the new reserve and init hypercalls back-to-back to maintain existing behavior. This paves the way for subsequent patches to separate the reservation and initialization steps in the VM's lifecycle. Signed-off-by: Fuad Tabba <tabba@google.com> Tested-by: Mark Brown <broonie@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-09-15KVM: arm64: Consolidate pKVM hypervisor VM initialization logicFuad Tabba1-23/+24
The insert_vm_table_entry() function was performing tasks beyond its primary responsibility. In addition to inserting a VM pointer into the vm_table, it was also initializing several fields within 'struct pkvm_hyp_vm', such as the VMID and stage-2 MMU pointers. This mixing of concerns made the code harder to follow. As another preparatory step towards allowing a VM table entry to be reserved before the VM is fully created, this logic must be cleaned up. By separating table insertion from state initialization, we can control the timing of the initialization step more precisely in subsequent patches. Refactor the code to consolidate all initialization logic into init_pkvm_hyp_vm(): - Move the initialization of the handle, VMID, and MMU fields from insert_vm_table_entry() to init_pkvm_hyp_vm(). - Simplify insert_vm_table_entry() to perform only one action: placing the provided pkvm_hyp_vm pointer into the vm_table. - Update the calling sequence in __pkvm_init_vm() to first allocate an entry in the VM table, initialize the VM, and then insert the VM into the VM table. This is all protected by the vm_table_lock for now. Subsequent patches will adjust the sequence and not hold the vm_table_lock while initializing the VM at the hypervisor (init_pkvm_hyp_vm()). Signed-off-by: Fuad Tabba <tabba@google.com> Tested-by: Mark Brown <broonie@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-09-15KVM: arm64: Separate allocation and insertion of pKVM VM table entriesFuad Tabba1-9/+43
The current insert_vm_table_entry() function performs two actions at once: it finds a free slot in the pKVM VM table and populates it with the pkvm_hyp_vm pointer. Refactor this function as a preparatory step for future work that will require reserving a VM slot and its corresponding handle earlier in the VM lifecycle, before the pkvm_hyp_vm structure is initialized and ready to be inserted. Split the function into a two-phase process: - A new allocate_vm_table_entry() function finds an empty slot, marks it as reserved with a RESERVED_ENTRY placeholder, and returns a handle derived from the slot's index. - The insert_vm_table_entry() function is repurposed to take the handle, validate that the corresponding slot is in the reserved state, and then populate it with the pkvm_hyp_vm pointer. Signed-off-by: Fuad Tabba <tabba@google.com> Tested-by: Mark Brown <broonie@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-09-15KVM: arm64: Decouple hyp VM creation state from its handleFuad Tabba4-2/+12
Currently, the presence of a pKVM handle (pkvm.handle != 0) is used to determine if the corresponding hypervisor (EL2) VM has been created and initialized. This couples the handle's lifecycle with the VM's creation state. This coupling will become problematic with upcoming changes that will allocate the pKVM handle earlier in the VM's life, before the VM is instantiated at the hypervisor. To prepare for this and make the state tracking explicit, decouple the two concepts. Introduce a new boolean flag, 'pkvm.is_created', to track whether the hypervisor-side VM has been created and initialized. A new helper, pkvm_hyp_vm_is_created(), is added to check this flag. All call sites that previously checked for the handle's existence are converted to use the new, explicit check. The 'is_created' flag is set to true upon successful creation in the hypervisor (EL2) and cleared upon destruction. Signed-off-by: Fuad Tabba <tabba@google.com> Tested-by: Mark Brown <broonie@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-09-15KVM: arm64: Clarify comments to distinguish pKVM mode from protected VMsFuad Tabba2-15/+12
The hypervisor code for protected KVM contains comments that are imprecise and at times flat-out wrong. They often refer to a "protected VM" in contexts where the code or data structure applies to _any_ VM managed by the hypervisor when pKVM is enabled. For instance, the 'vm_table' holds handles for all VMs known to the hypervisor, not exclusively for those that are configured as protected. This inaccurate terminology can make the code scope harder to understand for future (and current) developers. Clarify the comments throughout the pKVM hypervisor code to make a clear distinction between the pKVM feature itself (i.e., "protected mode") and the VMs that are specifically configured to be protected. This involves replacing ambiguous uses of "protected VM" with more accurate phrasing. No functional change intended. Signed-off-by: Fuad Tabba <tabba@google.com> Tested-by: Mark Brown <broonie@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-09-15KVM: arm64: Rename 'host_kvm' to 'kvm' in pKVM host codeFuad Tabba1-23/+23
In hypervisor (EL2) code, it is important to distinguish between the host's 'struct kvm' and a protected VM's 'struct kvm'. Using 'host_kvm' as variable name in that context makes this distinction clear. However, in the host kernel code (EL1), there is no such ambiguity. The code is only ever concerned with the host's own 'struct kvm' instance. The 'host_' prefix is therefore redundant and adds unnecessary verbosity. Simplify the code by renaming the 'host_kvm' parameter to 'kvm' in all functions within host-side kernel code (EL1). This improves readability and makes the naming consistent with other host-side kernel code. No functional change intended. Signed-off-by: Fuad Tabba <tabba@google.com> Tested-by: Mark Brown <broonie@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-09-15KVM: arm64: Rename pkvm.enabled to pkvm.is_protectedFuad Tabba2-3/+3
The 'pkvm.enabled' field in struct kvm_protected_vm is confusingly named. Its purpose is to indicate whether a VM is a _protected_ VM under pKVM, and not whether the VM itself is enabled or running. For a non-protected VM, the VM can be fully active and running, yet this field would be false. This ambiguity can lead to incorrect assumptions about the VM's operational state and makes the code harder to reason about. Rename the field to 'is_protected' to make it unambiguous that the flag tracks the protected status of the VM. No functional change intended. Reviewed-by: Kunwu Chan <kunwu.chan@linux.dev> Signed-off-by: Fuad Tabba <tabba@google.com> Reviewed-by: Kunwu Chan <chentao@kylinos.cn> Tested-by: Mark Brown <broonie@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-09-15KVM: arm64: Add build-time check for duplicate DECLARE_REG useFuad Tabba1-1/+2
The DECLARE_REG() macro provides a convenient way to create a local variable initialized from a cpu context in the hyp trap handlers. However, a common error is to use the macro multiple times in the same scope with the same register index, but for different logical purposes. This results in valid C code that compiles without error, but introduces subtle bugs where a developer expects two different variables to hold values from two different registers, when in fact they are both sourced from the same one. To prevent this entire class of bugs, modify the DECLARE_REG() macro to declare a dummy variable whose name is derived from the register index. If the macro is used again with the same index in the same scope, the compiler will fail with a "redeclaration of variable" error, turning a subtle runtime bug into an obvious build-time failure. Signed-off-by: Fuad Tabba <tabba@google.com> Tested-by: Mark Brown <broonie@kernel.org> Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-09-15x86,fs/resctrl: Consolidate monitor event descriptionsTony Luck1-3/+9
There are currently only three monitor events, all associated with the RDT_RESOURCE_L3 resource. Growing support for additional events will be easier with some restructuring to have a single point in file system code where all attributes of all events are defined. Place all event descriptions into an array mon_event_all[]. Doing this has the beneficial side effect of removing the need for rdt_resource::evt_list. Add resctrl_event_id::QOS_FIRST_EVENT for a lower bound on range checks for event ids and as the starting index to scan mon_event_all[]. Drop the code that builds evt_list and change the two places where the list is scanned to scan mon_event_all[] instead using a new helper macro for_each_mon_event(). Architecture code now informs file system code which events are available with resctrl_enable_mon_event(). Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Babu Moger <babu.moger@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Fenghua Yu <fenghuay@nvidia.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Link: https://lore.kernel.org/cover.1757108044.git.babu.moger@amd.com
2025-09-15KVM: arm64: Fix debug checking for np-guests using huge mappingsBen Horgan1-3/+6
When running with transparent huge pages and CONFIG_NVHE_EL2_DEBUG then the debug checking in assert_host_shared_guest() fails on the launch of an np-guest. This WARN_ON() causes a panic and generates the stack below. In __pkvm_host_relax_perms_guest() the debug checking assumes the mapping is a single page but it may be a block map. Update the checking so that the size is not checked and just assumes the correct size. While we're here make the same fix in __pkvm_host_mkyoung_guest(). Info: # lkvm run -k /share/arch/arm64/boot/Image -m 704 -c 8 --name guest-128 Info: Removed ghost socket file "/.lkvm//guest-128.sock". [ 1406.521757] kvm [141]: nVHE hyp BUG at: arch/arm64/kvm/hyp/nvhe/mem_protect.c:1088! [ 1406.521804] kvm [141]: nVHE call trace: [ 1406.521828] kvm [141]: [<ffff8000811676b4>] __kvm_nvhe_hyp_panic+0xb4/0xe8 [ 1406.521946] kvm [141]: [<ffff80008116d12c>] __kvm_nvhe_assert_host_shared_guest+0xb0/0x10c [ 1406.522049] kvm [141]: [<ffff80008116f068>] __kvm_nvhe___pkvm_host_relax_perms_guest+0x48/0x104 [ 1406.522157] kvm [141]: [<ffff800081169df8>] __kvm_nvhe_handle___pkvm_host_relax_perms_guest+0x64/0x7c [ 1406.522250] kvm [141]: [<ffff800081169f0c>] __kvm_nvhe_handle_trap+0x8c/0x1a8 [ 1406.522333] kvm [141]: [<ffff8000811680fc>] __kvm_nvhe___skip_pauth_save+0x4/0x4 [ 1406.522454] kvm [141]: ---[ end nVHE call trace ]--- [ 1406.522477] kvm [141]: Hyp Offset: 0xfffece8013600000 [ 1406.522554] Kernel panic - not syncing: HYP panic: [ 1406.522554] PS:834003c9 PC:0000b1806db6d170 ESR:00000000f2000800 [ 1406.522554] FAR:ffff8000804be420 HPFAR:0000000000804be0 PAR:0000000000000000 [ 1406.522554] VCPU:0000000000000000 [ 1406.523337] CPU: 3 UID: 0 PID: 141 Comm: kvm-vcpu-0 Not tainted 6.16.0-rc7 #97 PREEMPT [ 1406.523485] Hardware name: FVP Base RevC (DT) [ 1406.523566] Call trace: [ 1406.523629] show_stack+0x18/0x24 (C) [ 1406.523753] dump_stack_lvl+0xd4/0x108 [ 1406.523899] dump_stack+0x18/0x24 [ 1406.524040] panic+0x3d8/0x448 [ 1406.524184] nvhe_hyp_panic_handler+0x10c/0x23c [ 1406.524325] kvm_handle_guest_abort+0x68c/0x109c [ 1406.524500] handle_exit+0x60/0x17c [ 1406.524630] kvm_arch_vcpu_ioctl_run+0x2e0/0x8c0 [ 1406.524794] kvm_vcpu_ioctl+0x1a8/0x9cc [ 1406.524919] __arm64_sys_ioctl+0xac/0x104 [ 1406.525067] invoke_syscall+0x48/0x10c [ 1406.525189] el0_svc_common.constprop.0+0x40/0xe0 [ 1406.525322] do_el0_svc+0x1c/0x28 [ 1406.525441] el0_svc+0x38/0x120 [ 1406.525588] el0t_64_sync_handler+0x10c/0x138 [ 1406.525750] el0t_64_sync+0x1ac/0x1b0 [ 1406.525876] SMP: stopping secondary CPUs [ 1406.525965] Kernel Offset: disabled [ 1406.526032] CPU features: 0x0000,00000080,8e134ca1,9446773f [ 1406.526130] Memory Limit: none [ 1406.959099] ---[ end Kernel panic - not syncing: HYP panic: [ 1406.959099] PS:834003c9 PC:0000b1806db6d170 ESR:00000000f2000800 [ 1406.959099] FAR:ffff8000804be420 HPFAR:0000000000804be0 PAR:0000000000000000 [ 1406.959099] VCPU:0000000000000000 ] Signed-off-by: Ben Horgan <ben.horgan@arm.com> Fixes: f28f1d02f4eaa ("KVM: arm64: Add a range to __pkvm_host_unshare_guest()") Cc: Vincent Donnefort <vdonnefort@google.com> Cc: Quentin Perret <qperret@google.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: stable@vger.kernel.org Reviewed-by: Vincent Donnefort <vdonnefort@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org>