aboutsummaryrefslogtreecommitdiffstats
path: root/arch (follow)
AgeCommit message (Collapse)AuthorFilesLines
2025-09-12arm64: dts: ti: k3-am62-verdin: Add missing cfg for TI IPC FirmwareBeleswar Padhi1-1/+42
The wkup_r5fss0_core0_memory_region is used to store the text/data sections of the Device Manager (DM) firmware itself and is necessary for platform boot. Whereas the wkup_r5fss0_core0_dma_memory_region is used for allocating the Virtio buffers needed for IPC with the DM core which could be optional. The labels were incorrectly used in the k3-am62-verdin.dtsi file. Correct the firmware memory region label. Currently, only mailbox node is enabled with FIFO assignment for a single M4F remote core. However, there are no users of the enabled mailboxes. Add the missing carveouts for WKUP R5F and MCU M4F remote processors, and enable those by associating to the above carveout and mailboxes. This config aligns with other AM62 boards and can be refactored out later. Signed-off-by: Beleswar Padhi <b-padhi@ti.com> Tested-by: Hiago De Franco <hiago.franco@toradex.com> # Verdin AM62 Acked-by: Francesco Dolcini <francesco.dolcini@toradex.com> Link: https://patch.msgid.link/20250908142826.1828676-17-b-padhi@ti.com Signed-off-by: Nishanth Menon <nm@ti.com>
2025-09-12arm64: dts: ti: k3-am62p-verdin: Add missing cfg for TI IPC FirmwareBeleswar Padhi1-0/+40
The wkup_r5fss0_core0_memory_region is used to store the text/data sections of the Device Manager (DM) firmware itself and is necessary for platform boot. Whereas the wkup_r5fss0_core0_dma_memory_region is used for allocating the Virtio buffers needed for IPC with the DM core which could be optional. The labels were incorrectly used in the k3-am62p-verdin.dtsi file. Correct the firmware memory region label. Currently, only mailbox node is enabled with FIFO assignment. However, there are no users of the enabled mailboxes. Add the missing carveouts for WKUP and MCU R5F remote processors, and enable those by associating to the above carveout and mailboxes. This config aligns with other AM62P boards and can be refactored out later. Signed-off-by: Beleswar Padhi <b-padhi@ti.com> Tested-by: Hiago De Franco <hiago.franco@toradex.com> # Verdin AM62P Acked-by: Francesco Dolcini <francesco.dolcini@toradex.com> Link: https://patch.msgid.link/20250908142826.1828676-16-b-padhi@ti.com Signed-off-by: Nishanth Menon <nm@ti.com>
2025-09-12arm64: dts: ti: k3-j721e-beagleboneai64: Add missing cfg for TI IPC FWBeleswar Padhi1-0/+29
The TI IPC Firmwares running on J721E SoCs use certain MAIN domain timers as tick. Reserve those at board level DT to avoid remote processor crashes. This config aligns with other J721E boards and can be refactored out later. Signed-off-by: Beleswar Padhi <b-padhi@ti.com> Link: https://patch.msgid.link/20250908142826.1828676-15-b-padhi@ti.com Signed-off-by: Nishanth Menon <nm@ti.com>
2025-09-12arm64: dts: ti: k3: Rename rproc reserved-mem nodes to 'memory@addr'Beleswar Padhi29-285/+285
Currently, the reserved memory carveouts used by remote processors are named like 'rproc-name-<dma>-memory-region@addr'. While it is descriptive, the node label already serves that purpose. Rename reserved memory nodes to generic 'memory@addr' to align with the device tree specifications. This is done for all TI K3 based boards. Signed-off-by: Beleswar Padhi <b-padhi@ti.com> Reviewed-by: Francesco Dolcini <francesco.dolcini@toradex.com> Link: https://patch.msgid.link/20250908142826.1828676-14-b-padhi@ti.com Signed-off-by: Nishanth Menon <nm@ti.com>
2025-09-12arm64: dts: ti: k3-am6*-boards: Add label to reserved-memory nodeBeleswar Padhi13-13/+13
Add the label name 'reserved_memory' to the reserved-memory node in all K3 AM6* board level dts files. This is done so that the node can be referenced and extended to add more carveout entries as needed in future refactoring patches. Signed-off-by: Beleswar Padhi <b-padhi@ti.com> Acked-by: Andrew Davis <afd@ti.com> Link: https://patch.msgid.link/20250908142826.1828676-13-b-padhi@ti.com Signed-off-by: Nishanth Menon <nm@ti.com>
2025-09-12arm64: dts: ti: k3-am62a: Enable Mailbox nodes at the board levelBeleswar Padhi1-0/+4
Mailbox nodes defined in the top-level AM62A SoC dtsi files are incomplete and may not be functional unless they are extended with a chosen interrupt and connection to a remote processor. As the remote processors depend on memory nodes which are only known at the board integration level, these nodes should only be enabled when provided with the above information. Disable the Mailbox nodes in the dtsi files and only enable the ones that are actually used on a given board. Signed-off-by: Beleswar Padhi <b-padhi@ti.com> Tested-by: Judith Mendez <jm@ti.com> Acked-by: Andrew Davis <afd@ti.com> Link: https://patch.msgid.link/20250908142826.1828676-12-b-padhi@ti.com Signed-off-by: Nishanth Menon <nm@ti.com>
2025-09-12arm64: dts: ti: k3-am62: Enable Mailbox nodes at the board levelBeleswar Padhi3-0/+5
Mailbox nodes defined in the top-level AM62x SoC dtsi files are incomplete and may not be functional unless they are extended with a chosen interrupt and connection to a remote processor. As the remote processors depend on memory nodes which are only known at the board integration level, these nodes should only be enabled when provided with the above information. Disable the Mailbox nodes in the dtsi files and only enable the ones that are actually used on a given board. Signed-off-by: Beleswar Padhi <b-padhi@ti.com> Reviewed-by: Dhruva Gole <d-gole@ti.com> Acked-by: Andrew Davis <afd@ti.com> Link: https://patch.msgid.link/20250908142826.1828676-11-b-padhi@ti.com Signed-off-by: Nishanth Menon <nm@ti.com>
2025-09-12arm64: dts: ti: k3-am65: Enable remote processors at board levelBeleswar Padhi3-0/+15
Remote Processors defined in top-level AM65x SoC dtsi files are incomplete without the memory carveouts and mailbox assignments which are only known at board integration level. Therefore, disable the remote processors at SoC level and enable them at board level where above information is available. Signed-off-by: Beleswar Padhi <b-padhi@ti.com> Acked-by: Andrew Davis <afd@ti.com> Link: https://patch.msgid.link/20250908142826.1828676-10-b-padhi@ti.com Signed-off-by: Nishanth Menon <nm@ti.com>
2025-09-12arm64: dts: ti: k3-am64: Enable remote processors at board levelBeleswar Padhi6-0/+66
Remote Processors defined in top-level AM64x SoC dtsi files are incomplete without the memory carveouts and mailbox assignments which are only known at board integration level. Therefore, disable the remote processors at SoC level and enable them at board level where above information is available. Signed-off-by: Beleswar Padhi <b-padhi@ti.com> Tested-by: Wadim Egorov <w.egorov@phytec.de> # phycore-am64x Tested-by: Hari Nagalla <hnagalla@ti.com> Reviewed-by: Wadim Egorov <w.egorov@phytec.de> Acked-by: Andrew Davis <afd@ti.com> Link: https://patch.msgid.link/20250908142826.1828676-9-b-padhi@ti.com Signed-off-by: Nishanth Menon <nm@ti.com>
2025-09-12arm64: dts: ti: k3-am62a: Enable remote processors at board levelBeleswar Padhi5-0/+7
Remote Processors defined in top-level AM62A SoC dtsi files are incomplete without the memory carveouts and mailbox assignments which are only known at board integration level. Therefore, disable the remote processors at SoC level and enable them at board level where above information is available. Signed-off-by: Beleswar Padhi <b-padhi@ti.com> Tested-by: Judith Mendez <jm@ti.com> Acked-by: Andrew Davis <afd@ti.com> Link: https://patch.msgid.link/20250908142826.1828676-8-b-padhi@ti.com Signed-off-by: Nishanth Menon <nm@ti.com>
2025-09-12arm64: dts: ti: k3-am62: Enable remote processors at board levelBeleswar Padhi3-0/+3
Remote Processors defined in top-level AM62x SoC dtsi files are incomplete without the memory carveouts and mailbox assignments which are only known at board integration level. Therefore, disable the remote processors at SoC level and enable them at board level where above information is available. Signed-off-by: Beleswar Padhi <b-padhi@ti.com> Tested-by: Wadim Egorov <w.egorov@phytec.de> Reviewed-by: Wadim Egorov <w.egorov@phytec.de> Reviewed-by: Dhruva Gole <d-gole@ti.com> Acked-by: Andrew Davis <afd@ti.com> Link: https://patch.msgid.link/20250908142826.1828676-7-b-padhi@ti.com Signed-off-by: Nishanth Menon <nm@ti.com>
2025-09-12arm64: dts: ti: k3-am62p-j722s: Enable remote processors at board levelBeleswar Padhi6-0/+11
Remote Processors defined in top-level AM62P-J722S common SoC dtsi files are incomplete without the memory carveouts and mailbox assignments which are only known at board integration level. Therefore, disable the remote processors at SoC level and enable them at board level where above information is available. Signed-off-by: Beleswar Padhi <b-padhi@ti.com> Tested-by: Judith Mendez <jm@ti.com> Reviewed-by: Dhruva Gole <d-gole@ti.com> Acked-by: Andrew Davis <afd@ti.com> Link: https://patch.msgid.link/20250908142826.1828676-6-b-padhi@ti.com Signed-off-by: Nishanth Menon <nm@ti.com>
2025-09-12arm64: dts: ti: k3-j784s4-j742s2: Enable remote processors at board levelBeleswar Padhi4-0/+34
Remote Processors defined in top-level J784S4-J742S2 common SoC dtsi files are incomplete without the memory carveouts and mailbox assignments which are only known at board integration level. Therefore, disable the remote processors at SoC level and enable them at board level where above information is available. Signed-off-by: Beleswar Padhi <b-padhi@ti.com> Acked-by: Andrew Davis <afd@ti.com> Link: https://patch.msgid.link/20250908142826.1828676-5-b-padhi@ti.com Signed-off-by: Nishanth Menon <nm@ti.com>
2025-09-12arm64: dts: ti: k3-j721s2: Enable remote processors at board levelBeleswar Padhi5-0/+45
Remote Processors defined in top-level J721S2 SoC dtsi files are incomplete without the memory carveouts and mailbox assignments which are only known at board integration level. Therefore, disable the remote processors at SoC level and enable them at board level where above information is available. Signed-off-by: Beleswar Padhi <b-padhi@ti.com> Acked-by: Andrew Davis <afd@ti.com> Link: https://patch.msgid.link/20250908142826.1828676-4-b-padhi@ti.com Signed-off-by: Nishanth Menon <nm@ti.com>
2025-09-12arm64: dts: ti: k3-j721e: Enable remote processors at board levelBeleswar Padhi5-0/+51
Remote Processors defined in top-level J721E SoC dtsi files are incomplete without the memory carveouts and mailbox assignments which are only known at board integration level. Therefore, disable the remote processors at SoC level and enable them at board level where above information is available. Signed-off-by: Beleswar Padhi <b-padhi@ti.com> Acked-by: Andrew Davis <afd@ti.com> Link: https://patch.msgid.link/20250908142826.1828676-3-b-padhi@ti.com Signed-off-by: Nishanth Menon <nm@ti.com>
2025-09-12arm64: dts: ti: k3-j7200: Enable R5F remote processors at board levelBeleswar Padhi3-0/+15
Remote Processors defined in top-level J7200 SoC dtsi files are incomplete without the memory carveouts and mailbox assignments which are only known at board integration level. Therefore, disable the remote processors at SoC level and enable them at board level where above information is available. Signed-off-by: Beleswar Padhi <b-padhi@ti.com> Acked-by: Andrew Davis <afd@ti.com> Link: https://patch.msgid.link/20250908142826.1828676-2-b-padhi@ti.com Signed-off-by: Nishanth Menon <nm@ti.com>
2025-09-12arm64: dts: ti: k3-j742s2-mcu-wakeup: Override firmware-name for MCU R5F coresBeleswar Padhi2-0/+18
The J742S2 SoC reuses the common k3-j784s4-j742s2-mcu-wakeup-common.dtsi for its MCU domain, but it does not override the firmware-name property for its R5F cores. This causes the wrong firmware binaries to be referenced. Introduce a new k3-j742s2-mcu-wakeup.dtsi file to override the firmware-name property with correct names for J742s2. Fixes: 38fd90a3e1ac ("arm64: dts: ti: Introduce J742S2 SoC family") Signed-off-by: Beleswar Padhi <b-padhi@ti.com> Reviewed-by: Udit Kumar <u-kumar1@ti.com> Link: https://patch.msgid.link/20250823163111.2237199-1-b-padhi@ti.com Signed-off-by: Nishanth Menon <nm@ti.com>
2025-09-11Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski29-143/+350
Cross-merge networking fixes after downstream PR (net-6.17-rc6). Conflicts: net/netfilter/nft_set_pipapo.c net/netfilter/nft_set_pipapo_avx2.c c4eaca2e1052 ("netfilter: nft_set_pipapo: don't check genbit from packetpath lookups") 84c1da7b38d9 ("netfilter: nft_set_pipapo: use avx2 algorithm for insertions too") Only trivial adjacent changes (in a doc and a Makefile). Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-09-12dma-mapping: convert dma_direct_*map_page to be phys_addr_t basedLeon Romanovsky1-2/+2
Convert the DMA direct mapping functions to accept physical addresses directly instead of page+offset parameters. The functions were already operating on physical addresses internally, so this change eliminates the redundant page-to-physical conversion at the API boundary. The functions dma_direct_map_page() and dma_direct_unmap_page() are renamed to dma_direct_map_phys() and dma_direct_unmap_phys() respectively, with their calling convention changed from (struct page *page, unsigned long offset) to (phys_addr_t phys). Architecture-specific functions arch_dma_map_page_direct() and arch_dma_unmap_page_direct() are similarly renamed to arch_dma_map_phys_direct() and arch_dma_unmap_phys_direct(). The is_pci_p2pdma_page() checks are replaced with DMA_ATTR_MMIO checks to allow integration with dma_direct_map_resource and dma_direct_map_phys() is extended to support MMIO path either. Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Link: https://lore.kernel.org/r/bb15a22f76dc2e26683333ff54e789606cfbfcf0.1757423202.git.leonro@nvidia.com
2025-09-11ARM: defconfig: Remove obsolete CONFIG_USB_EHCI_MSMPetr Vorel1-1/+0
CONFIG_USB_EHCI_MSM was removed long time ago in v4.14-rc6 8b3f863033f9f ("usb: host: remove ehci-msm.c"). Signed-off-by: Petr Vorel <petr.vorel@gmail.com> Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com> Link: https://lore.kernel.org/r/20250829165031.110850-1-petr.vorel@gmail.com Signed-off-by: Bjorn Andersson <andersson@kernel.org>
2025-09-11bpf: Report arena faults to BPF stderrPuranjay Mohan2-5/+140
Begin reporting arena page faults and the faulting address to BPF program's stderr, this patch adds support in the arm64 and x86-64 JITs, support for other archs can be added later. The fault handlers receive the 32 bit address in the arena region so the upper 32 bits of user_vm_start is added to it before printing the address. This is what the user would expect to see as this is what is printed by bpf_printk() is you pass it an address returned by bpf_arena_alloc_pages(); Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/20250911145808.58042-4-puranjay@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-09-11bpf: arm64: simplify exception table handlingPuranjay Mohan1-22/+3
BPF loads with BPF_PROBE_MEM(SX) can load from unsafe pointers and the JIT adds an exception table entry for the JITed instruction which allows the exeption handler to set the destination register of the load to zero and continue execution from the next instruction. As all arm64 instructions are AARCH64_INSN_SIZE size, the exception handler can just increment the pc by AARCH64_INSN_SIZE without needing the exact address of the instruction following the the faulting instruction. Simplify the exception table usage in arm64 JIT by only saving the destination register in ex->fixup and drop everything related to the fixup_offset. The fault handler is modified to add AARCH64_INSN_SIZE to the pc. Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Acked-by: Yonghong Song <yonghong.song@linux.dev> Acked-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Acked-by: Xu Kuohai <xukuohai@huawei.com> Link: https://lore.kernel.org/r/20250911145808.58042-2-puranjay@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-09-11x86/virt/tdx: Use precalculated TDVPR page physical addressKai Huang3-13/+19
All of the x86 KVM guest types (VMX, SEV and TDX) do some special context tracking when entering guests. This means that the actual guest entry sequence must be noinstr. Part of entering a TDX guest is passing a physical address to the TDX module. Right now, that physical address is stored as a 'struct page' and converted to a physical address at guest entry. That page=>phys conversion can be complicated, can vary greatly based on kernel config, and it is definitely _not_ a noinstr path today. There have been a number of tinkering approaches to try and fix this up, but they all fall down due to some part of the page=>phys conversion infrastructure not being noinstr friendly. Precalculate the page=>phys conversion and store it in the existing 'tdx_vp' structure. Use the new field at every site that needs a tdvpr physical address. Remove the now redundant tdx_tdvpr_pa(). Remove the __flatten remnant from the tinkering. Note that only one user of the new field is actually noinstr. All others can use page_to_phys(). But, they might as well save the effort since there is a pre-calculated value sitting there for them. [ dhansen: rewrite all the text ] Signed-off-by: Kai Huang <kai.huang@intel.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Reviewed-by: Kiryl Shutsemau <kas@kernel.org> Tested-by: Farrah Chen <farrah.chen@intel.com>
2025-09-11arm64: tegra: Add I2C nodes for Tegra264Kartik Rajput1-0/+225
Add I2C nodes for Tegra264. Signed-off-by: Kartik Rajput <kkartik@nvidia.com> Signed-off-by: Thierry Reding <treding@nvidia.com>
2025-09-11Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf after rc5Alexei Starovoitov134-725/+1328
Cross-merge BPF and other fixes after downstream PR. No conflicts. Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-09-11ARM: tegra: add support for ASUS Eee Pad Slider SL101Svyatoslav Ryhel4-1251/+1333
Factor out common part from ASUS Eee Pad Transformer TF101 device tree into tegra20-asus-transformer-common.dtsi and add device tree fragment for ASUS Eee Pad Slider SL101. Tested-by: Winona Schroeer-Smith <wolfizen@wolfizen.net> # ASUS SL101 Tested-by: Antoni Aloy Torrens <aaloytorrens@gmail.com> # ASUS TF101 Signed-off-by: Svyatoslav Ryhel <clamor95@gmail.com> Signed-off-by: Thierry Reding <treding@nvidia.com>
2025-09-11ARM: tegra: transformer-20: fix audio-codec interruptSvyatoslav Ryhel1-1/+1
Correct audio-codec interrupt should be PX3 while PX1 is used for external microphone detection. Tested-by: Winona Schroeer-Smith <wolfizen@wolfizen.net> # ASUS SL101 Tested-by: Antoni Aloy Torrens <aaloytorrens@gmail.com> # ASUS TF101 Signed-off-by: Svyatoslav Ryhel <clamor95@gmail.com> Signed-off-by: Thierry Reding <treding@nvidia.com>
2025-09-11ARM: tegra: transformer-20: add missing magnetometer interruptSvyatoslav Ryhel1-0/+3
Add missing interrupt to magnetometer node. Tested-by: Winona Schroeer-Smith <wolfizen@wolfizen.net> # ASUS SL101 Tested-by: Antoni Aloy Torrens <aaloytorrens@gmail.com> # ASUS TF101 Signed-off-by: Svyatoslav Ryhel <clamor95@gmail.com> Signed-off-by: Thierry Reding <treding@nvidia.com>
2025-09-11ARM: tegra: Add DFLL clock support for Tegra114Svyatoslav Ryhel1-0/+33
Add DFLL clock node to common Tegra114 device tree along with clocks property to cpu node. Signed-off-by: Svyatoslav Ryhel <clamor95@gmail.com> Reviewed-by: Mikko Perttunen <mperttunen@nvidia.com> Signed-off-by: Thierry Reding <treding@nvidia.com>
2025-09-11ARM: tegra: p880: set correct touchscreen clippingJonas Schwöbel1-2/+2
Existing touchscreen clipping is too small and causes problems with touchscreen accuracy. Signed-off-by: Jonas Schwöbel <jonasschwoebel@yahoo.de> Signed-off-by: Svyatoslav Ryhel <clamor95@gmail.com> Signed-off-by: Thierry Reding <treding@nvidia.com>
2025-09-11KVM: nSVM: Replace kzalloc() + copy_from_user() with memdup_user()Thorsten Blum1-11/+9
Replace kzalloc() followed by copy_from_user() with memdup_user() to improve and simplify svm_set_nested_state(). Return early if an error occurs instead of trying to allocate memory for 'save' when memory allocation for 'ctl' already failed. Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev> Link: https://lore.kernel.org/r/20250903002951.118912-1-thorsten.blum@linux.dev Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-09-11x86/kvm: Prefer native qspinlock for dedicated vCPUs irrespective of PV_UNHALTLi RongQing1-10/+10
The commit b2798ba0b876 ("KVM: X86: Choose qspinlock when dedicated physical CPUs are available") states that when PV_DEDICATED=1 (vCPU has dedicated pCPU), qspinlock should be preferred regardless of PV_UNHALT. However, the current implementation doesn't reflect this: when PV_UNHALT=0, we still use virt_spin_lock() even with dedicated pCPUs. This is suboptimal because: 1. Native qspinlocks should outperform virt_spin_lock() for dedicated vCPUs irrespective of HALT exiting 2. virt_spin_lock() should only be preferred when vCPUs may be preempted (non-dedicated case) So reorder the PV spinlock checks to: 1. First handle dedicated pCPU case (disable virt_spin_lock_key) 2. Second check single CPU, and nopvspin configuration 3. Only then check PV_UNHALT support This ensures we always use native qspinlock for dedicated vCPUs, delivering pretty performance gains at high contention levels. Signed-off-by: Li RongQing <lirongqing@baidu.com> Reviewed-by: Sean Christopherson <seanjc@google.com> Tested-by: Wangyang Guo <wangyang.guo@intel.com> Link: https://lore.kernel.org/r/20250722110005.4988-1-lirongqing@baidu.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-09-11x86/kvm: Make kvm_async_pf_task_wake() a local static helperSean Christopherson2-4/+1
Make kvm_async_pf_task_wake() static and drop its export, as the symbol is only referenced from within kvm.c. No functional change intended. Link: https://lore.kernel.org/r/20250729153901.564123-1-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-09-11x86/kvm: Force legacy PCI hole to UC when overriding MTRRs for TDX/SNPSean Christopherson1-2/+19
When running as an SNP or TDX guest under KVM, force the legacy PCI hole, i.e. memory between Top of Lower Usable DRAM and 4GiB, to be mapped as UC via a forced variable MTRR range. In most KVM-based setups, legacy devices such as the HPET and TPM are enumerated via ACPI. ACPI enumeration includes a Memory32Fixed entry, and optionally a SystemMemory descriptor for an OperationRegion, e.g. if the device needs to be accessed via a Control Method. If a SystemMemory entry is present, then the kernel's ACPI driver will auto-ioremap the region so that it can be accessed at will. However, the ACPI spec doesn't provide a way to enumerate the memory type of SystemMemory regions, i.e. there's no way to tell software that a region must be mapped as UC vs. WB, etc. As a result, Linux's ACPI driver always maps SystemMemory regions using ioremap_cache(), i.e. as WB on x86. The dedicated device drivers however, e.g. the HPET driver and TPM driver, want to map their associated memory as UC or WC, as accessing PCI devices using WB is unsupported. On bare metal and non-CoCO, the conflicting requirements "work" as firmware configures the PCI hole (and other device memory) to be UC in the MTRRs. So even though the ACPI mappings request WB, they are forced to UC- in the kernel's tracking due to the kernel properly handling the MTRR overrides, and thus are compatible with the drivers' requested WC/UC-. With force WB MTRRs on SNP and TDX guests, the ACPI mappings get their requested WB if the ACPI mappings are established before the dedicated driver code attempts to initialize the device. E.g. if acpi_init() runs before the corresponding device driver is probed, ACPI's WB mapping will "win", and result in the driver's ioremap() failing because the existing WB mapping isn't compatible with the requested WC/UC-. E.g. when a TPM is emulated by the hypervisor (ignoring the security implications of relying on what is allegedly an untrusted entity to store measurements), the TPM driver will request UC and fail: [ 1.730459] ioremap error for 0xfed40000-0xfed45000, requested 0x2, got 0x0 [ 1.732780] tpm_tis MSFT0101:00: probe with driver tpm_tis failed with error -12 Note, the '0x2' and '0x0' values refer to "enum page_cache_mode", not x86's memtypes (which frustratingly are an almost pure inversion; 2 == WB, 0 == UC). E.g. tracing mapping requests for TPM TIS yields: Mapping TPM TIS with req_type = 0 WARNING: CPU: 22 PID: 1 at arch/x86/mm/pat/memtype.c:530 memtype_reserve+0x2ab/0x460 Modules linked in: CPU: 22 UID: 0 PID: 1 Comm: swapper/0 Tainted: G W 6.16.0-rc7+ #2 VOLUNTARY Tainted: [W]=WARN Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 05/29/2025 RIP: 0010:memtype_reserve+0x2ab/0x460 __ioremap_caller+0x16d/0x3d0 ioremap_cache+0x17/0x30 x86_acpi_os_ioremap+0xe/0x20 acpi_os_map_iomem+0x1f3/0x240 acpi_os_map_memory+0xe/0x20 acpi_ex_system_memory_space_handler+0x273/0x440 acpi_ev_address_space_dispatch+0x176/0x4c0 acpi_ex_access_region+0x2ad/0x530 acpi_ex_field_datum_io+0xa2/0x4f0 acpi_ex_extract_from_field+0x296/0x3e0 acpi_ex_read_data_from_field+0xd1/0x460 acpi_ex_resolve_node_to_value+0x2ee/0x530 acpi_ex_resolve_to_value+0x1f2/0x540 acpi_ds_evaluate_name_path+0x11b/0x190 acpi_ds_exec_end_op+0x456/0x960 acpi_ps_parse_loop+0x27a/0xa50 acpi_ps_parse_aml+0x226/0x600 acpi_ps_execute_method+0x172/0x3e0 acpi_ns_evaluate+0x175/0x5f0 acpi_evaluate_object+0x213/0x490 acpi_evaluate_integer+0x6d/0x140 acpi_bus_get_status+0x93/0x150 acpi_add_single_object+0x43a/0x7c0 acpi_bus_check_add+0x149/0x3a0 acpi_bus_check_add_1+0x16/0x30 acpi_ns_walk_namespace+0x22c/0x360 acpi_walk_namespace+0x15c/0x170 acpi_bus_scan+0x1dd/0x200 acpi_scan_init+0xe5/0x2b0 acpi_init+0x264/0x5b0 do_one_initcall+0x5a/0x310 kernel_init_freeable+0x34f/0x4f0 kernel_init+0x1b/0x200 ret_from_fork+0x186/0x1b0 ret_from_fork_asm+0x1a/0x30 </TASK> The above traces are from a Google-VMM based VM, but the same behavior happens with a QEMU based VM that is modified to add a SystemMemory range for the TPM TIS address space. The only reason this doesn't cause problems for HPET, which appears to require a SystemMemory region, is because HPET gets special treatment via x86_init.timers.timer_init(), and so gets a chance to create its UC- mapping before acpi_init() clobbers things. Disabling the early call to hpet_time_init() yields the same behavior for HPET: [ 0.318264] ioremap error for 0xfed00000-0xfed01000, requested 0x2, got 0x0 Hack around the ACPI gap by forcing the legacy PCI hole to UC when overriding the (virtual) MTRRs for CoCo guest, so that ioremap handling of MTRRs naturally kicks in and forces the ACPI mappings to be UC. Note, the requested/mapped memtype doesn't actually matter in terms of accessing the device. In practically every setup, legacy PCI devices are emulated by the hypervisor, and accesses are intercepted and handled as emulated MMIO, i.e. never access physical memory and thus don't have an effective memtype. Even in a theoretical setup where such devices are passed through by the host, i.e. point at real MMIO memory, it is KVM's (as the hypervisor) responsibility to force the memory to be WC/UC, e.g. via EPT memtype under TDX or real hardware MTRRs under SNP. Not doing so cannot work, and the hypervisor is highly motivated to do the right thing as letting the guest access hardware MMIO with WB would likely result in a variety of fatal #MCs. In other words, forcing the range to be UC is all about coercing the kernel's tracking into thinking that it has established UC mappings, so that the ioremap code doesn't reject mappings from e.g. the TPM driver and thus prevent the driver from loading and the device from functioning. Note #2, relying on guest firmware to handle this scenario, e.g. by setting virtual MTRRs and then consuming them in Linux, is not a viable option, as the virtual MTRR state is managed by the untrusted hypervisor, and because OVMF at least has stopped programming virtual MTRRs when running as a TDX guest. Link: https://lore.kernel.org/all/8137d98e-8825-415b-9282-1d2a115bb51a@linux.intel.com Fixes: 8e690b817e38 ("x86/kvm: Override default caching mode for SEV-SNP and TDX") Cc: stable@vger.kernel.org Cc: Peter Gonda <pgonda@google.com> Cc: Vitaly Kuznetsov <vkuznets@redhat.com> Cc: Tom Lendacky <thomas.lendacky@amd.com> Cc: Jürgen Groß <jgross@suse.com> Cc: Korakit Seemakhupt <korakit@google.com> Cc: Jianxiong Gao <jxgao@google.com> Cc: Nikolay Borisov <nik.borisov@suse.com> Suggested-by: Binbin Wu <binbin.wu@linux.intel.com> Reviewed-by: Binbin Wu <binbin.wu@linux.intel.com> Tested-by: Korakit Seemakhupt <korakit@google.com> Link: https://lore.kernel.org/r/20250828005249.39339-1-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-09-11Merge tag 's390-6.17-4' of ↵Linus Torvalds4-7/+5
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 fixes from Alexander Gordeev: - ptep_modify_prot_start() may be called in a loop, which might lead to the preempt_count overflow due to the unnecessary preemption disabling. Do not disable preemption to prevent the overflow - Events of type PERF_TYPE_HARDWARE are not tested for sampling and return -EOPNOTSUPP eventually. Instead, deny all sampling events by CPUMF counter facility and return -ENOENT to allow other PMUs to be tried - The PAI PMU driver returns -EINVAL if an event out of its range. That aborts a search for an alternative PMU driver. Instead, return -ENOENT to allow other PMUs to be tried * tag 's390-6.17-4' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: s390/cpum_cf: Deny all sampling events by counter PMU s390/pai: Deny all events not handled by this PMU s390/mm: Prevent possible preempt_count overflow
2025-09-11arm64: entry: Switch to generic IRQ entryJinjie Ruan6-292/+156
Currently, x86, Riscv and Loongarch use the generic entry code, which makes maintainer's work easier and code more elegant. Start converting arm64 to use the generic entry infrastructure from kernel/entry/* by switching it to generic IRQ entry, which removes 100+ lines of duplicate code. arm64 will completely switch to generic entry in a later series. The changes are below: - Remove *enter_from/exit_to_kernel_mode(), and wrap with generic irqentry_enter/exit() as their code and functionality are almost identical. - Define ARCH_EXIT_TO_USER_MODE_WORK and implement arch_exit_to_user_mode_work() to check arm64-specific thread flags "_TIF_MTE_ASYNC_FAULT" and "_TIF_FOREIGN_FPSTATE". So also remove *enter_from/exit_to_user_mode(), and wrap with generic enter_from/exit_to_user_mode() because they are exactly the same. - Remove arm64_enter/exit_nmi() and use generic irqentry_nmi_enter/exit() because they're exactly the same, so the temporary arm64 version irqentry_state can also be removed. - Remove PREEMPT_DYNAMIC code, as generic irqentry_exit_cond_resched() has the same functionality. - Implement arch_irqentry_exit_need_resched() with arm64_preempt_schedule_irq() for arm64 which will allow arm64 to do its architecture specific checks. Tested-by: Ada Couprie Diaz <ada.coupriediaz@arm.com> Suggested-by: Ada Couprie Diaz <ada.coupriediaz@arm.com> Suggested-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Will Deacon <will@kernel.org>
2025-09-11arm64: entry: Move arm64_preempt_schedule_irq() into __exit_to_kernel_mode()Jinjie Ruan1-48/+48
The arm64 entry code only preempts a kernel context upon a return from a regular IRQ exception. The generic entry code may preempt a kernel context for any exception return where irqentry_exit() is used, and so may preempt other exceptions such as faults. In preparation for moving arm64 over to the generic entry code, align arm64 with the generic behaviour by calling arm64_preempt_schedule_irq() from exit_to_kernel_mode(). To make this possible, arm64_preempt_schedule_irq() and dynamic/raw_irqentry_exit_cond_resched() are moved earlier in the file, with no changes. As Mark pointed out, this change will have the following 2 key impact: - " We'll preempt even without taking a "real" interrupt. That shouldn't result in preemption that wasn't possible before, but it does change the probability of preempting at certain points, and might have a performance impact, so probably warrants a benchmark." - " We will not preempt when taking interrupts from a region of kernel code where IRQs are enabled but RCU is not watching, matching the behaviour of the generic entry code. This has the potential to introduce livelock if we can ever have a screaming interrupt in such a region, so we'll need to go figure out whether that's actually a problem. Having this as a separate patch will make it easier to test/bisect for that specifically." Reviewed-by: Ada Couprie Diaz <ada.coupriediaz@arm.com> Suggested-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Will Deacon <will@kernel.org>
2025-09-11arm64: entry: Refactor preempt_schedule_irq() check codeJinjie Ruan2-15/+28
To align the structure of the code with irqentry_exit_cond_resched() from the generic entry code, hoist the need_irq_preemption() and IS_ENABLED() check earlier. And different preemption check functions are defined based on whether dynamic preemption is enabled. Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com> Reviewed-by: Ada Couprie Diaz <ada.coupriediaz@arm.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Will Deacon <will@kernel.org>
2025-09-11arm64: entry: Use preempt_count() and need_resched() helperJinjie Ruan1-10/+4
The generic entry code uses preempt_count() and need_resched() helpers to check if it should do preempt_schedule_irq(). Currently, arm64 use its own check logic, that is "READ_ONCE(current_thread_info()->preempt_count == 0", which is equivalent to "preempt_count() == 0 && need_resched()". In preparation for moving arm64 over to the generic entry code, use these helpers to replace arm64's own code and move it ahead. No functional changes. Reviewed-by: Ada Couprie Diaz <ada.coupriediaz@arm.com> Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Will Deacon <will@kernel.org>
2025-09-11arm64: entry: Rework arm64_preempt_schedule_irq()Jinjie Ruan1-7/+10
The generic entry code has the form: | raw_irqentry_exit_cond_resched() | { | if (!preempt_count()) { | ... | if (need_resched()) | preempt_schedule_irq(); | } | } In preparation for moving arm64 over to the generic entry code, align the structure of the arm64 code with raw_irqentry_exit_cond_resched() from the generic entry code. Reviewed-by: Ada Couprie Diaz <ada.coupriediaz@arm.com> Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Will Deacon <will@kernel.org>
2025-09-11arm64: entry: Refactor the entry and exit for exceptions from EL1Jinjie Ruan2-61/+106
The generic entry code uses irqentry_state_t to track lockdep and RCU state across exception entry and return. For historical reasons, arm64 embeds similar fields within its pt_regs structure. In preparation for moving arm64 over to the generic entry code, pull these fields out of arm64's pt_regs, and use a separate structure, matching the style of the generic entry code. No functional changes. Acked-by: Mark Rutland <mark.rutland@arm.com> Reviewed-by: Ada Couprie Diaz <ada.coupriediaz@arm.com> Suggested-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Will Deacon <will@kernel.org>
2025-09-11arm64: ptrace: Replace interrupts_enabled() with regs_irqs_disabled()Jinjie Ruan7-11/+12
The generic entry code expects architecture code to provide regs_irqs_disabled(regs) function, but arm64 does not have this and provides interrupts_enabled(regs), which has the opposite polarity. In preparation for moving arm64 over to the generic entry code, relace arm64's interrupts_enabled() with regs_irqs_disabled() and update its callers under arch/arm64. For the moment, a definition of interrupts_enabled() is provided for the GICv3 driver. Once arch/arm implement regs_irqs_disabled(), this can be removed. Delete the fast_interrupts_enabled() macro as it is unused and we don't want any new users to show up. No functional changes. Reviewed-by: Ada Couprie Diaz <ada.coupriediaz@arm.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Suggested-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com> Acked-by: Catalin Marinas <catalin.marinas@arm.com> Signed-off-by: Will Deacon <will@kernel.org>
2025-09-11arm64: sysreg: Add validation checks to sysreg header generation scriptFuad Tabba1-0/+20
The gen_sysreg.awk script processes the system register specification in the sysreg text file to generate C macro definitions. The current script will silently accept certain errors in the specification file, leading to incorrect header generation. For example, a Sysreg or SysregFields can be accidentally duplicated, causing its macros to be emitted twice. An Enum can contain duplicate values for different items, which is architecturally incorrect. Add checks to catch these errors at build time. The script now tracks all seen Sysreg and SysregFields definitions and checks for duplicates. It also tracks values within each Enum block to ensure entries are unique. Acked-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Fuad Tabba <tabba@google.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Will Deacon <will@kernel.org>
2025-09-11arm64: sysreg: Correct sign definitions for EIESB and DoubleLockFuad Tabba1-2/+2
The `ID_AA64MMFR4_EL1.EIESB` field, is an unsigned enumeration, but was incorrectly defined as a `SignedEnum` when introduced in commit cfc680bb04c5 ("arm64: sysreg: Add layout for ID_AA64MMFR4_EL1"). This is corrected to `UnsignedEnum`. Conversely, the `ID_AA64DFR0_EL1.DoubleLock` field, is a signed enumeration, but was incorrectly defined as an `UnsignedEnum`. This is corrected to `SignedEnum`, which wasn't correctly set when annotated as such in commit ad16d4cf0b4f ("arm64/sysreg: Initial unsigned annotations for ID registers"). Signed-off-by: Fuad Tabba <tabba@google.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Will Deacon <will@kernel.org>
2025-09-11arm64: sysreg: Fix and tidy up sysreg field definitionsFuad Tabba1-11/+3
Fix the value of ID_PFR1_EL1.Security NSACR_RFR to be 0b0010, as per DDI0601/2025-06, which wasn't correctly set when introduced in commit 1224308075f1 ("arm64/sysreg: Convert ID_PFR1_EL1 to automatic generation"). While at it, remove redundant definitions of CPACR_EL12 and RCWSMASK_EL1 and fix some typos. Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com> Acked-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Fuad Tabba <tabba@google.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Will Deacon <will@kernel.org>
2025-09-11openrisc: Add jump label supportchenmiao8-0/+133
Supported a complete jump_label implementation based on the ARM64 and RV64 version and add the CONFIG_JUMP_LABEL=y to the defconfig. Testing was conducted using a dedicated test module jump-label-test, provided in the link below. For detailed steps, please refer to the README also at the provided link. Link: https://github.com/ChenMiaoi/GSoC-2025-Final-Report/tree/main/tests/jump-label-test Test Environment: - Hardware: QEMU emulated OR1K - Kernel Version: 6.17.0-rc3-dirty - Configs: CONFIG_MODULES=y,CONFIG_MODULE_UNLOAD=y - Toolchain: or1k-none-linux-musl-gcc 15.1.0 Test Results: $ insmod jump_label_test.ko [ 32.590000] Jump label performance test module loaded [ 35.250000] Normal branch time: 1241327150 ns (124 ns per iteration) [ 35.250000] Jump label (false) time: 706422700 ns (70 ns per iteration) [ 35.250000] Jump label (true) time: 708913450 ns (70 ns per iteration) $ rmmod jump_label_test.ko [ 72.210000] Jump label test module unloaded The results show approximately 43% improvement in branch performance when using jump labels compared to traditional branches. Link: https://lore.kernel.org/openrisc/aLsZ9S3X0OpKy1RM@antec/T/#u Signed-off-by: chenmiao <chenmiao.ku@gmail.com> Signed-off-by: Stafford Horne <shorne@gmail.com>
2025-09-11openrisc: Regenerate defconfigs.chenmiao2-13/+6
Regenerating defconfigs allows subsequent changes to the configs to be related only to the corresponding modifications, without mixing changes from other configs. Signed-off-by: chenmiao <chenmiao.ku@gmail.com> Signed-off-by: Stafford Horne <shorne@gmail.com>
2025-09-11openrisc: Add R_OR1K_32_PCREL relocation type module supportchenmiao1-0/+4
To ensure the proper functioning of the jump_label test module, this patch adds support for the R_OR1K_32_PCREL relocation type for any modules. The implementation calculates the PC-relative offset by subtracting the instruction location from the target value and stores the result at the specified location. Signed-off-by: chenmiao <chenmiao.ku@gmail.com> Signed-off-by: Stafford Horne <shorne@gmail.com>
2025-09-11openrisc: Add text patching API supportchenmiao7-2/+111
Add text patching api's to use in subsequent jump_label implementation. We use a new fixmap FIX_TEXT_POKE0 entry to temporarily override MMU mappings to allow read only text pages to be written to. Previously, __set_fix was marked with __init as it was only used during the EARLYCON stage. Now that TEXT_POKE mappings require post-init usage (e.g., FIX_TEXT_POKE0), keeping __init would cause runtime bugs whenset_fixmap accesses invalid memory. Thus, we remove the __init flag to ensure __set_fix remains valid beyond initialization. A new function patch_insn_write is exposed to allow single instruction patching. Link: https://lore.kernel.org/openrisc/aJIC8o1WmVHol9RY@antec/T/#t Signed-off-by: chenmiao <chenmiao.ku@gmail.com> Signed-off-by: Stafford Horne <shorne@gmail.com>
2025-09-11x86/mce: Add a clear_bank() helperYazen Ghannam3-5/+18
Add a helper at the end of the MCA polling function to collect vendor and/or feature actions. Start with a basic skeleton for now. Actions for AMD thresholding and deferred errors will be added later. [ bp: Drop the obvious comment too. ] Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Nikolay Borisov <nik.borisov@suse.com> Link: https://lore.kernel.org/20250908-wip-mca-updates-v6-0-eef5d6c74b9c@amd.com