aboutsummaryrefslogtreecommitdiffstats
path: root/tools/perf/scripts/python/stackcollapse.py (unfollow)
AgeCommit message (Collapse)AuthorFilesLines
2025-09-24KVM: arm64: selftests: Enable EL2 by defaultOliver Upton3-1/+26
Take advantage of VHE to implicitly promote KVM selftests to run at EL2 with only slight modification. Update the smccc_filter test to account for this now that the EL2-ness of a VM is visible to tests. Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-09-24KVM: arm64: selftests: Initialize HCR_EL2Oliver Upton1-0/+6
Initialize HCR_EL2 such that EL2&0 is considered 'InHost', allowing the use of (mostly) unmodified EL1 selftests at EL2. Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-09-24KVM: arm64: selftests: Use the vCPU attr for setting nr of PMU countersOliver Upton1-31/+28
Configuring the number of implemented counters via PMCR_EL0.N was a bad idea in retrospect as it interacts poorly with nested. Migrate the selftest to use the vCPU attribute instead of the KVM_SET_ONE_REG mechanism. Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-09-24KVM: arm64: selftests: Use hyp timer IRQs when test runs at EL2Oliver Upton3-8/+28
Arch timer registers are redirected to their hypervisor counterparts when running in VHE EL2. This is great, except for the fact that the hypervisor timers use different PPIs. Use the correct INTIDs when that is the case. Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-09-24KVM: arm64: selftests: Select SMCCC conduit based on current ELOliver Upton5-8/+21
HVCs are taken within the VM when EL2 is in use. Ensure tests use the SMC instruction when running at EL2 to interact with the host. Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-09-24KVM: arm64: selftests: Provide helper for getting default vCPU targetOliver Upton6-13/+23
The default vCPU target in KVM selftests is pretty boring in that it doesn't enable any vCPU features. Expose a helper for getting the default target to prepare for cramming in more features. Call KVM_ARM_PREFERRED_TARGET directly from get-reg-list as it needs fine-grained control over feature flags. Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Reviewed-by: Itaru Kitayama <itaru.kitayama@fujitsu.com> Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-09-24KVM: arm64: selftests: Alias EL1 registers to EL2 counterpartsOliver Upton4-11/+69
FEAT_VHE has the somewhat nice property of implicitly redirecting EL1 register aliases to their corresponding EL2 representations when E2H=1. Unfortunately, there's no such abstraction for userspace and EL2 registers are always accessed by their canonical encoding. Introduce a helper that applies EL2 redirections to sysregs and use aggressive inlining to catch misuse at compile time. Go a little past the architectural definition for ease of use for test authors (e.g. the stack pointer). Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-09-24KVM: arm64: selftests: Create a VGICv3 for 'default' VMsOliver Upton19-56/+60
Start creating a VGICv3 by default unless explicitly opted-out by the test. While having an interrupt controller is nice, the real benefit here is clearing a hurdle for EL2 VMs which mandate the presence of a VGIC. Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-09-24KVM: arm64: selftests: Add unsanitised helpers for VGICv3 creationOliver Upton2-17/+35
vgic_v3_setup() has a good bit of sanity checking internally to ensure that vCPUs have actually been created and match the dimensioning of the vgic itself. Spin off an unsanitised setup and initialization helper so vgic initialization can be wired in around a 'default' VM's vCPU creation. Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-09-24KVM: arm64: selftests: Add helper to check for VGICv3 supportOliver Upton7-7/+21
Introduce a proper predicate for probing VGICv3 by performing a 'test' creation of the device on a dummy VM. Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-09-24KVM: arm64: selftests: Initialize VGICv3 only onceOliver Upton1-3/+0
vgic_v3_setup() unnecessarily initializes the vgic twice. Keep the initialization after configuring MMIO frames and get rid of the other. Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Reviewed-by: Zenghui Yu <yuzenghui@huawei.com> Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-09-24KVM: arm64: selftests: Provide kvm_arch_vm_post_create() in library codeOliver Upton3-14/+20
In order to compel the default usage of EL2 in selftests, move kvm_arch_vm_post_create() to library code and expose an opt-in for using MTE by default. Signed-off-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Marc Zyngier <maz@kernel.org>
2025-08-31Linux 6.17-rc4v6.17-rc4Linus Torvalds1-1/+1
2025-08-30kselftest/arm64: Don't open code SVE_PT_SIZE() in fp-ptraceMark Brown1-3/+2
In fp-trace when allocating a buffer to write SVE register data we open code the addition of the header size to the VL depeendent register data size, which lead to an underallocation bug when we cut'n'pasted the code for FPSIMD format writes. Use the SVE_PT_SIZE() macro that the kernel UAPI provides for this. Fixes: b84d2b27954f ("kselftest/arm64: Test FPSIMD format data writes via NT_ARM_SVE in fp-ptrace") Signed-off-by: Mark Brown <broonie@kernel.org> Link: https://lore.kernel.org/r/20250812-arm64-fp-trace-macro-v1-1-317cfff986a5@kernel.org Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2025-08-30arm64: mm: Fix CFI failure due to kpti_ng_pgd_alloc function signatureKees Cook3-9/+10
Seen during KPTI initialization: CFI failure at create_kpti_ng_temp_pgd+0x124/0xce8 (target: kpti_ng_pgd_alloc+0x0/0x14; expected type: 0xd61b88b6) The call site is alloc_init_pud() at arch/arm64/mm/mmu.c: pud_phys = pgtable_alloc(TABLE_PUD); alloc_init_pud() has the prototype: static void alloc_init_pud(p4d_t *p4dp, unsigned long addr, unsigned long end, phys_addr_t phys, pgprot_t prot, phys_addr_t (*pgtable_alloc)(enum pgtable_type), int flags) where the pgtable_alloc() prototype is declared. The target (kpti_ng_pgd_alloc) is used in arch/arm64/kernel/cpufeature.c: create_kpti_ng_temp_pgd(kpti_ng_temp_pgd, __pa(alloc), KPTI_NG_TEMP_VA, PAGE_SIZE, PAGE_KERNEL, kpti_ng_pgd_alloc, 0); which is an alias for __create_pgd_mapping_locked() with prototype: extern __alias(__create_pgd_mapping_locked) void create_kpti_ng_temp_pgd(pgd_t *pgdir, phys_addr_t phys, unsigned long virt, phys_addr_t size, pgprot_t prot, phys_addr_t (*pgtable_alloc)(enum pgtable_type), int flags); __create_pgd_mapping_locked() passes the function pointer down: __create_pgd_mapping_locked() -> alloc_init_p4d() -> alloc_init_pud() But the target function (kpti_ng_pgd_alloc) has the wrong signature: static phys_addr_t __init kpti_ng_pgd_alloc(int shift); The "int" should be "enum pgtable_type". To make "enum pgtable_type" available to cpufeature.c, move enum pgtable_type definition from arch/arm64/mm/mmu.c to arch/arm64/include/asm/mmu.h. Adjust kpti_ng_pgd_alloc to use "enum pgtable_type" instead of "int". The function behavior remains identical (parameter is unused). Fixes: c64f46ee1377 ("arm64: mm: use enum to identify pgtable level instead of *_SHIFT") Cc: <stable@vger.kernel.org> # 6.16.x Signed-off-by: Kees Cook <kees@kernel.org> Acked-by: Ard Biesheuvel <ardb@kernel.org> Link: https://lore.kernel.org/r/20250829190721.it.373-kees@kernel.org Reviewed-by: Ryan Roberts <ryan.roberts@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2025-08-29hardening: Require clang 20.1.0 for __counted_byNathan Chancellor1-4/+5
After an innocuous change in -next that modified a structure that contains __counted_by, clang-19 start crashing when building certain files in drivers/gpu/drm/xe. When assertions are enabled, the more descriptive failure is: clang: clang/lib/AST/RecordLayoutBuilder.cpp:3335: const ASTRecordLayout &clang::ASTContext::getASTRecordLayout(const RecordDecl *) const: Assertion `D && "Cannot get layout of forward declarations!"' failed. According to a reverse bisect, a tangential change to the LLVM IR generation phase of clang during the LLVM 20 development cycle [1] resolves this problem. Bump the version of clang that enables CONFIG_CC_HAS_COUNTED_BY to 20.1.0 to ensure that this issue cannot be hit. Link: https://github.com/llvm/llvm-project/commit/160fb1121cdf703c3ef5e61fb26c5659eb581489 [1] Signed-off-by: Nathan Chancellor <nathan@kernel.org> Reviewed-by: Justin Stitt <justinstitt@google.com> Link: https://lore.kernel.org/r/20250807-fix-counted_by-clang-19-v1-1-902c86c1d515@kernel.org Signed-off-by: Kees Cook <kees@kernel.org>
2025-08-28drm/mediatek: mtk_hdmi: Fix inverted parameters in some regmap_update_bits callsLouis-Alexis Eyraud1-4/+4
In mtk_hdmi driver, a recent change replaced custom register access function calls by regmap ones, but two replacements by regmap_update_bits were done incorrectly, because original offset and mask parameters were inverted, so fix them. Fixes: d6e25b3590a0 ("drm/mediatek: hdmi: Use regmap instead of iomem for main registers") Signed-off-by: Louis-Alexis Eyraud <louisalexis.eyraud@collabora.com> Reviewed-by: CK Hu <ck.hu@mediatek.com> Link: https://patchwork.kernel.org/project/dri-devel/patch/20250818-mt8173-fix-hdmi-issue-v1-1-55aff9b0295d@collabora.com/ Signed-off-by: Chun-Kuang Hu <chunkuang.hu@kernel.org>
2025-08-28MAINTAINERS: mark bcachefs externally maintainedLinus Torvalds1-1/+1
As per many long discussion threads, public and private. Signed-off-by: Linus Torvalds <torbalds@linux-foundation.org>
2025-08-28KVM: arm64: nv: Fix ATS12 handling of single-stage translationMarc Zyngier1-3/+3
Volodymyr reports that using a Xen DomU as a nested guest (where HCR_EL2.E2H == 0), ATS12 results in a translation that stops at the L2's S1, which isn't something you'd normally expects. Comparing the code against the spec proves to be illuminating, and suggests that the author of such code must have been tired, cross-eyed, drunk, or maybe all of the above. The gist of it is that, apart from HCR_EL2.VM or HCR_EL2.DC being 0, only the use of the EL2&0 translation regime limits the walk to S1 only, and that we must finish the S2 walk in any other case. Which solves the above issue, as E2H==0 indicates that ATS12 walks the EL1&0 translation regime. Explicitly checking for EL2&0 fixes this. Reported-by: Volodymyr Babchuk <volodymyr_babchuk@epam.com> Suggested-by: Oliver Upton <oliver.upton@linux.dev> Signed-off-by: Marc Zyngier <maz@kernel.org> Fixes: be04cebf3e788 ("KVM: arm64: nv: Add emulation of AT S12E{0,1}{R,W}") Link: https://lore.kernel.org/r/20250806141707.3479194-2-volodymyr_babchuk@epam.com Link: https://lore.kernel.org/r/20250809144811.2314038-2-maz@kernel.org Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2025-08-28KVM: arm64: Remove __vcpu_{read,write}_sys_reg_{from,to}_cpu()Marc Zyngier2-111/+80
There is no point having __vcpu_{read,write}_sys_reg_{from,to}_cpu() exposed to the rest of the kernel, as the only callers are in sys_regs.c. Move them where they below, which is another opportunity to simplify things a bit. Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20250817121926.217900-5-maz@kernel.org Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2025-08-28KVM: arm64: Fix vcpu_{read,write}_sys_reg() accessorsMarc Zyngier2-112/+162
Volodymyr reports (again!) that under some circumstances (E2H==0, walking S1 PTs), PAR_EL1 doesn't report the value of the latest walk in the CPU register, but that instead the value is written to the backing store. Further investigation indicates that the root cause of this is that a group of registers (PAR_EL1, TPIDR*_EL{0,1}, the *32_EL2 dregs) should always be considered as "on CPU", as they are not remapped between EL1 and EL2. We fail to treat them accordingly, and end-up considering that the register (PAR_EL1 in this example) should be written to memory instead of in the register. While it would be possible to quickly work around it, it is obvious that the way we track these things at the moment is pretty horrible, and could do with some improvement. Revamp the whole thing by: - defining a location for a register (memory, cpu), potentially depending on the state of the vcpu - define a transformation for this register (mapped register, potential translation, special register needing some particular attention) - convey this information in a structure that can be easily passed around As a result, the accessors themselves become much simpler, as the state is explicit instead of being driven by hard-to-understand conventions. We get rid of the "pure EL2 register" notion, which wasn't very useful, and add sanitisation of the values by applying the RESx masks as required, something that was missing so far. And of course, we add the missing registers to the list, with the indication that they are always loaded. Reported-by: Volodymyr Babchuk <volodymyr_babchuk@epam.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Fixes: fedc612314acf ("KVM: arm64: nv: Handle virtual EL2 registers in vcpu_read/write_sys_reg()") Link: https://lore.kernel.org/r/20250806141707.3479194-3-volodymyr_babchuk@epam.com Link: https://lore.kernel.org/r/20250817121926.217900-4-maz@kernel.org Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2025-08-28KVM: arm64: Simplify sysreg access on exception deliveryMarc Zyngier1-12/+4
Distinguishing between NV and VHE is slightly pointless, and only serves as an extra complication, or a way to introduce bugs, such as the way SPSR_EL1 gets written without checking for the state being resident. Get rid if this silly distinction, and fix the bug in one go. Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20250817121926.217900-3-maz@kernel.org Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2025-08-28KVM: arm64: Check for SYSREGS_ON_CPU before accessing the 32bit stateMarc Zyngier1-2/+2
Just like c6e35dff58d3 ("KVM: arm64: Check for SYSREGS_ON_CPU before accessing the CPU state") fixed the 64bit state access, add a check for the 32bit state actually being on the CPU before writing it. Signed-off-by: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20250817121926.217900-2-maz@kernel.org Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
2025-08-28bcache: change maintainer's email addressColy Li1-1/+1
Change to my new email address on fnnas.com. Signed-off-by: Coly Li <colyli@fnnas.com> Link: https://lore.kernel.org/r/20250828154835.32926-1-colyli@kernel.org Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-08-28ublk selftests: add --no_ublk_fixed_fd for not using registered ublk char deviceMing Lei6-33/+74
Add a new command line option --no_ublk_fixed_fd that excludes the ublk control device (/dev/ublkcN) from io_uring's registered files array. When this option is used, only backing files are registered starting from index 1, while the ublk control device is accessed using its raw file descriptor. Add ublk_get_registered_fd() helper function that returns the appropriate file descriptor for use with io_uring operations. Key optimizations implemented: - Cache UBLKS_Q_NO_UBLK_FIXED_FD flag in ublk_queue.flags to avoid reading dev->no_ublk_fixed_fd in fast path - Cache ublk char device fd in ublk_queue.ublk_fd for fast access - Update ublk_get_registered_fd() to use ublk_queue * parameter - Update io_uring_prep_buf_register/unregister() to use ublk_queue * - Replace ublk_device * access with ublk_queue * access in fast paths Also pass --no_ublk_fixed_fd to test_stress_04.sh for covering plain ublk char device mode. Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20250827121602.2619736-3-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-08-28ublk: avoid ublk_io_release() called after ublk char dev is closedMing Lei1-2/+70
When running test_stress_04.sh, the following warning is triggered: WARNING: CPU: 1 PID: 135 at drivers/block/ublk_drv.c:1933 ublk_ch_release+0x423/0x4b0 [ublk_drv] This happens when the daemon is abruptly killed: - some references may still be held, because registering IO buffer doesn't grab ublk char device reference OR - io->task_registered_buffers won't be cleared because io buffer is released from non-daemon context For zero-copy and auto buffer register modes, I/O reference crosses syscalls, so IO reference may not be dropped naturally when ublk server is killed abruptly. However, when releasing io_uring context, it is guaranteed that the reference is dropped finally, see io_sqe_buffers_unregister() from io_ring_ctx_free(). Fix this by adding ublk_drain_io_references() that: - Waits for active I/O references dropped in async way by scheduling work function, for avoiding ublk dev and io_uring file's release dependency - Reinitializes io->ref and io->task_registered_buffers to clean state This ensures the reference count state is clean when ublk_queue_reinit() is called, preventing the warning and potential use-after-free. Fixes: 1f6540e2aabb ("ublk: zc register/unregister bvec") Fixes: 1ceeedb59749 ("ublk: optimize UBLK_IO_UNREGISTER_IO_BUF on daemon task") Fixes: 8a8fe42d765b ("ublk: optimize UBLK_IO_REGISTER_IO_BUF on daemon task") Signed-off-by: Ming Lei <ming.lei@redhat.com> Link: https://lore.kernel.org/r/20250827121602.2619736-2-ming.lei@redhat.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-08-28io_uring/kbuf: always use READ_ONCE() to read ring provided buffer lengthsJens Axboe1-7/+13
Since the buffers are mapped from userspace, it is prudent to use READ_ONCE() to read the value into a local variable, and use that for any other actions taken. Having a stable read of the buffer length avoids worrying about it changing after checking, or being read multiple times. Similarly, the buffer may well change in between it being picked and being committed. Ensure the looping for incremental ring buffer commit stops if it hits a zero sized buffer, as no further progress can be made at that point. Fixes: ae98dbf43d75 ("io_uring/kbuf: add support for incremental buffer consumption") Link: https://lore.kernel.org/io-uring/tencent_000C02641F6250C856D0C26228DE29A3D30A@qq.com/ Reported-by: Qingyue Zhang <chunzhennn@qq.com> Reported-by: Suoxing Zhang <aftern00n@qq.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-08-28net: ipv4: fix regression in local-broadcast routesOscar Maes1-3/+7
Commit 9e30ecf23b1b ("net: ipv4: fix incorrect MTU in broadcast routes") introduced a regression where local-broadcast packets would have their gateway set in __mkroute_output, which was caused by fi = NULL being removed. Fix this by resetting the fib_info for local-broadcast packets. This preserves the intended changes for directed-broadcast packets. Cc: stable@vger.kernel.org Fixes: 9e30ecf23b1b ("net: ipv4: fix incorrect MTU in broadcast routes") Reported-by: Brett A C Sheffield <bacs@librecast.net> Closes: https://lore.kernel.org/regressions/20250822165231.4353-4-bacs@librecast.net Signed-off-by: Oscar Maes <oscmaes92@gmail.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://patch.msgid.link/20250827062322.4807-1-oscmaes92@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-08-28net: macb: Disable clocks onceNeil Mandir1-5/+2
When the driver is removed the clocks are disabled twice: once in macb_remove and a second time by runtime pm. Disable wakeup in remove so all the clocks are disabled and skip the second call to macb_clks_disable. Always suspend the device as we always set it active in probe. Fixes: d54f89af6cc4 ("net: macb: Add pm runtime support") Signed-off-by: Neil Mandir <neil.mandir@seco.com> Co-developed-by: Sean Anderson <sean.anderson@linux.dev> Signed-off-by: Sean Anderson <sean.anderson@linux.dev> Link: https://patch.msgid.link/20250826143022.935521-1-sean.anderson@linux.dev Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-08-28efivarfs: Fix slab-out-of-bounds in efivarfs_d_compareLi Nan1-0/+4
Observed on kernel 6.6 (present on master as well): BUG: KASAN: slab-out-of-bounds in memcmp+0x98/0xd0 Call trace: kasan_check_range+0xe8/0x190 __asan_loadN+0x1c/0x28 memcmp+0x98/0xd0 efivarfs_d_compare+0x68/0xd8 __d_lookup_rcu_op_compare+0x178/0x218 __d_lookup_rcu+0x1f8/0x228 d_alloc_parallel+0x150/0x648 lookup_open.isra.0+0x5f0/0x8d0 open_last_lookups+0x264/0x828 path_openat+0x130/0x3f8 do_filp_open+0x114/0x248 do_sys_openat2+0x340/0x3c0 __arm64_sys_openat+0x120/0x1a0 If dentry->d_name.len < EFI_VARIABLE_GUID_LEN , 'guid' can become negative, leadings to oob. The issue can be triggered by parallel lookups using invalid filename: T1 T2 lookup_open ->lookup simple_lookup d_add // invalid dentry is added to hash list lookup_open d_alloc_parallel __d_lookup_rcu __d_lookup_rcu_op_compare hlist_bl_for_each_entry_rcu // invalid dentry can be retrieved ->d_compare efivarfs_d_compare // oob Fix it by checking 'guid' before cmp. Fixes: da27a24383b2 ("efivarfs: guid part of filenames are case-insensitive") Signed-off-by: Li Nan <linan122@huawei.com> Signed-off-by: Wu Guanghao <wuguanghao3@huawei.com> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
2025-08-28ata: ahci_xgene: Use int type for 'rc' to store error codesQianfeng Rong1-5/+2
Use int instead of u32 for the 'rc' variable in xgene_ahci_softreset() to store negative error codes returned by ahci_do_softreset(). In xgene_ahci_pmp_softreset(), remove the redundant 'rc' variable and directly return the result of the ahci_do_softreset() call instead. Signed-off-by: Qianfeng Rong <rongqianfeng@vivo.com> Signed-off-by: Damien Le Moal <dlemoal@kernel.org>
2025-08-27fbnic: Move phylink resume out of service_task and into open/closeAlexander Duyck2-2/+4
The fbnic driver was presenting with the following locking assert coming out of a PM resume: [ 42.208116][ T164] RTNL: assertion failed at drivers/net/phy/phylink.c (2611) [ 42.208492][ T164] WARNING: CPU: 1 PID: 164 at drivers/net/phy/phylink.c:2611 phylink_resume+0x190/0x1e0 [ 42.208872][ T164] Modules linked in: [ 42.209140][ T164] CPU: 1 UID: 0 PID: 164 Comm: bash Not tainted 6.17.0-rc2-virtme #134 PREEMPT(full) [ 42.209496][ T164] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.17.0-5.fc42 04/01/2014 [ 42.209861][ T164] RIP: 0010:phylink_resume+0x190/0x1e0 [ 42.210057][ T164] Code: 83 e5 01 0f 85 b0 fe ff ff c6 05 1c cd 3e 02 01 90 ba 33 0a 00 00 48 c7 c6 20 3a 1d a5 48 c7 c7 e0 3e 1d a5 e8 21 b8 90 fe 90 <0f> 0b 90 90 e9 86 fe ff ff e8 42 ea 1f ff e9 e2 fe ff ff 48 89 ef [ 42.210708][ T164] RSP: 0018:ffffc90000affbd8 EFLAGS: 00010296 [ 42.210983][ T164] RAX: 0000000000000000 RBX: ffff8880078d8400 RCX: 0000000000000000 [ 42.211235][ T164] RDX: 0000000000000000 RSI: 1ffffffff4f10938 RDI: 0000000000000001 [ 42.211466][ T164] RBP: 0000000000000000 R08: ffffffffa2ae79ea R09: fffffbfff4b3eb84 [ 42.211707][ T164] R10: 0000000000000003 R11: 0000000000000000 R12: ffff888007ad8000 [ 42.211997][ T164] R13: 0000000000000002 R14: ffff888006a18800 R15: ffffffffa34c59e0 [ 42.212234][ T164] FS: 00007f0dc8e39740(0000) GS:ffff88808f51f000(0000) knlGS:0000000000000000 [ 42.212505][ T164] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 42.212704][ T164] CR2: 00007f0dc8e9fe10 CR3: 000000000b56d003 CR4: 0000000000772ef0 [ 42.213227][ T164] PKRU: 55555554 [ 42.213366][ T164] Call Trace: [ 42.213483][ T164] <TASK> [ 42.213565][ T164] __fbnic_pm_attach.isra.0+0x8e/0xa0 [ 42.213725][ T164] pci_reset_function+0x116/0x1d0 [ 42.213895][ T164] reset_store+0xa0/0x100 [ 42.214025][ T164] ? pci_dev_reset_attr_is_visible+0x50/0x50 [ 42.214221][ T164] ? sysfs_file_kobj+0xc1/0x1e0 [ 42.214374][ T164] ? sysfs_kf_write+0x65/0x160 [ 42.214526][ T164] kernfs_fop_write_iter+0x2f8/0x4c0 [ 42.214677][ T164] ? kernfs_vma_page_mkwrite+0x1f0/0x1f0 [ 42.214836][ T164] new_sync_write+0x308/0x6f0 [ 42.214987][ T164] ? __lock_acquire+0x34c/0x740 [ 42.215135][ T164] ? new_sync_read+0x6f0/0x6f0 [ 42.215288][ T164] ? lock_acquire.part.0+0xbc/0x260 [ 42.215440][ T164] ? ksys_write+0xff/0x200 [ 42.215590][ T164] ? perf_trace_sched_switch+0x6d0/0x6d0 [ 42.215742][ T164] vfs_write+0x65e/0xbb0 [ 42.215876][ T164] ksys_write+0xff/0x200 [ 42.215994][ T164] ? __ia32_sys_read+0xc0/0xc0 [ 42.216141][ T164] ? do_user_addr_fault+0x269/0x9f0 [ 42.216292][ T164] ? rcu_is_watching+0x15/0xd0 [ 42.216442][ T164] do_syscall_64+0xbb/0x360 [ 42.216591][ T164] entry_SYSCALL_64_after_hwframe+0x4b/0x53 [ 42.216784][ T164] RIP: 0033:0x7f0dc8ea9986 A bit of digging showed that we were invoking the phylink_resume as a part of the fbnic_up path when we were enabling the service task while not holding the RTNL lock. We should be enabling this sooner as a part of the ndo_open path and then just letting the service task come online later. This will help to enforce the correct locking and brings the phylink interface online at the same time as the network interface, instead of at a later time. I tested this on QEMU to verify this was working by putting the system to sleep using "echo mem > /sys/power/state" to put the system to sleep in the guest and then using the command "system_wakeup" in the QEMU monitor. Fixes: 69684376eed5 ("eth: fbnic: Add link detection") Signed-off-by: Alexander Duyck <alexanderduyck@fb.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Link: https://patch.msgid.link/175616257316.1963577.12238158800417771119.stgit@ahduyck-xeon-server.home.arpa Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-27fbnic: Fixup rtnl_lock and devl_lock handling related to mailbox codeAlexander Duyck1-7/+6
The exception handling path for the __fbnic_pm_resume function had a bug in that it was taking the devlink lock and then exiting to exception handling instead of waiting until after it released the lock to do so. In order to handle that I am swapping the placement of the unlock and the exception handling jump to label so that we don't trigger a deadlock by holding the lock longer than we need to. In addition this change applies the same ordering to the rtnl_lock/unlock calls in the same function as it should make the code easier to follow if it adheres to a consistent pattern. Fixes: 82534f446daa ("eth: fbnic: Add devlink dev flash support") Signed-off-by: Alexander Duyck <alexanderduyck@fb.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Link: https://patch.msgid.link/175616256667.1963577.5543500806256052549.stgit@ahduyck-xeon-server.home.arpa Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-27net: rose: fix a typo in rose_clear_routes()Eric Dumazet1-1/+1
syzbot crashed in rose_clear_routes(), after a recent patch typo. KASAN: null-ptr-deref in range [0x0000000000000010-0x0000000000000017] CPU: 0 UID: 0 PID: 10591 Comm: syz.3.1856 Not tainted syzkaller #0 PREEMPT(full) Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 07/12/2025 RIP: 0010:rose_clear_routes net/rose/rose_route.c:565 [inline] RIP: 0010:rose_rt_ioctl+0x162/0x1250 net/rose/rose_route.c:760 <TASK> rose_ioctl+0x3ce/0x8b0 net/rose/af_rose.c:1381 sock_do_ioctl+0xd9/0x300 net/socket.c:1238 sock_ioctl+0x576/0x790 net/socket.c:1359 vfs_ioctl fs/ioctl.c:51 [inline] __do_sys_ioctl fs/ioctl.c:598 [inline] __se_sys_ioctl+0xfc/0x170 fs/ioctl.c:584 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] do_syscall_64+0xfa/0x3b0 arch/x86/entry/syscall_64.c:94 entry_SYSCALL_64_after_hwframe+0x77/0x7f Fixes: da9c9c877597 ("net: rose: include node references in rose_neigh refcount") Reported-by: syzbot+2eb8d1719f7cfcfa6840@syzkaller.appspotmail.com Closes: https://lore.kernel.org/netdev/68af3e29.a70a0220.3cafd4.002e.GAE@google.com/T/#u Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Takamitsu Iwai <takamitz@amazon.co.jp> Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> Link: https://patch.msgid.link/20250827172149.5359-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-27l2tp: do not use sock_hold() in pppol2tp_session_get_sock()Eric Dumazet1-17/+8
pppol2tp_session_get_sock() is using RCU, it must be ready for sk_refcnt being zero. Commit ee40fb2e1eb5 ("l2tp: protect sock pointer of struct pppol2tp_session with RCU") was correct because it had a call_rcu(..., pppol2tp_put_sk) which was later removed in blamed commit. pppol2tp_recv() can use pppol2tp_session_get_sock() as well. Fixes: c5cbaef992d6 ("l2tp: refactor ppp socket/session relationship") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: James Chapman <jchapman@katalix.com> Reviewed-by: Guillaume Nault <gnault@redhat.com> Link: https://patch.msgid.link/20250826134435.1683435-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-27sctp: initialize more fields in sctp_v6_from_sk()Eric Dumazet1-0/+2
syzbot found that sin6_scope_id was not properly initialized, leading to undefined behavior. Clear sin6_scope_id and sin6_flowinfo. BUG: KMSAN: uninit-value in __sctp_v6_cmp_addr+0x887/0x8c0 net/sctp/ipv6.c:649 __sctp_v6_cmp_addr+0x887/0x8c0 net/sctp/ipv6.c:649 sctp_inet6_cmp_addr+0x4f2/0x510 net/sctp/ipv6.c:983 sctp_bind_addr_conflict+0x22a/0x3b0 net/sctp/bind_addr.c:390 sctp_get_port_local+0x21eb/0x2440 net/sctp/socket.c:8452 sctp_get_port net/sctp/socket.c:8523 [inline] sctp_listen_start net/sctp/socket.c:8567 [inline] sctp_inet_listen+0x710/0xfd0 net/sctp/socket.c:8636 __sys_listen_socket net/socket.c:1912 [inline] __sys_listen net/socket.c:1927 [inline] __do_sys_listen net/socket.c:1932 [inline] __se_sys_listen net/socket.c:1930 [inline] __x64_sys_listen+0x343/0x4c0 net/socket.c:1930 x64_sys_call+0x271d/0x3e20 arch/x86/include/generated/asm/syscalls_64.h:51 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] do_syscall_64+0xd9/0x210 arch/x86/entry/syscall_64.c:94 entry_SYSCALL_64_after_hwframe+0x77/0x7f Local variable addr.i.i created at: sctp_get_port net/sctp/socket.c:8515 [inline] sctp_listen_start net/sctp/socket.c:8567 [inline] sctp_inet_listen+0x650/0xfd0 net/sctp/socket.c:8636 __sys_listen_socket net/socket.c:1912 [inline] __sys_listen net/socket.c:1927 [inline] __do_sys_listen net/socket.c:1932 [inline] __se_sys_listen net/socket.c:1930 [inline] __x64_sys_listen+0x343/0x4c0 net/socket.c:1930 Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Reported-by: syzbot+e69f06a0f30116c68056@syzkaller.appspotmail.com Closes: https://lore.kernel.org/netdev/68adc0a2.050a0220.37038e.00c4.GAE@google.com/T/#u Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Acked-by: Xin Long <lucien.xin@gmail.com> Link: https://patch.msgid.link/20250826141314.1802610-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-27MAINTAINERS: rmnet: Update email addressesSubash Abhinov Kasiviswanathan1-2/+2
Switch to oss.qualcomm.com ids. Signed-off-by: Sean Tranchetti <sean.tranchetti@oss.qualcomm.com> Signed-off-by: Subash Abhinov Kasiviswanathan <subash.a.kasiviswanathan@oss.qualcomm.com> Link: https://patch.msgid.link/20250826215046.865530-1-subash.a.kasiviswanathan@oss.qualcomm.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-27fs/smb: Fix inconsistent refcnt updateShuhao Fu1-2/+5
A possible inconsistent update of refcount was identified in `smb2_compound_op`. Such inconsistent update could lead to possible resource leaks. Why it is a possible bug: 1. In the comment section of the function, it clearly states that the reference to `cfile` should be dropped after calling this function. 2. Every control flow path would check and drop the reference to `cfile`, except the patched one. 3. Existing callers would not handle refcount update of `cfile` if -ENOMEM is returned. To fix the bug, an extra goto label "out" is added, to make sure that the cleanup logic would always be respected. As the problem is caused by the allocation failure of `vars`, the cleanup logic between label "finished" and "out" can be safely ignored. According to the definition of function `is_replayable_error`, the error code of "-ENOMEM" is not recoverable. Therefore, the replay logic also gets ignored. Signed-off-by: Shuhao Fu <sfual@cse.ust.hk> Acked-by: Paulo Alcantara (Red Hat) <pc@manguebit.org> Cc: stable@vger.kernel.org Signed-off-by: Steve French <stfrench@microsoft.com>
2025-08-27drm/amdgpu/userq: fix error handling of invalid doorbellAlex Deucher1-0/+1
If the doorbell is invalid, be sure to set the r to an error state so the function returns an error. Reviewed-by: David (Ming Qiang) Wu <David.Wu3@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 7e2a5b0a9a165a7c51274aa01b18be29491b4345) Cc: stable@vger.kernel.org
2025-08-27drm/amdgpu: update firmware version checks for user queue supportJesse.Zhang1-3/+3
The minimum firmware versions required for user queue functionality have been increased to address an issue where the queue privilege state was lost during queue connect operations. The problem occurred because the privilege state was being restored to its initial value at the beginning of the function, overwriting the state that was properly set during the queue connect case. This commit updates the minimum version requirements: - ME firmware from 2390 to 2420 - PFP firmware from 2530 to 2580 - MEC firmware from 2600 to 2650 - MES firmware remains at 120 These updated firmware versions contain the necessary fixes to properly maintain queue privilege state throughout connect operations. Fixes: 61ca97e9590c ("drm/amdgpu: Add fw minimum version check for usermode queue") Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Jesse Zhang <Jesse.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 5f976c9939f0d5916d2b8ef3156a6d1799781df1) Cc: stable@vger.kernel.org
2025-08-27drm/amd/amdgpu: disable hwmon power1_cap* for gfx 11.0.3 on vf modeYang Wang1-8/+10
the PPSMC_MSG_GetPptLimit msg is not valid for gfx 11.0.3 on vf mode, so skiped to create power1_cap* hwmon sysfs node. Signed-off-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit e82a8d441038d8cb10b63047a9e705c42479d156) Cc: stable@vger.kernel.org
2025-08-27Revert "drm/amdgpu: fix incorrect vm flags to map bo"Alex Deucher1-2/+2
This reverts commit b08425fa77ad2f305fe57a33dceb456be03b653f. Revert this to align with 6.17 because the fixes tag was wrong on this commit. Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit be33e8a239aac204d7e9e673c4220ef244eb1ba3)
2025-08-27drm/amdgpu/gfx12: set MQD as appriopriate for queue typesAlex Deucher1-2/+6
Set the MQD as appropriate for the kernel vs user queues. Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 7b9110f2897957efd9715b52fc01986509729db3) Cc: stable@vger.kernel.org
2025-08-27drm/amdgpu/gfx11: set MQD as appriopriate for queue typesAlex Deucher1-2/+6
Set the MQD as appropriate for the kernel vs user queues. Acked-by: Christian König <christian.koenig@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 063d6683208722b1875f888a45084e3d112701ac) Cc: stable@vger.kernel.org
2025-08-27x86/bugs: Add attack vector controls for SSBDavid Kaplan2-4/+10
Attack vector controls for SSB were missed in the initial attack vector series. The default mitigation for SSB requires user-space opt-in so it is only relevant for user->user attacks. Check with attack vector controls when the command is auto - i.e., no explicit user selection has been done. Fixes: 2d31d2874663 ("x86/bugs: Define attack vectors relevant for each bug") Signed-off-by: David Kaplan <david.kaplan@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/20250819192200.2003074-5-david.kaplan@amd.com
2025-08-27net: rose: include node references in rose_neigh refcountTakamitsu Iwai1-2/+16
Current implementation maintains two separate reference counting mechanisms: the 'count' field in struct rose_neigh tracks references from rose_node structures, while the 'use' field (now refcount_t) tracks references from rose_sock. This patch merges these two reference counting systems using 'use' field for proper reference management. Specifically, this patch adds incrementing and decrementing of rose_neigh->use when rose_neigh->count is incremented or decremented. This patch also modifies rose_rt_free(), rose_rt_device_down() and rose_clear_route() to properly release references to rose_neigh objects before freeing a rose_node through rose_remove_node(). These changes ensure rose_neigh structures are properly freed only when all references, including those from rose_node structures, are released. As a result, this resolves a slab-use-after-free issue reported by Syzbot. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Reported-by: syzbot+942297eecf7d2d61d1f1@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=942297eecf7d2d61d1f1 Signed-off-by: Takamitsu Iwai <takamitz@amazon.co.jp> Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> Link: https://patch.msgid.link/20250823085857.47674-4-takamitz@amazon.co.jp Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-27net: rose: convert 'use' field to refcount_tTakamitsu Iwai5-33/+45
The 'use' field in struct rose_neigh is used as a reference counter but lacks atomicity. This can lead to race conditions where a rose_neigh structure is freed while still being referenced by other code paths. For example, when rose_neigh->use becomes zero during an ioctl operation via rose_rt_ioctl(), the structure may be removed while its timer is still active, potentially causing use-after-free issues. This patch changes the type of 'use' from unsigned short to refcount_t and updates all code paths to use rose_neigh_hold() and rose_neigh_put() which operate reference counts atomically. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Takamitsu Iwai <takamitz@amazon.co.jp> Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> Link: https://patch.msgid.link/20250823085857.47674-3-takamitz@amazon.co.jp Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-27net: rose: split remove and free operations in rose_remove_neigh()Takamitsu Iwai2-9/+14
The current rose_remove_neigh() performs two distinct operations: 1. Removes rose_neigh from rose_neigh_list 2. Frees the rose_neigh structure Split these operations into separate functions to improve maintainability and prepare for upcoming refcount_t conversion. The timer cleanup remains in rose_remove_neigh() because free operations can be called from timer itself. This patch introduce rose_neigh_put() to handle the freeing of rose_neigh structures and modify rose_remove_neigh() to handle removal only. Signed-off-by: Takamitsu Iwai <takamitz@amazon.co.jp> Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> Link: https://patch.msgid.link/20250823085857.47674-2-takamitz@amazon.co.jp Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-08-27io_uring/kbuf: fix signedness in this_len calculationQingyue Zhang1-1/+1
When importing and using buffers, buf->len is considered unsigned. However, buf->len is converted to signed int when committing. This can lead to unexpected behavior if the buffer is large enough to be interpreted as a negative value. Make min_t calculation unsigned. Fixes: ae98dbf43d75 ("io_uring/kbuf: add support for incremental buffer consumption") Co-developed-by: Suoxing Zhang <aftern00n@qq.com> Signed-off-by: Suoxing Zhang <aftern00n@qq.com> Signed-off-by: Qingyue Zhang <chunzhennn@qq.com> Link: https://lore.kernel.org/r/tencent_4DBB3674C0419BEC2C0C525949DA410CA307@qq.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-08-27x86/cpu/topology: Use initial APIC ID from XTOPOLOGY leaf on AMD/HYGONK Prateek Nayak1-9/+14
Prior to the topology parsing rewrite and the switchover to the new parsing logic for AMD processors in c749ce393b8f ("x86/cpu: Use common topology code for AMD"), the initial_apicid on these platforms was: - First initialized to the LocalApicId from CPUID leaf 0x1 EBX[31:24]. - Then overwritten by the ExtendedLocalApicId in CPUID leaf 0xb EDX[31:0] on processors that supported topoext. With the new parsing flow introduced in f7fb3b2dd92c ("x86/cpu: Provide an AMD/HYGON specific topology parser"), parse_8000_001e() now unconditionally overwrites the initial_apicid already parsed during cpu_parse_topology_ext(). Although this has not been a problem on baremetal platforms, on virtualized AMD guests that feature more than 255 cores, QEMU zeros out the CPUID leaf 0x8000001e on CPUs with CoreID > 255 to prevent collision of these IDs in EBX[7:0] which can only represent a maximum of 255 cores [1]. This results in the following FW_BUG being logged when booting a guest with more than 255 cores: [Firmware Bug]: CPU 512: APIC ID mismatch. CPUID: 0x0000 APIC: 0x0200 AMD64 Architecture Programmer's Manual Volume 2: System Programming Pub. 24593 Rev. 3.42 [2] Section 16.12 "x2APIC_ID" mentions the Extended Enumeration leaf 0xb (Fn0000_000B_EDX[31:0])(which was later superseded by the extended leaf 0x80000026) provides the full x2APIC ID under all circumstances unlike the one reported by CPUID leaf 0x8000001e EAX which depends on the mode in which APIC is configured. Rely on the APIC ID parsed during cpu_parse_topology_ext() from CPUID leaf 0x80000026 or 0xb and only use the APIC ID from leaf 0x8000001e if cpu_parse_topology_ext() failed (has_topoext is false). On platforms that support the 0xb leaf (Zen2 or later, AMD guests on QEMU) or the extended leaf 0x80000026 (Zen4 or later), the initial_apicid is now set to the value parsed from EDX[31:0]. On older AMD/Hygon platforms that do not support the 0xb leaf but support the TOPOEXT extension (families 0x15, 0x16, 0x17[Zen1], and Hygon), retain current behavior where the initial_apicid is set using the 0x8000001e leaf. Issue debugged by Naveen N Rao (AMD) <naveen@kernel.org> and Sairaj Kodilkar <sarunkod@amd.com>. [ bp: Massage commit message. ] Fixes: c749ce393b8f ("x86/cpu: Use common topology code for AMD") Suggested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: K Prateek Nayak <kprateek.nayak@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Tested-by: Naveen N Rao (AMD) <naveen@kernel.org> Cc: stable@vger.kernel.org Link: https://github.com/qemu/qemu/commit/35ac5dfbcaa4b [1] Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537 [2] Link: https://lore.kernel.org/20250825075732.10694-2-kprateek.nayak@amd.com