summaryrefslogtreecommitdiffstats
AgeCommit message (Collapse)AuthorLines
11 daysbtrfs: do not ASSERT() when the fs flips RO inside btrfs_repair_io_failure()Qu Wenruo-1/+7
[BUG] There is a bug report that when btrfs hits ENOSPC error in a critical path, btrfs flips RO (this part is expected, although the ENOSPC bug still needs to be addressed). The problem is after the RO flip, if there is a read repair pending, we can hit the ASSERT() inside btrfs_repair_io_failure() like the following: BTRFS info (device vdc): relocating block group 30408704 flags metadata|raid1 ------------[ cut here ]------------ BTRFS: Transaction aborted (error -28) WARNING: fs/btrfs/extent-tree.c:3235 at __btrfs_free_extent.isra.0+0x453/0xfd0, CPU#1: btrfs/383844 Modules linked in: kvm_intel kvm irqbypass [...] ---[ end trace 0000000000000000 ]--- BTRFS info (device vdc state EA): 2 enospc errors during balance BTRFS info (device vdc state EA): balance: ended with status: -30 BTRFS error (device vdc state EA): parent transid verify failed on logical 30556160 mirror 2 wanted 8 found 6 BTRFS error (device vdc state EA): bdev /dev/nvme0n1 errs: wr 0, rd 0, flush 0, corrupt 10, gen 0 [...] assertion failed: !(fs_info->sb->s_flags & SB_RDONLY) :: 0, in fs/btrfs/bio.c:938 ------------[ cut here ]------------ assertion failed: !(fs_info->sb->s_flags & SB_RDONLY) :: 0, in fs/btrfs/bio.c:938 kernel BUG at fs/btrfs/bio.c:938! Oops: invalid opcode: 0000 [#1] SMP NOPTI CPU: 0 UID: 0 PID: 868 Comm: kworker/u8:13 Tainted: G W N 6.19.0-rc6+ #4788 PREEMPT(full) Tainted: [W]=WARN, [N]=TEST Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.17.0-0-gb52ca86e094d-prebuilt.qemu.org 04/01/2014 Workqueue: btrfs-endio simple_end_io_work RIP: 0010:btrfs_repair_io_failure.cold+0xb2/0x120 RSP: 0000:ffffc90001d2bcf0 EFLAGS: 00010246 RAX: 0000000000000051 RBX: 0000000000001000 RCX: 0000000000000000 RDX: 0000000000000000 RSI: ffffffff8305cf42 RDI: 00000000ffffffff RBP: 0000000000000002 R08: 00000000fffeffff R09: ffffffff837fa988 R10: ffffffff8327a9e0 R11: 6f69747265737361 R12: ffff88813018d310 R13: ffff888168b8a000 R14: ffffc90001d2bd90 R15: ffff88810a169000 FS: 0000000000000000(0000) GS:ffff8885e752c000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 ------------[ cut here ]------------ [CAUSE] The cause of -ENOSPC error during the test case btrfs/124 is still unknown, although it's known that we still have cases where metadata can be over-committed but can not be fulfilled correctly, thus if we hit such ENOSPC error inside a critical path, we have no choice but abort the current transaction. This will mark the fs read-only. The problem is inside the btrfs_repair_io_failure() path that we require the fs not to be mount read-only. This is normally fine, but if we are doing a read-repair meanwhile the fs flips RO due to a critical error, we can enter btrfs_repair_io_failure() with super block set to read-only, thus triggering the above crash. [FIX] Just replace the ASSERT() with a proper return if the fs is already read-only. Reported-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/linux-btrfs/20260126045555.GB31641@lst.de/ Tested-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
11 daysbtrfs: reset block group size class when it becomes emptyJiasheng Jiang-0/+10
Block group size classes are managed consistently everywhere. Currently, btrfs_use_block_group_size_class() sets a block group's size class to specialize it for a specific allocation size. However, this size class remains "stale" even if the block group becomes completely empty (both used and reserved bytes reach zero). This happens in two scenarios: 1. When space reservations are freed (e.g., due to errors or transaction aborts) via btrfs_free_reserved_bytes(). 2. When the last extent in a block group is freed via btrfs_update_block_group(). While size classes are advisory, a stale size class can cause find_free_extent to unnecessarily skip candidate block groups during initial search loops. This undermines the purpose of size classes to reduce fragmentation by keeping block groups restricted to a specific size class when they could be reused for any size. Fix this by resetting the size class to BTRFS_BG_SZ_NONE whenever a block group's used and reserved counts both reach zero. This ensures that empty block groups are fully available for any allocation size in the next cycle. Fixes: 52bb7a2166af ("btrfs: introduce size class to block group allocator") Reviewed-by: Boris Burkov <boris@bur.io> Signed-off-by: Jiasheng Jiang <jiashengjiangcool@gmail.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
11 daysbtrfs: replace BUG() with error handling in __btrfs_balance()Adarsh Das-2/+8
We search with offset (u64)-1 which should never match exactly. Previously this was handled with BUG(). Now logs an error and return -EUCLEAN. Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Adarsh Das <adarshdas950@gmail.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
11 daysbtrfs: handle unexpected exact match in btrfs_set_inode_index_count()Adarsh Das-3/+12
We search with offset (u64)-1 which should never match exactly. Previously the code silently returned success without setting the index count. Now logs an error and return -EUCLEAN instead. Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Adarsh Das <adarshdas950@gmail.com> Reviewed-by: David Sterba <dsterba@suse.com>, Signed-off-by: David Sterba <dsterba@suse.com>
11 dayss390/debug: Convert debug area lock from a spinlock to a raw spinlockBenjamin Block-32/+32
With PREEMPT_RT as potential configuration option, spinlock_t is now considered as a sleeping lock, and thus might cause issues when used in an atomic context. But even with PREEMPT_RT as potential configuration option, raw_spinlock_t remains as a true spinning lock/atomic context. This creates potential issues with the s390 debug/tracing feature. The functions to trace errors are called in various contexts, including under lock of raw_spinlock_t, and thus the used spinlock_t in each debug area is in violation of the locking semantics. Here are two examples involving failing PCI Read accesses that are traced while holding `pci_lock` in `drivers/pci/access.c`: ============================= [ BUG: Invalid wait context ] 6.19.0-devel #18 Not tainted ----------------------------- bash/3833 is trying to lock: 0000027790baee30 (&rc->lock){-.-.}-{3:3}, at: debug_event_common+0xfc/0x300 other info that might help us debug this: context-{5:5} 5 locks held by bash/3833: #0: 0000027efbb29450 (sb_writers#3){.+.+}-{0:0}, at: ksys_write+0x7c/0xf0 #1: 00000277f0504a90 (&of->mutex#2){+.+.}-{4:4}, at: kernfs_fop_write_iter+0x13e/0x260 #2: 00000277beed8c18 (kn->active#339){.+.+}-{0:0}, at: kernfs_fop_write_iter+0x164/0x260 #3: 00000277e9859190 (&dev->mutex){....}-{4:4}, at: pci_dev_lock+0x2e/0x40 #4: 00000383068a7708 (pci_lock){....}-{2:2}, at: pci_bus_read_config_dword+0x4a/0xb0 stack backtrace: CPU: 6 UID: 0 PID: 3833 Comm: bash Kdump: loaded Not tainted 6.19.0-devel #18 PREEMPTLAZY Hardware name: IBM 9175 ME1 701 (LPAR) Call Trace: [<00000383048afec2>] dump_stack_lvl+0xa2/0xe8 [<00000383049ba166>] __lock_acquire+0x816/0x1660 [<00000383049bb1fa>] lock_acquire+0x24a/0x370 [<00000383059e3860>] _raw_spin_lock_irqsave+0x70/0xc0 [<00000383048bbb6c>] debug_event_common+0xfc/0x300 [<0000038304900b0a>] __zpci_load+0x17a/0x1f0 [<00000383048fad88>] pci_read+0x88/0xd0 [<00000383054cbce0>] pci_bus_read_config_dword+0x70/0xb0 [<00000383054d55e4>] pci_dev_wait+0x174/0x290 [<00000383054d5a3e>] __pci_reset_function_locked+0xfe/0x170 [<00000383054d9b30>] pci_reset_function+0xd0/0x100 [<00000383054ee21a>] reset_store+0x5a/0x80 [<0000038304e98758>] kernfs_fop_write_iter+0x1e8/0x260 [<0000038304d995da>] new_sync_write+0x13a/0x180 [<0000038304d9c5d0>] vfs_write+0x200/0x330 [<0000038304d9c88c>] ksys_write+0x7c/0xf0 [<00000383059cfa80>] __do_syscall+0x210/0x500 [<00000383059e4c06>] system_call+0x6e/0x90 INFO: lockdep is turned off. ============================= [ BUG: Invalid wait context ] 6.19.0-devel #3 Not tainted ----------------------------- bash/6861 is trying to lock: 0000009da05c7430 (&rc->lock){-.-.}-{3:3}, at: debug_event_common+0xfc/0x300 other info that might help us debug this: context-{5:5} 5 locks held by bash/6861: #0: 000000acff404450 (sb_writers#3){.+.+}-{0:0}, at: ksys_write+0x7c/0xf0 #1: 000000acff41c490 (&of->mutex#2){+.+.}-{4:4}, at: kernfs_fop_write_iter+0x13e/0x260 #2: 0000009da36937d8 (kn->active#75){.+.+}-{0:0}, at: kernfs_fop_write_iter+0x164/0x260 #3: 0000009dd15250d0 (&zdev->state_lock){+.+.}-{4:4}, at: enable_slot+0x2e/0xc0 #4: 000001a19682f708 (pci_lock){....}-{2:2}, at: pci_bus_read_config_byte+0x42/0xa0 stack backtrace: CPU: 16 UID: 0 PID: 6861 Comm: bash Kdump: loaded Not tainted 6.19.0-devel #3 PREEMPTLAZY Hardware name: IBM 9175 ME1 701 (LPAR) Call Trace: [<000001a194837ec2>] dump_stack_lvl+0xa2/0xe8 [<000001a194942166>] __lock_acquire+0x816/0x1660 [<000001a1949431fa>] lock_acquire+0x24a/0x370 [<000001a19596b810>] _raw_spin_lock_irqsave+0x70/0xc0 [<000001a194843b6c>] debug_event_common+0xfc/0x300 [<000001a194888b0a>] __zpci_load+0x17a/0x1f0 [<000001a194882d88>] pci_read+0x88/0xd0 [<000001a195453b88>] pci_bus_read_config_byte+0x68/0xa0 [<000001a195457bc2>] pci_setup_device+0x62/0xad0 [<000001a195458e70>] pci_scan_single_device+0x90/0xe0 [<000001a19488a0f6>] zpci_bus_scan_device+0x46/0x80 [<000001a19547f958>] enable_slot+0x98/0xc0 [<000001a19547f134>] power_write_file+0xc4/0x110 [<000001a194e20758>] kernfs_fop_write_iter+0x1e8/0x260 [<000001a194d215da>] new_sync_write+0x13a/0x180 [<000001a194d245d0>] vfs_write+0x200/0x330 [<000001a194d2488c>] ksys_write+0x7c/0xf0 [<000001a195957a30>] __do_syscall+0x210/0x500 [<000001a19596cbb6>] system_call+0x6e/0x90 INFO: lockdep is turned off. Since it is desired to keep it possible to create trace records in most situations, including this particular case (failing PCI config space accesses are relevant), convert the used spinlock_t in `struct debug_info` to raw_spinlock_t. The impact is small, as the debug area lock only protects bounded memory access without external dependencies, apart from one function debug_set_size() where kfree() is implicitly called with the lock held. Move debug_info_free() out of this lock, to keep remove this external dependency. Acked-by: Heiko Carstens <hca@linux.ibm.com> Signed-off-by: Benjamin Block <bblock@linux.ibm.com> Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
11 daysefi: Align unaccepted memory range to page boundaryKiryl Shutsemau (Meta)-2/+8
The accept_memory() and range_contains_unaccepted_memory() functions employ a "guard page" logic to prevent crashes with load_unaligned_zeropad(). This logic extends the range to be accepted (or checked) by one unit_size if the end of the range is aligned to a unit_size boundary. However, if the caller passes a range that is not page-aligned, the 'end' of the range might not be numerically aligned to unit_size, even if it covers the last page of a unit. This causes the "if (!(end % unit_size))" check to fail, skipping the necessary extension and leaving the next unit unaccepted, which can lead to a kernel panic when accessed by load_unaligned_zeropad(). Align the start address down and the size up to the nearest page boundary before performing the unit_size alignment check. This ensures that the guard unit is correctly added when the range effectively ends on a unit boundary. Signed-off-by: Kiryl Shutsemau (Meta) <kas@kernel.org> Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com> Acked-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
11 daysefi: Fix reservation of unaccepted memory tableKiryl Shutsemau (Meta)-4/+4
The reserve_unaccepted() function incorrectly calculates the size of the memblock reservation for the unaccepted memory table. It aligns the size of the table, but fails to account for cases where the table's starting physical address (efi.unaccepted) is not page-aligned. If the table starts at an offset within a page and its end crosses into a subsequent page that the aligned size does not cover, the end of the table will not be reserved. This can lead to the table being overwritten or inaccessible, causing a kernel panic in accept_memory(). This issue was observed when starting Intel TDX VMs with specific memory sizes (e.g., > 64GB). Fix this by calculating the end address first (including the unaligned start) and then aligning it up, ensuring the entire range is covered by the reservation. Fixes: 8dbe33956d96 ("efi/unaccepted: Make sure unaccepted table is mapped") Reported-by: Moritz Sanft <ms@edgeless.systems> Signed-off-by: Kiryl Shutsemau (Meta) <kas@kernel.org> Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com> Acked-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
11 daysMAINTAINERS: Add a reviewer entry for EFIIlias Apalodimas-0/+1
Over the years I've contributed patches to the EFI subsystem mostly around TPM and EFI variables. Add me as a reviewer. Signed-off-by: Ilias Apalodimas <ilias.apalodimas@linaro.org> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
11 daysefi: stmm: Constify struct efivar_operationsKrzysztof Kozlowski-8/+9
The 'struct efivar_operations' is not modified by the driver after initialization, so it should follow typical practice of being static const for increased code safety and readability. Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com> Reviewed-by: Ilias Apalodimas <ilias.apalodimas@linaro.org> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
11 daysx86/kexec: Copy ACPI root pointer address from config tableArd Biesheuvel-0/+7
Dave reports that kexec may fail when the first kernel boots via the EFI stub but without EFI runtime services, as in that case, the RSDP address field in struct bootparams is never assigned. Kexec copies this value into the version of struct bootparams that it provides to the incoming kernel, which may have no other means to locate the ACPI root pointer. So take the value from the EFI config tables if no root pointer has been set in the first kernel's struct bootparams. Fixes: a1b87d54f4e4 ("x86/efistub: Avoid legacy decompressor when doing EFI boot") Cc: <stable@vger.kernel.org> # v6.1 Reported-by: Dave Young <dyoung@redhat.com> Tested-by: Dave Young <dyoung@redhat.com> Link: https://lore.kernel.org/linux-efi/aZQg_tRQmdKNadCg@darkstar.users.ipa.redhat.com/ Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
11 daysgpio: amd-fch: ionly return allowed values from amd_fch_gpio_get()Dmitry Torokhov-3/+4
As of 86ef402d805d ("gpiolib: sanitize the return value of gpio_chip::get()") gpiolib requires drivers implementing GPIOs to only return 0, 1 or negative error for the get() callbacks. Ensure that amd-fch complies with this requirement. Fixes: 86ef402d805d ("gpiolib: sanitize the return value of gpio_chip::get()") Reported-and-tested-by: Tj <tj.iam.tj@proton.me> Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com> Link: https://patch.msgid.link/aZTlwnvHt2Gho4yN@google.com Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
11 daysx86/xen: Fix Xen PV guest bootJuergen Gross-1/+5
A recent patch moving the call of sparse_init() to common mm code broke booting as a Xen PV guest. Reason is that the Xen PV specific boot code relied on struct page area being accessible rather early, but this changed by the move of the call of sparse_init(). Fortunately the fix is rather easy: there is a static branch available indicating whether struct page contents are usable by Xen. This static branch just needs to be tested in some places for avoiding the access of struct page. Fixes: 4267739cabb8 ("arch, mm: consolidate initialization of SPARSE memory model") Signed-off-by: Juergen Gross <jgross@suse.com> Message-ID: <20260214135035.119357-1-jgross@suse.com>
11 daysgpio: sysfs: fix chip removal with GPIOs exported over sysfsBartosz Golaszewski-51/+55
Currently if we export a GPIO over sysfs and unbind the parent GPIO controller, the exported attribute will remain under /sys/class/gpio because once we remove the parent device, we can no longer associate the descriptor with it in gpiod_unexport() and never drop the final reference. Rework the teardown code: provide an unlocked variant of gpiod_unexport() and remove all exported GPIOs with the sysfs_lock taken before unregistering the parent device itself. This is done to prevent any new exports happening before we unregister the device completely. Cc: stable@vger.kernel.org Fixes: 1cd53df733c2 ("gpio: sysfs: don't look up exported lines as class devices") Link: https://patch.msgid.link/20260212133505.81516-1-bartosz.golaszewski@oss.qualcomm.com Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
11 daysgpio: swnode: restore the swnode-name-against-chip-label matchingBartosz Golaszewski-0/+19
Using the remote firmware node for software node lookup is the right thing to do. The GPIO controller we want to resolve should have the software node we scooped out of the reference attached to it. However, there are existing users who abuse the software node API by creating dummy swnodes whose name is set to the expected label string of the GPIO controller whose pins they want to control and use them in their local swnode references as GPIO properties. This used to work when we compared the software node's name to the chip's label. When we switched to using a real fwnode lookup, these users broke down because the firmware nodes in question were never attached to the controllers they were looking for. Restore the label matching as a fallback to fix the broken users but add a big FIXME urging for a better solution. Cc: stable@vger.kernel.org # v6.18, v6.19 Fixes: 216c12047571 ("gpio: swnode: allow referencing GPIO chips by firmware nodes") Link: https://lore.kernel.org/all/aYkdKfP5fg6iywgr@jekhomev/ Acked-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: Hans de Goede <johannes.goede@oss.qualcomm.com> Link: https://patch.msgid.link/20260211085313.16792-1-bartosz.golaszewski@oss.qualcomm.com Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
11 daysALSA: echoaudio: Add SPDX ids to some filesTim Bird-339/+23
Add SPDX-License-Identifier lines to some files in the sound subsystem - mostly in the echoaudio drivers. Remove boilerplate GPL headers. Signed-off-by: Tim Bird <tim.bird@sony.com> Link: https://patch.msgid.link/20260212234928.3739815-1-tim.bird@sony.com Signed-off-by: Takashi Iwai <tiwai@suse.de>
11 daysALSA: isa: Add SPDX id lines to some filesTim Bird-6/+7
Add SPDX-License-Identifier lines to several files where they are missing, mostly in the sound/isa subdir. Use GPL-2.0 as the id. [ note: the same change applied to sound/hda/core/trace.c, too -- tiwai ] Signed-off-by: Tim Bird <tim.bird@sony.com> Link: https://patch.msgid.link/20260212195905.3726149-1-tim.bird@sony.com Signed-off-by: Takashi Iwai <tiwai@suse.de>
11 daysALSA: core: Add SPDX license id to filesTim Bird-128/+9
Add an SPDX id of LGPL-2.0+ to files in the sound core sub-system that are missing ids. Remove boilerplate text. These files were originally submitted in a big commit for the ALSA sound system for kernel version 2.5.4, by Jaroslav Kysela, in Feb 2002. Signed-off-by: Tim Bird <tim.bird@sony.com> Link: https://patch.msgid.link/20260212183103.3720788-1-tim.bird@sony.com Signed-off-by: Takashi Iwai <tiwai@suse.de>
11 daysDrivers: hv: vmbus: Use kthread for vmbus interrupts on PREEMPT_RTJan Kiszka-1/+65
Resolves the following lockdep report when booting PREEMPT_RT on Hyper-V with related guest support enabled: [ 1.127941] hv_vmbus: registering driver hyperv_drm [ 1.132518] ============================= [ 1.132519] [ BUG: Invalid wait context ] [ 1.132521] 6.19.0-rc8+ #9 Not tainted [ 1.132524] ----------------------------- [ 1.132525] swapper/0/0 is trying to lock: [ 1.132526] ffff8b9381bb3c90 (&channel->sched_lock){....}-{3:3}, at: vmbus_chan_sched+0xc4/0x2b0 [ 1.132543] other info that might help us debug this: [ 1.132544] context-{2:2} [ 1.132545] 1 lock held by swapper/0/0: [ 1.132547] #0: ffffffffa010c4c0 (rcu_read_lock){....}-{1:3}, at: vmbus_chan_sched+0x31/0x2b0 [ 1.132557] stack backtrace: [ 1.132560] CPU: 0 UID: 0 PID: 0 Comm: swapper/0 Not tainted 6.19.0-rc8+ #9 PREEMPT_{RT,(lazy)} [ 1.132565] Hardware name: Microsoft Corporation Virtual Machine/Virtual Machine, BIOS Hyper-V UEFI Release v4.1 09/25/2025 [ 1.132567] Call Trace: [ 1.132570] <IRQ> [ 1.132573] dump_stack_lvl+0x6e/0xa0 [ 1.132581] __lock_acquire+0xee0/0x21b0 [ 1.132592] lock_acquire+0xd5/0x2d0 [ 1.132598] ? vmbus_chan_sched+0xc4/0x2b0 [ 1.132606] ? lock_acquire+0xd5/0x2d0 [ 1.132613] ? vmbus_chan_sched+0x31/0x2b0 [ 1.132619] rt_spin_lock+0x3f/0x1f0 [ 1.132623] ? vmbus_chan_sched+0xc4/0x2b0 [ 1.132629] ? vmbus_chan_sched+0x31/0x2b0 [ 1.132634] vmbus_chan_sched+0xc4/0x2b0 [ 1.132641] vmbus_isr+0x2c/0x150 [ 1.132648] __sysvec_hyperv_callback+0x5f/0xa0 [ 1.132654] sysvec_hyperv_callback+0x88/0xb0 [ 1.132658] </IRQ> [ 1.132659] <TASK> [ 1.132660] asm_sysvec_hyperv_callback+0x1a/0x20 As code paths that handle vmbus IRQs use sleepy locks under PREEMPT_RT, the vmbus_isr execution needs to be moved into thread context. Open- coding this allows to skip the IPI that irq_work would additionally bring and which we do not need, being an IRQ, never an NMI. This affects both x86 and arm64, therefore hook into the common driver logic. Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com> Reviewed-by: Florian Bezdeka <florian.bezdeka@siemens.com> Tested-by: Florian Bezdeka <florian.bezdeka@siemens.com> Reviewed-by: Michael Kelley <mhklinux@outlook.com> Tested-by: Michael Kelley <mhklinux@outlook.com> Signed-off-by: Wei Liu <wei.liu@kernel.org>
11 daysfsverity: fix build error by adding fsverity_readahead() stubEric Biggers-2/+7
hppa-linux-gcc 9.5.0 generates a call to fsverity_readahead() in f2fs_readahead() when CONFIG_FS_VERITY=n, because it fails to do the expected dead code elimination based on vi always being NULL. Fix the build error by adding an inline stub for fsverity_readahead(). Since it's just for opportunistic readahead, just make it a no-op. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202602180838.pwICdY2r-lkp@intel.com/ Fixes: 45dcb3ac9832 ("f2fs: consolidate fsverity_info lookup") Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20260218012244.18536-1-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>
11 daysfsverity: remove fsverity_verify_page()Eric Biggers-8/+2
Now that fsverity_verify_page() has no callers, remove it. Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20260218010630.7407-4-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>
11 daysf2fs: make f2fs_verify_cluster() partially large-folio-awareEric Biggers-4/+5
f2fs_verify_cluster() is the only remaining caller of the non-large-folio-aware function fsverity_verify_page(). To unblock the removal of that function, change f2fs_verify_cluster() to verify the entire folio of each page and mark it up-to-date. Note that this doesn't actually make f2fs_verify_cluster() large-folio-aware, as it is still passed an array of pages. Currently, it's never called with large folios. Suggested-by: Matthew Wilcox <willy@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20260218010630.7407-3-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>
11 daysf2fs: remove unnecessary ClearPageUptodate in f2fs_verify_cluster()Eric Biggers-2/+0
Remove the unnecessary clearing of PG_uptodate. It's guaranteed to already be clear. Suggested-by: Matthew Wilcox <willy@infradead.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Link: https://lore.kernel.org/r/20260218010630.7407-2-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>
11 daysx86/hyperv: Remove ASM_CALL_CONSTRAINT with VMMCALL insnUros Bizjak-1/+1
Unlike CALL instruction, VMMCALL does not push to the stack, so it's OK to allow the compiler to insert it before the frame pointer gets set up by the containing function. ASM_CALL_CONSTRAINT is for CALLs that must be inserted after the frame pointer is set up, so it is over-constraining here and can be removed. Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Tested-by: Michael Kelley <mhklinux@outlook.com> Cc: K. Y. Srinivasan <kys@microsoft.com> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Wei Liu <wei.liu@kernel.org> Cc: Dexuan Cui <decui@microsoft.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Wei Liu <wei.liu@kernel.org>
11 daysx86/hyperv: Use savesegment() instead of inline asm() to save segment registersUros Bizjak-4/+5
Use standard savesegment() utility macro to save segment registers. Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Acked-by: Wei Liu <wei.liu@kernel.org> Tested-by: Michael Kelley <mhklinux@outlook.com> Cc: K. Y. Srinivasan <kys@microsoft.com> Cc: Haiyang Zhang <haiyangz@microsoft.com> Cc: Dexuan Cui <decui@microsoft.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Wei Liu <wei.liu@kernel.org>
11 dayseth: fbnic: Add validation for MTU changesDimitri Daskalakis-0/+18
Increasing the MTU beyond the HDS threshold causes the hardware to fragment packets across multiple buffers. If a single-buffer XDP program is attached, the driver will drop all multi-frag frames. While we can't prevent a remote sender from sending non-TCP packets larger than the MTU, this will prevent users from inadvertently breaking new TCP streams. Traditionally, drivers supported XDP with MTU less than 4Kb (packet per page). Fbnic currently prevents attaching XDP when MTU is too high. But it does not prevent increasing MTU after XDP is attached. Fixes: 1b0a3950dbd4 ("eth: fbnic: Add XDP pass, drop, abort support") Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Dimitri Daskalakis <dimitri.daskalakis1@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>
11 daystools/power turbostat: Fix AMD RAPL regressionLen Brown-2/+1
turbostat.c:8688: rapl_perf_init: Assertion `next_domain < num_domains' failed. Two recent cleanup patches that were not supposed to change anything broke the core_id code needed for AMD RAPL initialization: commit 070e92361eec ("tools/power turbostat: Enhance HT enumeration") commit ddf60e38ca04 ("tools/power turbostat: Simplify global core_id calculation") Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Len Brown <len.brown@intel.com>
11 daysMerge tag 'amd-drm-next-6.20-2026-02-13' of ↵Dave Airlie-390/+612
https://gitlab.freedesktop.org/agd5f/linux into drm-next amd-drm-next-6.20-2026-02-13: amdgpu: - SMU 13.x fixes - DC resume lag fix - MPO fixes - DCN 3.6 fix - VSDB fixes - HWSS clean up - Replay fixes - DCE cursor fixes - DCN 3.5 SR DDR5 latency fixes - HPD fixes - Error path unwind fixes - SMU13/14 mode1 reset fixes - PSP 15 updates - SMU 15 updates - RAS fixes - Sync fix in amdgpu_dma_buf_move_notify() - HAINAN fix - PSP 13.x fix - GPUVM locking fix amdkfd: - APU GTT as VRAM fix radeon: - HAINAN fix Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://patch.msgid.link/20260213220825.1454189-1-alexander.deucher@amd.com
11 daysMerge tag 'drm-intel-next-fixes-2026-02-13' of ↵Dave Airlie-4/+19
https://gitlab.freedesktop.org/drm/i915/kernel into drm-next - Regresion fix for HDR 4k displays (#15503) - Fixup for Dell XPS 13 7390 eDP rate limit - Memory leak fix on ACPI _DSM handling Signed-off-by: Dave Airlie <airlied@redhat.com> From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Link: https://patch.msgid.link/aY8CtbhijtetQ6P3@jlahtine-mobl
11 daysnet/sched: act_skbedit: fix divide-by-zero in tcf_skbedit_hash()Ruitong Liu-1/+5
Commit 38a6f0865796 ("net: sched: support hash selecting tx queue") added SKBEDIT_F_TXQ_SKBHASH support. The inclusive range size is computed as: mapping_mod = queue_mapping_max - queue_mapping + 1; The range size can be 65536 when the requested range covers all possible u16 queue IDs (e.g. queue_mapping=0 and queue_mapping_max=U16_MAX). That value cannot be represented in a u16 and previously wrapped to 0, so tcf_skbedit_hash() could trigger a divide-by-zero: queue_mapping += skb_get_hash(skb) % params->mapping_mod; Compute mapping_mod in a wider type and reject ranges larger than U16_MAX to prevent params->mapping_mod from becoming 0 and avoid the crash. Fixes: 38a6f0865796 ("net: sched: support hash selecting tx queue") Cc: stable@vger.kernel.org # 6.12+ Signed-off-by: Ruitong Liu <cnitlrt@gmail.com> Link: https://patch.msgid.link/20260213175948.1505257-1-cnitlrt@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 daysvsock: document namespace mode sysctlsStefano Garzarella-2/+50
Add documentation for the vsock per-namespace sysctls (`ns_mode` and `child_ns_mode`) to Documentation/admin-guide/sysctl/net.rst. These sysctls were introduced by commit eafb64f40ca4 ("vsock: add netns to vsock core"). Document the two namespace modes (`global` and `local`), the inheritance behavior of `child_ns_mode`, and the restriction preventing local namespaces from setting `child_ns_mode` to `global`. Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Tested-by: Randy Dunlap <rdunlap@infradead.org> Acked-by: Randy Dunlap <rdunlap@infradead.org> Link: https://patch.msgid.link/20260216163147.236844-1-sgarzare@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 daysnet: ethernet: ec_bhf: Fix dma_free_coherent() dma handleThomas Fourier-1/+1
dma_free_coherent() in error path takes priv->rx_buf.alloc_len as the dma handle. This would lead to improper unmapping of the buffer. Change the dma handle to priv->rx_buf.alloc_phys. Fixes: 6af55ff52b02 ("Driver for Beckhoff CX5020 EtherCAT master module.") Cc: <stable@vger.kernel.org> Signed-off-by: Thomas Fourier <fourier.thomas@gmail.com> Link: https://patch.msgid.link/20260213164340.77272-2-fourier.thomas@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 daysmacvlan: observe an RCU grace period in macvlan_common_newlink() error pathEric Dumazet-0/+5
valis reported that a race condition still happens after my prior patch. macvlan_common_newlink() might have made @dev visible before detecting an error, and its caller will directly call free_netdev(dev). We must respect an RCU period, either in macvlan or the core networking stack. After adding a temporary mdelay(1000) in macvlan_forward_source_one() to open the race window, valis repro was: ip link add p1 type veth peer p2 ip link set address 00:00:00:00:00:20 dev p1 ip link set up dev p1 ip link set up dev p2 ip link add mv0 link p2 type macvlan mode source (ip link add invalid% link p2 type macvlan mode source macaddr add 00:00:00:00:00:20 &) ; sleep 0.5 ; ping -c1 -I p1 1.2.3.4 PING 1.2.3.4 (1.2.3.4): 56 data bytes RTNETLINK answers: Invalid argument BUG: KASAN: slab-use-after-free in macvlan_forward_source (drivers/net/macvlan.c:408 drivers/net/macvlan.c:444) Read of size 8 at addr ffff888016bb89c0 by task e/175 CPU: 1 UID: 1000 PID: 175 Comm: e Not tainted 6.19.0-rc8+ #33 NONE Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014 Call Trace: <IRQ> dump_stack_lvl (lib/dump_stack.c:123) print_report (mm/kasan/report.c:379 mm/kasan/report.c:482) ? macvlan_forward_source (drivers/net/macvlan.c:408 drivers/net/macvlan.c:444) kasan_report (mm/kasan/report.c:597) ? macvlan_forward_source (drivers/net/macvlan.c:408 drivers/net/macvlan.c:444) macvlan_forward_source (drivers/net/macvlan.c:408 drivers/net/macvlan.c:444) ? tasklet_init (kernel/softirq.c:983) macvlan_handle_frame (drivers/net/macvlan.c:501) Allocated by task 169: kasan_save_stack (mm/kasan/common.c:58) kasan_save_track (./arch/x86/include/asm/current.h:25 mm/kasan/common.c:70 mm/kasan/common.c:79) __kasan_kmalloc (mm/kasan/common.c:419) __kvmalloc_node_noprof (./include/linux/kasan.h:263 mm/slub.c:5657 mm/slub.c:7140) alloc_netdev_mqs (net/core/dev.c:12012) rtnl_create_link (net/core/rtnetlink.c:3648) rtnl_newlink (net/core/rtnetlink.c:3830 net/core/rtnetlink.c:3957 net/core/rtnetlink.c:4072) rtnetlink_rcv_msg (net/core/rtnetlink.c:6958) netlink_rcv_skb (net/netlink/af_netlink.c:2550) netlink_unicast (net/netlink/af_netlink.c:1319 net/netlink/af_netlink.c:1344) netlink_sendmsg (net/netlink/af_netlink.c:1894) __sys_sendto (net/socket.c:727 net/socket.c:742 net/socket.c:2206) __x64_sys_sendto (net/socket.c:2209) do_syscall_64 (arch/x86/entry/syscall_64.c:63 arch/x86/entry/syscall_64.c:94) entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:131) Freed by task 169: kasan_save_stack (mm/kasan/common.c:58) kasan_save_track (./arch/x86/include/asm/current.h:25 mm/kasan/common.c:70 mm/kasan/common.c:79) kasan_save_free_info (mm/kasan/generic.c:587) __kasan_slab_free (mm/kasan/common.c:287) kfree (mm/slub.c:6674 mm/slub.c:6882) rtnl_newlink (net/core/rtnetlink.c:3845 net/core/rtnetlink.c:3957 net/core/rtnetlink.c:4072) rtnetlink_rcv_msg (net/core/rtnetlink.c:6958) netlink_rcv_skb (net/netlink/af_netlink.c:2550) netlink_unicast (net/netlink/af_netlink.c:1319 net/netlink/af_netlink.c:1344) netlink_sendmsg (net/netlink/af_netlink.c:1894) __sys_sendto (net/socket.c:727 net/socket.c:742 net/socket.c:2206) __x64_sys_sendto (net/socket.c:2209) do_syscall_64 (arch/x86/entry/syscall_64.c:63 arch/x86/entry/syscall_64.c:94) entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:131) Fixes: f8db6475a836 ("macvlan: fix error recovery in macvlan_common_newlink()") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: valis <sec@valis.email> Link: https://patch.msgid.link/20260213142557.3059043-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 daysselftests: tc_actions: don't dump 2MB of \0 to stdoutJakub Kicinski-1/+1
Since we started running selftests in NIPA we have been seeing tc_actions.sh generate a soft lockup warning on ~20% of the runs. On the pre-netdev foundation setup it was actually a missed irq splat from the console. Now it's either that or a lockup. I initially suspected a socket locking issue since the test is exercising local loopback with act_mirred. After hours of staring at this I noticed in strace that ncat when -o $file is specified _both_ saves the output to the file and still prints it to stdout. Because the file being sent is constructed with: dd conv=sparse status=none if=/dev/zero bs=1M count=2 of=$mirred ^^^^^^^^^ the data printed is all \0. Most terminals don't display nul characters (and neither does vng output capture save them). But QEMU's serial console still has to poke them thru which is very slow and causes the lockup (if the file is >600kB). Replace the '-o $file' with '> $file'. This speeds the test up from 2m20s to 18s on debug kernels, and prevents the warnings. Fixes: ca22da2fbd69 ("act_mirred: use the backlog for nested calls to mirred ingress") Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20260214035159.2119699-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 daysipv6: addrconf: reduce default temp_valid_lft to 2 daysFernando Fernandez Mancera-1/+2
This is a recommendation from RFC 8981 and it was intended to be changed by commit 969c54646af0 ("ipv6: Implement draft-ietf-6man-rfc4941bis") but it only changed the sysctl documentation. Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20260214172543.5783-1-fmancera@suse.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 daysping: annotate data-races in ping_lookup()Eric Dumazet-12/+19
isk->inet_num, isk->inet_rcv_saddr and sk->sk_bound_dev_if are read locklessly in ping_lookup(). Add READ_ONCE()/WRITE_ONCE() annotations. The race on isk->inet_rcv_saddr is probably coming from IPv6 support, but does not deserve a specific backport. Fixes: dbca1596bbb0 ("ping: convert to RCU lookups, get rid of rwlock") Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> Link: https://patch.msgid.link/20260216100149.3319315-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 daysnet: dsa: MxL862xx: don't force-enable MAXLINEAR_GPHYArnd Bergmann-1/+0
The newly added dsa driver attempts to enable the corresponding PHY driver, but that one has additional dependencies that may not be available: WARNING: unmet direct dependencies detected for MAXLINEAR_GPHY Depends on [m]: NETDEVICES [=y] && PHYLIB [=y] && (HWMON [=m] || HWMON [=m]=n [=n]) Selected by [y]: - NET_DSA_MXL862 [=y] && NETDEVICES [=y] && NET_DSA [=y] aarch64-linux-ld: drivers/net/phy/mxl-gpy.o: in function `gpy_probe': mxl-gpy.c:(.text.gpy_probe+0x13c): undefined reference to `devm_hwmon_device_register_with_info' aarch64-linux-ld: drivers/net/phy/mxl-gpy.o: in function `gpy_hwmon_read': mxl-gpy.c:(.text.gpy_hwmon_read+0x48): undefined reference to `polynomial_calc' There is actually no compile-time dependency, as DSA correctly uses the PHY abstractions. Remove the 'select' statement to reduce the complexity. Fixes: 23794bec1cb6 ("net: dsa: add basic initial driver for MxL862xx switches") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Daniel Golle <daniel@makrotopia.org> Link: https://patch.msgid.link/20260216105522.2382373-1-arnd@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 daysdpll: zl3073x: Fix ref frequency settingIvan Vecera-0/+2
The frequency for an input reference is computed as: frequency = freq_base * freq_mult * freq_ratio_m / freq_ratio_n Before commit 5bc02b190a3fb ("dpll: zl3073x: Cache all reference properties in zl3073x_ref"), zl3073x_dpll_input_pin_frequency_set() explicitly wrote 1 to both the REF_RATIO_M and REF_RATIO_N hardware registers whenever a new frequency was set. This ensured the FEC ratio was always reset to 1:1 alongside the new base/multiplier values. The refactoring in that commit introduced zl3073x_ref_freq_set() to update the cached ref state, but this helper only sets freq_base and freq_mult without resetting freq_ratio_m and freq_ratio_n to 1. Because zl3073x_ref_state_set() uses a compare-and-write strategy, unchanged ratio fields are never written to the hardware. If the device previously had non-unity FEC ratio values, they remain in effect after a frequency change, resulting in an incorrect computed frequency. Explicitly set freq_ratio_m and freq_ratio_n to 1 in zl3073x_ref_freq_set() to restore the original behavior. Fixes: 5bc02b190a3fb ("dpll: zl3073x: Cache all reference properties in zl3073x_ref") Signed-off-by: Ivan Vecera <ivecera@redhat.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20260216194007.680416-1-ivecera@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 daysnet: do not delay zero-copy skbs in skb_attempt_defer_free()Eric Dumazet-1/+6
After the blamed commit, TCP tx zero copy notifications could be arbitrarily delayed and cause regressions in applications waiting for them. Signed-off-by: Eric Dumazet <edumazet@google.com> Fixes: e20dfbad8aab ("net: fix napi_consume_skb() with alien skbs") Reviewed-by: Jason Xing <kerneljasonxing@gmail.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20260216193653.627617-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 daysnet: psp: select CONFIG_SKB_EXTENSIONSArnd Bergmann-0/+1
psp now uses skb extensions, failing to build when that is disabled: In file included from include/net/psp.h:7, from net/psp/psp_sock.c:9: include/net/psp/functions.h: In function '__psp_skb_coalesce_diff': include/net/psp/functions.h:60:13: error: implicit declaration of function 'skb_ext_find'; did you mean 'skb_ext_copy'? [-Wimplicit-function-declaration] 60 | a = skb_ext_find(one, SKB_EXT_PSP); | ^~~~~~~~~~~~ | skb_ext_copy include/net/psp/functions.h:60:31: error: 'SKB_EXT_PSP' undeclared (first use in this function) 60 | a = skb_ext_find(one, SKB_EXT_PSP); | ^~~~~~~~~~~ include/net/psp/functions.h:60:31: note: each undeclared identifier is reported only once for each function it appears in include/net/psp/functions.h: In function '__psp_sk_rx_policy_check': include/net/psp/functions.h:94:53: error: 'SKB_EXT_PSP' undeclared (first use in this function) 94 | struct psp_skb_ext *pse = skb_ext_find(skb, SKB_EXT_PSP); | ^~~~~~~~~~~ net/psp/psp_sock.c: In function 'psp_sock_recv_queue_check': net/psp/psp_sock.c:164:41: error: 'SKB_EXT_PSP' undeclared (first use in this function) 164 | pse = skb_ext_find(skb, SKB_EXT_PSP); | ^~~~~~~~~~~ Select the Kconfig symbol as we do from its other users. Fixes: 6b46ca260e22 ("net: psp: add socket security association code") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Daniel Zahka <daniel.zahka@gmail.com> Link: https://patch.msgid.link/20260216105500.2382181-1-arnd@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 daysbpftool: Fix truncated netlink dumpsJakub Kicinski-2/+7
Netlink requires that the recv buffer used during dumps is at least min(PAGE_SIZE, 8k) (see the man page). Otherwise the messages will get truncated. Make sure bpftool follows this requirement, avoid missing information on systems with large pages. Acked-by: Quentin Monnet <qmo@kernel.org> Fixes: 7084566a236f ("tools/bpftool: Remove libbpf_internal.h usage in bpftool") Signed-off-by: Jakub Kicinski <kuba@kernel.org> Link: https://lore.kernel.org/r/20260217194150.734701-1-kuba@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
11 daysipv6: fix a race in ip6_sock_set_v6only()Eric Dumazet-4/+7
It is unlikely that this function will be ever called with isk->inet_num being not zero. Perform the check on isk->inet_num inside the locked section for complete safety. Fixes: 9b115749acb24 ("ipv6: add ip6_sock_set_v6only") Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Fernando Fernandez Mancera <fmancera@suse.de> Link: https://patch.msgid.link/20260216102202.3343588-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 daysdrm/xe: Prevent VFs from exposing the CCS mode sysfs fileNareshkumar Gollakoti-1/+1
Skip creating CCS sysfs files in VF mode to ensure VFs do not try to change CCS mode, as it is predefined and immutable in the SR-IOV mode. Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: Nareshkumar Gollakoti <naresh.kumar.g@intel.com> Signed-off-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Link: https://patch.msgid.link/20260202170810.1393147-5-naresh.kumar.g@intel.com (cherry picked from commit 4e8f602ac3574cf1ebc7acfb6624d06e04b30c91) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
11 daysdrm/xe/hwmon: Prevent unintended VRAM channel creationKarthik Poosa-3/+3
Remove the unnecessary VRAM channel entry introduced in xe_hwmon_channel. Without this, adding any new hwmon channel causes extra VRAM channel to appear. This remained unnoticed earlier because VRAM was the final xe hwmon channel. v2: Use MAX_VRAM_CHANNELS with in_range() instead of CHANNEL_VRAM_N_MAX. (Raag) Fixes: 49a498338417 ("drm/xe/hwmon: Expose individual VRAM channel temperature") Signed-off-by: Karthik Poosa <karthik.poosa@intel.com> Reviewed-by: Raag Jadav <raag.jadav@intel.com> Link: https://patch.msgid.link/20260206081655.2115439-1-karthik.poosa@intel.com Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com> (cherry picked from commit 48eb073c7d95883eca2789447f94e1e8cafbabe5) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
11 daysdrm/pagemap: pass pagemap_addr by referenceArnd Bergmann-7/+7
Passing a structure by value into a function is sometimes problematic, for a number of reasons. Of of these is a warning from the 32-bit arm compiler: drivers/gpu/drm/drm_gpusvm.c: In function '__drm_gpusvm_unmap_pages': drivers/gpu/drm/drm_gpusvm.c:1152:33: note: parameter passing for argument of type 'struct drm_pagemap_addr' changed in GCC 9.1 1152 | dpagemap->ops->device_unmap(dpagemap, | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 1153 | dev, *addr); | ~~~~~~~~~~~ This particular problem is harmless since we are not mixing compiler versions inside of the compiler. However, passing this by reference avoids the warning along with providing slightly better calling conventions as it avoids an extra copy on the stack. Fixes: 75af93b3f5d0 ("drm/pagemap, drm/xe: Support destination migration over interconnect") Fixes: 2df55d9e66a2 ("drm/xe: Support pcie p2p dma as a fast interconnect") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> Link: https://patch.msgid.link/20260216134644.1025365-1-arnd@kernel.org Acked-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com> (cherry picked from commit 95162db0208aee122d10ac1342fe97a1721cd258) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
11 daysdrm/xe/bo: Redirect faults to dummy page for wedged deviceRaag Jadav-1/+1
As per uapi documentation[1], the prerequisite for wedged device is to redirected page faults to a dummy page. Follow it. [1] Documentation/gpu/drm-uapi.rst v2: Add uapi reference and fixes tag (Matthew Brost) Fixes: 7bc00751f877 ("drm/xe: Use device wedged event") Signed-off-by: Raag Jadav <raag.jadav@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patch.msgid.link/20260212055622.2054991-1-raag.jadav@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com> (cherry picked from commit c020fff70d757612933711dd3cc3751d7d782d3c) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
11 daysdrm/xe: Make xe_modparam.force_vram_bar_size signedShuicheng Lin-1/+1
vram_bar_size is registered as an int module parameter and is documented to accept negative values to disable BAR resizing. Store it as an int in xe_modparam as well, so negative values work as intended and the module_param type matches. Fixes: 80742a1aa26e ("drm/xe: Allow to drop vram resizing") Reviewed-by: Michal Wajdeczko <michal.wajdeczko@intel.com> Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com> Link: https://patch.msgid.link/20260202181853.1095736-2-shuicheng.lin@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com> (cherry picked from commit 25c9aa4dcb5ef2ad9f354d19f8f1eeb690d1c161) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
11 daysdrm/xe/vf: Avoid reading media version when media GT is disabledPiotr Piórkowski-0/+6
When the media GT is not allowed, a VF must not attempt to read the media version from the GuC. The GuC may not be loaded, and any attempt to communicate with it would result in a timeout and a VF probe failure: (...) [ 1912.406046] xe 0000:01:00.1: [drm] *ERROR* Tile0: GT1: GuC mmio request 0x5507: no reply 0x5507 [ 1912.407277] xe 0000:01:00.1: [drm] *ERROR* Tile0: GT1: [GUC COMMUNICATION] MMIO send failed (-ETIMEDOUT) [ 1912.408689] xe 0000:01:00.1: [drm] *ERROR* VF: Tile0: GT1: Failed to reset GuC state (-ETIMEDOUT) [ 1912.413986] xe 0000:01:00.1: probe with driver xe failed with error -110 Let's skip reading the media version for VFs when the media GT is not allowed. v2: move the condition directly to the VF path Fixes: 7abd69278bb5 ("drm/xe/configfs: Add attribute to disable GT types") Signed-off-by: Piotr Piórkowski <piotr.piorkowski@intel.com> Cc: Matt Roper <matthew.d.roper@intel.com> Cc: Michal Wajdeczko <michal.wajdeczko@intel.com> Cc: Shuicheng Lin <shuicheng.lin@intel.com> Reviewed-by: Shuicheng Lin <shuicheng.lin@intel.com> Link: https://patch.msgid.link/20260202115041.2863357-1-piotr.piorkowski@intel.com Signed-off-by: Michał Winiarski <michal.winiarski@intel.com> (cherry picked from commit 0bcacf56dc0b265f9c47056c6a4f0c1394a8a3f0) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
11 daysdrm/xe/xe2_hpg: Fix handling of Wa_14019988906 & Wa_14019877138Matt Roper-10/+8
The PSS_CHICKEN register has been part of the RCS engine's LRC since it was first introduced in Xe_LP. That means that any workarounds that adjust its value (such as Wa_14019988906 and Wa_14019877138) need to be implemented in the lrc_was[] table so that they become part of the default LRC from which all subsequent LRCs are copied. Although these workarounds were implemented correctly on most platforms, they were incorrectly placed on the engine_was[] table for Xe2_HPG. Move the workarounds to the proper lrc_was[] table and switch the 'xe_rtp_match_first_render_or_compute' rule to specifically match the RCS since that's the engine whose LRC manages the register. Bspec: 65182 Fixes: 7f3ee7d88058 ("drm/xe/xe2hpg: Add initial GT workarounds") Reviewed-by: Shekhar Chauhan <shekhar.chauhan@intel.com> Link: https://patch.msgid.link/20260205220508.51905-2-matthew.d.roper@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com> (cherry picked from commit e04c609eedf4d6748ac0bcada4de1275b034fed6) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
11 daysdrm/xe/mmio: Avoid double-adjust in 64-bit readsShuicheng Lin-5/+5
xe_mmio_read64_2x32() was adjusting register addresses and then calling xe_mmio_read32(), which applies the adjustment again. This may shift accesses twice if adj_offset < adj_limit. There is no issue currently, as for media gt, adj_offset > adj_limit, so the 2nd adjust will be a no-op. But it may not work in future. To fix it, replace the adjusted-address comparison with a direct sanity check that ensures the MMIO address adjustment cutoff never falls within the 8-byte range of a 64-bit register. And let xe_mmio_read32() handle address translation. v2: rewrite the sanity check in a more natural way. (Matt) v3: Add Fixes tag. (Jani) Fixes: 07431945d8ae ("drm/xe: Avoid 64-bit register reads") Reviewed-by: Matt Roper <matthew.d.roper@intel.com> Cc: Jani Nikula <jani.nikula@intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com> Link: https://patch.msgid.link/20260130165621.471408-2-shuicheng.lin@intel.com Signed-off-by: Matt Roper <matthew.d.roper@intel.com> (cherry picked from commit a30f999681126b128a43137793ac84b6a5b7443f) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
11 daysdrm/xe: Add bounds check on pat_index to prevent OOB kernel read in madviseJia Yao-1/+6
When user provides a bogus pat_index value through the madvise IOCTL, the xe_pat_index_get_coh_mode() function performs an array access without validating bounds. This allows a malicious user to trigger an out-of-bounds kernel read from the xe->pat.table array. The vulnerability exists because the validation in madvise_args_are_sane() directly calls xe_pat_index_get_coh_mode(xe, args->pat_index.val) without first checking if pat_index is within [0, xe->pat.n_entries). Although xe_pat_index_get_coh_mode() has a WARN_ON to catch this in debug builds, it still performs the unsafe array access in production kernels. v2(Matthew Auld) - Using array_index_nospec() to mitigate spectre attacks when the value is used v3(Matthew Auld) - Put the declarations at the start of the block Fixes: ada7486c5668 ("drm/xe: Implement madvise ioctl for xe") Reviewed-by: Matthew Auld <matthew.auld@intel.com> Cc: <stable@vger.kernel.org> # v6.18+ Cc: Matthew Brost <matthew.brost@intel.com> Cc: Shuicheng Lin <shuicheng.lin@intel.com> Cc: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com> Cc: "Thomas Hellström" <thomas.hellstrom@linux.intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Matthew Auld <matthew.auld@intel.com> Signed-off-by: Jia Yao <jia.yao@intel.com> Signed-off-by: Matthew Auld <matthew.auld@intel.com> Link: https://patch.msgid.link/20260205161529.1819276-1-jia.yao@intel.com (cherry picked from commit 944a3329b05510d55c69c2ef455136e2fc02de29) Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>