linux - Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/

Age	Commit message (Collapse)	Author	Files	Lines
2 days	PCI: dwc: ep: Return after clearing BAR-match inbound mapping	Koichiro Den	1	-0/+1
	dw_pcie_ep_clear_ib_maps() first checks whether the inbound mapping for a BAR is in BAR Match Mode (tracked via ep_func->bar_to_atu[bar]). Once found, the iATU region is disabled and the bookkeeping is cleared. BAR Match Mode and Address Match Mode mappings are mutually exclusive for a given BAR, so there is nothing left for the Address Match Mode teardown path to do after the BAR Match Mode mapping has been removed. Return early after clearing the BAR Match Mode mapping to avoid running the Address Match Mode teardown path. This makes the helper's intention explicit and helps detect incorrect use of pci_epc_set_bar(). Suggested-by: Niklas Cassel <cassel@kernel.org> Signed-off-by: Koichiro Den <den@valinux.co.jp> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Niklas Cassel <cassel@kernel.org> Link: https://patch.msgid.link/20260202145407.503348-2-den@valinux.co.jp
2 days	PCI: endpoint: pci-epf-test: Select configfs	Arnd Bergmann	1	-0/+1
	Like some of the other endpoint modules, pci-epf-test now also uses configfs, but is missing an indication in Kconfig: arm-linux-gnueabi-ld: drivers/pci/endpoint/functions/pci-epf-test.o: in function `pci_epf_test_add_cfs': pci-epf-test.c:(.text.pci_epf_test_add_cfs+0x2c): undefined reference to `config_group_init_type_name' Select the symbol as needed. Fixes: ffcc4850a161 ("PCI: endpoint: pci-epf-test: Allow overriding default BAR sizes") Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202602180706.VtXkmtqL-lkp@intel.com/ Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Niklas Cassel <cassel@kernel.org> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Link: https://patch.msgid.link/20260211070812.4087119-1-arnd@kernel.org
2 days	PCI: Account fully optional bridge windows correctly	Ilpo Järvinen	1	-13/+16
	pbus_size_mem_optional() adds dev_res->add_size of a bridge window into children_add_size when the window has a non-optional part. However, if the bridge window is fully optional, only r_size is added (which is zero for such a window). Also, a second dev_res entry will be added by pci_dev_res_add_to_list() into realloc_head for the bridge window (resulting in triggering the realloc_head-must-be-fully-consumed sanity check after a single pass of the resource assignment algorithm): WARNING: drivers/pci/setup-bus.c:2153 at pci_assign_unassigned_root_bus_resources+0xa5/0x260 Correct these problems by always adding dev_res->add_size for bridge windows and not calling pci_dev_res_add_to_list() if the dev_res entry exists. Fixes: 6a5e64c75e82 ("PCI: Add pbus_mem_size_optional() to handle optional sizes") Reported-by: RavitejaX Veesam <ravitejax.veesam@intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: RavitejaX Veesam <ravitejax.veesam@intel.com> Link: https://patch.msgid.link/20260218223419.22366-1-ilpo.jarvinen@linux.intel.com
2 days	tracing: Wake up poll waiters for hist files when removing an event	Petr Pavlu	2	-0/+8
	The event_hist_poll() function attempts to verify whether an event file is being removed, but this check may not occur or could be unnecessarily delayed. This happens because hist_poll_wakeup() is currently invoked only from event_hist_trigger() when a hist command is triggered. If the event file is being removed, no associated hist command will be triggered and a waiter will be woken up only after an unrelated hist command is triggered. Fix the issue by adding a call to hist_poll_wakeup() in remove_event_file_dir() after setting the EVENT_FILE_FL_FREED flag. This ensures that a task polling on a hist file is woken up and receives EPOLLERR. Cc: stable@vger.kernel.org Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Tom Zanussi <zanussi@kernel.org> Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Link: https://patch.msgid.link/20260219162737.314231-3-petr.pavlu@suse.com Fixes: 1bd13edbbed6 ("tracing/hist: Add poll(POLLIN) support on hist file") Signed-off-by: Petr Pavlu <petr.pavlu@suse.com> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2 days	tracing: Fix checking of freed trace_event_file for hist files	Petr Pavlu	1	-2/+2
	The event_hist_open() and event_hist_poll() functions currently retrieve a trace_event_file pointer from a file struct by invoking event_file_data(), which simply returns file->f_inode->i_private. The functions then check if the pointer is NULL to determine whether the event is still valid. This approach is flawed because i_private is assigned when an eventfs inode is allocated and remains set throughout its lifetime. Instead, the code should call event_file_file(), which checks for EVENT_FILE_FL_FREED. Using the incorrect access function may result in the code potentially opening a hist file for an event that is being removed or becoming stuck while polling on this file. Correct the access method to event_file_file() in both functions. Cc: stable@vger.kernel.org Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Tom Zanussi <zanussi@kernel.org> Link: https://patch.msgid.link/20260219162737.314231-2-petr.pavlu@suse.com Fixes: 1bd13edbbed6 ("tracing/hist: Add poll(POLLIN) support on hist file") Signed-off-by: Petr Pavlu <petr.pavlu@suse.com> Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2 days	fgraph: Do not call handlers direct when not using ftrace_ops	Steven Rostedt	2	-4/+21
	The function graph tracer was modified to us the ftrace_ops of the function tracer. This simplified the code as well as allowed more features of the function graph tracer. Not all architectures were converted over as it required the implementation of HAVE_DYNAMIC_FTRACE_WITH_ARGS to implement. For those architectures, it still did it the old way where the function graph tracer handle was called by the function tracer trampoline. The handler then had to check the hash to see if the registered handlers wanted to be called by that function or not. In order to speed up the function graph tracer that used ftrace_ops, if only one callback was registered with function graph, it would call its function directly via a static call. Now, if the architecture does not support the use of using ftrace_ops and still has the ftrace function trampoline calling the function graph handler, then by doing a direct call it removes the check against the handler's hash (list of functions it wants callbacks to), and it may call that handler for functions that the handler did not request calls for. On 32bit x86, which does not support the ftrace_ops use with function graph tracer, it shows the issue: ~# trace-cmd start -p function -l schedule ~# trace-cmd show # tracer: function_graph # # CPU DURATION FUNCTION CALLS # \| \| \| \| \| \| \| 2) * 11898.94 us \| schedule(); 3) # 1783.041 us \| schedule(); 1) \| schedule() { ------------------------------------------ 1) bash-8369 => kworker-7669 ------------------------------------------ 1) \| schedule() { ------------------------------------------ 1) kworker-7669 => bash-8369 ------------------------------------------ 1) + 97.004 us \| } 1) \| schedule() { [..] Now by starting the function tracer is another instance: ~# trace-cmd start -B foo -p function This causes the function graph tracer to trace all functions (because the function trace calls the function graph tracer for each on, and the function graph trace is doing a direct call): ~# trace-cmd show # tracer: function_graph # # CPU DURATION FUNCTION CALLS # \| \| \| \| \| \| \| 1) 1.669 us \| } /* preempt_count_sub / 1) + 10.443 us \| } / _raw_spin_unlock_irqrestore / 1) \| tick_program_event() { 1) \| clockevents_program_event() { 1) 1.044 us \| ktime_get(); 1) 6.481 us \| lapic_next_event(); 1) + 10.114 us \| } 1) + 11.790 us \| } 1) ! 181.223 us \| } / hrtimer_interrupt / 1) ! 184.624 us \| } / __sysvec_apic_timer_interrupt */ 1) \| irq_exit_rcu() { 1) 0.678 us \| preempt_count_sub(); When it should still only be tracing the schedule() function. To fix this, add a macro FGRAPH_NO_DIRECT to be set to 0 when the architecture does not support function graph use of ftrace_ops, and set to 1 otherwise. Then use this macro to know to allow function graph tracer to call the handlers directly or not. Cc: stable@vger.kernel.org Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Mark Rutland <mark.rutland@arm.com> Link: https://patch.msgid.link/20260218104244.5f14dade@gandalf.local.home Fixes: cc60ee813b503 ("function_graph: Use static_call and branch to optimize entry function") Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2 days	tracing: ring-buffer: Fix to check event length before using	Masami Hiramatsu (Google)	1	-1/+5
	Check the event length before adding it for accessing next index in rb_read_data_buffer(). Since this function is used for validating possibly broken ring buffers, the length of the event could be broken. In that case, the new event (e + len) can point a wrong address. To avoid invalid memory access at boot, check whether the length of each event is in the possible range before using it. Cc: stable@vger.kernel.org Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Fixes: 5f3b6e839f3c ("ring-buffer: Validate boot range memory events") Link: https://patch.msgid.link/177123421541.142205.9414352170164678966.stgit@devnote2 Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2 days	ring-buffer: Fix possible dereference of uninitialized pointer	Daniil Dulov	1	-1/+2
	There is a pointer head_page in rb_meta_validate_events() which is not initialized at the beginning of a function. This pointer can be dereferenced if there is a failure during reader page validation. In this case the control is passed to "invalid" label where the pointer is dereferenced in a loop. To fix the issue initialize orig_head and head_page before calling rb_validate_buffer. Found by Linux Verification Center (linuxtesting.org) with SVACE. Cc: stable@vger.kernel.org Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Link: https://patch.msgid.link/20260213100130.2013839-1-d.dulov@aladdin.ru Closes: https://lore.kernel.org/r/202406130130.JtTGRf7W-lkp@intel.com/ Fixes: 5f3b6e839f3c ("ring-buffer: Validate boot range memory events") Signed-off-by: Daniil Dulov <d.dulov@aladdin.ru> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2 days	KVM: arm64: vgic: Handle const qualifier from gic_kvm_info allocation type	Kees Cook	1	-1/+1
	In preparation for making the kmalloc family of allocators type aware, we need to make sure that the returned type from the allocation matches the type of the variable being assigned. (Before, the allocator would always return "void *", which can be implicitly cast to any pointer type.) The assigned type is "struct gic_kvm_info", but the returned type, while matching, is const qualified. To get them exactly matching, just use the dereferenced pointer for the sizeof(). Link: https://patch.msgid.link/20260206223022.it.052-kees@kernel.org Signed-off-by: Kees Cook <kees@kernel.org>
3 days	drm/msm: Adjust msm_iommu_pagetable_prealloc_allocate() allocation type	Kees Cook	1	-1/+1
	In preparation for making the kmalloc family of allocators type aware, we need to make sure that the returned type from the allocation matches the type of the variable being assigned. (Before, the allocator would always return "void ", which can be implicitly cast to any pointer type.) The assigned type is "void " but the returned type will be "void **". These are the same allocation size (pointer size), but the types do not match. Adjust the allocation type to match the assignment. Reviewed-by: Dmitry Baryshkov <dmitry.baryshkov@oss.qualcomm.com> Link: https://patch.msgid.link/20260206222151.work.016-kees@kernel.org Signed-off-by: Kees Cook <kees@kernel.org>
3 days	dm: dm-zoned: Adjust dmz_load_mapping() allocation type	Kees Cook	1	-1/+1
	In preparation for making the kmalloc family of allocators type aware, we need to make sure that the returned type from the allocation matches the type of the variable being assigned. (Before, the allocator would always return "void ", which can be implicitly cast to any pointer type.) The assigned type is "struct dmz_mblock " but the returned type will be "struct dmz_mblk *". These are the same allocation size (pointer size), but the types do not match. Adjust the allocation type to match the assignment. Link: https://patch.msgid.link/20250426061707.work.587-kees@kernel.org Signed-off-by: Kees Cook <kees@kernel.org>
3 days	dm-crypt: Adjust crypt_alloc_tfms_aead() allocation type	Kees Cook	1	-1/+1
	In preparation for making the kmalloc family of allocators type aware, we need to make sure that the returned type from the allocation matches the type of the variable being assigned. (Before, the allocator would always return "void ", which can be implicitly cast to any pointer type.) The assigned type is "struct crypto_skcipher " but the returned type will be "struct crypto_aead *". These are the same allocation size (pointer size), but the types don't match. Adjust the allocation type to match the assignment. Link: https://patch.msgid.link/20250426061629.work.266-kees@kernel.org Signed-off-by: Kees Cook <kees@kernel.org>
3 days	net: nfc: nci: Fix parameter validation for packet data	Michael Thalmeier	1	-18/+141
	Since commit 9c328f54741b ("net: nfc: nci: Add parameter validation for packet data") communication with nci nfc chips is not working any more. The mentioned commit tries to fix access of uninitialized data, but failed to understand that in some cases the data packet is of variable length and can therefore not be compared to the maximum packet length given by the sizeof(struct). Fixes: 9c328f54741b ("net: nfc: nci: Add parameter validation for packet data") Cc: stable@vger.kernel.org Signed-off-by: Michael Thalmeier <michael.thalmeier@hale.at> Reported-by: syzbot+740e04c2a93467a0f8c8@syzkaller.appspotmail.com Link: https://patch.msgid.link/20260218083000.301354-1-michael.thalmeier@hale.at Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 days	net/mlx5e: Use unsigned for mlx5e_get_max_num_channels	Cosmin Ratiu	1	-1/+2
	The max number of channels is always an unsigned int, use the correct type to fix compilation errors done with strict type checking, e.g.: error: call to ‘__compiletime_assert_1110’ declared with attribute error: min(mlx5e_get_devlink_param_num_doorbells(mdev), mlx5e_get_max_num_channels(mdev)) signedness error Fixes: 74a8dadac17e ("net/mlx5e: Preparations for supporting larger number of channels") Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com> Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Jacob Keller <Jacob.e.keller@intel.com> Link: https://patch.msgid.link/20260218072904.1764634-7-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 days	net/mlx5e: Fix deadlocks between devlink and netdev instance locks	Cosmin Ratiu	4	-58/+61
	In the mentioned "Fixes" commit, various work tasks triggering devlink health reporter recovery were switched to use netdev_trylock to protect against concurrent tear down of the channels being recovered. But this had the side effect of introducing potential deadlocks because of incorrect lock ordering. The correct lock order is described by the init flow: probe_one -> mlx5_init_one (acquires devlink lock) -> mlx5_init_one_devl_locked -> mlx5_register_device -> mlx5_rescan_drivers_locked -...-> mlx5e_probe -> _mlx5e_probe -> register_netdev (acquires rtnl lock) -> register_netdevice (acquires netdev lock) => devlink lock -> rtnl lock -> netdev lock. But in the current recovery flow, the order is wrong: mlx5e_tx_err_cqe_work (acquires netdev lock) -> mlx5e_reporter_tx_err_cqe -> mlx5e_health_report -> devlink_health_report (acquires devlink lock => boom!) -> devlink_health_reporter_recover -> mlx5e_tx_reporter_recover -> mlx5e_tx_reporter_recover_from_ctx -> mlx5e_tx_reporter_err_cqe_recover The same pattern exists in: mlx5e_reporter_rx_timeout mlx5e_reporter_tx_ptpsq_unhealthy mlx5e_reporter_tx_timeout Fix these by moving the netdev_trylock calls from the work handlers lower in the call stack, in the respective recovery functions, where they are actually necessary. Fixes: 8f7b00307bf1 ("net/mlx5e: Convert mlx5 netdevs to instance locking") Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com> Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Jacob Keller <Jacob.e.keller@intel.com> Link: https://patch.msgid.link/20260218072904.1764634-6-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 days	net/mlx5e: MACsec, add ASO poll loop in macsec_aso_set_arm_event	Gal Pressman	1	-1/+2
	The macsec_aso_set_arm_event function calls mlx5_aso_poll_cq once without a retry loop. If the CQE is not immediately available after posting the WQE, the function fails unnecessarily. Use read_poll_timeout() to poll 3-10 usecs for CQE, consistent with other ASO polling code paths in the driver. Fixes: 739cfa34518e ("net/mlx5: Make ASO poll CQ usable in atomic context") Signed-off-by: Gal Pressman <gal@nvidia.com> Reviewed-by: Jianbo Liu <jianbol@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Jacob Keller <Jacob.e.keller@intel.com> Link: https://patch.msgid.link/20260218072904.1764634-5-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 days	net/mlx5: Fix misidentification of write combining CQE during poll loop	Gal Pressman	1	-9/+5
	The write combining completion poll loop uses usleep_range() which can sleep much longer than requested due to scheduler latency. Under load, we witnessed a 20ms+ delay until the process was rescheduled, causing the jiffies based timeout to expire while the thread is sleeping. The original do-while loop structure (poll, sleep, check timeout) would exit without a final poll when waking after timeout, missing a CQE that arrived during sleep. Instead of the open-coded while loop, use the kernel's poll_timeout_us() which always performs an additional check after the sleep expiration, and is less error-prone. Note: poll_timeout_us() doesn't accept a sleep range, by passing 10 sleep_us the sleep range effectively changes from 2-10 to 3-10 usecs. Fixes: d98995b4bf98 ("net/mlx5: Reimplement write combining test") Signed-off-by: Gal Pressman <gal@nvidia.com> Reviewed-by: Jianbo Liu <jianbol@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Jacob Keller <Jacob.e.keller@intel.com> Link: https://patch.msgid.link/20260218072904.1764634-4-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 days	net/mlx5e: Fix misidentification of ASO CQE during poll loop	Gal Pressman	2	-14/+6
	The ASO completion poll loop uses usleep_range() which can sleep much longer than requested due to scheduler latency. Under load, we witnessed a 20ms+ delay until the process was rescheduled, causing the jiffies based timeout to expire while the thread is sleeping. The original do-while loop structure (poll, sleep, check timeout) would exit without a final poll when waking after timeout, missing a CQE that arrived during sleep. Instead of the open-coded while loop, use the kernel's read_poll_timeout() which always performs an additional check after the sleep expiration, and is less error-prone. Note: read_poll_timeout() doesn't accept a sleep range, by passing 10 sleep_us the sleep range effectively changes from 2-10 to 3-10 usecs. Fixes: 739cfa34518e ("net/mlx5: Make ASO poll CQ usable in atomic context") Fixes: 7e3fce82d945 ("net/mlx5e: Overcome slow response for first macsec ASO WQE") Signed-off-by: Gal Pressman <gal@nvidia.com> Reviewed-by: Jianbo Liu <jianbol@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Jacob Keller <Jacob.e.keller@intel.com> Link: https://patch.msgid.link/20260218072904.1764634-3-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 days	net/mlx5: Fix multiport device check over light SFs	Shay Drory	1	-2/+2
	Driver is using num_vhca_ports capability to distinguish between multiport master device and multiport slave device. num_vhca_ports is a capability the driver sets according to the MAX num_vhca_ports capability reported by FW. On the other hand, light SFs doesn't set the above capbility. This leads to wrong results whenever light SFs is checking whether he is a multiport master or slave. Therefore, use the MAX capability to distinguish between master and slave devices. Fixes: e71383fb9cd1 ("net/mlx5: Light probe local SFs") Signed-off-by: Shay Drory <shayd@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Jacob Keller <Jacob.e.keller@intel.com> Link: https://patch.msgid.link/20260218072904.1764634-2-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 days	bonding: alb: fix UAF in rlb_arp_recv during bond up/down	Hangbin Liu	1	-1/+5
	The ALB RX path may access rx_hashtbl concurrently with bond teardown. During rapid bond up/down cycles, rlb_deinitialize() frees rx_hashtbl while RX handlers are still running, leading to a null pointer dereference detected by KASAN. However, the root cause is that rlb_arp_recv() can still be accessed after setting recv_probe to NULL, which is actually a use-after-free (UAF) issue. That is the reason for using the referenced commit in the Fixes tag. [ 214.174138] Oops: general protection fault, probably for non-canonical address 0xdffffc000000001d: 0000 [#1] SMP KASAN PTI [ 214.186478] KASAN: null-ptr-deref in range [0x00000000000000e8-0x00000000000000ef] [ 214.194933] CPU: 30 UID: 0 PID: 2375 Comm: ping Kdump: loaded Not tainted 6.19.0-rc8+ #2 PREEMPT(voluntary) [ 214.205907] Hardware name: Dell Inc. PowerEdge R730/0WCJNT, BIOS 2.14.0 01/14/2022 [ 214.214357] RIP: 0010:rlb_arp_recv+0x505/0xab0 [bonding] [ 214.220320] Code: 0f 85 2b 05 00 00 48 b8 00 00 00 00 00 fc ff df 40 0f b6 ed 48 c1 e5 06 49 03 ad 78 01 00 00 48 8d 7d 28 48 89 fa 48 c1 ea 03 <0f> b6 04 02 84 c0 74 06 0f 8e 12 05 00 00 80 7d 28 00 0f 84 8c 00 [ 214.241280] RSP: 0018:ffffc900073d8870 EFLAGS: 00010206 [ 214.247116] RAX: dffffc0000000000 RBX: ffff888168556822 RCX: ffff88816855681e [ 214.255082] RDX: 000000000000001d RSI: dffffc0000000000 RDI: 00000000000000e8 [ 214.263048] RBP: 00000000000000c0 R08: 0000000000000002 R09: ffffed11192021c8 [ 214.271013] R10: ffff8888c9010e43 R11: 0000000000000001 R12: 1ffff92000e7b119 [ 214.278978] R13: ffff8888c9010e00 R14: ffff888168556822 R15: ffff888168556810 [ 214.286943] FS: 00007f85d2d9cb80(0000) GS:ffff88886ccb3000(0000) knlGS:0000000000000000 [ 214.295966] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 214.302380] CR2: 00007f0d047b5e34 CR3: 00000008a1c2e002 CR4: 00000000001726f0 [ 214.310347] Call Trace: [ 214.313070] <IRQ> [ 214.315318] ? __pfx_rlb_arp_recv+0x10/0x10 [bonding] [ 214.320975] bond_handle_frame+0x166/0xb60 [bonding] [ 214.326537] ? __pfx_bond_handle_frame+0x10/0x10 [bonding] [ 214.332680] __netif_receive_skb_core.constprop.0+0x576/0x2710 [ 214.339199] ? __pfx_arp_process+0x10/0x10 [ 214.343775] ? sched_balance_find_src_group+0x98/0x630 [ 214.349513] ? __pfx___netif_receive_skb_core.constprop.0+0x10/0x10 [ 214.356513] ? arp_rcv+0x307/0x690 [ 214.360311] ? __pfx_arp_rcv+0x10/0x10 [ 214.364499] ? __lock_acquire+0x58c/0xbd0 [ 214.368975] __netif_receive_skb_one_core+0xae/0x1b0 [ 214.374518] ? __pfx___netif_receive_skb_one_core+0x10/0x10 [ 214.380743] ? lock_acquire+0x10b/0x140 [ 214.385026] process_backlog+0x3f1/0x13a0 [ 214.389502] ? process_backlog+0x3aa/0x13a0 [ 214.394174] __napi_poll.constprop.0+0x9f/0x370 [ 214.399233] net_rx_action+0x8c1/0xe60 [ 214.403423] ? __pfx_net_rx_action+0x10/0x10 [ 214.408193] ? lock_acquire.part.0+0xbd/0x260 [ 214.413058] ? sched_clock_cpu+0x6c/0x540 [ 214.417540] ? mark_held_locks+0x40/0x70 [ 214.421920] handle_softirqs+0x1fd/0x860 [ 214.426302] ? __pfx_handle_softirqs+0x10/0x10 [ 214.431264] ? __neigh_event_send+0x2d6/0xf50 [ 214.436131] do_softirq+0xb1/0xf0 [ 214.439830] </IRQ> The issue is reproducible by repeatedly running ip link set bond0 up/down while receiving ARP messages, where rlb_arp_recv() can race with rlb_deinitialize() and dereference a freed rx_hashtbl entry. Fix this by setting recv_probe to NULL and then calling synchronize_net() to wait for any concurrent RX processing to finish. This ensures that no RX handler can access rx_hashtbl after it is freed in bond_alb_deinitialize(). Reported-by: Liang Li <liali@redhat.com> Fixes: 3aba891dde38 ("bonding: move processing of recv handlers into handle_frame()") Reviewed-by: Nikolay Aleksandrov <nikolay@nvidia.com> Acked-by: Jay Vosburgh <jv@jvosburgh.net> Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Link: https://patch.msgid.link/20260218060919.101574-1-liuhangbin@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 days	bnge: fix reserving resources from FW	Vikas Gupta	1	-1/+1
	HWRM_FUNC_CFG is used to reserve resources, whereas HWRM_FUNC_QCFG is intended for querying resource information from the firmware. Since __bnge_hwrm_reserve_pf_rings() reserves resources for a specific PF, the command type should be HWRM_FUNC_CFG. Fixes: 627c67f038d2 ("bng_en: Add resource management support") Signed-off-by: Vikas Gupta <vikas.gupta@broadcom.com> Reviewed-by: Bhargava Chenna Marreddy <bhargava.marreddy@broadcom.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20260218052755.4097468-1-vikas.gupta@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
3 days	drm/amd/display: Remove unneeded DAC link encoder register	Timur Kristóf	4	-5/+4
	Not needed anymore since we use the VBIOS function. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Tested-by: Mauro Rossi <issor.oruam@gmail.com> Reviewed-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 days	drm/amd/display: Enable DAC in DCE link encoder	Timur Kristóf	6	-17/+50
	Ensure that the DAC output is enabled at the correct time by moving it to the DCE link encoder similarly to how digital outputs are enabled. This also removes the call to DAC1EncoderControl from the DCE HWSS, which always felt like it was a hacky solution. Fixes: 0fbe321a93ce ("drm/amd/display: Implement DCE analog link encoders (v2)") Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Tested-by: Mauro Rossi <issor.oruam@gmail.com> Reviewed-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 days	drm/amd/display: Set CRTC source for DAC using registers	Timur Kristóf	6	-37/+43
	Apparently the VBIOS SelectCRTC_Source function overwrites a few registers (such as FMT_*) which DC writes in a different place, which can cause problems. Instead of using the SelectCRTC_Source function from the VBIOS, use the DAC_SOURCE_SELECT register directly, similarly to how it is done for digital link encoders. Fixes: 3be26d81b150 ("drm/amd/display: Support DAC in dce110_hwseq") Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Tested-by: Mauro Rossi <issor.oruam@gmail.com> Reviewed-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 days	drm/amd/display: Initialize DAC in DCE link encoder using VBIOS	Timur Kristóf	2	-2/+11
	The VBIOS DAC1EncoderControl() function can initialize the DAC, by writing board-specific values to certain registers. Call this at link encoder hardware initialization time similarly to how the equivalent UNIPHYTransmitterControl initialization is done. This fixes DAC output on the Radeon HD 7790. Also remove the ENCODER_CONTROL_SETUP enum from the dac_encoder_control_prepare_params function which is actually not a supported operation for DAC encoders. Fixes: 0fbe321a93ce ("drm/amd/display: Implement DCE analog link encoders (v2)") Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Tested-by: Mauro Rossi <issor.oruam@gmail.com> Suggested-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 days	drm/amd/display: Turn off DAC in DCE link encoder using VBIOS	Timur Kristóf	2	-16/+17
	Apparently, the VBIOS DAC1EncoderControl function is much more graceful about turning off the DAC. It writes various DAC registers in a specific sequence. Use that instead of just clearing the DAC_ENABLE register. Do this in just the dce110_link_encoder_disable_output function and remove it from the HWSS. Fixes: 0fbe321a93ce ("drm/amd/display: Implement DCE analog link encoders (v2)") Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Tested-by: Mauro Rossi <issor.oruam@gmail.com> Suggested-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 days	drm/amd/display: Don't call find_analog_engine() twice	Timur Kristóf	1	-1/+0
	The analog engine is already there in the link_analog_engine variable and assigned to enc_init_data.analog_engine already. I suspect this was a rebase mistake. Fixes: 436d0d22aa70 ("drm/amd/display: Pass proper DAC encoder ID to VBIOS") Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Tested-by: Mauro Rossi <issor.oruam@gmail.com> Reviewed-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 days	drm/amdgpu: fix 4-level paging if GMC supports 57-bit VA v2	Christian König	3	-4/+6
	It turned that using 4 level page tables on GMC generations which support 57bit VAs actually doesn't work at all. Background is that the GMC actually can't switch between 4 and 5 levels, but rather just uses a subset of address space when less than 5 levels are selected. Philip already removed the automatically switch to 4levels, now fix it as well should it be enabled by module parameters. v2: fix AMDGPU_GMC_HOLE_MASK as well, fix off by one issue pointed out by Philip Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Philip Yang <philip.yang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 days	drm/amdgpu: keep vga memory on MacBooks with switchable graphics	Alex Deucher	1	-0/+10
	On Intel MacBookPros with switchable graphics, when the iGPU is enabled, the address of VRAM gets put at 0 in the dGPU's virtual address space. This is non-standard and seems to cause issues with the cursor if it ends up at 0. We have the framework to reserve memory at 0 in the address space, so enable it here if the vram start address is 0. Reviewed-and-tested-by: Mario Kleiner <mario.kleiner.de@gmail.com> Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4302 Cc: stable@vger.kernel.org Cc: Mario Kleiner <mario.kleiner.de@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 days	drm/amdgpu: Set atomics to true for xgmi	Harish Kasiviswanathan	1	-3/+4
	xgmi support atomics between links. Set them to true. This only set for GFX12 onwards to avoid regression on older generations v2: Use correct xgmi flag that indicates CPU connection Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 days	drm/amdkfd: Check for NULL return values	Andrew Martin	4	-8/+14
	This patch fixes issues when the code moves forward with a potential NULL pointer, without checking. Removed one redundant NULL check for a function parameter. This check is already done in the only caller. Signed-off-by: Andrew Martin <andrew.martin@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 days	drm/amd/display: Use same max plane scaling limits for all 64 bpp formats	Mario Kleiner	1	-0/+5
	The plane scaling hw seems to have the same min/max plane scaling limits for all 16 bpc / 64 bpp interleaved pixel color formats. Therefore add cases to amdgpu_dm_plane_get_min_max_dc_plane_scaling() for all the 16 bpc fixed-point / unorm formats to use the same .fp16 up/downscaling factor limits as used by the fp16 floating point formats. So far, 16 bpc unorm formats were not handled, and the default: path returned max/min factors for 32 bpp argb8888 formats, which were wrong and bigger than what many DCE / DCN hw generations could handle. The result sometimes was misscaling of framebuffers with DRM_FORMAT_XRGB16161616, DRM_FORMAT_ARGB16161616, DRM_FORMAT_XBGR16161616, DRM_FORMAT_ABGR16161616, leading to very wrong looking display, as tested on Polaris11 / DCE-11.2. So far this went unnoticed, because only few userspace clients used such 16 bpc unorm framebuffers, and those didn't use hw plane scaling, so they did not experience this issue. With upcoming Mesa 26 exposing 16 bpc unorm formats under both OpenGL and Vulkan under Wayland, and the upcoming GNOME 50 Mutter Wayland compositor allowing for direct scanout of these formats, the scaling hw will be used on these formats if possible for HiDPI display scaling, so it is important to use the correct hw scaling limits to avoid wrong display. Tested on AMD Polaris 11 / DCE 11.2 with upcoming Mesa 26 and GNOME 50 on HiDPI displays with scaling enabled. The mutter Wayland compositor now correctly falls back to scaling via desktop compositing instead of direct scanout, thereby avoiding wrong image display. For unscaled mode, it correctly uses direct scanout. Fixes: 580204038f5b ("drm/amd/display: Enable support for 16 bpc fixed-point framebuffers.") Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com> Tested-by: Mario Kleiner <mario.kleiner.de@gmail.com> Cc: Alex Deucher <alexander.deucher@amd.com> Cc: Harry Wentland <harry.wentland@amd.com> Cc: Leo Li <sunpeng.li@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 days	drm/amdgpu: Set vmid0 PAGE_TABLE_DEPTH for GFX12.1	Harish Kasiviswanathan	1	-1/+4
	GFX12.1 uses 2 level gart table. Set the context register appropriately Signed-off-by: Feifei Xu <Feifei.Xu@amd.com> Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Reviewed-by: Mukul Joshi <mukul.joshi@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 days	drm/amdkfd: Disable MQD queue priority	Andrew Martin	7	-7/+7
	This solves a priority inversion issue, caused by the language runtime making high-priority queues wait for activity on lower-priority queues. Signed-off-by: Andrew Martin <andrew.martin@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 days	drm/amd/display: Remove conditional for shaper 3DLUT power-on	Alex Hung	1	-2/+1
	[Why] Shaper programming has high chance to fail on first time after power-on or reboot. This can be verified by running IGT's kms_colorop. [How] Always power on the shaper and 3DLUT before programming by removing the debug flag of low power mode. Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Ray Wu <ray.wu@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 days	drm/amd/display: Check return of shaper curve to HW format	Alex Hung	1	-3/+3
	[Why & How] Check return of cm3_helper_translate_curve_to_hw_format. This is reported as a CHECKED_RETURN error by Coverity. Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Ray Wu <ray.wu@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 days	drm/amd/display: Correct logic check error for fastboot	Charlene Liu	1	-2/+2
	[Why] Fix fastboot broken in driver. This is caused by an open source backport change 7495962c. from the comment, the intended check is to disable fastboot for pre-DCN10. but the logic check is reversed, and causes fastboot to be disabled on all DCN10 and after. fastboot is for driver trying to pick up bios used hw setting and bypass reprogramming the hw if dc_validate_boot_timing() condition meets. Fixes: 7495962cbceb ("drm/amd/display: Disable fastboot on DCE 6 too") Cc: stable@vger.kernel.org Reviewed-by: Mario Limonciello <Mario.Limonciello@amd.com> Reviewed-by: Ovidiu Bunea <ovidiu.bunea@amd.com> Signed-off-by: Charlene Liu <Charlene.Liu@amd.com> Signed-off-by: Ray Wu <ray.wu@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 days	drm/amd/display: Skip eDP detection when no sink	Saidireddy Yenugu	1	-0/+5
	[Why & How] When there is no eDP panel connected and during s0ix resume, unnecessary eDP power sequence and HPD happening, resulting in ~2 seconds delay. Fixed the issue by avoiding link detect for eDP connection with no sink in dm_resume. Reviewed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Saidireddy Yenugu <Saidireddy.Yenugu@amd.com> Co-developed-by: ThummarDip Kishorbhai <ThummarDip.Kishorbhai@amd.com> Signed-off-by: ThummarDip Kishorbhai <ThummarDip.Kishorbhai@amd.com> Signed-off-by: Ray Wu <ray.wu@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 days	Revert "drm/amd/display: Add Gfx Base Case For Linear Tiling Handling"	Nicholas Carbones	13	-33/+3
	This reverts commit 08a01ec306db ("drm/amd/display: Add Gfx Base Case For Linear Tiling Handling") Reason for revert: Got blank screen issues while doing PNP Reviewed-by: Joshua Aberback <joshua.aberback@amd.com> Signed-off-by: Nicholas Carbones <Nicholas.Carbones@amd.com> Signed-off-by: Ray Wu <ray.wu@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 days	Revert "drm/amd/display: Correct hubp GfxVersion verification"	Nicholas Carbones	3	-52/+39
	This reverts commit 3303aa64e7a6 ("drm/amd/display: Correct hubp GfxVersion verification") Reason for revert: Got blank screen issues while doing PNP Reviewed-by: Joshua Aberback <joshua.aberback@amd.com> Signed-off-by: Nicholas Carbones <Nicholas.Carbones@amd.com> Signed-off-by: Ray Wu <ray.wu@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 days	Revert "drm/amd/display: Add Handling for gfxversion DcGfxBase"	Nicholas Carbones	1	-3/+0
	This reverts commit 2e193f5b1b4f ("drm/amd/display: Add Handling for gfxversion DcGfxBase") Reason for revert: Cause some regressions Reviewed-by: Dillon Varone <dillon.varone@amd.com> Signed-off-by: Nicholas Carbones <Nicholas.Carbones@amd.com> Signed-off-by: Ray Wu <ray.wu@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 days	drm/amd/display: Migrate DCCG registers access from hwseq to dccg component.	Bhuvanachandra Pinninti	21	-55/+143
	[Why] Direct DCCG register access in hwseq layer was creating register conflicts. [How] Migrated DCCG registers from hwseq-dccg component. Reviewed-by: Martin Leung <Martin.Leung@amd.com> Signed-off-by: Bhuvanachandra Pinninti <BhuvanaChandra.Pinninti@amd.com> Signed-off-by: Ray Wu <ray.wu@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 days	drm/amd/display: Implementing ramless idle mouse trigger	Muaaz Nisar	3	-0/+28
	[Why & How] Adding mouse trigger in dc_stream to recover from low refresh rate idle state upon mouse movement without vsync interrupts. Reviewed-by: Aric Cyr <aric.cyr@amd.com> Signed-off-by: Muaaz Nisar <muaaz.nisar@amd.com> Signed-off-by: Ray Wu <ray.wu@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 days	drm/amd/display: Disable SR feature on eDP1 by default	Charlene Liu	5	-5/+20
	[Why & How] Disable SR feature on eDP1 by default. Reviewed-by: Wenjing Liu <wenjing.liu@amd.com> Signed-off-by: Charlene Liu <Charlene.Liu@amd.com> Signed-off-by: Ray Wu <ray.wu@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 days	drm/amd/display: Expose functions of other dcn use	Dmytro Laktyushkin	5	-18/+32
	[Why & HOw] Expose some functions for later dcns to reuse Reviewed-by: Charlene Liu <charlene.liu@amd.com> Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: Dmytro Laktyushkin <dmytro.laktyushkin@amd.com> Signed-off-by: Ray Wu <ray.wu@amd.com> Tested-by: Daniel Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
3 days	eth: fbnic: Advertise supported XDP features.	Dimitri Daskalakis	1	-0/+2
	Drivers are supposed to advertise the XDP features they support. This was missed while adding XDP support. Before: $ ynl --family netdev --dump dev-get ... {'ifindex': 3, 'xdp-features': set(), 'xdp-rx-metadata-features': set(), 'xsk-features': set()}, ... After: $ ynl --family netdev --dump dev-get ... {'ifindex': 3, 'xdp-features': {'basic', 'rx-sg'}, 'xdp-rx-metadata-features': set(), 'xsk-features': set()}, ... Fixes: 168deb7b31b2 ("eth: fbnic: Add support for XDP_TX action") Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Dimitri Daskalakis <dimitri.daskalakis1@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20260218030620.3329608-1-dimitri.daskalakis1@gmail.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
3 days	fbdev: au1100fb: Replace license boilerplate by SPDX header	Uwe Kleine-König	1	-20/+1
	This also gets rid of an old address of the FSF. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com> Signed-off-by: Helge Deller <deller@gmx.de>
3 days	fbdev: au1100fb: Fold au1100fb.h into its only user	Uwe Kleine-König	2	-373/+338
	This gets rid of a header that is only used once. The copyrights and license specifications are all already covered in the au1100fb.c file. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com> Signed-off-by: Helge Deller <deller@gmx.de>
3 days	fbdev: au1100fb: Replace custom printk wrappers by pr_*	Uwe Kleine-König	2	-30/+21
	The global wrappers also have the advantage to do stricter format checking, so the pr_devel formats are also checked if DEBUG is not defined. The global variants only check for DEBUG being defined and not its actual value, so the #define to zero is dropped, too. There is only a slight semantic change as the (by default disabled) debug output doesn't contain __FILE__ any more. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com> Signed-off-by: Helge Deller <deller@gmx.de>
3 days	fbdev: au1100fb: Make driver compilable on non-mips platforms	Uwe Kleine-König	3	-5/+12
	The header asm/mach-au1x00/au1000.h is unused apart from pulling in <linux/delay.h> (for mdelay()) and <linux/io.h> (for KSEG1ADDR()). Then the only platform specific part in the driver is the usage of the KSEG1ADDR macro, which for the non-mips case can be stubbed. Signed-off-by: Uwe Kleine-König <u.kleine-koenig@baylibre.com> Signed-off-by: Helge Deller <deller@gmx.de>