summaryrefslogtreecommitdiffstats
path: root/drivers/gpu
AgeCommit message (Collapse)AuthorLines
2026-03-30drm/amd/pm/ci: Clear EnabledForActivity field for memory levelsTimur Kristóf-1/+1
Follow what radeon did and what amdgpu does for other GPUs with SMU7. Fixes: 9f4b35411cfe ("drm/amd/powerplay: add CI asics support to smumgr (v3)") Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-03-30drm/amd/pm/ci: Fix powertune defaults for Hawaii 0x67B0Timur Kristóf-1/+1
There is no AMD GPU with the ID 0x66B0, this looks like a typo. It should be 0x67B0 which is actually part of the PCI ID list, and should use the Hawaii XT powertune defaults according to the old radeon driver. Fixes: 9f4b35411cfe ("drm/amd/powerplay: add CI asics support to smumgr (v3)") Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-03-30drm/amd/pm/smu7: Remove non-functional SMU7 voltage dependency on DALTimur Kristóf-95/+0
It looks like this was written for an old version of DC (DAL) and was never adapted afterwards. This was non-functional because it relied on the "dal_power_level" field which was never assigned anywhere in the code base. Also, it was not implemented for CI ASICs. Now superseded by the newer voltage dependency on display clock table added by the previous commit, let's remove. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-03-30drm/amd/pm/smu7: Fix SMU7 voltage dependency on display clockTimur Kristóf-3/+86
The DCE (display controller engine) requires a minimum voltage in order to function correctly, depending on which clock level it currently uses. Add a new table that contains display clock frequency levels and the corresponding required voltages. The clock frequency levels are taken from DC (and the old radeon driver's voltage dependency table for CI in cases where its values were lower). The voltage levels are taken from the following function: phm_initializa_dynamic_state_adjustment_rule_settings(). Furthermore, in case of CI, call smu7_patch_vddc() on the new table to account for leakage voltage (like in radeon). Use the display clock value from amd_pp_display_configuration to look up the voltage level needed by the DCE. Send the voltage to the SMU via the PPSMC_MSG_VddC_Request command. The previous implementation of this feature was non-functional because it relied on a "dal_power_level" field which was never assigned; and it was not at all implemented for CI ASICs. I verified this on a Radeon R9 M380 which previously booted to a black screen with DC enabled (default since Linux 6.19), but now works correctly. Fixes: 599a7e9fe1b6 ("drm/amd/powerplay: implement smu7 hwmgr to manager asics with smu ip version 7.") Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-03-30drm/amd/pm/ci: Disable MCLK DPM on problematic CI ASICsTimur Kristóf-0/+15
There are two known cases where MCLK DPM can causes issues: Radeon R9 M380 found in iMac computers from 2015. The SMU in this GPU just hangs as soon as we send it the PPSMC_MSG_MCLKDPM_Enable command, even when MCLK switching is disabled, and even when we only populate one MCLK DPM level. Apply workaround to all devices with the same subsystem ID. Radeon R7 260X due to old memory controller microcode. We only flash the MC ucode when it isn't set up by the VBIOS, therefore there is no way to make sure that it has the correct ucode version. I verified that this patch fixes the SMU hang on the R9 M380 which would previously fail to boot. This also fixes the UVD initialization error on that GPU which happened because the SMU couldn't ungate the UVD after it hung. Fixes: 86457c3b21cb ("drm/amd/powerplay: Add support for CI asics to hwmgr") Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-03-30drm/amd/pm/ci: Use highest MCLK on CI when MCLK DPM is disabledTimur Kristóf-0/+8
When MCLK DPM is disabled for any reason, populate the MCLK table with the highest MCLK DPM level, so that the ASIC can use the highest possible memory clock to get good performance even when MCLK DPM is disabled. Fixes: 9f4b35411cfe ("drm/amd/powerplay: add CI asics support to smumgr (v3)") Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-03-30drm/amdkfd: Fix queue preemption/eviction failures by aligning control stack ↵Donet Tom-3/+4
size to GPU page size The control stack size is calculated based on the number of CUs and waves, and is then aligned to PAGE_SIZE. When the resulting control stack size is aligned to 64 KB, GPU hangs and queue preemption failures are observed while running RCCL unit tests on systems with more than two GPUs. amdgpu 0048:0f:00.0: amdgpu: Queue preemption failed for queue with doorbell_id: 80030008 amdgpu 0048:0f:00.0: amdgpu: Failed to evict process queues amdgpu 0048:0f:00.0: amdgpu: GPU reset begin!. Source: 4 amdgpu 0048:0f:00.0: amdgpu: Queue preemption failed for queue with doorbell_id: 80030008 amdgpu 0048:0f:00.0: amdgpu: Failed to evict process queues amdgpu 0048:0f:00.0: amdgpu: Failed to restore process queues This issue is observed on both 4 KB and 64 KB system page-size configurations. This patch fixes the issue by aligning the control stack size to AMDGPU_GPU_PAGE_SIZE instead of PAGE_SIZE, so the control stack size will not be 64 KB on systems with a 64 KB page size and queue preemption works correctly. Additionally, In the current code, wg_data_size is aligned to PAGE_SIZE, which can waste memory if the system page size is large. In this patch, wg_data_size is aligned to AMDGPU_GPU_PAGE_SIZE. The cwsr_size, calculated from wg_data_size and the control stack size, is aligned to PAGE_SIZE. Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Donet Tom <donettom@linux.ibm.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit a3e14436304392fbada359edd0f1d1659850c9b7)
2026-03-30drm/amdgpu: Fix wait after reset sequence in S4Lijo Lazar-3/+8
For a mode-1 reset done at the end of S4 on PSPv11 dGPUs, only check if TOS is unloaded. Fixes: 32f73741d6ee ("drm/amdgpu: Wait for bootloader after PSPv11 reset") Closes: https://gitlab.freedesktop.org/drm/amd/-/work_items/4853 Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 2fb4883b884a437d760bd7bdf7695a7e5a60bba3) Cc: stable@vger.kernel.org
2026-03-30drm/amd/display: Fix NULL pointer dereference in dcn401_init_hw()Srinivasan Shanmugam-6/+11
dcn401_init_hw() assumes that update_bw_bounding_box() is valid when entering the update path. However, the existing condition: ((!fams2_enable && update_bw_bounding_box) || freq_changed) does not guarantee this, as the freq_changed branch can evaluate to true independently of the callback pointer. This can result in calling update_bw_bounding_box() when it is NULL. Fix this by separating the update condition from the pointer checks and ensuring the callback, dc->clk_mgr, and bw_params are validated before use. Fixes the below: ../dc/hwss/dcn401/dcn401_hwseq.c:367 dcn401_init_hw() error: we previously assumed 'dc->res_pool->funcs->update_bw_bounding_box' could be null (see line 362) Fixes: ca0fb243c3bb ("drm/amd/display: Underflow Seen on DCN401 eGPU") Cc: Daniel Sa <Daniel.Sa@amd.com> Cc: Alvin Lee <alvin.lee2@amd.com> Cc: Roman Li <roman.li@amd.com> Cc: Alex Hung <alex.hung@amd.com> Cc: Tom Chung <chiahsuan.chung@amd.com> Cc: Dan Carpenter <dan.carpenter@linaro.org> Cc: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 86117c5ab42f21562fedb0a64bffea3ee5fcd477) Cc: stable@vger.kernel.org
2026-03-30drm/amdgpu: Change AMDGPU_VA_RESERVED_TRAP_SIZE to 64KBDonet Tom-3/+3
Currently, AMDGPU_VA_RESERVED_TRAP_SIZE is hardcoded to 8KB, while KFD_CWSR_TBA_TMA_SIZE is defined as 2 * PAGE_SIZE. On systems with 4K pages, both values match (8KB), so allocation and reserved space are consistent. However, on 64K page-size systems, KFD_CWSR_TBA_TMA_SIZE becomes 128KB, while the reserved trap area remains 8KB. This mismatch causes the kernel to crash when running rocminfo or rccl unit tests. Kernel attempted to read user page (2) - exploit attempt? (uid: 1001) BUG: Kernel NULL pointer dereference on read at 0x00000002 Faulting instruction address: 0xc0000000002c8a64 Oops: Kernel access of bad area, sig: 11 [#1] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries CPU: 34 UID: 1001 PID: 9379 Comm: rocminfo Tainted: G E 6.19.0-rc4-amdgpu-00320-gf23176405700 #56 VOLUNTARY Tainted: [E]=UNSIGNED_MODULE Hardware name: IBM,9105-42A POWER10 (architected) 0x800200 0xf000006 of:IBM,FW1060.30 (ML1060_896) hv:phyp pSeries NIP: c0000000002c8a64 LR: c00000000125dbc8 CTR: c00000000125e730 REGS: c0000001e0957580 TRAP: 0300 Tainted: G E MSR: 8000000000009033 <SF,EE,ME,IR,DR,RI,LE> CR: 24008268 XER: 00000036 CFAR: c00000000125dbc4 DAR: 0000000000000002 DSISR: 40000000 IRQMASK: 1 GPR00: c00000000125d908 c0000001e0957820 c0000000016e8100 c00000013d814540 GPR04: 0000000000000002 c00000013d814550 0000000000000045 0000000000000000 GPR08: c00000013444d000 c00000013d814538 c00000013d814538 0000000084002268 GPR12: c00000000125e730 c000007e2ffd5f00 ffffffffffffffff 0000000000020000 GPR16: 0000000000000000 0000000000000002 c00000015f653000 0000000000000000 GPR20: c000000138662400 c00000013d814540 0000000000000000 c00000013d814500 GPR24: 0000000000000000 0000000000000002 c0000001e0957888 c0000001e0957878 GPR28: c00000013d814548 0000000000000000 c00000013d814540 c0000001e0957888 NIP [c0000000002c8a64] __mutex_add_waiter+0x24/0xc0 LR [c00000000125dbc8] __mutex_lock.constprop.0+0x318/0xd00 Call Trace: 0xc0000001e0957890 (unreliable) __mutex_lock.constprop.0+0x58/0xd00 amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu+0x6fc/0xb60 [amdgpu] kfd_process_alloc_gpuvm+0x54/0x1f0 [amdgpu] kfd_process_device_init_cwsr_dgpu+0xa4/0x1a0 [amdgpu] kfd_process_device_init_vm+0xd8/0x2e0 [amdgpu] kfd_ioctl_acquire_vm+0xd0/0x130 [amdgpu] kfd_ioctl+0x514/0x670 [amdgpu] sys_ioctl+0x134/0x180 system_call_exception+0x114/0x300 system_call_vectored_common+0x15c/0x2ec This patch changes AMDGPU_VA_RESERVED_TRAP_SIZE to 64 KB and KFD_CWSR_TBA_TMA_SIZE to the AMD GPU page size. This means we reserve 64 KB for the trap in the address space, but only allocate 8 KB within it. With this approach, the allocation size never exceeds the reserved area. Fixes: 34a1de0f7935 ("drm/amdkfd: Relocate TBA/TMA to opposite side of VM hole") Reviewed-by: Christian König <christian.koenig@amd.com> Suggested-by: Felix Kuehling <felix.kuehling@amd.com> Suggested-by: Christian König <christian.koenig@amd.com> Signed-off-by: Donet Tom <donettom@linux.ibm.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 31b8de5e55666f26ea7ece5f412b83eab3f56dbb) Cc: stable@vger.kernel.org
2026-03-30drm/amdgpu/userq: fix memory leak in MQD creation error pathsJunrui Luo-4/+12
In mes_userq_mqd_create(), the memdup_user() allocations for IP-specific MQD structs are not freed when subsequent VA validation fails. The goto free_mqd label only cleans up the MQD BO object and userq_props. Fix by adding kfree() before each goto free_mqd on VA validation failure in the COMPUTE, GFX, and SDMA branches. Fixes: 9e46b8bb0539 ("drm/amdgpu: validate userq buffer virtual address and size") Reported-by: Yuhao Jiang <danisjiang@gmail.com> Signed-off-by: Junrui Luo <moonafterrain@outlook.com> Reviewed-by: Prike Liang <Prike.Liang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 27f5ff9e4a4150d7cf8b4085aedd3b77ddcc5d08) Cc: stable@vger.kernel.org
2026-03-30drm/amd: Fix MQD and control stack alignment for non-4KDonet Tom-21/+64
For gfxV9, due to a hardware bug ("based on the comments in the code here [1]"), the control stack of a user-mode compute queue must be allocated immediately after the page boundary of its regular MQD buffer. To handle this, we allocate an enlarged MQD buffer where the first page is used as the MQD and the remaining pages store the control stack. Although these regions share the same BO, they require different memory types: the MQD must be UC (uncached), while the control stack must be NC (non-coherent), matching the behavior when the control stack is allocated in user space. This logic works correctly on systems where the CPU page size matches the GPU page size (4K). However, the current implementation aligns both the MQD and the control stack to the CPU PAGE_SIZE. On systems with a larger CPU page size, the entire first CPU page is marked UC—even though that page may contain multiple GPU pages. The GPU treats the second 4K GPU page inside that CPU page as part of the control stack, but it is incorrectly mapped as UC. This patch fixes the issue by aligning both the MQD and control stack sizes to the GPU page size (4K). The first 4K page is correctly marked as UC for the MQD, and the remaining GPU pages are marked NC for the control stack. This ensures proper memory type assignment on systems with larger CPU page sizes. [1]: https://elixir.bootlin.com/linux/v6.18/source/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c#L118 Acked-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Donet Tom <donettom@linux.ibm.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 998d6781410de1c4b787fdbf6c56e851ea7fa553)
2026-03-30drm/amdkfd: Align expected_queue_size to PAGE_SIZEDonet Tom-2/+2
The AQL queue size can be 4K, but the minimum buffer object (BO) allocation size is PAGE_SIZE. On systems with a page size larger than 4K, the expected queue size does not match the allocated BO size, causing queue creation to fail. Align the expected queue size to PAGE_SIZE so that it matches the allocated BO size and allows queue creation to succeed. Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Donet Tom <donettom@linux.ibm.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit b01cd158a2f5230b137396c5f8cda3fc780abbc2)
2026-03-30drm/amdgpu: fix the idr allocation flagsPrike Liang-1/+4
Fix the IDR allocation flags by using atomic GFP flags in non‑sleepable contexts to avoid the __might_sleep() complaint. 268.290239] [drm] Initialized amdgpu 3.64.0 for 0000:03:00.0 on minor 0 [ 268.294900] BUG: sleeping function called from invalid context at ./include/linux/sched/mm.h:323 [ 268.295355] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 1744, name: modprobe [ 268.295705] preempt_count: 1, expected: 0 [ 268.295886] RCU nest depth: 0, expected: 0 [ 268.296072] 2 locks held by modprobe/1744: [ 268.296077] #0: ffff8c3a44abd1b8 (&dev->mutex){....}-{4:4}, at: __driver_attach+0xe4/0x210 [ 268.296100] #1: ffffffffc1a6ea78 (amdgpu_pasid_idr_lock){+.+.}-{3:3}, at: amdgpu_pasid_alloc+0x26/0xe0 [amdgpu] [ 268.296494] CPU: 12 UID: 0 PID: 1744 Comm: modprobe Tainted: G U OE 6.19.0-custom #16 PREEMPT(voluntary) [ 268.296498] Tainted: [U]=USER, [O]=OOT_MODULE, [E]=UNSIGNED_MODULE [ 268.296499] Hardware name: AMD Majolica-RN/Majolica-RN, BIOS RMJ1009A 06/13/2021 [ 268.296501] Call Trace: Fixes: 8f1de51f49be ("drm/amdgpu: prevent immediate PASID reuse case") Tested-by: Borislav Petkov (AMD) <bp@alien8.de> Signed-off-by: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit ea56aa2625708eaf96f310032391ff37746310ef) Cc: stable@vger.kernel.org
2026-03-30drm/amdgpu: validate doorbell_offset in user queue creationJunrui Luo-0/+7
amdgpu_userq_get_doorbell_index() passes the user-provided doorbell_offset to amdgpu_doorbell_index_on_bar() without bounds checking. An arbitrarily large doorbell_offset can cause the calculated doorbell index to fall outside the allocated doorbell BO, potentially corrupting kernel doorbell space. Validate that doorbell_offset falls within the doorbell BO before computing the BAR index, using u64 arithmetic to prevent overflow. Fixes: f09c1e6077ab ("drm/amdgpu: generate doorbell index for userqueue") Reported-by: Yuhao Jiang <danisjiang@gmail.com> Signed-off-by: Junrui Luo <moonafterrain@outlook.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit de1ef4ffd70e1d15f0bf584fd22b1f28cbd5e2ec) Cc: stable@vger.kernel.org
2026-03-30drm/amdgpu/pm: drop SMU driver if version not matched messagesAlex Deucher-3/+0
It just leads to user confusion. Cc: Yang Wang <kevinyang.wang@amd.com> Cc: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit e471627d56272a791972f25e467348b611c31713) Cc: stable@vger.kernel.org
2026-03-30drm/amdgpu: use multiple entities in amdgpu_move_blitPierre-Eric Pelloux-Prayer-4/+10
Thanks to "drm/ttm: rework pipelined eviction fence handling", ttm can deal correctly with moves and evictions being executed from different contexts. Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Acked-by: Felix Kuehling <felix.kuehling@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-03-30drm/amdkfd: Fix queue preemption/eviction failures by aligning control stack ↵Donet Tom-3/+4
size to GPU page size The control stack size is calculated based on the number of CUs and waves, and is then aligned to PAGE_SIZE. When the resulting control stack size is aligned to 64 KB, GPU hangs and queue preemption failures are observed while running RCCL unit tests on systems with more than two GPUs. amdgpu 0048:0f:00.0: amdgpu: Queue preemption failed for queue with doorbell_id: 80030008 amdgpu 0048:0f:00.0: amdgpu: Failed to evict process queues amdgpu 0048:0f:00.0: amdgpu: GPU reset begin!. Source: 4 amdgpu 0048:0f:00.0: amdgpu: Queue preemption failed for queue with doorbell_id: 80030008 amdgpu 0048:0f:00.0: amdgpu: Failed to evict process queues amdgpu 0048:0f:00.0: amdgpu: Failed to restore process queues This issue is observed on both 4 KB and 64 KB system page-size configurations. This patch fixes the issue by aligning the control stack size to AMDGPU_GPU_PAGE_SIZE instead of PAGE_SIZE, so the control stack size will not be 64 KB on systems with a 64 KB page size and queue preemption works correctly. Additionally, In the current code, wg_data_size is aligned to PAGE_SIZE, which can waste memory if the system page size is large. In this patch, wg_data_size is aligned to AMDGPU_GPU_PAGE_SIZE. The cwsr_size, calculated from wg_data_size and the control stack size, is aligned to PAGE_SIZE. Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Donet Tom <donettom@linux.ibm.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-03-30drm/amdgpu: use TTM_NUM_MOVE_FENCES when reserving fencesPierre-Eric Pelloux-Prayer-17/+9
Use TTM_NUM_MOVE_FENCES as an upperbound of how many fences ttm might need to deal with moves/evictions. Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Acked-by: Felix Kuehling <felix.kuehling@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-03-30drm/amdgpu: round robin through clear_entities in amdgpu_fill_bufferPierre-Eric Pelloux-Prayer-5/+20
This makes clear of different BOs run in parallel. Partial jobs to clear a single BO still execute sequentially. Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-03-30drm/amdgpu: allocate move entities dynamicallyPierre-Eric Pelloux-Prayer-15/+25
No functional change for now, as we always allocate a single entity. Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Acked-by: Felix Kuehling <felix.kuehling@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-03-30drm/amdgpu: allocate clear entities dynamicallyPierre-Eric Pelloux-Prayer-18/+42
No functional change for now, as we always allocate a single entity and use it everywhere. Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-03-30drm/amdkfd: fix kernel crash on releasing NULL sysfs entryEric Huang-1/+2
there is an abnormal case that When a process re-opens kfd with different mm_struct(execve() called by user), the allocated p->kobj will be freed, but missed setting it to NULL, that will cause sysfs/kernel crash with NULL pointers in p->kobj on kfd_process_remove_sysfs() when releasing process, and the similar error on kfd_procfs_del_queue() as well. Signed-off-by: Eric Huang <jinhuieric.huang@amd.com> Reviewed-by: Kent Russell <kent.russell@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-03-30drm/amd/pm: Unify version check in SMUv14Lijo Lazar-69/+21
Use common helper function for firmware version check and logging in SMUv14 Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-03-30drm/amd/display: Add Idle state manager(ISM)Ray Wu-63/+903
[Why] Rapid allow/disallow of idle optimization calls, whether it be IPS or self-refresh features, can end up using more power if actual time-in-idle is low. It can also spam DMUB command submission in a way that prevents it from servicing other requestors. [How] Introduce the Idle State Manager (ISM) to amdgpu. It maintains a finite state machine that uses a hysteresis to determine if a delay should be inserted between a caller allowing idle, and when the actual idle optimizations are programmed. A second timer is also introduced to enable static screen optimizations (SSO) such as PSR1 and Replay low HZ idle mode. Rapid SSO enable/disable can have a negative power impact on some low hz video playback, and can introduce user lag for PSR1 (due to up to 3 frames of sync latency). This effectively rate-limits idle optimizations, based on hysteresis. This also replaces the existing delay logic used for PSR1, allowing drm_vblank_crtc_config.disable_immediate = true, and thus allowing drm_crtc_vblank_restore(). v2: * Loosen criteria for ISM to exit idle optimizations; it failed to exit idle correctly on cursor updates when there are no drm_vblank requestors, * Document default_ism_config * Convert pr_debug to trace events to reduce overhead on frequent codepaths * checkpatch.pl fixes Link: https://gitlab.freedesktop.org/drm/amd/-/issues/4527 Link: https://gitlab.freedesktop.org/drm/amd/-/issues/3709 Fixes: 58a261bfc967 ("drm/amd/display: use a more lax vblank enable policy for older ASICs") Signed-off-by: Ray Wu <ray.wu@amd.com> Signed-off-by: Leo Li <sunpeng.li@amd.com> Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-03-30drm/amdgpu/gfx11: Add Cleaner Shader Support for GFX11.5.4Srinivasan Shanmugam-0/+14
The Cleaner Shader is responsible for clearing LDS, VGPRs and SGPRs between GPU workloads to enforce process isolation and avoid data leakage. The cleaner shader clears per-wave GPU state (LDS, VGPRs and SGPRs) between workloads, improving process isolation and preventing stale data from being observed by subsequent tasks. This reuses the existing cleaner shader used on GFX11.0.3 and enables it for GFX11.5.4 GPUs when firmware requirements are met. Cc: Muhammad Adam <muhammad.adam@amd.com> Cc: Mario Sopena-Novales <mario.novales@amd.com> Cc: Tom Wu <Tom.Wu@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-03-30drm/amdgpu: Fix wait after reset sequence in S4Lijo Lazar-3/+8
For a mode-1 reset done at the end of S4 on PSPv11 dGPUs, only check if TOS is unloaded. Fixes: 32f73741d6ee ("drm/amdgpu: Wait for bootloader after PSPv11 reset") Closes: https://gitlab.freedesktop.org/drm/amd/-/work_items/4853 Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-03-30drm/amdkfd: Switch to dev_* printk stuff in kfd_int_process_v12_1.cLang Yu-12/+16
dev_* printk stuff is multi-GPU friendly. Use dev_warn_ratelimited() for print_sq_intr_info_error() which is consistent with previous IPs. Use dev_dbg_ratelimited() for irrelevant node interrupt print to avoid too much noise. Signed-off-by: Lang Yu <lang.yu@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-03-30drm/amd/pm: Unify version check in SMUv12Lijo Lazar-39/+1
Use common helper function for firmware version check and logging in SMUv12. Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-03-30drm/amd/pm: Unify version check in SMUv11Lijo Lazar-94/+52
Use common helper function for firmware version check and logging in SMUv11 Signed-off-by: Lijo Lazar <lijo.lazar@amd.com> Reviewed-by: Asad Kamal <asad.kamal@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-03-30drm/amd/display: Promote DC to 3.2.376Taimur Hassan-1/+1
This version brings along following fixes: - correct unknown plane state patch - Revert "Refactor DC update checks" - Revert "Add 3DLUT DMA broadcast support" - Remove invalid DPSTREAMCLK mask usage - enable eDP DSC seamless boot support - Revert "Rework HDMI link training and YCbCr422 with DSC policy" - Disable PSR & Replay CRTC disable by default - Fix Silence Compiler Warnings - Add link output control for DPIA - eliminate clock manager code duplication - Don't set 4to1MPC config dynamically - Merge pipes for validate - Fix bounds checking in dml2_0 clock table array - Avoid turning off the PHY when OTG is running for DVI - Should support p-state under dcn21 - Enable Replay support for dcn42 - Remove check for DC_DMCUB_ENABLE on DCN42 Acked-by: Wayne Lin <Wayne.Lin@amd.com> Signed-off-by: Taimur Hassan <Syed.Hassan@amd.com> Signed-off-by: Chuanyu Tseng <chuanyu.tseng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-03-30drm/amd/display: [FW Promotion] Release 0.1.53.0Taimur Hassan-0/+1
[Why] dmu: Parse freesync mccs vcp code Acked-by: Wayne Lin <Wayne.Lin@amd.com> Signed-off-by: Taimur Hassan <Syed.Hassan@amd.com> Signed-off-by: Chuanyu Tseng <chuanyu.tseng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-03-30drm/amd/display: Silence type conversion warnings in dml2Gaghik Khachatrian-19/+41
[Why] Compiler build generates type conversion warnings throughout dc/dml2_0 where values are implicitly narrowed (e.g. int/uint32_t/uint64_t assigned to uint8_t, unsigned char, char, bool, or dml_bool_t), cluttering build output and masking genuine issues. [How] Add explicit casts at each narrowing assignment with ASSERT guards to catch out-of-range values in debug builds: - uint8_t: otg_inst, num_planes, pipe_idx, vblank_index fields - unsigned char: pipe_dlg_param.otg_inst from tg->inst - char: mcache num_pipes from num_dpps_required - bool/dml_bool_t: INTERLACE bitfield and fams2 enable flag use != 0 - uint64_t: widen min_hardware_refresh_in_uhz to hold div64_u64 result, then cast to unsigned long for min_refresh_uhz with ASSERT Reviewed-by: Austin Zheng <austin.zheng@amd.com> Signed-off-by: Gaghik Khachatrian <gaghik.khachatrian@amd.com> Signed-off-by: Chuanyu Tseng <chuanyu.tseng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-03-30drm/amd/display: Fix Compiler warnings in dmubGaghik Khachatrian-0/+11
[Why] Resolve compiler warnings by marking unused parameters explicitly. [How] In .c and .h files, keep parameter names in signatures and add a line with`(void)param;` inside the function body Preserved function signatures and avoids breaking code paths that may reference the parameter under conditional compilation. Reviewed-by: Aric Cyr <aric.cyr@amd.com> Signed-off-by: Gaghik Khachatrian <gaghik.khachatrian@amd.com> Signed-off-by: Chuanyu Tseng <chuanyu.tseng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-03-30drm/amd/display: Fixed Silence complier warnings in dcGaghik Khachatrian-46/+750
[Why] Resolve compiler warnings by marking unused parameters explicitly. [How] In .c and .h function definitions, keep parameter names in signatures and add a line with `(void)param;` in function body Preserved function signatures and avoids breaking code paths that may reference the parameter under conditional compilation. Reviewed-by: Dillon Varone <dillon.varone@amd.com> Reviewed-by: Austin Zheng <austin.zheng@amd.com> Signed-off-by: Gaghik Khachatrian <gaghik.khachatrian@amd.com> Signed-off-by: Chuanyu Tseng <chuanyu.tseng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-03-30drm/amd/display: Move FPU Guards From DML To DC - Part 3Rafal Ostrowski-5/+9
[Why] FPU guards (DC_FP_START/DC_FP_END) are required to wrap around code that can manipulates floats. To do this properly, the FPU guards must be used in a file that is not compiled as a FPU unit. If the guards are used in a file that is a FPU unit, other sections in the file that aren't guarded may be end up being compiled to use FPU operations. [How] Added DC_FP_START and DC_FP_END to DC functions that call DML functions using FPU. Reviewed-by: Dillon Varone <dillon.varone@amd.com> Signed-off-by: Rafal Ostrowski <rafal.ostrowski@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-03-30drm/amd/display: Move FPU Guards From DML To DC - Part 2Rafal Ostrowski-471/+484
[Why] FPU guards (DC_FP_START/DC_FP_END) are required to wrap around code that can manipulates floats. To do this properly, the FPU guards must be used in a file that is not compiled as a FPU unit. If the guards are used in a file that is a FPU unit, other sections in the file that aren't guarded may be end up being compiled to use FPU operations. [How] Removed DC_FP_START and DC_FP_END. Reviewed-by: Dillon Varone <dillon.varone@amd.com> Signed-off-by: Rafal Ostrowski <rafal.ostrowski@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Chuanyu Tseng <Chuanyu.Tseng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-03-30drm/amd/display: Move FPU Guards From DML To DC - Part 1Rafal Ostrowski-53/+170
[Why] FPU guards (DC_FP_START/DC_FP_END) are required to wrap around code that can manipulates floats. To do this properly, the FPU guards must be used in a file that is not compiled as a FPU unit. If the guards are used in a file that is a FPU unit, other sections in the file that aren't guarded may be end up being compiled to use FPU operations. [How] Added DC_FP_START and DC_FP_END to DC functions that call DML functions using FPU. Reviewed-by: Dillon Varone <dillon.varone@amd.com> Signed-off-by: Rafal Ostrowski <rafal.ostrowski@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-03-30drm/amdgpu: add support to query vram info from firmwareGangliang Xie-243/+315
add support to query vram info from firmware v2: change APU vram type, add multi-aid check v3: seperate vram info query function into 3 parts and call them in a helper func when requirements are met. v4: calculate vram_width for v9.x Signed-off-by: Gangliang Xie <ganglxie@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-03-30drm/amd/display: correct unknown plane state patchCharlene Liu-1/+1
[why] dcn42x is using same gfx as dcn35, i.e. not use gfx_address3. Reviewed-by: Ovidiu Bunea <ovidiu.bunea@amd.com> Signed-off-by: Charlene Liu <Charlene.Liu@amd.com> Signed-off-by: Chuanyu Tseng <chuanyu.tseng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-03-30Revert "drm/amd/display: Refactor DC update checks"Dillon Varone-204/+308
Revert commit c24bb00cc6cf ("drm/amd/display: Refactor DC update checks") [WHY] Causing issues with PSR/Replay, reverting until those can be fixed. Reviewed-by: Martin Leung <Martin.Leung@amd.com> Signed-off-by: Dillon Varone <Dillon.Varone@amd.com> Signed-off-by: Chuanyu Tseng <chuanyu.tseng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-03-30Revert "drm/amd/display: Add 3DLUT DMA broadcast support"Dillon Varone-67/+49
Revert commit 7d59465de38e ("drm/amd/display: Add 3DLUT DMA broadcast support") [WHY&HOW] Dependencies of this change are still causing issues, so reverting until those can be fixed. Reviewed-by: Martin Leung <Martin.Leung@amd.com> Signed-off-by: Dillon Varone <Dillon.Varone@amd.com> Signed-off-by: Chuanyu Tseng <chuanyu.tseng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-03-30drm/amd/display: Remove invalid DPSTREAMCLK mask usageRoman Li-4/+0
[Why] The invalid register field access causes ASSERT(mask != 0) to fire in set_reg_field_values() during display enable. WARNING: at drivers/gpu/drm/amd/amdgpu/../display/dc/dc_helper.c:100 set_reg_field_values.isra.0+0xcf/0xf0 [amdgpu] Call Trace: <TASK> generic_reg_update_ex+0x66/0x1d0 [amdgpu] dccg401_set_dpstreamclk+0xed/0x350 [amdgpu] dcn401_enable_stream+0x165/0x370 [amdgpu] link_set_dpms_on+0x6e9/0xe90 [amdgpu] dce110_apply_single_controller_ctx_to_hw+0x343/0x530 [amdgpu] dce110_apply_ctx_to_hw+0x1f6/0x2d0 [amdgpu] dc_commit_state_no_check+0x49a/0xe20 [amdgpu] dc_commit_streams+0x354/0x570 [amdgpu] amdgpu_dm_atomic_commit_tail+0x6f8/0x3fc0 [amdgpu] DCN4.x hardware does not have DPSTREAMCLK_GATE_DISABLE and DPSTREAMCLK_ROOT_GATE_DISABLE fields in DCCG_GATE_DISABLE_CNTL3. These global fields only exist in DCN3.1.x hardware. [How] Remove the call that tries to update non-existent fields in CNTL3. DCN4.x uses per-instance fields in CNTL5 instead, which are already correctly programmed in the switch cases above. Reviewed-by: Dillon Varone <dillon.varone@amd.com> Signed-off-by: Roman Li <Roman.Li@amd.com> Signed-off-by: Chuanyu Tseng <chuanyu.tseng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-03-30drm/amd/display: using cm structure for lut3d related infoDillon Varone-10/+16
[Why] Using the alternative implementation via cm structure of config lut3d data Reviewed-by: Austin Zheng <austin.zheng@amd.com> Signed-off-by: Dillon Varone <Dillon.Varone@amd.com> Signed-off-by: ChuanYu Tseng <ChuanYu.Tseng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-03-30drm/amd/display: enable eDP DSC seamless boot supportMohit Bawa-3/+80
[Why] VBIOS supports DSC for seamless boot on newer hardware. Reading hardware state allows proper DSC validation without breaking existing boot display. [What] Remove DSC block for boot timing validation and implement hardware state reading to populate DSC configuration from VBIOS-configured state. Enhance dsc_read_state function in DCN401 to read additional DSC parameters. Reviewed-by: Yihan Zhu <yihan.zhu@amd.com> Signed-off-by: Mohit Bawa <Mohit.Bawa@amd.com> Signed-off-by: Chuanyu Tseng <chuanyu.tseng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-03-30Revert "drm/amd/display: Rework YCbCr422 DSC policy"Relja Vojvodic-22/+18
Revert commit 19b79e4f2182 ("drm/amd/display: Rework YCbCr422 DSC policy") Reason for Revert: This commit is causing compliance failures Reviewed-by: Wenjing Liu <wenjing.liu@amd.com> Signed-off-by: Relja Vojvodic <Relja.Vojvodic@amd.com> Signed-off-by: Chuanyu Tseng <chuanyu.tseng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-03-30drm/amd/dc: Disable PSR & Replay CRTC disable by defaultOvidiu Bunea-0/+2
[why & how] Let IPS FSM handle OTG disable. Reviewed-by: Leo Chen <leo.chen@amd.com> Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com> Signed-off-by: Ovidiu Bunea <ovidiu.bunea@amd.com> Signed-off-by: Chuanyu Tseng <chuanyu.tseng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-03-30drm/amd/display: Fixed silence signed/unsigned mismatch warningsClay King-23/+19
Fix compiler warnings by consistently use the same signedness for a given value Reviewed-by: Joshua Aberback <joshua.aberback@amd.com> Signed-off-by: Clay King <clayking@amd.com> Signed-off-by: Chuanyu Tseng <chuanyu.tseng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-03-30drm/amd/dc: Add link output control for DPIALincheng Ku-7/+11
[Why] To support specific sequencing requirements for DPIA link output [How] Implement the dpia_link_hwss structure and define the necessary control function pointers. The initialization order is aligned with the core link_hwss definition to ensure consistency Reviewed-by: Wenjing Liu <wenjing.liu@amd.com> Signed-off-by: Lincheng Ku <LinCheng.Ku@amd.com> Signed-off-by: Chuanyu Tseng <chuanyu.tseng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-03-30drm/amd/display: Fix silence signed/unsigned mismatch warnings in dmlClay King-7/+8
[Why & How] Fix signed/unsigned mismatch warnings by using the same signedness for a given value Reviewed-by: Austin Zheng <austin.zheng@amd.com> Signed-off-by: Clay King <clayking@amd.com> Signed-off-by: Chuanyu Tseng <chuanyu.tseng@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>