linux/drivers/gpu/drm/amd/amdgpu, branch master

drm/amdgpu/gfx_v12_0: set gfx.rs64_enable from PFP header on GFX12

2026-05-11T21:54:44Z

gfx_v12_0_init_microcode() always loads RS64 CP ucode but never set adev->gfx.rs64_enable, so it stayed false and code that branches on it (e.g. MEC pipe reset) used the legacy CP_MEC_CNTL path incorrectly. Match GFX11: derive RS64 mode from the PFP firmware header (v2.0) via amdgpu_ucode_hdr_version(). Log at debug when RS64 is enabled. Reviewed-by: Alex Deucher Signed-off-by: Jesse Zhang Signed-off-by: Alex Deucher (cherry picked from commit b03d53598b0d2048e8fa7303b8d0784768ec4fa6)

drm/amd/ras: Fix CPER ring debugfs read overflow

2026-05-11T21:54:28Z

The legacy CPER debugfs reader can reach the payload path without a valid pointer snapshot. The remaining user byte count is also treated as the ring occupancy in dwords, so reads past the header can copy more than requested. Take the CPER lock before sampling pointers. Resample rptr/wptr for payload reads, bound the payload copy by available dwords and the remaining user size, and advance the file position for each dword copied. Signed-off-by: Xiang Liu Reviewed-by: Tao Zhou Signed-off-by: Alex Deucher (cherry picked from commit 1e40ef87ffdc291e05ccdade8b9170cc9c1c4249)

drm/amdgpu: fix userq hang detection and reset

2026-05-11T21:47:11Z

Fix lock inversions pointed out by Prike and Sunil. The hang detection timeout *CAN'T* grab locks under which we wait for fences, especially not the userq_mutex lock. Then instead of this completely broken handling with the hang_detect_fence just cancel the work when fences are processed and re-start if necessary. Signed-off-by: Christian König Reviewed-by: Sunil Khatri Signed-off-by: Alex Deucher (cherry picked from commit 1b62077f045ac6ffde7c97005c6659569ac5c1ec)

drm/amdgpu: remove almost all calls to amdgpu_userq_detect_and_reset_queues

2026-05-11T21:47:04Z

Well the reset handling seems broken on multiple levels. As first step of fixing this remove most calls to the hang detection. That function should only be called after we run into a timeout! And *NOT* as random check spread over the code in multiple places. Signed-off-by: Christian König Reviewed-by: Sunil Khatri Signed-off-by: Alex Deucher (cherry picked from commit 71bea36b54ccfb14cbc90f94267af6369af4e702)

drm/amdgpu: rework amdgpu_userq_signal_ioctl v3

2026-05-11T21:46:43Z

This one was fortunately not looking so bad as the wait ioctl path, but there were still a few things which could be fixed/improved: 1. Allocating with GFP_ATOMIC was quite unnecessary, we can do that before taking the userq_lock. 2. Use a new mutex as protection for the fence_drv_xa so that we can do memory allocations while holding it. 3. Starting the reset timer is unnecessary when the fence is already signaled when we create it. 4. Cleanup error handling, avoid trying to free the queue when we don't even got one. v2: fix incorrect usage of xa_find, destroy the new mutex on error v3: cleanup ref ordering Signed-off-by: Christian König Reviewed-by: Sunil Khatri Signed-off-by: Alex Deucher (cherry picked from commit 1609eb0f81a609d350169839128cecf298c84e7a)

drm/amdgpu: remove deadlocks from amdgpu_userq_pre_reset

2026-05-11T21:46:34Z

The purpose of a GPU reset is to make sure that fence can be signaled again and the signal and resume workers can make progress again. So waiting for the resume worker or any fence in the GPU reset path is just utterly nonsense. Signed-off-by: Christian König Reviewed-by: Prike Liang Signed-off-by: Alex Deucher (cherry picked from commit fcd5f065eab46993af43442fd77ee8d9eb9c5bdf)

drm/amdgpu: nuke amdgpu_userq_fence_slab v2

2026-05-05T14:23:06Z

As preparation for independent fences remove the extra slab, kmalloc should do just fine. v2: use GFP_KERNEL instead of GFP_ATOMIC Signed-off-by: Christian König Reviewed-by: Prike Liang Reviewed-by: Sunil Khatri Signed-off-by: Alex Deucher (cherry picked from commit 0d831487b5be0ae59cac865a0aa87b0acc3dc717)

drm/amdgpu/userq: fix access to stale wptr mapping

2026-05-05T14:22:13Z

Use drm_exec to take both locks i.e vm root bo and wptr_obj bo to access the mapping data properly. This fixes the security issue of unmap the wptr_obj while a queue creation is in progress and passing other bo at same address. Signed-off-by: Sunil Khatri Reviewed-by: Christian König Signed-off-by: Alex Deucher (cherry picked from commit 1fc6c8ab45dbee096469c08c13f6099d57a52d6c) Cc: stable@vger.kernel.org

drm/amdgpu: zero-initialize GART table on allocation

2026-05-05T14:17:22Z

GART TLB is flushed after unmapping but not after mapping. Since amdgpu_bo_create_kernel() does not zero-initialize the buffer, when a single PTE is written the TLB may speculatively load other uninitialized entries from the same cacheline. Those garbage entries can appear valid, and a subsequent write to another PTE in the same cacheline may cause the GPU to use a stale garbage PTE from the TLB. Fix this by calling memset_io() to zero-initialize the GART table with gart_pte_flags immediately after allocation. Using AMDGPU_GEM_CREATE_VRAM_CLEARED, SDMA-based clear will not work since SDMA needs GART to be initialized to work. Suggested-by: Felix Kuehling Signed-off-by: Philip Yang Reviewed-by: Christian König Signed-off-by: Alex Deucher (cherry picked from commit d9af8263b82b6eaa60c5718e0c6631c5037e4b24) Cc: stable@vger.kernel.org

drm/amdgpu/sdma4: replace BUG_ON with WARN_ON in fence emission

2026-05-05T14:16:09Z

sdma_v4_0_ring_emit_fence() contains two BUG_ON(addr & 0x3) assertions that verify fence writeback addresses are dword-aligned. These assertions can be reached from unprivileged userspace via crafted DRM_IOCTL_AMDGPU_CS submissions, causing a fatal kernel panic in a scheduler worker thread. Replace both BUG_ON() calls with WARN_ON() to log the condition without crashing the kernel. A misaligned fence address at this point indicates a driver bug, but crashing the kernel is never the correct response when the assertion is reachable from userspace. The CS IOCTL path is the correct place to filter invalid submissions; the ring emission callback is too late to do anything about it. Fixes: 2130f89ced2c ("drm/amdgpu: add SDMA v4.0 implementation (v2)") Reviewed-by: Christian König Signed-off-by: John B. Moore Signed-off-by: Alex Deucher (cherry picked from commit b90250bd933afd1ba94d86d6b13821997b22b18e) Cc: stable@vger.kernel.org