<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/drivers/gpu/drm/amd/amdgpu, branch master</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=master</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=master'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2026-03-04T18:15:00Z</updated>
<entry>
<title>drm/amdgpu/userq: refcount userqueues to avoid any race conditions</title>
<updated>2026-03-04T18:15:00Z</updated>
<author>
<name>Sunil Khatri</name>
<email>sunil.khatri@amd.com</email>
</author>
<published>2026-03-02T13:20:46Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=65b5c326ce4103620c977b8dcb1699bdac4da143'/>
<id>urn:sha1:65b5c326ce4103620c977b8dcb1699bdac4da143</id>
<content type='text'>
To avoid race condition and avoid UAF cases, implement kref
based queues and protect the below operations using xa lock
a. Getting a queue from xarray
b. Increment/Decrement it's refcount

Every time some one want to access a queue, always get via
amdgpu_userq_get to make sure we have locks in place and get
the object if active.

A userqueue is destroyed on the last refcount is dropped which
typically would be via IOCTL or during fini.

v2: Add the missing drop in one the condition in the signal ioclt [Alex]

v3: remove the queue from the xarray first in the free queue ioctl path
    [Christian]

- Pass queue to the amdgpu_userq_put directly.
- make amdgpu_userq_put xa_lock free since we are doing put for each get
  only and final put is done via destroy and we remove the queue from xa
  with lock.
- use userq_put in fini too so cleanup is done fully.

v4: Use xa_erase directly rather than doing load and erase in free
    ioctl. Also remove some of the error logs which could be exploited
    by the user to flood the logs [Christian]

Signed-off-by: Sunil Khatri &lt;sunil.khatri@amd.com&gt;
Reviewed-by: Christian König &lt;christian.koenig@amd.com&gt;
Reviewed-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
(cherry picked from commit 4952189b284d4d847f92636bb42dd747747129c0)
Cc: &lt;stable@vger.kernel.org&gt; # 048c1c4e5171: drm/amdgpu/userq: Consolidate wait ioctl exit path
Cc: &lt;stable@vger.kernel.org&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu/userq: Consolidate wait ioctl exit path</title>
<updated>2026-03-04T18:15:00Z</updated>
<author>
<name>Tvrtko Ursulin</name>
<email>tvrtko.ursulin@igalia.com</email>
</author>
<published>2026-02-23T12:41:32Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=048c1c4e51715ffddd4189745c07f530f34fbe37'/>
<id>urn:sha1:048c1c4e51715ffddd4189745c07f530f34fbe37</id>
<content type='text'>
If we gate the fence destruction with a check telling us whether there are
valid pointers in there we can eliminate the need for dual, basically
identical, exit paths.

Reviewed-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Tvrtko Ursulin &lt;tvrtko.ursulin@igalia.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
(cherry picked from commit bea29bb0dd29012949cd44fdb122465a9fd5cf91)
</content>
</entry>
<entry>
<title>drm/amdgpu/psp: Use Indirect access address for GFX to PSP mailbox</title>
<updated>2026-03-04T18:15:00Z</updated>
<author>
<name>sguttula</name>
<email>suresh.guttula@amd.com</email>
</author>
<published>2026-02-25T08:27:01Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=a145bbff6f53ab80757a15eba5ad2ba8e3bdc9dc'/>
<id>urn:sha1:a145bbff6f53ab80757a15eba5ad2ba8e3bdc9dc</id>
<content type='text'>
The reason the RAP is not granting access to 0x58200 is that
a dedicated RSMU slot would have to be spent for this address range,
and MPASP is close to running out of RSMU slots.

This will help to fix PSP TOC load failure during secureboot.
GFX Driver Need to use indirect access for SMN address regs.

Signed-off-by: sguttula &lt;suresh.guttula@amd.com&gt;
Reviewed-by: Lijo Lazar &lt;lijo.lazar@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
(cherry picked from commit 9b822e26eea3899003aa8a89d5e2c4408e066e20)
</content>
</entry>
<entry>
<title>drm/amdgpu: Fix use-after-free race in VM acquire</title>
<updated>2026-03-04T18:15:00Z</updated>
<author>
<name>Alysa Liu</name>
<email>Alysa.Liu@amd.com</email>
</author>
<published>2026-02-05T16:21:45Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=2c1030f2e84885cc58bffef6af67d5b9d2e7098f'/>
<id>urn:sha1:2c1030f2e84885cc58bffef6af67d5b9d2e7098f</id>
<content type='text'>
Replace non-atomic vm-&gt;process_info assignment with cmpxchg()
to prevent race when parent/child processes sharing a drm_file
both try to acquire the same VM after fork().

Reviewed-by: Harish Kasiviswanathan &lt;Harish.Kasiviswanathan@amd.com&gt;
Signed-off-by: Alysa Liu &lt;Alysa.Liu@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
(cherry picked from commit c7c573275ec20db05be769288a3e3bb2250ec618)
Cc: stable@vger.kernel.org
</content>
</entry>
<entry>
<title>drm/amdgpu: Enable DPG support for VCN5</title>
<updated>2026-03-02T22:13:29Z</updated>
<author>
<name>sguttula</name>
<email>suresh.guttula@amd.com</email>
</author>
<published>2026-02-21T05:17:59Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=389c2024cab817366e6b8345f679f41064fa94d6'/>
<id>urn:sha1:389c2024cab817366e6b8345f679f41064fa94d6</id>
<content type='text'>
This will set DPG flags for enabling power gating on GFX11_5_4

Signed-off-by: sguttula &lt;suresh.guttula@amd.com&gt;
Reviewed-by: Pratik Vishwakarma &lt;Pratik.Vishwakarma@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
(cherry picked from commit a503c266d70d3363ba6bffb883cd6ecdb092670c)
</content>
</entry>
<entry>
<title>drm/amd: Disable MES LR compute W/A</title>
<updated>2026-02-25T22:58:06Z</updated>
<author>
<name>Mario Limonciello</name>
<email>mario.limonciello@amd.com</email>
</author>
<published>2026-02-25T16:51:16Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=6b0d812971370c64b837a2db4275410f478272fe'/>
<id>urn:sha1:6b0d812971370c64b837a2db4275410f478272fe</id>
<content type='text'>
A workaround was introduced in commit 1fb710793ce2 ("drm/amdgpu: Enable
MES lr_compute_wa by default") to help with some hangs observed in gfx1151.

This WA didn't fully fix the issue.  It was actually fixed by adjusting
the VGPR size to the correct value that matched the hardware in commit
b42f3bf9536c ("drm/amdkfd: bump minimum vgpr size for gfx1151").

There are reports of instability on other products with newer GC microcode
versions, and I believe they're caused by this workaround. As we don't
need the workaround any more, remove it.

Fixes: b42f3bf9536c ("drm/amdkfd: bump minimum vgpr size for gfx1151")
Acked-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Mario Limonciello &lt;mario.limonciello@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
(cherry picked from commit 9973e64bd6ee7642860a6f3b6958cbf14e89cabd)
Cc: stable@vger.kernel.org
</content>
</entry>
<entry>
<title>drm/amdgpu: Fix error handling in slot reset</title>
<updated>2026-02-25T22:57:55Z</updated>
<author>
<name>Lijo Lazar</name>
<email>lijo.lazar@amd.com</email>
</author>
<published>2026-02-24T04:48:51Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=b57c4ec98c17789136a4db948aec6daadceb5024'/>
<id>urn:sha1:b57c4ec98c17789136a4db948aec6daadceb5024</id>
<content type='text'>
If the device has not recovered after slot reset is called, it goes to
out label for error handling. There it could make decision based on
uninitialized hive pointer and could result in accessing an uninitialized
list.

Initialize the list and hive properly so that it handles the error
situation and also releases the reset domain lock which is acquired
during error_detected callback.

Fixes: 732c6cefc1ec ("drm/amdgpu: Replace tmp_adev with hive in amdgpu_pci_slot_reset")
Signed-off-by: Lijo Lazar &lt;lijo.lazar@amd.com&gt;
Reviewed-by: Ce Sun &lt;cesun102@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
(cherry picked from commit bb71362182e59caa227e4192da5a612b09349696)
</content>
</entry>
<entry>
<title>drm/amdgpu/vcn5: Add SMU dpm interface type</title>
<updated>2026-02-25T22:57:06Z</updated>
<author>
<name>sguttula</name>
<email>suresh.guttula@amd.com</email>
</author>
<published>2026-02-21T04:33:32Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=a5fe1a54513196e4bc8f9170006057dc31e7155e'/>
<id>urn:sha1:a5fe1a54513196e4bc8f9170006057dc31e7155e</id>
<content type='text'>
This will set AMDGPU_VCN_SMU_DPM_INTERFACE_* smu_type
based on soc type and fixing ring timeout issue seen
for DPM enabled case.

Signed-off-by: sguttula &lt;suresh.guttula@amd.com&gt;
Reviewed-by: Pratik Vishwakarma &lt;Pratik.Vishwakarma@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
(cherry picked from commit f0f23c315b38c55e8ce9484cf59b65811f350630)
</content>
</entry>
<entry>
<title>drm/amdgpu: Fix locking bugs in error paths</title>
<updated>2026-02-25T22:56:50Z</updated>
<author>
<name>Bart Van Assche</name>
<email>bvanassche@acm.org</email>
</author>
<published>2026-02-23T21:50:23Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=480ad5f6ead4a47b969aab6618573cd6822bb6a4'/>
<id>urn:sha1:480ad5f6ead4a47b969aab6618573cd6822bb6a4</id>
<content type='text'>
Do not unlock psp-&gt;ras_context.mutex if it has not been locked. This has
been detected by the Clang thread-safety analyzer.

Cc: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Cc: Christian König &lt;christian.koenig@amd.com&gt;
Cc: YiPeng Chai &lt;YiPeng.Chai@amd.com&gt;
Cc: Hawking Zhang &lt;Hawking.Zhang@amd.com&gt;
Cc: amd-gfx@lists.freedesktop.org
Fixes: b3fb79cda568 ("drm/amdgpu: add mutex to protect ras shared memory")
Acked-by: Christian König &lt;christian.koenig@amd.com&gt;
Signed-off-by: Bart Van Assche &lt;bvanassche@acm.org&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
(cherry picked from commit 6fa01b4335978051d2cd80841728fd63cc597970)
</content>
</entry>
<entry>
<title>drm/amdgpu: Unlock a mutex before destroying it</title>
<updated>2026-02-25T22:56:43Z</updated>
<author>
<name>Bart Van Assche</name>
<email>bvanassche@acm.org</email>
</author>
<published>2026-02-23T22:00:07Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=5e0bcc7b88bcd081aaae6f481b10d9ab294fcb69'/>
<id>urn:sha1:5e0bcc7b88bcd081aaae6f481b10d9ab294fcb69</id>
<content type='text'>
Mutexes must be unlocked before these are destroyed. This has been detected
by the Clang thread-safety analyzer.

Cc: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Cc: Christian König &lt;christian.koenig@amd.com&gt;
Cc: Yang Wang &lt;kevinyang.wang@amd.com&gt;
Cc: Hawking Zhang &lt;Hawking.Zhang@amd.com&gt;
Cc: amd-gfx@lists.freedesktop.org
Fixes: f5e4cc8461c4 ("drm/amdgpu: implement RAS ACA driver framework")
Reviewed-by: Yang Wang &lt;kevinyang.wang@amd.com&gt;
Acked-by: Christian König &lt;christian.koenig@amd.com&gt;
Signed-off-by: Bart Van Assche &lt;bvanassche@acm.org&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
(cherry picked from commit 270258ba320beb99648dceffb67e86ac76786e55)
</content>
</entry>
</feed>
