<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/drivers/gpu/drm/amd/amdgpu/amdgpu_vm_pt.c, branch v6.8</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v6.8</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v6.8'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2023-12-13T21:50:35Z</updated>
<entry>
<title>drm/amdgpu: fix tear down order in amdgpu_vm_pt_free</title>
<updated>2023-12-13T21:50:35Z</updated>
<author>
<name>Christian König</name>
<email>christian.koenig@amd.com</email>
</author>
<published>2023-12-08T12:43:09Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=ceb9a321e7639700844aa3bf234a4e0884f13b77'/>
<id>urn:sha1:ceb9a321e7639700844aa3bf234a4e0884f13b77</id>
<content type='text'>
When freeing PD/PT with shadows it can happen that the shadow
destruction races with detaching the PD/PT from the VM causing a NULL
pointer dereference in the invalidation code.

Fix this by detaching the the PD/PT from the VM first and then
freeing the shadow instead.

Signed-off-by: Christian König &lt;christian.koenig@amd.com&gt;
Fixes: https://gitlab.freedesktop.org/drm/amd/-/issues/2867
Cc: &lt;stable@vger.kernel.org&gt;
Reviewed-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: Add EXT_COHERENT support for APU and NUMA systems</title>
<updated>2023-10-27T18:15:16Z</updated>
<author>
<name>David Francis</name>
<email>David.Francis@amd.com</email>
</author>
<published>2023-10-12T14:35:20Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=142262a1c02ad4d334ca1152dc4a0f6db3ef3bfc'/>
<id>urn:sha1:142262a1c02ad4d334ca1152dc4a0f6db3ef3bfc</id>
<content type='text'>
On gfx943 APU, EXT_COHERENT should give MTYPE_CC for local and
MTYPE_UC for nonlocal memory.

On NUMA systems, local memory gets the local mtype, set by an
override callback. If EXT_COHERENT is set, memory will be set as
MTYPE_UC by default, with local memory MTYPE_CC.

Add an option in the override function for this case, and
add a check to ensure it is not used on UNCACHED memory.

V2: Combined APU and NUMA code into one patch
V3: Fixed a potential nullptr in amdgpu_vm_bo_update

Signed-off-by: David Francis &lt;David.Francis@amd.com&gt;
Reviewed-by: Felix Kuehling &lt;Felix.Kuehling@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: ratelimited override pte flags messages</title>
<updated>2023-10-04T22:40:11Z</updated>
<author>
<name>Philip Yang</name>
<email>Philip.Yang@amd.com</email>
</author>
<published>2023-09-28T22:17:21Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=fdac89096666ef80691994391c7ba7f03520797a'/>
<id>urn:sha1:fdac89096666ef80691994391c7ba7f03520797a</id>
<content type='text'>
Use ratelimited version of dev_dbg to avoid flooding dmesg log. No
functional change.

Signed-off-by: Philip Yang &lt;Philip.Yang@amd.com&gt;
Reviewed-by: Felix Kuehling &lt;Felix.Kuehling@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: Fix one kernel-doc comment</title>
<updated>2023-07-21T20:52:25Z</updated>
<author>
<name>Yang Li</name>
<email>yang.lee@linux.alibaba.com</email>
</author>
<published>2023-07-20T01:05:14Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=f135b0fc3110e39b7ac79a932bd0b7eadb337896'/>
<id>urn:sha1:f135b0fc3110e39b7ac79a932bd0b7eadb337896</id>
<content type='text'>
Use colon to separate parameter name from their specific meaning.
silence the warning:

drivers/gpu/drm/amd/amdgpu/amdgpu_vm_pt.c:793: warning: Function parameter or member 'adev' not described in 'amdgpu_vm_pte_update_noretry_flags'

Reviewed-by: Randy Dunlap &lt;rdunlap@infradead.org&gt;
Signed-off-by: Yang Li &lt;yang.lee@linux.alibaba.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu/vm: use the same xcp_id from root PD</title>
<updated>2023-07-18T15:18:53Z</updated>
<author>
<name>Guchun Chen</name>
<email>guchun.chen@amd.com</email>
</author>
<published>2023-07-13T07:55:58Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=e379b5e7dc7e934940290b38d033907930d37c99'/>
<id>urn:sha1:e379b5e7dc7e934940290b38d033907930d37c99</id>
<content type='text'>
Other PDs/PTs allocation should just use the same xcp_id as that
stored in root PD.

Suggested-by: Christian König &lt;christian.koenig@amd.com&gt;
Signed-off-by: Guchun Chen &lt;guchun.chen@amd.com&gt;
Reviewed-by: Felix Kuehling &lt;Felix.Kuehling@amd.com&gt;
Reviewed-by: Christian König &lt;christian.koenig@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: fix slab-out-of-bounds issue in amdgpu_vm_pt_create</title>
<updated>2023-07-18T15:18:16Z</updated>
<author>
<name>Guchun Chen</name>
<email>guchun.chen@amd.com</email>
</author>
<published>2023-07-13T07:09:37Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=5003ca63bce63b20c02c8049be46c44135939a64'/>
<id>urn:sha1:5003ca63bce63b20c02c8049be46c44135939a64</id>
<content type='text'>
Recent code set xcp_id stored from file private data when opening
device to amdgpu bo for accounting memory usage etc, but not all
VMs are attached to this fpriv structure like the vm cases in
amdgpu_mes_self_test, otherwise, KASAN will complain below out
of bound access. And more importantly, VM code should not touch
fpriv structure, so drop fpriv code handling from amdgpu_vm_pt.

[   77.292314] BUG: KASAN: slab-out-of-bounds in amdgpu_vm_pt_create+0x17e/0x4b0 [amdgpu]
[   77.293845] Read of size 4 at addr ffff888102c48a48 by task modprobe/1069
[   77.294146] Call Trace:
[   77.294178]  &lt;TASK&gt;
[   77.294208]  dump_stack_lvl+0x49/0x63
[   77.294260]  print_report+0x16f/0x4a6
[   77.294307]  ? amdgpu_vm_pt_create+0x17e/0x4b0 [amdgpu]
[   77.295979]  ? kasan_complete_mode_report_info+0x3c/0x200
[   77.296057]  ? amdgpu_vm_pt_create+0x17e/0x4b0 [amdgpu]
[   77.297556]  kasan_report+0xb4/0x130
[   77.297609]  ? amdgpu_vm_pt_create+0x17e/0x4b0 [amdgpu]
[   77.299202]  __asan_load4+0x6f/0x90
[   77.299272]  amdgpu_vm_pt_create+0x17e/0x4b0 [amdgpu]
[   77.300796]  ? amdgpu_init+0x6e/0x1000 [amdgpu]
[   77.302222]  ? amdgpu_vm_pt_clear+0x750/0x750 [amdgpu]
[   77.303721]  ? preempt_count_sub+0x18/0xc0
[   77.303786]  amdgpu_vm_init+0x39e/0x870 [amdgpu]
[   77.305186]  ? amdgpu_vm_wait_idle+0x90/0x90 [amdgpu]
[   77.306683]  ? kasan_set_track+0x25/0x30
[   77.306737]  ? kasan_save_alloc_info+0x1b/0x30
[   77.306795]  ? __kasan_kmalloc+0x87/0xa0
[   77.306852]  amdgpu_mes_self_test+0x169/0x620 [amdgpu]

v2: without specifying xcp partition for PD/PT bo, the xcp id is -1.

Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2686
Fixes: 3ebfd221c1a8 ("drm/amdkfd: Store xcp partition id to amdgpu bo")
Signed-off-by: Guchun Chen &lt;guchun.chen@amd.com&gt;
Tested-by: Mikhail Gavrilov &lt;mikhail.v.gavrilov@gmail.com&gt;
Reviewed-by: Felix Kuehling &lt;Felix.Kuehling@amd.com&gt;
Reviewed-by: Christian König &lt;christian.koenig@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: have bos for PDs/PTS cpu accessible when kfd uses cpu to update vm</title>
<updated>2023-07-07T17:51:48Z</updated>
<author>
<name>Xiaogang Chen</name>
<email>xiaogang.chen@amd.com</email>
</author>
<published>2023-06-30T16:38:35Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=eb58ad143dab0c9d649d702cc929f6bd4b62b455'/>
<id>urn:sha1:eb58ad143dab0c9d649d702cc929f6bd4b62b455</id>
<content type='text'>
When kfd uses cpu to update vm iterates all current PDs/PTs bos, adds
AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED flag and kmap them to kernel virtual
address space before kfd updates the vm that was created by gfx.

Signed-off-by: Xiaogang Chen &lt;Xiaogang.Chen@amd.com&gt;
Reviewed-by: Christian König &lt;christian.koenig@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: Update invalid PTE flag setting</title>
<updated>2023-07-07T17:51:47Z</updated>
<author>
<name>Mukul Joshi</name>
<email>mukul.joshi@amd.com</email>
</author>
<published>2023-06-09T15:11:53Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=e77673d14f2cec6d47d2da4e58dce87c2d66e54f'/>
<id>urn:sha1:e77673d14f2cec6d47d2da4e58dce87c2d66e54f</id>
<content type='text'>
Update the invalid PTE flag setting with TF enabled.
This is to ensure, in addition to transitioning the
retry fault to a no-retry fault, it also causes the
wavefront to enter the trap handler. With the current
setting, the fault only transitions to a no-retry fault.
Additionally, have 2 sets of invalid PTE settings, one for
TF enabled, the other for TF disabled. The setting with
TF disabled, doesn't work with TF enabled.

Signed-off-by: Mukul Joshi &lt;mukul.joshi@amd.com&gt;
Acked-by: Christian König &lt;christian.koenig@amd.com&gt;
Reviewed-by: Felix Kuehling &lt;Felix.Kuehling@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: fix Null pointer dereference error in amdgpu_device_recover_vram</title>
<updated>2023-06-09T16:34:02Z</updated>
<author>
<name>Horatio Zhang</name>
<email>Hongkun.Zhang@amd.com</email>
</author>
<published>2023-05-29T18:23:37Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=cbb63eccc05626d0d111b335e44f111a3bb92871'/>
<id>urn:sha1:cbb63eccc05626d0d111b335e44f111a3bb92871</id>
<content type='text'>
Use the function of amdgpu_bo_vm_destroy to handle the resource release
of shadow bo. During the amdgpu_mes_self_test, shadow bo released, but
vmbo-&gt;shadow_list was not, which caused a null pointer reference error
in amdgpu_device_recover_vram when GPU reset.

Fixes: 6c032c37ac3e ("drm/amdgpu: Fix vram recover doesn't work after whole GPU reset (v2)")
Signed-off-by: xinhui pan &lt;xinhui.pan@amd.com&gt;
Signed-off-by: Horatio Zhang &lt;Hongkun.Zhang@amd.com&gt;
Acked-by: Feifei Xu &lt;Feifei.Xu@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdkfd: Store xcp partition id to amdgpu bo</title>
<updated>2023-06-09T14:36:38Z</updated>
<author>
<name>Philip Yang</name>
<email>Philip.Yang@amd.com</email>
</author>
<published>2023-03-08T16:57:00Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=3ebfd221c1a83e5f0edadb87d173d8fd93d1d125'/>
<id>urn:sha1:3ebfd221c1a83e5f0edadb87d173d8fd93d1d125</id>
<content type='text'>
For memory accounting per compute partition and export drm amdgpu bo and
then import to KFD, we need the xcp id to account the memory usage or
find the KFD node of the original amdgpu bo to create the KFD bo on the
correct adev KFD node.

Set xcp_id_plus1 of amdgpu_bo_param to create bo and store xcp_id to
amddgpu bo. Add helper macro to get the mem_id from adev and xcp_id.

v2: squash in fix ("drm/amdgpu: Fix BO creation failure on GFX 9.4.3 dGPU")

Signed-off-by: Philip Yang &lt;Philip.Yang@amd.com&gt;
Reviewed-by: Felix Kuehling &lt;Felix.Kuehling@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
</feed>
