<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/drivers/gpu/drm/amd/amdgpu/amdgpu_mes.c, branch v6.7</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v6.7</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v6.7'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2023-11-03T16:18:32Z</updated>
<entry>
<title>drm/amdgpu: fix GRBM read timeout when do mes_self_test</title>
<updated>2023-11-03T16:18:32Z</updated>
<author>
<name>Tim Huang</name>
<email>Tim.Huang@amd.com</email>
</author>
<published>2023-11-01T06:22:04Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=36e7ff5c13cb15cb7b06c76d42bb76cbf6b7ea75'/>
<id>urn:sha1:36e7ff5c13cb15cb7b06c76d42bb76cbf6b7ea75</id>
<content type='text'>
Use a proper MEID to make sure the CP_HQD_* and CP_GFX_HQD_* registers
can be touched when initialize the compute and gfx mqd in mes_self_test.
Otherwise, we expect no response from CP and an GRBM eventual timeout.

Signed-off-by: Tim Huang &lt;Tim.Huang@amd.com&gt;
Acked-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Reviewed-by: Yifan Zhang &lt;yifan1.zhang@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Cc: stable@vger.kernel.org
</content>
</entry>
<entry>
<title>drm/amdgpu: Use function for IP version check</title>
<updated>2023-09-20T16:23:28Z</updated>
<author>
<name>Lijo Lazar</name>
<email>lijo.lazar@amd.com</email>
</author>
<published>2023-09-11T08:18:11Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=4e8303cf2c4dd27374a16a8881ec1a1cd5baf86f'/>
<id>urn:sha1:4e8303cf2c4dd27374a16a8881ec1a1cd5baf86f</id>
<content type='text'>
Use an inline function for version check. Gives more flexibility to
handle any format changes.

Signed-off-by: Lijo Lazar &lt;lijo.lazar@amd.com&gt;
Reviewed-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: cleanup MES process level doorbells</title>
<updated>2023-08-07T21:14:07Z</updated>
<author>
<name>Shashank Sharma</name>
<email>shashank.sharma@amd.com</email>
</author>
<published>2023-03-21T16:25:47Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=664c3b03f9ca97302b4d832d7972326eb5fde3b4'/>
<id>urn:sha1:664c3b03f9ca97302b4d832d7972326eb5fde3b4</id>
<content type='text'>
MES allocates process level doorbells, but there is no userspace
client to consume it. It was only being used for the MES ring
tests (in kernel), and was written by kernel doorbell write.

The previous patch of this series has changed the MES ring test code to
use kernel level MES doorbells. This patch now cleans up the process level
doorbell allocation code which is not required.

Cc: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Cc: Christian Koenig &lt;christian.koenig@amd.com&gt;
Reviewed-by: Christian König &lt;christian.koenig@amd.com&gt;
Signed-off-by: Shashank Sharma &lt;shashank.sharma@amd.com&gt;
Signed-off-by: Arvind Yadav &lt;arvind.yadav@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: use doorbell mgr for MES kernel doorbells</title>
<updated>2023-08-07T21:14:07Z</updated>
<author>
<name>Shashank Sharma</name>
<email>shashank.sharma@amd.com</email>
</author>
<published>2023-07-14T14:22:38Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=e3cbb1f404b65211218002df00aead255dfb1c04'/>
<id>urn:sha1:e3cbb1f404b65211218002df00aead255dfb1c04</id>
<content type='text'>
This patch:
- Removes the existing doorbell management code, and its variables
  from the doorbell_init function, it will be done in doorbell
  manager now.
- uses the doorbell page created for MES kernel level needs (doorbells
  for MES self tests)
- current MES code was allocating MES doorbells in MES process context,
  but those were getting written using kernel doorbell calls. This patch
  instead allocates a MES kernel doorbell for this (in add_hw_queue).

V2: Create an extra page of doorbells for MES during kernel doorbell
    creation (Alex)
V4: Move MES doorbell size and page offset objects in this patch from
    patch 6.

Cc: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Cc: Christian Koenig &lt;christian.koenig@amd.com&gt;
Reviewed-by: Christian Koenig &lt;christian.koenig@amd.com&gt;
Signed-off-by: Shashank Sharma &lt;shashank.sharma@amd.com&gt;
Signed-off-by: Arvind Yadav &lt;arvind.yadav@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>Merge tag 'amd-drm-next-6.6-2023-07-28' of https://gitlab.freedesktop.org/agd5f/linux into drm-next</title>
<updated>2023-08-04T09:10:18Z</updated>
<author>
<name>Daniel Vetter</name>
<email>daniel.vetter@ffwll.ch</email>
</author>
<published>2023-08-04T09:10:18Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=3d00c59d147724e536b415e389445ece6fcda42f'/>
<id>urn:sha1:3d00c59d147724e536b415e389445ece6fcda42f</id>
<content type='text'>
amd-drm-next-6.6-2023-07-28:

amdgpu:
- Lots of checkpatch cleanups
- GFX 9.4.3 updates
- Add USB PD and IFWI flashing documentation
- GPUVM updates
- RAS fixes
- DRR fixes
- FAMS fixes
- Virtual display fixes
- Soft IH fixes
- SMU13 fixes
- Rework PSP firmware loading for other IPs
- Kernel doc fixes
- DCN 3.0.1 fixes
- LTTPR fixes
- DP MST fixes
- DCN 3.1.6 fixes
- SubVP fixes
- Display bandwidth calculation fixes
- VCN4 secure submission fixes
- Allow building DC on RISC-V
- Add visible FB info to bo_print_info
- HBR3 fixes
- Add PSP 14.0 support
- GFX9 MCBP fix
- GMC10 vmhub index fix
- GMC11 vmhub index fix
- Create a new doorbell manager
- SR-IOV fixes

amdkfd:
- Cleanup CRIU dma-buf handling
- Use KIQ to unmap HIQ
- GFX 9.4.3 debugger updates
- GFX 9.4.2 debugger fixes
- Enable cooperative groups fof gfx11
- SVM fixes

radeon:
- Lots of checkpatch cleanups

Merge conflicts:
- drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
	The switch to drm eu helpers in 8a206685d36f ("drm/amdgpu: use
	drm_exec for GEM and CSA handling v2") clashed with the
	cosmetic cleanups from 30953c4d000b ("drm/amdgpu: Fix style
	issues in amdgpu_gem.c"). I
	kept the former since the cleanup up code is gone.
- drivers/gpu/drm/amd/amdgpu/atom.c.
	adf64e214280 ("drm/amd: Avoid reading the VBIOS part number
	twice") removed code that 992b8fe106ab ("drm/radeon: Replace
	all non-returning strlcpy with strscpy") polished.

From: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Link: https://patchwork.freedesktop.org/patch/msgid/20230728214228.8102-1-alexander.deucher@amd.com
[sima: some merge conflict wrangling as noted]
Signed-off-by: Daniel Vetter &lt;daniel.vetter@intel.com&gt;
</content>
</entry>
<entry>
<title>drm/amdkfd: enable cooperative groups for gfx11</title>
<updated>2023-07-25T17:35:43Z</updated>
<author>
<name>Jonathan Kim</name>
<email>jonathan.kim@amd.com</email>
</author>
<published>2023-07-12T20:58:55Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=7a1c5c6753858cbbf0b073eaa9b53d8f56ee0927'/>
<id>urn:sha1:7a1c5c6753858cbbf0b073eaa9b53d8f56ee0927</id>
<content type='text'>
MES can concurrently schedule queues on the device that require
exclusive device access if marked exclusively_scheduled without the
requirement of GWS.  Similar to the F32 HWS, MES will manage
quality of service for these queues.
Use this for cooperative groups since cooperative groups are device
occupancy limited.

Since some GFX11 devices can only be debugged with partial CUs, do not
allow the debugging of cooperative groups on these devices as the CU
occupancy limit will change on attach.

In addition, zero initialize the MES add queue submission vector for MES
initialization tests as we do not want these to be cooperative
dispatches.

Signed-off-by: Jonathan Kim &lt;jonathan.kim@amd.com&gt;
Reviewed-by: Felix Kuehling &lt;felix.kuehling@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: fix slab-out-of-bounds issue in amdgpu_vm_pt_create</title>
<updated>2023-07-18T15:18:16Z</updated>
<author>
<name>Guchun Chen</name>
<email>guchun.chen@amd.com</email>
</author>
<published>2023-07-13T07:09:37Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=5003ca63bce63b20c02c8049be46c44135939a64'/>
<id>urn:sha1:5003ca63bce63b20c02c8049be46c44135939a64</id>
<content type='text'>
Recent code set xcp_id stored from file private data when opening
device to amdgpu bo for accounting memory usage etc, but not all
VMs are attached to this fpriv structure like the vm cases in
amdgpu_mes_self_test, otherwise, KASAN will complain below out
of bound access. And more importantly, VM code should not touch
fpriv structure, so drop fpriv code handling from amdgpu_vm_pt.

[   77.292314] BUG: KASAN: slab-out-of-bounds in amdgpu_vm_pt_create+0x17e/0x4b0 [amdgpu]
[   77.293845] Read of size 4 at addr ffff888102c48a48 by task modprobe/1069
[   77.294146] Call Trace:
[   77.294178]  &lt;TASK&gt;
[   77.294208]  dump_stack_lvl+0x49/0x63
[   77.294260]  print_report+0x16f/0x4a6
[   77.294307]  ? amdgpu_vm_pt_create+0x17e/0x4b0 [amdgpu]
[   77.295979]  ? kasan_complete_mode_report_info+0x3c/0x200
[   77.296057]  ? amdgpu_vm_pt_create+0x17e/0x4b0 [amdgpu]
[   77.297556]  kasan_report+0xb4/0x130
[   77.297609]  ? amdgpu_vm_pt_create+0x17e/0x4b0 [amdgpu]
[   77.299202]  __asan_load4+0x6f/0x90
[   77.299272]  amdgpu_vm_pt_create+0x17e/0x4b0 [amdgpu]
[   77.300796]  ? amdgpu_init+0x6e/0x1000 [amdgpu]
[   77.302222]  ? amdgpu_vm_pt_clear+0x750/0x750 [amdgpu]
[   77.303721]  ? preempt_count_sub+0x18/0xc0
[   77.303786]  amdgpu_vm_init+0x39e/0x870 [amdgpu]
[   77.305186]  ? amdgpu_vm_wait_idle+0x90/0x90 [amdgpu]
[   77.306683]  ? kasan_set_track+0x25/0x30
[   77.306737]  ? kasan_save_alloc_info+0x1b/0x30
[   77.306795]  ? __kasan_kmalloc+0x87/0xa0
[   77.306852]  amdgpu_mes_self_test+0x169/0x620 [amdgpu]

v2: without specifying xcp partition for PD/PT bo, the xcp id is -1.

Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2686
Fixes: 3ebfd221c1a8 ("drm/amdkfd: Store xcp partition id to amdgpu bo")
Signed-off-by: Guchun Chen &lt;guchun.chen@amd.com&gt;
Tested-by: Mikhail Gavrilov &lt;mikhail.v.gavrilov@gmail.com&gt;
Reviewed-by: Felix Kuehling &lt;Felix.Kuehling@amd.com&gt;
Reviewed-by: Christian König &lt;christian.koenig@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: use drm_exec for MES testing</title>
<updated>2023-07-12T12:14:44Z</updated>
<author>
<name>Christian König</name>
<email>christian.koenig@amd.com</email>
</author>
<published>2022-08-16T13:32:30Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=2acc73f81f2500e1bf03e2fbba8b733c0817dbb9'/>
<id>urn:sha1:2acc73f81f2500e1bf03e2fbba8b733c0817dbb9</id>
<content type='text'>
Start using the new component here as well.

Signed-off-by: Christian König &lt;christian.koenig@amd.com&gt;
Acked-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Link: https://patchwork.freedesktop.org/patch/msgid/20230711133122.3710-6-christian.koenig@amd.com
</content>
</entry>
<entry>
<title>drm/amdkfd: fix and enable debugging for gfx11</title>
<updated>2023-06-09T16:48:19Z</updated>
<author>
<name>Jonathan Kim</name>
<email>jonathan.kim@amd.com</email>
</author>
<published>2023-05-23T15:57:27Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=09d49e14ea6fd125a21f89b80f888c09be32a174'/>
<id>urn:sha1:09d49e14ea6fd125a21f89b80f888c09be32a174</id>
<content type='text'>
There are a couple of fixes required to enable gfx11 debugging.

First, ADD_QUEUE.trap_en is an inappropriate place to toggle
a per-process register so move it to SET_SHADER_DEBUGGER.trap_en.
When ADD_QUEUE.skip_process_ctx_clear is set, MES will prioritize
the SET_SHADER_DEBUGGER.trap_en setting.

Second, to preserve correct save/restore priviledged wave states
in coordination with the trap enablement setting, resume suspended
waves early in the disable call.

Signed-off-by: Jonathan Kim &lt;jonathan.kim@amd.com&gt;
Reviewed-by: Felix Kuehling &lt;felix.kuehling@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: expose debug api for mes</title>
<updated>2023-06-09T16:35:43Z</updated>
<author>
<name>Jonathan Kim</name>
<email>jonathan.kim@amd.com</email>
</author>
<published>2022-08-27T02:04:15Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=a9818854ea7870ec5464d37b72c89f5fc198708e'/>
<id>urn:sha1:a9818854ea7870ec5464d37b72c89f5fc198708e</id>
<content type='text'>
Similar to the F32 HWS, the RS64 HWS for GFX11 now supports a multi-process
debug API.

The skip_process_ctx_clear ADD_QUEUE requirement is to prevent the MES
from clearing the process context when the first queue is added to the
scheduler in order to maintain debug mode settings during queue preemption
and restore.  The MES clears the process context in this case due to an
unresolved FW caching bug during normal mode operations.
During debug mode, the KFD will hold a reference to the target process
so the process context should never go stale and MES can afford to skip
this requirement.

Signed-off-by: Jonathan Kim &lt;jonathan.kim@amd.com&gt;
Reviewed-by: Felix Kuehling &lt;felix.kuehling@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
</feed>
