<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd.c, branch v6.0</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v6.0</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v6.0'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2022-07-13T15:25:17Z</updated>
<entry>
<title>drm/amdgpu: support reset flag set for gpu reset</title>
<updated>2022-07-13T15:25:17Z</updated>
<author>
<name>Likun Gao</name>
<email>Likun.Gao@amd.com</email>
</author>
<published>2022-07-08T03:14:05Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=f1549c09c520877be211d483d3c6f4e7f77d2588'/>
<id>urn:sha1:f1549c09c520877be211d483d3c6f4e7f77d2588</id>
<content type='text'>
Move reset_context out of gpu recover function to make it configurable
for different reset purpose.
For the reset way of call gpu_recovery sysfs, force to use full reset
method. Otherwise, try soft reset by default if the related ASIC
supportted, if soft reset failed, will use full reset.

Signed-off-by: Likun Gao &lt;Likun.Gao@amd.com&gt;
Reviewed-by: Hawking Zhang &lt;Hawking.Zhang@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: Follow up change to previous drm scheduler change.</title>
<updated>2022-06-28T15:24:41Z</updated>
<author>
<name>Andrey Grodzovsky</name>
<email>andrey.grodzovsky@amd.com</email>
</author>
<published>2022-06-20T21:33:55Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=9ae55f030dc523fc4dc6069557e4a887ea815453'/>
<id>urn:sha1:9ae55f030dc523fc4dc6069557e4a887ea815453</id>
<content type='text'>
Align refcount behaviour for amdgpu_job embedded HW fence with
classic pointer style HW fences by increasing refcount each
time emit is called so amdgpu code doesn't need to make workarounds
using amdgpu_job.job_run_counter to keep the HW fence refcount balanced.

Also since in the previous patch we resumed setting s_fence-&gt;parent to NULL
in drm_sched_stop switch to directly checking if job-&gt;hw_fence is
signaled to short circuit reset if already signed.

Signed-off-by: Andrey Grodzovsky &lt;andrey.grodzovsky@amd.com&gt;
Tested-by: Yiqing Yao &lt;yiqing.yao@amd.com&gt;
Acked-by: Christian König &lt;christian.koenig@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: To flush tlb for MMHUB of RAVEN series</title>
<updated>2022-06-23T21:21:53Z</updated>
<author>
<name>Ruili Ji</name>
<email>ruiliji2@amd.com</email>
</author>
<published>2022-06-22T06:20:22Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=508f748b03949143ccda614b900e3f7d842251e5'/>
<id>urn:sha1:508f748b03949143ccda614b900e3f7d842251e5</id>
<content type='text'>
amdgpu: [mmhub0] no-retry page fault (src_id:0 ring:40 vmid:8 pasid:32769, for process test_basic pid 3305 thread test_basic pid 3305)
amdgpu: in page starting at address 0x00007ff990003000 from IH client 0x12 (VMC)
amdgpu: VM_L2_PROTECTION_FAULT_STATUS:0x00840051
amdgpu: Faulty UTCL2 client ID: MP1 (0x0)
amdgpu: MORE_FAULTS: 0x1
amdgpu: WALKER_ERROR: 0x0
amdgpu: PERMISSION_FAULTS: 0x5
amdgpu: MAPPING_ERROR: 0x0
amdgpu: RW: 0x1

When memory is allocated by kfd, no one triggers the tlb flush for MMHUB0.
There is page fault from MMHUB0.

v2:fix indentation
v3:change subject and fix indentation

Signed-off-by: Ruili Ji &lt;ruiliji2@amd.com&gt;
Reviewed-by: Philip Yang &lt;philip.yang@amd.com&gt;
Reviewed-by: Aaron Liu &lt;aaron.liu@amd.com&gt;
Acked-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: Rename amdgpu_device_gpu_recover_imp back to amdgpu_device_gpu_recover</title>
<updated>2022-06-10T19:26:12Z</updated>
<author>
<name>Andrey Grodzovsky</name>
<email>andrey.grodzovsky@amd.com</email>
</author>
<published>2022-05-17T18:27:49Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=cf727044144d47c3e8482b9a7775bd3f04a87341'/>
<id>urn:sha1:cf727044144d47c3e8482b9a7775bd3f04a87341</id>
<content type='text'>
We removed the wrapper that was queueing the recover function
into reset domain queue who was using this name.

Signed-off-by: Andrey Grodzovsky &lt;andrey.grodzovsky@amd.com&gt;
Reviewed-by: Christian König &lt;christian.koenig@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: Add work_struct for GPU reset from kfd.</title>
<updated>2022-06-10T19:26:07Z</updated>
<author>
<name>Andrey Grodzovsky</name>
<email>andrey.grodzovsky@amd.com</email>
</author>
<published>2022-05-17T18:25:20Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=b5fd0cf3ea377a7332721df8a8c8e7715f93c8d4'/>
<id>urn:sha1:b5fd0cf3ea377a7332721df8a8c8e7715f93c8d4</id>
<content type='text'>
We need to have a work_struct to cancel this reset if another
already in progress.

Signed-off-by: Andrey Grodzovsky &lt;andrey.grodzovsky@amd.com&gt;
Reviewed-by: Christian König &lt;christian.koenig@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdkfd: Add KFD support for soc21 v3</title>
<updated>2022-05-04T14:43:54Z</updated>
<author>
<name>Mukul Joshi</name>
<email>mukul.joshi@amd.com</email>
</author>
<published>2022-04-26T17:00:11Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=cc009e613de6560eb499f8bc92c80a737752cb30'/>
<id>urn:sha1:cc009e613de6560eb499f8bc92c80a737752cb30</id>
<content type='text'>
Add initial support for soc21 in KFD compute
driver (Mukul)
- Add new definition for soc21 device.
- Add new file for amdgpu-kfd interface for GFX11 family.
- Add new file for queue management, interrupt handling,
  mqd management for GFX11 family in KFD driver.
- Related changes/updates for soc21 device in
  KFD driver.
- Repurpose last 2 entries of SDMA MQD for driver use.

v2: Add an optional argument into update queue operation (Mukul)

v3: Switch to ip version check, replace kgd_dev with
    amdgpu_device (Hawking)

Signed-off-by: Mukul Joshi &lt;mukul.joshi@amd.com&gt;
Signed-off-by: Hawking Zhang &lt;Hawking.Zhang@amd.com&gt;
Reviewed-by: Oak Zeng &lt;Oak.Zeng@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdkfd: Fix NULL pointer dereference</title>
<updated>2022-04-07T20:37:39Z</updated>
<author>
<name>Felix Kuehling</name>
<email>Felix.Kuehling@amd.com</email>
</author>
<published>2022-04-06T20:58:24Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=3cd3e731f34ff2f021165aeefd640acba9dd0993'/>
<id>urn:sha1:3cd3e731f34ff2f021165aeefd640acba9dd0993</id>
<content type='text'>
Check that adev-&gt;gfx.ras is valid before using it.

Fixes: 6475ae2b742876 ("drm/amdgpu: add UTCL2 RAS poison query for Aldebaran (v2)")
CC: Tao Zhou &lt;tao.zhou1@amd.com&gt;
Signed-off-by: Felix Kuehling &lt;Felix.Kuehling@amd.com&gt;
Reviewed-by: Mukul Joshi &lt;mukul.joshi@amd.com&gt;
Reviewed-by: Hawking Zhang &lt;Hawking.Zhang@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: add UTCL2 RAS poison query for Aldebaran (v2)</title>
<updated>2022-03-25T16:40:26Z</updated>
<author>
<name>Tao Zhou</name>
<email>tao.zhou1@amd.com</email>
</author>
<published>2022-03-15T09:48:18Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=6475ae2b742876aa9b2a0aff7ba60f5c81917614'/>
<id>urn:sha1:6475ae2b742876aa9b2a0aff7ba60f5c81917614</id>
<content type='text'>
Add help functions to query and reset RAS UTCL2 poison status.

v2: implement it on amdgpu side and kfd only calls it.

Signed-off-by: Tao Zhou &lt;tao.zhou1@amd.com&gt;
Reviewed-by: Hawking Zhang &lt;Hawking.Zhang@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdkfd: remove unused function</title>
<updated>2022-01-11T20:44:26Z</updated>
<author>
<name>Nirmoy Das</name>
<email>nirmoy.das@amd.com</email>
</author>
<published>2022-01-07T08:51:13Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=ffb378fb3069520da3c2be3c1269250ec9c028ab'/>
<id>urn:sha1:ffb378fb3069520da3c2be3c1269250ec9c028ab</id>
<content type='text'>
Remove unused amdgpu_amdkfd_get_vram_usage()

CC: Felix.Kuehling@amd.com

Signed-off-by: Nirmoy Das &lt;nirmoy.das@amd.com&gt;
Reviewed-by: Christian König &lt;christian.koenig@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Fixes: dfcbe6d5f4a340 ("drm/amdgpu: Remove unused function pointers")
</content>
</entry>
<entry>
<title>drm/amdgpu: save error count in RAS poison handler</title>
<updated>2021-12-30T13:54:45Z</updated>
<author>
<name>Tao Zhou</name>
<email>tao.zhou1@amd.com</email>
</author>
<published>2021-12-20T08:36:54Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=fec8c5244fc07b1f6a3249a8714489f594ff5c4f'/>
<id>urn:sha1:fec8c5244fc07b1f6a3249a8714489f594ff5c4f</id>
<content type='text'>
Otherwise the RAS error count couldn't be queried from sysfs.

Signed-off-by: Tao Zhou &lt;tao.zhou1@amd.com&gt;
Reviewed-by: Stanley.Yang &lt;Stanley.Yang@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
</feed>
