<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c, branch v5.11</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v5.11</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v5.11'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2020-11-02T20:35:50Z</updated>
<entry>
<title>drm/amdgpu/amdgpu: use "*" adjacent to data name</title>
<updated>2020-11-02T20:35:50Z</updated>
<author>
<name>Deepak R Varma</name>
<email>mh12gx2825@gmail.com</email>
</author>
<published>2020-11-02T19:37:12Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=c4c5ae67d17976d058341b24908d5f562f3ce933'/>
<id>urn:sha1:c4c5ae67d17976d058341b24908d5f562f3ce933</id>
<content type='text'>
When declaring pointer data, the "*" symbol should be used adjacent to
the data name as per the coding standards. This resolves following
issues reported by checkpatch script:
	ERROR: "foo *   bar" should be "foo *bar"
	ERROR: "foo * bar" should be "foo *bar"
	ERROR: "foo*            bar" should be "foo *bar"
	ERROR: "(foo*)" should be "(foo *)"

Signed-off-by: Deepak R Varma &lt;mh12gx2825@gmail.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>amd/amdgpu_ctx: Use struct_size() helper and kmalloc() (v2)</title>
<updated>2020-10-09T18:44:39Z</updated>
<author>
<name>Gustavo A. R. Silva</name>
<email>gustavoars@kernel.org</email>
</author>
<published>2020-10-08T14:34:50Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=201a4eb9dc960b6c081936e0990e289959707996'/>
<id>urn:sha1:201a4eb9dc960b6c081936e0990e289959707996</id>
<content type='text'>
Make use of the new struct_size() helper instead of the offsetof() idiom.
Also, use kmalloc() instead of kcalloc().

v2: squash in kzalloc fix

Signed-off-by: Gustavo A. R. Silva &lt;gustavoars@kernel.org&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: disable gpu-sched load balance for uvd</title>
<updated>2020-09-03T18:46:54Z</updated>
<author>
<name>Nirmoy Das</name>
<email>nirmoy.das@amd.com</email>
</author>
<published>2020-08-29T04:51:41Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=bc21585f3ff050f3fc7faaefc5e5c0541cec7aae'/>
<id>urn:sha1:bc21585f3ff050f3fc7faaefc5e5c0541cec7aae</id>
<content type='text'>
On hardware with multiple uvd instances, dependent uvd jobs
may get scheduled to different uvd instances. Because uvd_enc
jobs retain hw context, dependent jobs should always run on the
same uvd instance. This patch disables GPU scheduler's load balancer
for a context that binds jobs from the same context to a uvd
instance.

v2: Squash in uvd_enc fix

Signed-off-by: Nirmoy Das &lt;nirmoy.das@amd.com&gt;
Reviewed-by: Christian König &lt;christian.koenig@amd.com&gt;
Reviewed-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: drm_device to amdgpu_device by inline-f (v2)</title>
<updated>2020-08-24T17:06:06Z</updated>
<author>
<name>Luben Tuikov</name>
<email>luben.tuikov@amd.com</email>
</author>
<published>2020-08-24T16:27:47Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=1348969ab68cb864034ec4fe48a86c157cc4e10d'/>
<id>urn:sha1:1348969ab68cb864034ec4fe48a86c157cc4e10d</id>
<content type='text'>
Get the amdgpu_device from the DRM device by use
of an inline function, drm_to_adev(). The inline
function resolves a pointer to struct drm_device
to a pointer to struct amdgpu_device.

v2: Use a typed visible static inline function
    instead of an invisible macro.

Signed-off-by: Luben Tuikov &lt;luben.tuikov@amd.com&gt;
Reviewed-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/scheduler: Remove priority macro INVALID (v2)</title>
<updated>2020-08-18T22:20:26Z</updated>
<author>
<name>Luben Tuikov</name>
<email>luben.tuikov@amd.com</email>
</author>
<published>2020-08-12T00:56:58Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=9af5e21dace795891544042abda877ada39abacc'/>
<id>urn:sha1:9af5e21dace795891544042abda877ada39abacc</id>
<content type='text'>
Remove DRM_SCHED_PRIORITY_INVALID. We no longer
carry around an invalid priority and cut it off
at the source.

Backwards compatibility behaviour of AMDGPU CTX
IOCTL passing in garbage for context priority
from user space and then mapping that to
DRM_SCHED_PRIORITY_NORMAL is preserved.

v2: Revert "res"  --&gt; "r" and
           "prio" --&gt; "priority".

Signed-off-by: Luben Tuikov &lt;luben.tuikov@amd.com&gt;
Reviewed-by: Christian König &lt;christian.koenig@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/scheduler: Scheduler priority fixes (v2)</title>
<updated>2020-08-18T22:20:17Z</updated>
<author>
<name>Luben Tuikov</name>
<email>luben.tuikov@amd.com</email>
</author>
<published>2020-08-11T23:59:58Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=e2d732fdb7a9e421720a644580cd6a9400f97f60'/>
<id>urn:sha1:e2d732fdb7a9e421720a644580cd6a9400f97f60</id>
<content type='text'>
Remove DRM_SCHED_PRIORITY_LOW, as it was used
in only one place.

Rename and separate by a line
DRM_SCHED_PRIORITY_MAX to DRM_SCHED_PRIORITY_COUNT
as it represents a (total) count of said
priorities and it is used as such in loops
throughout the code. (0-based indexing is the
the count number.)

Remove redundant word HIGH in priority names,
and rename *KERNEL* to *HIGH*, as it really
means that, high.

v2: Add back KERNEL and remove SW and HW,
    in lieu of a single HIGH between NORMAL and KERNEL.

Signed-off-by: Luben Tuikov &lt;luben.tuikov@amd.com&gt;
Reviewed-by: Christian König &lt;christian.koenig@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: revert "fix system hang issue during GPU reset"</title>
<updated>2020-08-14T20:22:40Z</updated>
<author>
<name>Christian König</name>
<email>christian.koenig@amd.com</email>
</author>
<published>2020-08-12T15:48:26Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=f1403342ebdfcff3c3cf57ae476f19d3078f2767'/>
<id>urn:sha1:f1403342ebdfcff3c3cf57ae476f19d3078f2767</id>
<content type='text'>
The whole approach wasn't thought through till the end.

We already had a reset lock like this in the past and it caused the same problems like this one.

Completely revert the patch for now and add individual trylock protection to the hardware access functions as necessary.

This reverts commit df9c8d1aa278c435c30a69b8f2418b4a52fcb929.

Signed-off-by: Christian König &lt;christian.koenig@amd.com&gt;
Acked-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: fix system hang issue during GPU reset</title>
<updated>2020-07-27T20:21:37Z</updated>
<author>
<name>Dennis Li</name>
<email>Dennis.Li@amd.com</email>
</author>
<published>2020-07-08T07:07:13Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=df9c8d1aa278c435c30a69b8f2418b4a52fcb929'/>
<id>urn:sha1:df9c8d1aa278c435c30a69b8f2418b4a52fcb929</id>
<content type='text'>
when GPU hang, driver has multi-paths to enter amdgpu_device_gpu_recover,
the atomic adev-&gt;in_gpu_reset and hive-&gt;in_reset are used to avoid
re-entering GPU recovery.

During GPU reset and resume, it is unsafe that other threads access GPU,
which maybe cause GPU reset failed. Therefore the new rw_semaphore
adev-&gt;reset_sem is introduced, which protect GPU from being accessed by
external threads during recovery.

v2:
1. add rwlock for some ioctls, debugfs and file-close function.
2. change to use dqm-&gt;is_resetting and dqm_lock for protection in kfd
driver.
3. remove try_lock and change adev-&gt;in_gpu_reset as atomic, to avoid
re-enter GPU recovery for the same GPU hang.

v3:
1. change back to use adev-&gt;reset_sem to protect kfd callback
functions, because dqm_lock couldn't protect all codes, for example:
free_mqd must be called outside of dqm_lock;

[ 1230.176199] Hardware name: Supermicro SYS-7049GP-TRT/X11DPG-QT, BIOS 3.1 05/23/2019
[ 1230.177221] Call Trace:
[ 1230.178249]  dump_stack+0x98/0xd5
[ 1230.179443]  amdgpu_virt_kiq_reg_write_reg_wait+0x181/0x190 [amdgpu]
[ 1230.180673]  gmc_v9_0_flush_gpu_tlb+0xcc/0x310 [amdgpu]
[ 1230.181882]  amdgpu_gart_unbind+0xa9/0xe0 [amdgpu]
[ 1230.183098]  amdgpu_ttm_backend_unbind+0x46/0x180 [amdgpu]
[ 1230.184239]  ? ttm_bo_put+0x171/0x5f0 [ttm]
[ 1230.185394]  ttm_tt_unbind+0x21/0x40 [ttm]
[ 1230.186558]  ttm_tt_destroy.part.12+0x12/0x60 [ttm]
[ 1230.187707]  ttm_tt_destroy+0x13/0x20 [ttm]
[ 1230.188832]  ttm_bo_cleanup_memtype_use+0x36/0x80 [ttm]
[ 1230.189979]  ttm_bo_put+0x1be/0x5f0 [ttm]
[ 1230.191230]  amdgpu_bo_unref+0x1e/0x30 [amdgpu]
[ 1230.192522]  amdgpu_amdkfd_free_gtt_mem+0xaf/0x140 [amdgpu]
[ 1230.193833]  free_mqd+0x25/0x40 [amdgpu]
[ 1230.195143]  destroy_queue_cpsch+0x1a7/0x270 [amdgpu]
[ 1230.196475]  pqm_destroy_queue+0x105/0x260 [amdgpu]
[ 1230.197819]  kfd_ioctl_destroy_queue+0x37/0x70 [amdgpu]
[ 1230.199154]  kfd_ioctl+0x277/0x500 [amdgpu]
[ 1230.200458]  ? kfd_ioctl_get_clock_counters+0x60/0x60 [amdgpu]
[ 1230.201656]  ? tomoyo_file_ioctl+0x19/0x20
[ 1230.202831]  ksys_ioctl+0x98/0xb0
[ 1230.204004]  __x64_sys_ioctl+0x1a/0x20
[ 1230.205174]  do_syscall_64+0x5f/0x250
[ 1230.206339]  entry_SYSCALL_64_after_hwframe+0x49/0xbe

2. remove try_lock and introduce atomic hive-&gt;in_reset, to avoid
re-enter GPU recovery.

v4:
1. remove an unnecessary whitespace change in kfd_chardev.c
2. remove comment codes in amdgpu_device.c
3. add more detailed comment in commit message
4. define a wrap function amdgpu_in_reset

v5:
1. Fix some style issues.

Reviewed-by: Hawking Zhang &lt;Hawking.Zhang@amd.com&gt;
Suggested-by: Andrey Grodzovsky &lt;andrey.grodzovsky@amd.com&gt;
Suggested-by: Christian König &lt;christian.koenig@amd.com&gt;
Suggested-by: Felix Kuehling &lt;Felix.Kuehling@amd.com&gt;
Suggested-by: Lijo Lazar &lt;Lijo.Lazar@amd.com&gt;
Suggested-by: Luben Tukov &lt;luben.tuikov@amd.com&gt;
Signed-off-by: Dennis Li &lt;Dennis.Li@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: rework sched_list generation</title>
<updated>2020-04-09T14:43:14Z</updated>
<author>
<name>Nirmoy Das</name>
<email>nirmoy.das@amd.com</email>
</author>
<published>2020-04-01T09:46:57Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=1c6d567bdf73a207f51ef2e5745854ba7daa22c7'/>
<id>urn:sha1:1c6d567bdf73a207f51ef2e5745854ba7daa22c7</id>
<content type='text'>
Generate HW IP's sched_list in amdgpu_ring_init() instead of
amdgpu_ctx.c. This makes amdgpu_ctx_init_compute_sched(),
ring.has_high_prio and amdgpu_ctx_init_sched() unnecessary.
This patch also stores sched_list for all HW IPs in one big
array in struct amdgpu_device which makes amdgpu_ctx_init_entity()
much more leaner.

v2:
fix a coding style issue
do not use drm hw_ip const to populate amdgpu_ring_type enum

v3:
remove ctx reference and move sched array and num_sched to a struct
use num_scheds to detect uninitialized scheduler list

v4:
use array_index_nospec for user space controlled variables
fix possible checkpatch.pl warnings

Signed-off-by: Nirmoy Das &lt;nirmoy.das@amd.com&gt;
Reviewed-by: Christian König &lt;christian.koenig@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: disable gpu_sched load balancer for vcn jobs</title>
<updated>2020-03-16T20:21:32Z</updated>
<author>
<name>Nirmoy Das</name>
<email>nirmoy.das@amd.com</email>
</author>
<published>2020-03-13T14:26:54Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=4ff7d8ba4c80b8198bd4a0cc339373eb28540552'/>
<id>urn:sha1:4ff7d8ba4c80b8198bd4a0cc339373eb28540552</id>
<content type='text'>
VCN HW doesn't support dynamic load balance on multiple instances
for a context. This patch initializes VNC entities with only one
drm_gpu_scheduler picked by drm_sched_pick_best(). Picking a
drm_gpu_scheduler using drm_sched_pick_best() ensures that we
do load balance among multiple contexts but not among multiple
jobs in a context.

Signed-off-by: Nirmoy Das &lt;nirmoy.das@amd.com&gt;
Reviewed-by: Christian König &lt;christian.koenig@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
</feed>
