<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c, branch v5.15</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v5.15</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v5.15'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2021-10-05T17:02:31Z</updated>
<entry>
<title>drm/amdgpu: handle the case of pci_channel_io_frozen only in amdgpu_pci_resume</title>
<updated>2021-10-05T17:02:31Z</updated>
<author>
<name>Guchun Chen</name>
<email>guchun.chen@amd.com</email>
</author>
<published>2021-10-01T01:48:50Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=248b061689a40f4fed05252ee2c89f87cf26d7d8'/>
<id>urn:sha1:248b061689a40f4fed05252ee2c89f87cf26d7d8</id>
<content type='text'>
In current code, when a PCI error state pci_channel_io_normal is detectd,
it will report PCI_ERS_RESULT_CAN_RECOVER status to PCI driver, and PCI
driver will continue the execution of PCI resume callback report_resume by
pci_walk_bridge, and the callback will go into amdgpu_pci_resume
finally, where write lock is releasd unconditionally without acquiring
such lock first. In this case, a deadlock will happen when other threads
start to acquire the read lock.

To fix this, add a member in amdgpu_device strucutre to cache
pci_channel_state, and only continue the execution in amdgpu_pci_resume
when it's pci_channel_io_frozen.

Fixes: c9a6b82f45e2 ("drm/amdgpu: Implement DPC recovery")
Suggested-by: Andrey Grodzovsky &lt;andrey.grodzovsky@amd.com&gt;
Signed-off-by: Guchun Chen &lt;guchun.chen@amd.com&gt;
Reviewed-by: Andrey Grodzovsky &lt;andrey.grodzovsky@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: init iommu after amdkfd device init</title>
<updated>2021-10-05T17:02:20Z</updated>
<author>
<name>Yifan Zhang</name>
<email>yifan1.zhang@amd.com</email>
</author>
<published>2021-09-28T07:42:35Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=714d9e4574d54596973ee3b0624ee4a16264d700'/>
<id>urn:sha1:714d9e4574d54596973ee3b0624ee4a16264d700</id>
<content type='text'>
This patch is to fix clinfo failure in Raven/Picasso:

Number of platforms: 1
  Platform Profile: FULL_PROFILE
  Platform Version: OpenCL 2.2 AMD-APP (3364.0)
  Platform Name: AMD Accelerated Parallel Processing
  Platform Vendor: Advanced Micro Devices, Inc.
  Platform Extensions: cl_khr_icd cl_amd_event_callback

  Platform Name: AMD Accelerated Parallel Processing Number of devices: 0

Signed-off-by: Yifan Zhang &lt;yifan1.zhang@amd.com&gt;
Reviewed-by: James Zhu &lt;James.Zhu@amd.com&gt;
Tested-by: James Zhu &lt;James.Zhu@amd.com&gt;
Acked-by: Felix Kuehling &lt;Felix.Kuehling@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: move iommu_resume before ip init/resume</title>
<updated>2021-09-16T13:56:24Z</updated>
<author>
<name>James Zhu</name>
<email>James.Zhu@amd.com</email>
</author>
<published>2021-09-07T15:32:22Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=f02abeb0779700c308e661a412451b38962b8a0b'/>
<id>urn:sha1:f02abeb0779700c308e661a412451b38962b8a0b</id>
<content type='text'>
Separate iommu_resume from kfd_resume, and move it before
other amdgpu ip init/resume.

Bug: https://bugzilla.kernel.org/show_bug.cgi?id=211277
Signed-off-by: James Zhu &lt;James.Zhu@amd.com&gt;
Reviewed-by: Felix Kuehling &lt;Felix.Kuehling@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Cc: stable@vger.kernel.org
</content>
</entry>
<entry>
<title>drm/amdgpu: Cancel delayed work when GFXOFF is disabled</title>
<updated>2021-08-20T16:09:44Z</updated>
<author>
<name>Michel Dänzer</name>
<email>mdaenzer@redhat.com</email>
</author>
<published>2021-08-17T08:23:25Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=90a9266269eb9f71af1f323c33e1dca53527bd22'/>
<id>urn:sha1:90a9266269eb9f71af1f323c33e1dca53527bd22</id>
<content type='text'>
schedule_delayed_work does not push back the work if it was already
scheduled before, so amdgpu_device_delay_enable_gfx_off ran ~100 ms
after the first time GFXOFF was disabled and re-enabled, even if GFXOFF
was disabled and re-enabled again during those 100 ms.

This resulted in frame drops / stutter with the upcoming mutter 41
release on Navi 14, due to constantly enabling GFXOFF in the HW and
disabling it again (for getting the GPU clock counter).

To fix this, call cancel_delayed_work_sync when the disable count
transitions from 0 to 1, and only schedule the delayed work on the
reverse transition, not if the disable count was already 0. This makes
sure the delayed work doesn't run at unexpected times, and allows it to
be lock-free.

v2:
* Use cancel_delayed_work_sync &amp; mutex_trylock instead of
  mod_delayed_work.
v3:
* Make amdgpu_device_delay_enable_gfx_off lock-free (Christian König)
v4:
* Fix race condition between amdgpu_gfx_off_ctrl incrementing
  adev-&gt;gfx.gfx_off_req_count and amdgpu_device_delay_enable_gfx_off
  checking for it to be 0 (Evan Quan)

Cc: stable@vger.kernel.org
Reviewed-by: Evan Quan &lt;evan.quan@amd.com&gt;
Reviewed-by: Lijo Lazar &lt;lijo.lazar@amd.com&gt; # v3
Acked-by: Christian König &lt;christian.koenig@amd.com&gt; # v3
Signed-off-by: Michel Dänzer &lt;mdaenzer@redhat.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amd/amdgpu:flush ttm delayed work before cancel_sync</title>
<updated>2021-08-18T22:23:00Z</updated>
<author>
<name>YuBiao Wang</name>
<email>YuBiao.Wang@amd.com</email>
</author>
<published>2021-08-17T09:36:33Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=691191a2f458e0176414cb5b3993b0c018cdc58c'/>
<id>urn:sha1:691191a2f458e0176414cb5b3993b0c018cdc58c</id>
<content type='text'>
[Why]
In some cases when we unload driver, warning call trace
will show up in vram_mgr_fini which claims that LRU is not empty, caused
by the ttm bo inside delay deleted queue.

[How]
We should flush delayed work to make sure the delay deleting is done.

Signed-off-by: YuBiao Wang &lt;YuBiao.Wang@amd.com&gt;
Reviewed-by: Andrey Grodzovsky &lt;andrey.grodzovsky@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amd/amdgpu embed hw_fence into amdgpu_job</title>
<updated>2021-08-16T19:16:58Z</updated>
<author>
<name>Jack Zhang</name>
<email>Jack.Zhang1@amd.com</email>
</author>
<published>2021-05-12T07:06:35Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=c530b02f39850a639b72d01ebbf7e5d745c60831'/>
<id>urn:sha1:c530b02f39850a639b72d01ebbf7e5d745c60831</id>
<content type='text'>
Why: Previously hw fence is alloced separately with job.
It caused historical lifetime issues and corner cases.
The ideal situation is to take fence to manage both job
and fence's lifetime, and simplify the design of gpu-scheduler.

How:
We propose to embed hw_fence into amdgpu_job.
1. We cover the normal job submission by this method.
2. For ib_test, and submit without a parent job keep the
legacy way to create a hw fence separately.
v2:
use AMDGPU_FENCE_FLAG_EMBED_IN_JOB_BIT to show that the fence is
embedded in a job.
v3:
remove redundant variable ring in amdgpu_job
v4:
add tdr sequence support for this feature. Add a job_run_counter to
indicate whether this job is a resubmit job.
v5
add missing handling in amdgpu_fence_enable_signaling

Signed-off-by: Jingwen Chen &lt;Jingwen.Chen2@amd.com&gt;
Signed-off-by: Jack Zhang &lt;Jack.Zhang7@hotmail.com&gt;
Reviewed-by: Andrey Grodzovsky &lt;andrey.grodzovsky@amd.com&gt;
Reviewed by: Monk Liu &lt;monk.liu@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amd/amdgpu: skip locking delayed work if not initialized.</title>
<updated>2021-08-09T19:42:30Z</updated>
<author>
<name>YuBiao Wang</name>
<email>YuBiao.Wang@amd.com</email>
</author>
<published>2021-08-05T02:32:40Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=e78b3197dbf73fc0695dd019e388576d0a551830'/>
<id>urn:sha1:e78b3197dbf73fc0695dd019e388576d0a551830</id>
<content type='text'>
When init failed in early init stage, amdgpu_object has
not been initialized, so hasn't the ttm delayed queue functions.

Signed-off-by: YuBiao Wang &lt;YuBiao.Wang@amd.com&gt;
Reviewed-by: Emily.Deng &lt;Emily.Deng@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: avoid over-handle of fence driver fini in s3 test (v2)</title>
<updated>2021-08-06T01:17:58Z</updated>
<author>
<name>Guchun Chen</name>
<email>guchun.chen@amd.com</email>
</author>
<published>2021-07-29T10:35:13Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=067f44c8b4590c3f24d21a037578a478590f2175'/>
<id>urn:sha1:067f44c8b4590c3f24d21a037578a478590f2175</id>
<content type='text'>
In amdgpu_fence_driver_hw_fini, no need to call drm_sched_fini to stop
scheduler in s3 test, otherwise, fence related failure will arrive
after resume. To fix this and for a better clean up, move drm_sched_fini
from fence_hw_fini to fence_sw_fini, as it's part of driver shutdown, and
should never be called in hw_fini.

v2: rename amdgpu_fence_driver_init to amdgpu_fence_driver_sw_init,
to keep sw_init and sw_fini paired.

Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1668
Fixes: 8d35a2596164c1 ("drm/amdgpu: adjust fence driver enable sequence")
Suggested-by: Christian König &lt;christian.koenig@amd.com&gt;
Tested-by: Mike Lothian &lt;mike@fireburn.co.uk&gt;
Signed-off-by: Guchun Chen &lt;guchun.chen@amd.com&gt;
Reviewed-by: Christian König &lt;christian.koenig@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>Merge tag 'amd-drm-next-5.15-2021-07-29' of https://gitlab.freedesktop.org/agd5f/linux into drm-next</title>
<updated>2021-07-30T06:48:35Z</updated>
<author>
<name>Dave Airlie</name>
<email>airlied@redhat.com</email>
</author>
<published>2021-07-30T06:48:35Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=04d505de7f82c8f2daa6139b460b05dc01e354e0'/>
<id>urn:sha1:04d505de7f82c8f2daa6139b460b05dc01e354e0</id>
<content type='text'>
amd-drm-next-5.15-2021-07-29:

amdgpu:
- VCN/JPEG power down sequencing fixes
- Various navi pcie link handling fixes
- Clockgating fixes
- Yellow Carp fixes
- Beige Goby fixes
- Misc code cleanups
- S0ix fixes
- SMU i2c bus rework
- EEPROM handling rework
- PSP ucode handling cleanup
- SMU error handling rework
- AMD HDMI freesync fixes
- USB PD firmware update rework
- MMIO based vram access rework
- Misc display fixes
- Backlight fixes
- Add initial Cyan Skillfish support
- Overclocking fixes suspend/resume

amdkfd:
- Sysfs leak fix
- Add counters for vm faults and migration
- GPUVM TLB optimizations

radeon:
- Misc fixes

Signed-off-by: Dave Airlie &lt;airlied@redhat.com&gt;

From: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Link: https://patchwork.freedesktop.org/patch/msgid/20210730033455.3852-1-alexander.deucher@amd.com
</content>
</entry>
<entry>
<title>drm/amdgpu: adjust fence driver enable sequence</title>
<updated>2021-07-29T02:15:44Z</updated>
<author>
<name>Likun Gao</name>
<email>Likun.Gao@amd.com</email>
</author>
<published>2021-07-26T09:17:52Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=8d35a2596164c1c9d34d4656fd42b445cd1e247f'/>
<id>urn:sha1:8d35a2596164c1c9d34d4656fd42b445cd1e247f</id>
<content type='text'>
Fence driver was enabled per ring when sw init on per IP block before.
Change to enable all the fence driver at the same time after
amdgpu_device_ip_init finished.
Rename some function related to fence to make it reasonable for read.

Signed-off-by: Likun Gao &lt;Likun.Gao@amd.com&gt;
Reviewed-by: Hawking Zhang &lt;Hawking.Zhang@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
</feed>
