<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/drivers/gpu/drm/amd/amdgpu/mxgpu_ai.c, branch v5.17</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v5.17</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v5.17'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2022-01-07T22:19:34Z</updated>
<entry>
<title>drm/amdgpu: add dummy event6 for vega10</title>
<updated>2022-01-07T22:19:34Z</updated>
<author>
<name>James Yao</name>
<email>yiqing.yao@amd.com</email>
</author>
<published>2021-12-29T10:10:32Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=216a9873198bdc5c670a9f71d58fafd30227c9c8'/>
<id>urn:sha1:216a9873198bdc5c670a9f71d58fafd30227c9c8</id>
<content type='text'>
[why]
Malicious mailbox event1 fails driver loading on vega10.
A dummy event6 prevent driver from taking response from malicious event1 as its own.

[how]
On vega10, send a mailbox event6 before sending event1.

Signed-off-by: James Yao &lt;yiqing.yao@amd.com&gt;
Reviewed-by: Jingwen Chen &lt;Jingwen.Chen2@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: SRIOV flr_work should use down_write</title>
<updated>2021-12-14T21:09:02Z</updated>
<author>
<name>Victor Skvortsov</name>
<email>victor.skvortsov@amd.com</email>
</author>
<published>2021-12-13T21:38:20Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=fa4a427d84f9b797970a3d5139d7645403e4e989'/>
<id>urn:sha1:fa4a427d84f9b797970a3d5139d7645403e4e989</id>
<content type='text'>
Host initiated VF FLR may fail if someone else is
already holding a read_lock. Change from down_write_trylock
to down_write to guarantee the reset goes through.

Signed-off-by: Victor Skvortsov &lt;victor.skvortsov@amd.com&gt;
Reviewed by: Shaoyun.liu &lt;Shaoyun.liu@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amd/amdgpu: Add ready_to_reset resp for vega10</title>
<updated>2021-08-30T18:59:33Z</updated>
<author>
<name>YuBiao Wang</name>
<email>YuBiao.Wang@amd.com</email>
</author>
<published>2021-08-27T06:48:51Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=64261a0d0600ab335677073c54b1989565ceddad'/>
<id>urn:sha1:64261a0d0600ab335677073c54b1989565ceddad</id>
<content type='text'>
Send response to host after received the flr notification from host.
Port NV change to vega10.

Signed-off-by: YuBiao Wang &lt;YuBiao.Wang@amd.com&gt;
Reviewed-by: Jingwen Chen &lt;Jingwen.Chen2@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: SRIOV flr_work should take write_lock</title>
<updated>2021-07-13T15:48:09Z</updated>
<author>
<name>Jingwen Chen</name>
<email>Jingwen.Chen2@amd.com</email>
</author>
<published>2021-07-01T02:19:17Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=798c511548b946ae9ec123b0dfe197a5f29e63ec'/>
<id>urn:sha1:798c511548b946ae9ec123b0dfe197a5f29e63ec</id>
<content type='text'>
[Why]
If flr_work takes read_lock, then other threads who takes
read_lock can access hardware when host is doing vf flr.

[How]
flr_work should take write_lock to avoid this case.

Signed-off-by: Jingwen Chen &lt;Jingwen.Chen2@amd.com&gt;
Reviewed-by: Monk Liu &lt;monk.liu@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu/sriov Stop data exchange for wholegpu reset</title>
<updated>2021-01-14T04:47:39Z</updated>
<author>
<name>Jack Zhang</name>
<email>Jack.Zhang1@amd.com</email>
</author>
<published>2021-01-07T10:38:29Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=3c2a01cb0fc567c18b802f25d619e31c196294ce'/>
<id>urn:sha1:3c2a01cb0fc567c18b802f25d619e31c196294ce</id>
<content type='text'>
[Why]
When host trigger a whole gpu reset, guest will keep
waiting till host finish reset. But there's a work
queue in guest exchanging data between vf&amp;pf which need
to access frame buffer. During whole gpu reset, frame
buffer is not accessable, and this causes the call trace.

[How]
After vf get reset notification from pf, stop data exchange.

Signed-off-by: Jingwen Chen &lt;Jingwen.Chen2@amd.com&gt;
Signed-off-by: Jack Zhang &lt;Jack.Zhang1@amd.com&gt;
Reviewed-by: Monk Liu &lt;monk.liu@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu/SRIOV: Extend VF reset request wait period</title>
<updated>2020-12-15T16:35:35Z</updated>
<author>
<name>Jiange Zhao</name>
<email>Jiange.Zhao@amd.com</email>
</author>
<published>2020-11-25T13:56:05Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=3aa883ac8eea38281f97a7409d2922e6f343bf6c'/>
<id>urn:sha1:3aa883ac8eea38281f97a7409d2922e6f343bf6c</id>
<content type='text'>
In Virtualization case, when one VF is sending too many
FLR requests, hypervisor would stop responding to this
VF's request for a long period of time. This is called
event guard. During this period of cooling time, guest
driver should wait instead of doing other things. After
this period of time, guest driver would resume reset
process and return to normal.

Currently, guest driver would wait 12 seconds and return fail
if it doesn't get response from host.

Solution: extend this waiting time in guest driver and poll
response periodically. Poll happens every 6 seconds and it will
last for 60 seconds.

v2: change the max repetition times from number to macro.

Signed-off-by: Jiange Zhao &lt;Jiange.Zhao@amd.com&gt;
Acked-by: Hawking Zhang &lt;Hawking.Zhang@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: Do gpu recovery when no job is running</title>
<updated>2020-09-15T21:24:18Z</updated>
<author>
<name>Liu ChengZhe</name>
<email>ChengZhe.Liu@amd.com</email>
</author>
<published>2020-09-09T08:00:12Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=2a9787dcf537d0e4f1fa41cbd883abe9d70b9fcb'/>
<id>urn:sha1:2a9787dcf537d0e4f1fa41cbd883abe9d70b9fcb</id>
<content type='text'>
In function flr_work, we should do gpu recovery when no job
is running. Fix the logic by inverting it.

v2: modify the description

Reviewed-by: Christian König &lt;christian.koenig@amd.com&gt;
Signed-off-by: Liu ChengZhe &lt;ChengZhe.Liu@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: change reset lock from mutex to rw_semaphore</title>
<updated>2020-08-24T16:23:48Z</updated>
<author>
<name>Dennis Li</name>
<email>Dennis.Li@amd.com</email>
</author>
<published>2020-08-20T02:06:32Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=6049db43d6dd9cbb9d7f897839d15b0bc600e23c'/>
<id>urn:sha1:6049db43d6dd9cbb9d7f897839d15b0bc600e23c</id>
<content type='text'>
clients don't need reset-lock for synchronization when no
GPU recovery.

v2:
change to return the return value of down_read_killable.

v3:
if GPU recovery begin, VF ignore FLR notification.

Reviewed-by: Monk Liu &lt;monk.liu@amd.com&gt;
Acked-by: Christian König &lt;christian.koenig@amd.com&gt;
Signed-off-by: Dennis Li &lt;Dennis.Li@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: refine codes to avoid reentering GPU recovery</title>
<updated>2020-08-24T16:22:56Z</updated>
<author>
<name>Dennis Li</name>
<email>Dennis.Li@amd.com</email>
</author>
<published>2020-08-19T09:23:03Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=53b3f8f40e6cff36ae12b11e6d6b308af3c7e53f'/>
<id>urn:sha1:53b3f8f40e6cff36ae12b11e6d6b308af3c7e53f</id>
<content type='text'>
if other threads have holden the reset lock, recovery will
fail to try_lock. Therefore we introduce atomic hive-&gt;in_reset
and adev-&gt;in_gpu_reset, to avoid reentering GPU recovery.

v2:
drop "? true : false" in the definition of amdgpu_in_reset

Reviewed-by: Hawking Zhang &lt;Hawking.Zhang@amd.com&gt;
Signed-off-by: Dennis Li &lt;Dennis.Li@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: Fix repeatly flr issue</title>
<updated>2020-08-18T22:22:02Z</updated>
<author>
<name>jqdeng</name>
<email>Emily.Deng@amd.com</email>
</author>
<published>2020-08-07T09:31:19Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=9a1cddd6374fb1eb166f40f10d5411e3699066d4'/>
<id>urn:sha1:9a1cddd6374fb1eb166f40f10d5411e3699066d4</id>
<content type='text'>
Only for no job running test case need to do recover in
flr notification.
For having job in mirror list, then let guest driver to
hit job timeout, and then do recover.

Signed-off-by: jqdeng &lt;Emily.Deng@amd.com&gt;
Acked-by: Nirmoy Das &lt;nirmoy.das@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
</feed>
