<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.h, branch v5.10</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v5.10</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v5.10'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2020-12-09T15:06:39Z</updated>
<entry>
<title>drm/amdgpu: fix debugfs creation/removal, again</title>
<updated>2020-12-09T15:06:39Z</updated>
<author>
<name>Arnd Bergmann</name>
<email>arnd@arndb.de</email>
</author>
<published>2020-12-03T23:06:43Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=2343e9d2c5a94459b9de92649f1650e36eb79a10'/>
<id>urn:sha1:2343e9d2c5a94459b9de92649f1650e36eb79a10</id>
<content type='text'>
There is still a warning when CONFIG_DEBUG_FS is disabled:

drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c:1145:13: error: 'amdgpu_ras_debugfs_create_ctrl_node' defined but not used [-Werror=unused-function]
 1145 | static void amdgpu_ras_debugfs_create_ctrl_node(struct amdgpu_device *adev)

Change the code again to make the compiler actually drop
this code but not warn about it.

Fixes: ae2bf61ff39e ("drm/amdgpu: guard ras debugfs creation/removal based on CONFIG_DEBUG_FS")
Reviewed-by: Tao Zhou &lt;tao.zhou1@amd.com&gt;
Signed-off-by: Arnd Bergmann &lt;arnd@arndb.de&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: bypass querying ras error count registers</title>
<updated>2020-08-14T20:12:22Z</updated>
<author>
<name>Guchun Chen</name>
<email>guchun.chen@amd.com</email>
</author>
<published>2020-08-04T07:00:53Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=f75e94d86829e92a758a26fc5bbdb4c9eba86260'/>
<id>urn:sha1:f75e94d86829e92a758a26fc5bbdb4c9eba86260</id>
<content type='text'>
Once ras recovery is issued by ras sync flood interrupt or
ras controller interrupt, add this guard to bypass or execute
ras error count register harvest of all IPs.

Signed-off-by: Guchun Chen &lt;guchun.chen@amd.com&gt;
Reviewed-by: Hawking Zhang &lt;Hawking.Zhang@amd.com&gt;
Reviewed-by: Dennis Li &lt;Dennis.Li@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: break GPU recovery once it's in bad state(v4)</title>
<updated>2020-08-04T21:26:54Z</updated>
<author>
<name>Guchun Chen</name>
<email>guchun.chen@amd.com</email>
</author>
<published>2020-07-23T08:20:02Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=e8fbaf03429d085b05e66b153a8363b746c94b43'/>
<id>urn:sha1:e8fbaf03429d085b05e66b153a8363b746c94b43</id>
<content type='text'>
When GPU executes recovery and retriving bad GPU tag
from external eerpom device, the recovery will be broken
and error message is printed as well for user's awareness.

v2: Refine warning message in threshold reaching case, and
    fix spelling typo.

v3: Fix explicit calling of bad gpu.

v4: Rename function names.

Signed-off-by: Guchun Chen &lt;guchun.chen@amd.com&gt;
Reviewed-by: Hawking Zhang &lt;Hawking.Zhang@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: skip bad page reservation once issuing from eeprom write</title>
<updated>2020-08-04T21:26:38Z</updated>
<author>
<name>Guchun Chen</name>
<email>guchun.chen@amd.com</email>
</author>
<published>2020-07-23T07:50:42Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=35cd2cdadbccf08e9b4d5a8fca4914dc37ab48e0'/>
<id>urn:sha1:35cd2cdadbccf08e9b4d5a8fca4914dc37ab48e0</id>
<content type='text'>
Once the ras recovery is issued from eeprom write itself,
bad page reservation should be ignored, otherwise, recursive
calling of writting to eeprom would happen.

Signed-off-by: Guchun Chen &lt;guchun.chen@amd.com&gt;
Reviewed-by: Hawking Zhang &lt;Hawking.Zhang@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: validate bad page threshold in ras(v3)</title>
<updated>2020-08-04T21:25:58Z</updated>
<author>
<name>Guchun Chen</name>
<email>guchun.chen@amd.com</email>
</author>
<published>2020-07-22T02:00:27Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=c84d46707ebbfc303268e27d5aeb8bb4e6d5b1ff'/>
<id>urn:sha1:c84d46707ebbfc303268e27d5aeb8bb4e6d5b1ff</id>
<content type='text'>
Bad page threshold value should be valid in the range between
-1 and max records length of eeprom. It could determine when
saved bad pages exceed threshold value, and proceed corresponding
actions.

v2: When using the default typical value, it should be min
value between typical value and eeprom max records length.

v3: drop the case of setting bad_page_cnt_threshold to be
    0xFFFFFFFF, as it confuses user.

Signed-off-by: Guchun Chen &lt;guchun.chen@amd.com&gt;
Reviewed-by: Hawking Zhang &lt;Hawking.Zhang@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: RAS emergency restart logic refine</title>
<updated>2020-07-15T16:41:47Z</updated>
<author>
<name>Wenhui Sheng</name>
<email>Wenhui.Sheng@amd.com</email>
</author>
<published>2020-07-13T07:14:30Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=bb5c7235eaafb4e2f957e9f0f71a187db5cf525a'/>
<id>urn:sha1:bb5c7235eaafb4e2f957e9f0f71a187db5cf525a</id>
<content type='text'>
If we are in RAS triggered situation and
BACO isn't support, emergency restart is needed,
and this code is only needed for some specific
cases(vega20 with given smu fw version).

After we add smu mode1 reset for sienna cichlid, we
need to share AMD_RESET_METHOD_MODE1 with psp mode1 reset,
so in amdgpu_device_gpu_recover, we need differentiate
which mode1 reset we are using, then decide if it's
a full reset and then decide if emergency restart is needed,
the logic will become much more complex.

After discussion with Hawking, move emergency restart logic
to an independent function.

Signed-off-by: Likun Gao &lt;Likun.Gao@amd.com&gt;
Signed-off-by: Wenhui Sheng &lt;Wenhui.Sheng@amd.com&gt;
Reviewed-by: Hawking Zhang &lt;Hawking.Zhang@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: disable ras query and iject during gpu reset</title>
<updated>2020-04-01T18:44:42Z</updated>
<author>
<name>John Clements</name>
<email>john.clements@amd.com</email>
</author>
<published>2020-03-25T08:01:14Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=61380faa4b4cc577df8a7ff5db5859bac6b351f7'/>
<id>urn:sha1:61380faa4b4cc577df8a7ff5db5859bac6b351f7</id>
<content type='text'>
added flag to ras context to indicate if ras query functionality is ready

Reviewed-by: Hawking Zhang &lt;Hawking.Zhang@amd.com&gt;
Signed-off-by: John Clements &lt;john.clements@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: add function to creat all ras debugfs node</title>
<updated>2020-03-10T19:55:02Z</updated>
<author>
<name>Tao Zhou</name>
<email>tao.zhou1@amd.com</email>
</author>
<published>2020-03-06T03:59:16Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=f9317014ea517e13a59b9d437948c5ab17ce4ed7'/>
<id>urn:sha1:f9317014ea517e13a59b9d437948c5ab17ce4ed7</id>
<content type='text'>
centralize all debugfs creation in one place for ras

this is required to fix ras when the driver does not use the drm load
and unload callbacks due to ordering issues with the drm device node.

Signed-off-by: Tao Zhou &lt;tao.zhou1@amd.com&gt;
Signed-off-by: Stanley.Yang &lt;Stanley.Yang@amd.com&gt;
Reviewed-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: drop useless BACO arg in amdgpu_ras_reset_gpu</title>
<updated>2019-12-18T21:09:06Z</updated>
<author>
<name>Guchun Chen</name>
<email>guchun.chen@amd.com</email>
</author>
<published>2019-12-13T08:46:05Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=619346240932ac86a0f0c42f887827ae759eda47'/>
<id>urn:sha1:619346240932ac86a0f0c42f887827ae759eda47</id>
<content type='text'>
BACO reset mode strategy is determined by latter func when
calling amdgpu_ras_reset_gpu. So not to confuse audience, drop
it.

Signed-off-by: Guchun Chen &lt;guchun.chen@amd.com&gt;
Reviewed-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: clear err_event_athub flag after reset exit</title>
<updated>2019-12-05T21:26:11Z</updated>
<author>
<name>Le Ma</name>
<email>le.ma@amd.com</email>
</author>
<published>2019-10-25T09:19:38Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=00eaa57172a02edddbf445112409e807e0caacd9'/>
<id>urn:sha1:00eaa57172a02edddbf445112409e807e0caacd9</id>
<content type='text'>
Otherwise next err_event_athub error cannot call gpu reset. And following
resume sequence will not be affected by this flag.

v2: create function to clear amdgpu_ras_in_intr for modularity of ras driver

Signed-off-by: Le Ma &lt;le.ma@amd.com&gt;
Reviewed-by: Hawking Zhang &lt;Hawking.Zhang@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
</feed>
