<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c, branch v6.14</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v6.14</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v6.14'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2024-12-18T17:39:08Z</updated>
<entry>
<title>drm/admgpu: replace kmalloc() and memcpy() with kmemdup()</title>
<updated>2024-12-18T17:39:08Z</updated>
<author>
<name>Mirsad Todorovac</name>
<email>mtodorovac69@gmail.com</email>
</author>
<published>2024-12-17T22:58:10Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=a21ab06b8c2d8d25c4a83bdf39542834b1f3beae'/>
<id>urn:sha1:a21ab06b8c2d8d25c4a83bdf39542834b1f3beae</id>
<content type='text'>
The static analyser tool gave the following advice:

./drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c:1266:7-14: WARNING opportunity for kmemdup

 → 1266         tmp = kmalloc(used_size, GFP_KERNEL);
   1267         if (!tmp)
   1268                 return -ENOMEM;
   1269
 → 1270         memcpy(tmp, &amp;host_telemetry-&gt;body.error_count, used_size);

Replacing kmalloc() + memcpy() with kmemdump() doesn't change semantics.
Original code works without fault, so this is not a bug fix but proposed improvement.

Link: https://lwn.net/Articles/198928/
Fixes: 84a2947ecc85 ("drm/amdgpu: Implement virt req_ras_err_count")
Cc: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Cc: "Christian König" &lt;christian.koenig@amd.com&gt;
Cc: Xinhui Pan &lt;Xinhui.Pan@amd.com&gt;
Cc: David Airlie &lt;airlied@gmail.com&gt;
Cc: Simona Vetter &lt;simona@ffwll.ch&gt;
Cc: Zhigang Luo &lt;Zhigang.Luo@amd.com&gt;
Cc: Victor Skvortsov &lt;victor.skvortsov@amd.com&gt;
Cc: Hawking Zhang &lt;Hawking.Zhang@amd.com&gt;
Cc: Lijo Lazar &lt;lijo.lazar@amd.com&gt;
Cc: Yunxiang Li &lt;Yunxiang.Li@amd.com&gt;
Cc: Jack Xiao &lt;Jack.Xiao@amd.com&gt;
Cc: Vignesh Chander &lt;Vignesh.Chander@amd.com&gt;
Cc: Danijel Slivka &lt;danijel.slivka@amd.com&gt;
Cc: amd-gfx@lists.freedesktop.org
Cc: dri-devel@lists.freedesktop.org
Cc: linux-kernel@vger.kernel.org
Reviewed-by: Lijo Lazar &lt;lijo.lazar@amd.com&gt;
Signed-off-by: Mirsad Todorovac &lt;mtodorovac69@gmail.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: Implement virt req_ras_err_count</title>
<updated>2024-11-11T16:55:42Z</updated>
<author>
<name>Victor Skvortsov</name>
<email>victor.skvortsov@amd.com</email>
</author>
<published>2024-10-30T14:18:00Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=84a2947ecc85c67f433f2cc2186e54cdb9047b61'/>
<id>urn:sha1:84a2947ecc85c67f433f2cc2186e54cdb9047b61</id>
<content type='text'>
Enable RAS late init  if VF RAS Telemetry is supported.

When enabled, the VF can use this interface to query total
RAS error counts from the host.

The VF FB access may abruptly end due to a fatal error,
therefore the VF must cache and sanitize the input.

The Host allows 15 Telemetry messages every 60 seconds, afterwhich
the host will ignore any more in-coming telemetry messages. The VF will
rate limit its msg calling to once every 5 seconds (12 times in 60 seconds).
While the VF is rate limited, it will continue to report the last
good cached data.

v2: Flip generate report &amp; update statistics order for VF

Signed-off-by: Victor Skvortsov &lt;victor.skvortsov@amd.com&gt;
Acked-by: Tao Zhou &lt;tao.zhou1@amd.com&gt;
Reviewed-by: Zhigang Luo &lt;zhigang.luo@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: VF Query RAS Caps from Host if supported</title>
<updated>2024-11-11T16:55:36Z</updated>
<author>
<name>Victor Skvortsov</name>
<email>victor.skvortsov@amd.com</email>
</author>
<published>2024-10-30T13:58:56Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=907fec2dfd061ca422d8b121f4af1b6062e098ba'/>
<id>urn:sha1:907fec2dfd061ca422d8b121f4af1b6062e098ba</id>
<content type='text'>
If VF RAS Capability support is enabled, guest is able to
retrieve the real RAS support from the host.

Signed-off-by: Victor Skvortsov &lt;victor.skvortsov@amd.com&gt;
Reviewed-by: Zhigang Luo &lt;zhigang.luo@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: Disable dpm_enabled flag while VF is in reset</title>
<updated>2024-08-13T16:12:52Z</updated>
<author>
<name>Victor Skvortsov</name>
<email>victor.skvortsov@amd.com</email>
</author>
<published>2024-08-08T17:22:34Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=f83cec3b3a7c968bbceb810b7acd1baf3fe8cd87'/>
<id>urn:sha1:f83cec3b3a7c968bbceb810b7acd1baf3fe8cd87</id>
<content type='text'>
VFs do not perform HW fini/suspend in FLR, so the dpm_enabled
is incorrectly kept enabled. Add interface to disable it in
virt_pre_reset call.

v2: Made implementation generic for all asics
v3: Re-order conditionals so PP_MP1_STATE_FLR is only evaluated on VF

Signed-off-by: Victor Skvortsov &lt;victor.skvortsov@amd.com&gt;
Reviewed-by: Lijo Lazar &lt;lijo.lazar@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu/mes: add multiple mes ring instances support</title>
<updated>2024-08-13T14:29:25Z</updated>
<author>
<name>Jack Xiao</name>
<email>Jack.Xiao@amd.com</email>
</author>
<published>2024-08-07T03:53:35Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=c7d4355648ffa02a1551495b05c71ea6c884d29c'/>
<id>urn:sha1:c7d4355648ffa02a1551495b05c71ea6c884d29c</id>
<content type='text'>
Add multiple mes ring instances in mes structure to support
multiple mes pipes.

Signed-off-by: Jack Xiao &lt;Jack.Xiao@amd.com&gt;
Acked-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Reviewed-by: Hawking Zhang &lt;Hawking.Zhang@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: Set no_hw_access when VF request full GPU fails</title>
<updated>2024-07-08T20:46:56Z</updated>
<author>
<name>Yifan Zha</name>
<email>Yifan.Zha@amd.com</email>
</author>
<published>2024-06-27T07:06:23Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=33f23fc3155b13c4a96d94a0a22dc26db767440b'/>
<id>urn:sha1:33f23fc3155b13c4a96d94a0a22dc26db767440b</id>
<content type='text'>
[Why]
If VF request full GPU access and the request failed,
the VF driver can get stuck accessing registers for an extended period during
the unload of KMS.

[How]
Set no_hw_access flag when VF request for full GPU access fails
This prevents further hardware access attempts, avoiding the prolonged
stuck state.

Signed-off-by: Yifan Zha &lt;Yifan.Zha@amd.com&gt;
Acked-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: process RAS fatal error MB notification</title>
<updated>2024-06-27T21:31:37Z</updated>
<author>
<name>Vignesh Chander</name>
<email>Vignesh.Chander@amd.com</email>
</author>
<published>2024-06-24T21:44:26Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=cbda2758d8bfae323b846210a3e52f0ad5fe7164'/>
<id>urn:sha1:cbda2758d8bfae323b846210a3e52f0ad5fe7164</id>
<content type='text'>
For RAS error scenario, VF guest driver will check mailbox
and set fed flag to avoid unnecessary HW accesses.
additionally, poll for reset completion message first
to avoid accidentally spamming multiple reset requests to host.

v2: add another mailbox check for handling case where kfd detects
timeout first

v3: set host_flr bit and use wait_for_reset

Signed-off-by: Vignesh Chander &lt;Vignesh.Chander@amd.com&gt;
Reviewed-by: Zhigang Luo &lt;Zhigang.Luo@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: fix sriov host flr handler</title>
<updated>2024-06-14T20:15:58Z</updated>
<author>
<name>Yunxiang Li</name>
<email>Yunxiang.Li@amd.com</email>
</author>
<published>2024-05-24T20:22:28Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=5c0a1cdd17ce9eb315102c65084af899622ed268'/>
<id>urn:sha1:5c0a1cdd17ce9eb315102c65084af899622ed268</id>
<content type='text'>
We send back the ready to reset message before we stop anything. This is
wrong. Move it to when we are actually ready for the FLR to happen.

In the current state since we take tens of seconds to stop everything,
it is very likely that host would give up waiting and reset the GPU
before we send ready, so it would be the same as before. But this gets
rid of the hack with reset_domain locking and also let us tell how slow
ready to reset actually is from the host. The ready to reset speed can
be improved later.

Signed-off-by: Yunxiang Li &lt;Yunxiang.Li@amd.com&gt;
Acked-by: Christian König &lt;christian.koenig@amd.com&gt;
Reviewed-by: Emily Deng &lt;Emily.Deng@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: add skip_hw_access checks for sriov</title>
<updated>2024-06-14T20:15:58Z</updated>
<author>
<name>Yunxiang Li</name>
<email>Yunxiang.Li@amd.com</email>
</author>
<published>2024-05-24T20:14:55Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=b3948ad1ac582f560e1f3aeaecf384619921c48d'/>
<id>urn:sha1:b3948ad1ac582f560e1f3aeaecf384619921c48d</id>
<content type='text'>
Accessing registers via host is missing the check for skip_hw_access and
the lockdep check that comes with it.

Signed-off-by: Yunxiang Li &lt;Yunxiang.Li@amd.com&gt;
Reviewed-by: Christian König &lt;christian.koenig@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: fix failure mapping legacy queue when FLR</title>
<updated>2024-06-05T15:25:14Z</updated>
<author>
<name>Lin.Cao</name>
<email>lincao12@amd.com</email>
</author>
<published>2024-05-31T06:02:39Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=c8ad1bbbc2751063c7a5825911e58996ef849628'/>
<id>urn:sha1:c8ad1bbbc2751063c7a5825911e58996ef849628</id>
<content type='text'>
Flag "mes.ring.shced.ready" will be set as true after mes hw init and set
as false when mes hw fini to avoid duplicate initialization. But hw fini
will not be called when function level reset, which will cause mes hw
init be skipped during FLR, which will leads to mapping legacy queue
fail. Set this flag as false when post reset will fix this issue.

Signed-off-by: Lin.Cao &lt;lincao12@amd.com&gt;
Acked-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
</feed>
