<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c, branch v6.12</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v6.12</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v6.12'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2024-09-18T20:15:06Z</updated>
<entry>
<title>drm/amdgpu: nuke the VM PD/PT shadow handling</title>
<updated>2024-09-18T20:15:06Z</updated>
<author>
<name>Christian König</name>
<email>christian.koenig@amd.com</email>
</author>
<published>2024-08-27T14:12:11Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=7181faaa4703705939580abffaf9cb5d6b50dbb7'/>
<id>urn:sha1:7181faaa4703705939580abffaf9cb5d6b50dbb7</id>
<content type='text'>
This was only used as workaround for recovering the page tables after
VRAM was lost and is no longer necessary after the function
amdgpu_vm_bo_reset_state_machine() started to do the same.

Compute never used shadows either, so the only proplematic case left is
SVM and that is most likely not recoverable in any way when VRAM is
lost.

Signed-off-by: Christian König &lt;christian.koenig@amd.com&gt;
Acked-by: Lijo Lazar &lt;lijo.lazar@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: Fix missing check pcie_p2p module param</title>
<updated>2024-09-17T14:04:57Z</updated>
<author>
<name>Bob Zhou</name>
<email>bob.zhou@amd.com</email>
</author>
<published>2024-09-06T09:48:20Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=28b0ef922738b74335c20b8ed4bf8e259353a3a3'/>
<id>urn:sha1:28b0ef922738b74335c20b8ed4bf8e259353a3a3</id>
<content type='text'>
The module param pcie_p2p should be checked for kfd p2p feature, so add it.

Fixes: 75f0efbc4b3b ("drm/amdgpu: Take IOMMU remapping into account for p2p checks")
Signed-off-by: Bob Zhou &lt;bob.zhou@amd.com&gt;
Reviewed-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amd/amdgpu: move drain_workqueue before shutdown is set</title>
<updated>2024-08-29T17:39:07Z</updated>
<author>
<name>Victor Zhao</name>
<email>Victor.Zhao@amd.com</email>
</author>
<published>2024-08-25T16:14:26Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=af76ca8e180f38a7d874c18cf810707762766627'/>
<id>urn:sha1:af76ca8e180f38a7d874c18cf810707762766627</id>
<content type='text'>
[background] when unloading amdgpu driver right after running a
workload, drain_workqueue is causing "Fence fallback timer
expired on ring sdma0.0". Under sriov, this issue will cause sriov
full access timeout and a reset happening.

move drain_workqueue before shutdown is set to allow ih process and
before enter full access under sriov to avoid full access time cost.

Signed-off-by: Victor Zhao &lt;Victor.Zhao@amd.com&gt;
Reviewed-by: Feifei Xu &lt;Feifei.Xu@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: skip printing vram_lost if needed</title>
<updated>2024-08-29T17:38:53Z</updated>
<author>
<name>Trigger Huang</name>
<email>Trigger.Huang@amd.com</email>
</author>
<published>2024-08-19T07:53:22Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=6122f5c72e38a88eda13c7168e2ebbd3bd80b681'/>
<id>urn:sha1:6122f5c72e38a88eda13c7168e2ebbd3bd80b681</id>
<content type='text'>
The vm lost status can only be obtained after a GPU reset occurs, but
sometimes a dev core dump can be happened before GPU reset. So a new
argument is added to tell the dev core dump implementation whether to
skip printing the vram_lost status in the dump.
And this patch is also trying to decouple the core dump function from
the GPU reset function, by replacing the argument amdgpu_reset_context
with amdgpu_job to specify the context for core dump.

V2: Inform user if VRAM lost check is skipped so users don't assume
VRAM wasn't lost (Alex)

Signed-off-by: Trigger Huang &lt;Trigger.Huang@amd.com&gt;
Suggested-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Reviewed-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>Merge tag 'amd-drm-next-6.12-2024-08-26' of https://gitlab.freedesktop.org/agd5f/linux into drm-next</title>
<updated>2024-08-27T12:33:12Z</updated>
<author>
<name>Daniel Vetter</name>
<email>daniel.vetter@ffwll.ch</email>
</author>
<published>2024-08-27T12:33:12Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=e55ef65510a401862b902dc979441ea10ae25c61'/>
<id>urn:sha1:e55ef65510a401862b902dc979441ea10ae25c61</id>
<content type='text'>
amd-drm-next-6.12-2024-08-26:

amdgpu:
- SDMA devcoredump support
- DCN 4.0.1 updates
- DC SUBVP fixes
- Refactor OPP in DC
- Refactor MMHUBBUB in DC
- DC DML 2.1 updates
- DC FAMS2 updates
- RAS updates
- GFX12 updates
- VCN 4.0.3 updates
- JPEG 4.0.3 updates
- Enable wave kill (soft recovery) for compute queues
- Clean up CP error interrupt handling
- Enable CP bad opcode interrupts
- VCN 4.x fixes
- VCN 5.x fixes
- GPU reset fixes
- Fix vbios embedded EDID size handling
- SMU 14.x updates
- Misc code cleanups and spelling fixes
- VCN devcoredump support
- ISP MFD i2c support
- DC vblank fixes
- GFX 12 fixes
- PSR fixes
- Convert vbios embedded EDID to drm_edid
- DCN 3.5 updates
- DMCUB updates
- Cursor fixes
- Overdrive support for SMU 14.x
- GFX CP padding optimizations
- DCC fixes
- DSC fixes
- Preliminary per queue reset infrastructure
- Initial per queue reset support for GFX 9
- Initial per queue reset support for GFX 7, 8
- DCN 3.2 fixes
- DP MST fixes
- SR-IOV fixes
- GFX 9.4.3/4 devcoredump support
- Add process isolation framework
- Enable process isolation support for GFX 9.4.3/4
- Take IOMMU remapping into account for P2P DMA checks

amdkfd:
- CRIU fixes
- Improved input validation for user queues
- HMM fix
- Enable process isolation support for GFX 9.4.3/4
- Initial per queue reset support for GFX 9
- Allow users to target recommended SDMA engines

radeon:
- remove .load and drm_dev_alloc
- Fix vbios embedded EDID size handling
- Convert vbios embedded EDID to drm_edid
- Use GEM references instead of TTM
- r100 cp init cleanup
- Fix potential overflows in evergreen CS offset tracking

UAPI:
- KFD support for targetting queues on recommended SDMA engines
  Proposed userspace:
  https://github.com/ROCm/ROCR-Runtime/commit/2f588a24065f41c208c3701945e20be746d8faf7
  https://github.com/ROCm/ROCR-Runtime/commit/eb30a5bbc7719c6ffcf2d2dd2878bc53a47b3f30

drm/buddy:
- Add start address support for trim function

From: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Link: https://patchwork.freedesktop.org/patch/msgid/20240826201528.55307-1-alexander.deucher@amd.com
</content>
</entry>
<entry>
<title>drm/amdgpu: Take IOMMU remapping into account for p2p checks</title>
<updated>2024-08-23T14:53:05Z</updated>
<author>
<name>Rahul Jain</name>
<email>Rahul.Jain@amd.com</email>
</author>
<published>2024-08-13T08:11:11Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=75f0efbc4b3b088cca20864d055b3854a51b5af0'/>
<id>urn:sha1:75f0efbc4b3b088cca20864d055b3854a51b5af0</id>
<content type='text'>
when trying to enable p2p the amdgpu_device_is_peer_accessible()
checks the condition where address_mask overlaps the aper_base
and hence returns 0, due to which the p2p disables for this platform

IOMMU should remap the BAR addresses so the device can access
them. Hence check if peer_adev is remapping DMA

v5: (Felix, Alex)
- fixing comment as per Alex feedback
- refactor code as per Felix

v4: (Alex)
- fix the comment and description

v3:
- remove iommu_remap variable

v2: (Alex)
- Fix as per review comments
- add new function amdgpu_device_check_iommu_remap to check if iommu
  remap

Signed-off-by: Rahul Jain &lt;Rahul.Jain@amd.com&gt;
Reviewed-by: Felix Kuehling &lt;felix.kuehling@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: Implement Enforce Isolation Handler for KGD/KFD serialization</title>
<updated>2024-08-21T02:07:35Z</updated>
<author>
<name>Srinivasan Shanmugam</name>
<email>srinivasan.shanmugam@amd.com</email>
</author>
<published>2024-06-06T07:58:02Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=afefd6f245024684fff75100052065d6a9e8f75f'/>
<id>urn:sha1:afefd6f245024684fff75100052065d6a9e8f75f</id>
<content type='text'>
This commit introduces the Enforce Isolation Handler designed to enforce
shader isolation on AMD GPUs, which helps to prevent data leakage
between different processes.

The handler counts the number of emitted fences for each GFX and compute
ring. If there are any fences, it schedules the `enforce_isolation_work`
to be run after a delay of `GFX_SLICE_PERIOD`. If there are no fences,
it signals the Kernel Fusion Driver (KFD) to resume the runqueue.

The function is synchronized using the `enforce_isolation_mutex`.

This commit also introduces a reference count mechanism
(kfd_sch_req_count) to keep track of the number of requests to enable
the KFD scheduler. When a request to enable the KFD scheduler is made,
the reference count is decremented. When the reference count reaches
zero, a delayed work is scheduled to enforce isolation after a delay of
GFX_SLICE_PERIOD.

When a request to disable the KFD scheduler is made, the function first
checks if the reference count is zero. If it is, it cancels the delayed
work for enforcing isolation and checks if the KFD scheduler is active.
If the KFD scheduler is active, it sends a request to stop the KFD
scheduler and sets the KFD scheduler state to inactive. Then, it
increments the reference count.

The function is synchronized using the kfd_sch_mutex to ensure that the
KFD scheduler state and reference count are updated atomically.

Cc: Christian König &lt;christian.koenig@amd.com&gt;
Cc: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Srinivasan Shanmugam &lt;srinivasan.shanmugam@amd.com&gt;
Suggested-by: Christian König &lt;christian.koenig@amd.com&gt;
Suggested-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: Add enforce_isolation sysfs attribute</title>
<updated>2024-08-21T02:06:52Z</updated>
<author>
<name>Srinivasan Shanmugam</name>
<email>srinivasan.shanmugam@amd.com</email>
</author>
<published>2024-05-27T02:00:47Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=e189be9b2e3820c88164d95090f1fd6343cd77fc'/>
<id>urn:sha1:e189be9b2e3820c88164d95090f1fd6343cd77fc</id>
<content type='text'>
This commit adds a new sysfs attribute 'enforce_isolation' to control
the 'enforce_isolation' setting per GPU. The attribute can be read and
written, and accepts values 0 (disabled) and 1 (enabled).

When 'enforce_isolation' is enabled, reserved VMIDs are allocated for
each ring. When it's disabled, the reserved VMIDs are freed.

The set function locks a mutex before changing the 'enforce_isolation'
flag and the VMIDs, and unlocks it afterwards. This ensures that these
operations are atomic and prevents race conditions and other concurrency
issues.

Cc: Christian König &lt;christian.koenig@amd.com&gt;
Cc: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Srinivasan Shanmugam &lt;srinivasan.shanmugam@amd.com&gt;
Suggested-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: Make enforce_isolation setting per GPU</title>
<updated>2024-08-16T18:27:45Z</updated>
<author>
<name>Srinivasan Shanmugam</name>
<email>srinivasan.shanmugam@amd.com</email>
</author>
<published>2024-07-29T16:05:26Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=96595204195d7e13736a84295e217316610d4cdb'/>
<id>urn:sha1:96595204195d7e13736a84295e217316610d4cdb</id>
<content type='text'>
This commit makes enforce_isolation setting to be per GPU and per
partition by adding the enforce_isolation array to the adev structure.
The adev variable is set based on the global enforce_isolation module
parameter during device initialization.

In amdgpu_ids.c, the adev-&gt;enforce_isolation value for the current GPU
is used to determine whether to enforce isolation between graphics and
compute processes on that GPU.

In amdgpu_ids.c, the adev-&gt;enforce_isolation value for the current GPU
and partition is used to determine whether to enforce isolation between
graphics and compute processes on that GPU and partition.

This allows the enforce_isolation setting to be controlled individually
for each GPU and each partition, which is useful in a system with
multiple GPUs and partitions where different isolation settings might be
desired for different GPUs and partitions.

v2: fix loop in amdgpu_vmid_mgr_init() (Alex)

Cc: Christian König &lt;christian.koenig@amd.com&gt;
Cc: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Srinivasan Shanmugam &lt;srinivasan.shanmugam@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Suggested-by: Christian König &lt;christian.koenig@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu/gfx11: add a mutex for the gfx semaphore</title>
<updated>2024-08-16T18:24:33Z</updated>
<author>
<name>Alex Deucher</name>
<email>alexander.deucher@amd.com</email>
</author>
<published>2024-07-12T20:37:33Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=76acba7b7f12517990f326fabfecb6f55e334233'/>
<id>urn:sha1:76acba7b7f12517990f326fabfecb6f55e334233</id>
<content type='text'>
This will be used in more places in the future so
add a mutex.

Acked-by: Vitaly Prosyak &lt;vitaly.prosyak@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
</feed>
