<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c, branch v6.8</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v6.8</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v6.8'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2024-01-31T22:34:05Z</updated>
<entry>
<title>drm/amdgpu: Fix the warning info in mode1 reset</title>
<updated>2024-01-31T22:34:05Z</updated>
<author>
<name>Ma Jun</name>
<email>Jun.Ma2@amd.com</email>
</author>
<published>2024-01-05T06:05:25Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=bb34bc2cd3ee284d7992df24a3f7d24f61a59268'/>
<id>urn:sha1:bb34bc2cd3ee284d7992df24a3f7d24f61a59268</id>
<content type='text'>
Fix the warning info below during mode1 reset.
[  +0.000004] Call Trace:
[  +0.000004]  &lt;TASK&gt;
[  +0.000006]  ? show_regs+0x6e/0x80
[  +0.000011]  ? __flush_work.isra.0+0x2e8/0x390
[  +0.000005]  ? __warn+0x91/0x150
[  +0.000009]  ? __flush_work.isra.0+0x2e8/0x390
[  +0.000006]  ? report_bug+0x19d/0x1b0
[  +0.000013]  ? handle_bug+0x46/0x80
[  +0.000012]  ? exc_invalid_op+0x1d/0x80
[  +0.000011]  ? asm_exc_invalid_op+0x1f/0x30
[  +0.000014]  ? __flush_work.isra.0+0x2e8/0x390
[  +0.000007]  ? __flush_work.isra.0+0x208/0x390
[  +0.000007]  ? _prb_read_valid+0x216/0x290
[  +0.000008]  __cancel_work_timer+0x11d/0x1a0
[  +0.000007]  ? try_to_grab_pending+0xe8/0x190
[  +0.000012]  cancel_work_sync+0x14/0x20
[  +0.000008]  amddrm_sched_stop+0x3c/0x1d0 [amd_sched]
[  +0.000032]  amdgpu_device_gpu_recover+0x29a/0xe90 [amdgpu]

This warning info was printed after applying the patch
"drm/sched: Convert drm scheduler to use a work queue rather than kthread".
The root cause is that amdgpu driver tries to use the uninitialized
work_struct in the struct drm_gpu_scheduler

v2:
 - Rename the function to amdgpu_ring_sched_ready and move it to
amdgpu_ring.c (Alex)
v3:
- Fix a few more checks based on Vitaly's patch (Alex)
v4:
- squash in fix noticed by Bert in
https://gitlab.freedesktop.org/drm/amd/-/issues/3139

Fixes: 11b3b9f461c5 ("drm/sched: Check scheduler ready before calling timeout handling")
Reviewed-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Signed-off-by: Vitaly Prosyak &lt;vitaly.prosyak@amd.com&gt;
Signed-off-by: Ma Jun &lt;Jun.Ma2@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: Add register read/write debugfs support for AID's</title>
<updated>2024-01-03T15:30:19Z</updated>
<author>
<name>Mangesh Gadre</name>
<email>Mangesh.Gadre@amd.com</email>
</author>
<published>2023-12-19T14:59:16Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=5eb8094a9b05ae5b3e49376a6e5a7a004cd0514f'/>
<id>urn:sha1:5eb8094a9b05ae5b3e49376a6e5a7a004cd0514f</id>
<content type='text'>
SMN address is larger than 32 bits for registers on different AID's
Updating existing interface to support access to such registers.

Signed-off-by: Mangesh Gadre &lt;Mangesh.Gadre@amd.com&gt;
Reviewed-by: Christian König &lt;christian.koenig@amd.com&gt;
Reviewed-by: Asad Kamal &lt;asad.kamal@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu/debugfs: fix error code when smc register accessors are NULL</title>
<updated>2023-12-14T20:27:42Z</updated>
<author>
<name>Alex Deucher</name>
<email>alexander.deucher@amd.com</email>
</author>
<published>2023-11-27T22:26:29Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=afe58346d5d3887b3e49ff623d2f2e471f232a8d'/>
<id>urn:sha1:afe58346d5d3887b3e49ff623d2f2e471f232a8d</id>
<content type='text'>
Should be -EOPNOTSUPP.

Fixes: 5104fdf50d32 ("drm/amdgpu: Fix a null pointer access when the smc_rreg pointer is NULL")
Reviewed-by: Christian König &lt;christian.koenig@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: SW part of MES event log enablement</title>
<updated>2023-12-07T22:43:13Z</updated>
<author>
<name>shaoyunl</name>
<email>shaoyun.liu@amd.com</email>
</author>
<published>2023-11-23T18:59:44Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=b2662d4cc4ce2db4bd55e00a528b1d35be82c6c3'/>
<id>urn:sha1:b2662d4cc4ce2db4bd55e00a528b1d35be82c6c3</id>
<content type='text'>
This is the generic SW part, prepare the event log buffer and dump it through debugfs

Signed-off-by: shaoyunl &lt;shaoyun.liu@amd.com&gt;
Reviewed-by: Felix Kuehling &lt;Felix.Kuehling@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>Merge tag 'amd-drm-next-6.8-2023-12-01' of https://gitlab.freedesktop.org/agd5f/linux into drm-next</title>
<updated>2023-12-05T02:11:41Z</updated>
<author>
<name>Dave Airlie</name>
<email>airlied@redhat.com</email>
</author>
<published>2023-12-05T02:11:41Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=5edfd7d94b0310b74136b666551f1d23711ed445'/>
<id>urn:sha1:5edfd7d94b0310b74136b666551f1d23711ed445</id>
<content type='text'>
amd-drm-next-6.8-2023-12-01:

amdgpu:
- Add new 64 bit sequence number infrastructure.
  This will ultimately be used for user queue synchronization.
- GPUVM updates
- Misc code cleanups
- RAS updates
- DCN 3.5 updates
- Rework PCIe link speed handling
- Document GPU reset types
- DMUB fixes
- eDP fixes
- NBIO 7.9 updates
- NBIO 7.11 updates
- SubVP updates
- DCN 3.1.4 fixes
- ABM fixes
- AGP aperture fix
- DCN 3.1.5 fix
- Fix some potential error path memory leaks
- Enable PCIe PMEs
- Add XGMI, PCIe state dumping for aqua vanjaram
- GFX11 golden register updates
- Misc display fixes

amdkfd:
- Migrate TLB flushing logic to amdgpu
- Trap handler fixes
- Fix restore workers handling on suspend and reset
- Fix possible memory leak in pqm_uninit()

radeon:
- Fix some possible overflows in command buffer checking
- Check for errors in ring_lock

From: Alex Deucher &lt;alexander.deucher@amd.com&gt;
Link: https://patchwork.freedesktop.org/patch/msgid/20231201181743.5313-1-alexander.deucher@amd.com
Signed-off-by: Dave Airlie &lt;airlied@redhat.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: Fix cat debugfs amdgpu_regs_didt causes kernel null pointer</title>
<updated>2023-11-29T21:49:23Z</updated>
<author>
<name>Lu Yao</name>
<email>yaolu@kylinos.cn</email>
</author>
<published>2023-11-23T01:22:34Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=7b194fdccb8458779687063e582cf218a0920c29'/>
<id>urn:sha1:7b194fdccb8458779687063e582cf218a0920c29</id>
<content type='text'>
For 'AMDGPU_FAMILY_SI' family cards, in 'si_common_early_init' func, init
'didt_rreg' and 'didt_wreg' to 'NULL'. But in func
'amdgpu_debugfs_regs_didt_read/write', using 'RREG32_DIDT' 'WREG32_DIDT'
lacks of relevant judgment. And other 'amdgpu_ip_block_version' that use
these two definitions won't be added for 'AMDGPU_FAMILY_SI'.

So, add null pointer judgment before calling.

Reviewed-by: Christian König &lt;christian.koenig@amd.com&gt;
Signed-off-by: Lu Yao &lt;yaolu@kylinos.cn&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>Merge drm/drm-next into drm-misc-next</title>
<updated>2023-11-15T09:56:44Z</updated>
<author>
<name>Maxime Ripard</name>
<email>mripard@kernel.org</email>
</author>
<published>2023-11-15T09:45:19Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=3bf3e21c15d4386a5f15118ec39bbc1b67ea5759'/>
<id>urn:sha1:3bf3e21c15d4386a5f15118ec39bbc1b67ea5759</id>
<content type='text'>
Let's kickstart the v6.8 release cycle.

Signed-off-by: Maxime Ripard &lt;mripard@kernel.org&gt;
</content>
</entry>
<entry>
<title>drm/sched: Add drm_sched_wqueue_* helpers</title>
<updated>2023-11-01T21:29:20Z</updated>
<author>
<name>Matthew Brost</name>
<email>matthew.brost@intel.com</email>
</author>
<published>2023-10-31T03:24:35Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=35963cf2cd25eeea8bdb4d02853dac1e66fb13a0'/>
<id>urn:sha1:35963cf2cd25eeea8bdb4d02853dac1e66fb13a0</id>
<content type='text'>
Add scheduler wqueue ready, stop, and start helpers to hide the
implementation details of the scheduler from the drivers.

v2:
  - s/sched_wqueue/sched_wqueue (Luben)
  - Remove the extra white line after the return-statement (Luben)
  - update drm_sched_wqueue_ready comment (Luben)

Cc: Luben Tuikov &lt;luben.tuikov@amd.com&gt;
Signed-off-by: Matthew Brost &lt;matthew.brost@intel.com&gt;
Reviewed-by: Luben Tuikov &lt;luben.tuikov@amd.com&gt;
Link: https://lore.kernel.org/r/20231031032439.1558703-2-matthew.brost@intel.com
Signed-off-by: Luben Tuikov &lt;ltuikov89@gmail.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: Fix a null pointer access when the smc_rreg pointer is NULL</title>
<updated>2023-10-26T22:41:22Z</updated>
<author>
<name>Qu Huang</name>
<email>qu.huang@linux.dev</email>
</author>
<published>2023-10-23T12:56:37Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=5104fdf50d326db2c1a994f8b35dcd46e63ae4ad'/>
<id>urn:sha1:5104fdf50d326db2c1a994f8b35dcd46e63ae4ad</id>
<content type='text'>
In certain types of chips, such as VEGA20, reading the amdgpu_regs_smc file could result in an abnormal null pointer access when the smc_rreg pointer is NULL. Below are the steps to reproduce this issue and the corresponding exception log:

1. Navigate to the directory: /sys/kernel/debug/dri/0
2. Execute command: cat amdgpu_regs_smc
3. Exception Log::
[4005007.702554] BUG: kernel NULL pointer dereference, address: 0000000000000000
[4005007.702562] #PF: supervisor instruction fetch in kernel mode
[4005007.702567] #PF: error_code(0x0010) - not-present page
[4005007.702570] PGD 0 P4D 0
[4005007.702576] Oops: 0010 [#1] SMP NOPTI
[4005007.702581] CPU: 4 PID: 62563 Comm: cat Tainted: G           OE     5.15.0-43-generic #46-Ubunt       u
[4005007.702590] RIP: 0010:0x0
[4005007.702598] Code: Unable to access opcode bytes at RIP 0xffffffffffffffd6.
[4005007.702600] RSP: 0018:ffffa82b46d27da0 EFLAGS: 00010206
[4005007.702605] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffa82b46d27e68
[4005007.702609] RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff9940656e0000
[4005007.702612] RBP: ffffa82b46d27dd8 R08: 0000000000000000 R09: ffff994060c07980
[4005007.702615] R10: 0000000000020000 R11: 0000000000000000 R12: 00007f5e06753000
[4005007.702618] R13: ffff9940656e0000 R14: ffffa82b46d27e68 R15: 00007f5e06753000
[4005007.702622] FS:  00007f5e0755b740(0000) GS:ffff99479d300000(0000) knlGS:0000000000000000
[4005007.702626] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[4005007.702629] CR2: ffffffffffffffd6 CR3: 00000003253fc000 CR4: 00000000003506e0
[4005007.702633] Call Trace:
[4005007.702636]  &lt;TASK&gt;
[4005007.702640]  amdgpu_debugfs_regs_smc_read+0xb0/0x120 [amdgpu]
[4005007.703002]  full_proxy_read+0x5c/0x80
[4005007.703011]  vfs_read+0x9f/0x1a0
[4005007.703019]  ksys_read+0x67/0xe0
[4005007.703023]  __x64_sys_read+0x19/0x20
[4005007.703028]  do_syscall_64+0x5c/0xc0
[4005007.703034]  ? do_user_addr_fault+0x1e3/0x670
[4005007.703040]  ? exit_to_user_mode_prepare+0x37/0xb0
[4005007.703047]  ? irqentry_exit_to_user_mode+0x9/0x20
[4005007.703052]  ? irqentry_exit+0x19/0x30
[4005007.703057]  ? exc_page_fault+0x89/0x160
[4005007.703062]  ? asm_exc_page_fault+0x8/0x30
[4005007.703068]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[4005007.703075] RIP: 0033:0x7f5e07672992
[4005007.703079] Code: c0 e9 b2 fe ff ff 50 48 8d 3d fa b2 0c 00 e8 c5 1d 02 00 0f 1f 44 00 00 f3 0f        1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 0f 05 &lt;48&gt; 3d 00 f0 ff ff 77 56 c3 0f 1f 44 00 00 48 83 e       c 28 48 89 54 24
[4005007.703083] RSP: 002b:00007ffe03097898 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
[4005007.703088] RAX: ffffffffffffffda RBX: 0000000000020000 RCX: 00007f5e07672992
[4005007.703091] RDX: 0000000000020000 RSI: 00007f5e06753000 RDI: 0000000000000003
[4005007.703094] RBP: 00007f5e06753000 R08: 00007f5e06752010 R09: 00007f5e06752010
[4005007.703096] R10: 0000000000000022 R11: 0000000000000246 R12: 0000000000022000
[4005007.703099] R13: 0000000000000003 R14: 0000000000020000 R15: 0000000000020000
[4005007.703105]  &lt;/TASK&gt;
[4005007.703107] Modules linked in: nf_tables libcrc32c nfnetlink algif_hash af_alg binfmt_misc nls_       iso8859_1 ipmi_ssif ast intel_rapl_msr intel_rapl_common drm_vram_helper drm_ttm_helper amd64_edac t       tm edac_mce_amd kvm_amd ccp mac_hid k10temp kvm acpi_ipmi ipmi_si rapl sch_fq_codel ipmi_devintf ipm       i_msghandler msr parport_pc ppdev lp parport mtd pstore_blk efi_pstore ramoops pstore_zone reed_solo       mon ip_tables x_tables autofs4 ib_uverbs ib_core amdgpu(OE) amddrm_ttm_helper(OE) amdttm(OE) iommu_v       2 amd_sched(OE) amdkcl(OE) drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops cec rc_core        drm igb ahci xhci_pci libahci i2c_piix4 i2c_algo_bit xhci_pci_renesas dca
[4005007.703184] CR2: 0000000000000000
[4005007.703188] ---[ end trace ac65a538d240da39 ]---
[4005007.800865] RIP: 0010:0x0
[4005007.800871] Code: Unable to access opcode bytes at RIP 0xffffffffffffffd6.
[4005007.800874] RSP: 0018:ffffa82b46d27da0 EFLAGS: 00010206
[4005007.800878] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffa82b46d27e68
[4005007.800881] RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff9940656e0000
[4005007.800883] RBP: ffffa82b46d27dd8 R08: 0000000000000000 R09: ffff994060c07980
[4005007.800886] R10: 0000000000020000 R11: 0000000000000000 R12: 00007f5e06753000
[4005007.800888] R13: ffff9940656e0000 R14: ffffa82b46d27e68 R15: 00007f5e06753000
[4005007.800891] FS:  00007f5e0755b740(0000) GS:ffff99479d300000(0000) knlGS:0000000000000000
[4005007.800895] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[4005007.800898] CR2: ffffffffffffffd6 CR3: 00000003253fc000 CR4: 00000000003506e0

Signed-off-by: Qu Huang &lt;qu.huang@linux.dev&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
<entry>
<title>drm/amdgpu: Encapsulate all device reset info</title>
<updated>2023-10-20T19:11:28Z</updated>
<author>
<name>André Almeida</name>
<email>andrealmeid@igalia.com</email>
</author>
<published>2023-09-15T14:38:59Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=2d6a2a28cdeade75021503f86e57e7ebce7eb74c'/>
<id>urn:sha1:2d6a2a28cdeade75021503f86e57e7ebce7eb74c</id>
<content type='text'>
To better organize struct amdgpu_device, keep all reset information
related fields together in a separated struct.

Signed-off-by: André Almeida &lt;andrealmeid@igalia.com&gt;
Signed-off-by: Shashank Sharma &lt;shashank.sharma@amd.com&gt;
Reviewed-by: Shashank Sharma &lt;shashank.sharma@amd.com&gt;
Signed-off-by: Alex Deucher &lt;alexander.deucher@amd.com&gt;
</content>
</entry>
</feed>
