| Age | Commit message (Collapse) | Author | Lines |
|
[Why]
Implicit signed-to-unsigned conversions caused compiler
warnings in DC paths.
[How]
Added explicit (unsigned int)/(uint32_t) casts for sentinel -1
assignments and IRQ ~MASK initializers, with small cast alignment
in logging/DPCD code.
Functionality and behavior is unchanged; only type intent is explicit.
Reviewed-by: Dillon Varone <dillon.varone@amd.com>
Signed-off-by: Gaghik Khachatrian <gaghik.khachatrian@amd.com>
Signed-off-by: Chuanyu Tseng <chuanyu.tseng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
[Why]
Clock manager contained significant duplicate code between
variants with identical logic for functions using only SMU
calls or shared registers. This increases maintenance overhead
and potential for bugs.
[How]
Expose clock constants and internal functions in header for
sharing. Remove duplicate implementations and update function
pointers to use shared functions. Refactor remaining
variant-specific functions to use shared constants and helper
functions. Add compatibility comments for hardware differences.
Reviewed-by: Dmytro Laktyushkin <dmytro.laktyushkin@amd.com>
Signed-off-by: Gabe Teeger <gabe.teeger@amd.com>
Signed-off-by: Chuanyu Tseng <chuanyu.tseng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Fix Conversion that might result in a loss of data warnings in dmub/src/:
- dmub_dcn20/31/32/35/42/60/401.c: Add ASSERT(value <= 0xFF) and
explicit (uint8_t) cast when storing REG_GET results into uint8_t
debug struct fields. Add != 0 for bool assignments from uint32_t
bitfield reads.
- dmub_reg.c: Cast va_arg shift value to uint8_t with ASSERT guard
before passing to set_reg_field_value_masks().
- dmub_srv.c: Widen num_pending to uint64_t to match uint64_t
arithmetic; use != 0 for bool assignments from unsigned expressions.
No functional change intended.
Reviewed-by: Dillon Varone <dillon.varone@amd.com>
Signed-off-by: Gaghik Khachatrian <gaghik.khachatrian@amd.com>
Signed-off-by: Chuanyu Tseng <chuanyu.tseng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
We were previously modifying the global dc->config.enable_4to1MPC
dynamically. These variables are meant as global configs, not to
by dynamically modified. Modifying them dynamically prevents us
from enabling/disabling functionality for debug purposes and can
easily lead to bad things since we're not operating on the current
state but on DC-wide variables.
Instead we should look at the existing split4mpc decision in
dcn20_validate_apply_split_flags and make the decision there,
if the global config.enable_4to1MPC is set to true for the
DCN version we're running.
This fixes corruption that is observed when running a new IGT
kms_colorop test for color-space-conversion that uses a
YUV plane and outputs to a writeback connector.
Co-developed by Claude Sonnet 4.5.
Assisted-by: Claude:claude-sonnet-4.5
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Chuanyu Tseng <chuanyu.tseng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
dcn401_init_hw() assumes that update_bw_bounding_box() is valid when
entering the update path. However, the existing condition:
((!fams2_enable && update_bw_bounding_box) || freq_changed)
does not guarantee this, as the freq_changed branch can evaluate to true
independently of the callback pointer.
This can result in calling update_bw_bounding_box() when it is NULL.
Fix this by separating the update condition from the pointer checks and
ensuring the callback, dc->clk_mgr, and bw_params are validated before
use.
Fixes the below:
../dc/hwss/dcn401/dcn401_hwseq.c:367 dcn401_init_hw() error: we previously assumed 'dc->res_pool->funcs->update_bw_bounding_box' could be null (see line 362)
Fixes: ca0fb243c3bb ("drm/amd/display: Underflow Seen on DCN401 eGPU")
Cc: Daniel Sa <Daniel.Sa@amd.com>
Cc: Alvin Lee <alvin.lee2@amd.com>
Cc: Roman Li <roman.li@amd.com>
Cc: Alex Hung <alex.hung@amd.com>
Cc: Tom Chung <chiahsuan.chung@amd.com>
Cc: Dan Carpenter <dan.carpenter@linaro.org>
Cc: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Validation expects to operate on non-split pipes. This is
seen in dcn20_fast_validate_bw, which merges pipes for
validation. We weren't doing that in the non-fast path
which lead to validation failures when operating with
4-to-1 MPC and a writeback connector.
Co-developed by Claude Sonnet 4.5
Assisted-by: Claude:claude-sonnet-4.5
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Harry Wentland <harry.wentland@amd.com>
Signed-off-by: Chuanyu Tseng <chuanyu.tseng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
'update_planes_and_stream_state'
Add missing info for the update_descriptor parameter in
update_planes_and_stream_state().
Fixes the below with gcc W=1:
../display/dc/core/dc.c:3630 function parameter 'update_descriptor' not described in 'update_planes_and_stream_state'
Fixes: c24bb00cc6cf ("drm/amd/display: Refactor DC update checks")
Cc: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Cc: Dillon Varone <Dillon.Varone@amd.com>
Cc: Chuanyu Tseng <chuanyu.tseng@amd.com>
Cc: Roman Li <roman.li@amd.com>
Cc: Alex Hung <alex.hung@amd.com>
Cc: Tom Chung <chiahsuan.chung@amd.com>
Cc: Dan Carpenter <dan.carpenter@linaro.org>
Cc: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
clk_mgr_construct() initializes display clock and memory bandwidth
settings during driver bring-up.
As part of this, the driver selects a watermark table based on the
memory type (DDR4, LPDDR4, LPDDR5) from ctx->dc_bios->integrated_info.
The display pipeline continuously reads pixel data from memory,
processes it (such as scaling, color conversion, and blending), and
sends it to the screen. To keep this pipeline running smoothly, the
driver must ensure there is enough memory bandwidth and that clocks are
increased when needed.
Watermark tables define when the GPU should increase clocks to ensure
there is enough bandwidth to feed pixel data without underflow.
However, ctx->dc_bios->integrated_info is dereferenced without checking
for NULL in multiple clk_mgr_construct() implementations. On some
platforms, BIOS may not provide this information, and accessing it
directly can cause a NULL pointer dereference during initialization.
Fix this by adding a NULL check before accessing integrated_info.
If integrated_info is not available, the driver safely falls back to
default watermark tables.
Fixes:
../dcn21/rn_clk_mgr.c:775 rn_clk_mgr_construct() warn: variable dereferenced before check 'ctx->dc_bios->integrated_info' (see line 743)
../dcn301/vg_clk_mgr.c:750 vg_clk_mgr_construct() warn: variable dereferenced before check 'ctx->dc_bios->integrated_info' (see line 736)
../dcn31/dcn31_clk_mgr.c:789 dcn31_clk_mgr_construct() warn: variable dereferenced before check 'ctx->dc_bios->integrated_info' (see line 728)
../dcn314/dcn314_clk_mgr.c:906 dcn314_clk_mgr_construct() warn: variable dereferenced before check 'ctx->dc_bios->integrated_info' (see line 845)
../dcn315/dcn315_clk_mgr.c:716 dcn315_clk_mgr_construct() warn: variable dereferenced before check 'ctx->dc_bios->integrated_info' (see line 655)
../dcn316/dcn316_clk_mgr.c:660 dcn316_clk_mgr_construct() warn: variable dereferenced before check 'ctx->dc_bios->integrated_info' (see line 639)
../dcn35/dcn35_clk_mgr.c:1540 dcn35_clk_mgr_construct() warn: variable dereferenced before check 'ctx->dc_bios->integrated_info' (see line 1467)
Fixes: 25879d7b4986 ("drm/amd/display: Clean FPGA code in dc")
Cc: Roman Li <roman.li@amd.com>
Cc: Alex Hung <alex.hung@amd.com>
Cc: Tom Chung <chiahsuan.chung@amd.com>
Cc: Dan Carpenter <dan.carpenter@linaro.org>
Cc: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
In dc_dmub_srv_log_diagnostic_data() and
dc_dmub_srv_enable_dpia_trace().
Both functions check:
if (!dc_dmub_srv || !dc_dmub_srv->dmub)
and then call DC_LOG_ERROR() inside that block.
DC_LOG_ERROR() uses dc_dmub_srv->ctx internally. So if
dc_dmub_srv is NULL, the logging itself can dereference a
NULL pointer and cause a crash.
Fix this by splitting the checks.
First check if dc_dmub_srv is NULL and return immediately.
Then check dc_dmub_srv->dmub and log the error only when
dc_dmub_srv is valid.
Fixes the below:
../display/dc/dc_dmub_srv.c:962 dc_dmub_srv_log_diagnostic_data() error: we previously assumed 'dc_dmub_srv' could be null (see line 961)
../display/dc/dc_dmub_srv.c:1167 dc_dmub_srv_enable_dpia_trace() error: we previously assumed 'dc_dmub_srv' could be null (see line 1166)
Fixes: 2631ac1ac328 ("drm/amd/display: add DMUB registers to crash dump diagnostic data.")
Fixes: 71ba6b577a35 ("drm/amd/display: Add interface to enable DPIA trace")
Cc: Roman Li <roman.li@amd.com>
Cc: Alex Hung <alex.hung@amd.com>
Cc: Tom Chung <chiahsuan.chung@amd.com>
Cc: Dan Carpenter <dan.carpenter@linaro.org>
Cc: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com>
Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
[Why]
Multiple locations in dml2_0 used num_clk_values-1 as array index
without checking if num_clk_values > 0. When num_clk_values is 0,
this results in accessing array index -1, which wraps to 255 for
unsigned types, causing out-of-bounds memory access and potential
crashes.
[How]
Add proper bounds checking using ternary operators to guard all
num_clk_values-1 array accesses. When num_clk_values is 0, return 0
as fallback value instead of accessing invalid memory. This prevents
buffer overflows while maintaining backward compatibility and provides
sensible default behavior for empty clock tables.
Reviewed-by: Dillon Varone <dillon.varone@amd.com>
Signed-off-by: Gabe Teeger <gabe.teeger@amd.com>
Signed-off-by: Chuanyu Tseng <chuanyu.tseng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
[Why]
The OTG's virtual pixel clock source for DVI comes from the PHY.
If the signal type is DVI then the OTG can become stuck on pre DCN401
ASIC when DPMS off occurs because the OTG remains running but the
PHY transmitter is disabled.
[How]
There exists logic to keep track of the OTG running refcount on the
link to determine if the link needs to go to PLL_EN instead of TX_EN
but the logic only checks for HDMI TMDS on older ASIC.
DVI is still a TMDS signal type so the constraint should also apply.
Replace the checks for dc_is_hdmi_tmds_signal with dc_is_tmds_signal to
cover both HDMI and DVI for the symclk refcount workaround.
Reviewed-by: Dillon Varone <dillon.varone@amd.com>
Reviewed-by: Charlene Liu <charlene.liu@amd.com>
Signed-off-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Chuanyu Tseng <chuanyu.tseng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Handling unused function parameter due to cause compiler warning
Reviewed-by: Clayton King <clayton.king@amd.com>
Reviewed-by: Dillon Varone <dillon.varone@amd.com>
Signed-off-by: Gaghik Khachatrian <gaghik.khachatrian@amd.com>
Signed-off-by: Chuanyu Tseng <chuanyu.tseng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
[Why]
Under DCN21, observe flip_done timeout issue while
running 3D benchmark under MPO case. Timeout is caused
by driver fails validate_bandwidth() during
atomic_commit_tail but passes atomic_check.
Under further analysis, indicates the delta of
atomic_check and atomic_commit_tail are
dc->current_state->bw_ctx.dml.soc.sr_exit_time_us and
dc->current_state->bw_ctx.dml.soc.sr_enter_plus_exit_time_us.
We set validate_mode as DC_VALIDATE_MODE_ONLY while calling
dc_validate_global_state() at atomic_check, but set mode as
DC_VALIDATE_MODE_AND_PROGRAMMING during atomic_commit_tail.
If dc_validate_mode set as DC_VALIDATE_MODE_ONLY,
validate_bandwidth() will skip the wm and dlg calculation.
During commit_tail, validate_bandwidth() is called with
dc_validate_mode set as DC_VALIDATE_MODE_AND_PROGRAMMING and
dc_state->bw_ctx.dml.soc.sr_exit_time_us might get modified
after the wm_calculation and stored into dc->current_state.
Which means dc->current_state->bw_ctx.dml.soc.sr_exit_time_us
might not aligned with the one stored in dm_state->context.
That causes duplicated dm_state->context not aligned with
dc->current_state, and might have bandwidth validation pass
in atomic_check and fail in commit_tail later.
[How]
When the issue occurs, it fails dml_get_voltage_level() with
the condition dm_allow_self_refresh_and_mclk_switch but pass
with the condition dm_allow_self_refresh. However, we should
support p-state. So we should not pass validate_bandwidth by
allowing self refresh only. Change the policy under DCN21.
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Wayne Lin <Wayne.Lin@amd.com>
Signed-off-by: Chuanyu Tseng <chuanyu.tseng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Add DCN4.2 to the list that supports
Panel Replay feature.
Reviewed-by: Alex Hung <Alex.Hung@amd.com>
Signed-off-by: Roman Li <Roman.Li@amd.com>
Signed-off-by: Chuanyu Tseng <Chuanyu.Tseng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
[why]
DCN without DMCUB is not a supported configuration on DCN42.
[how]
Remove the DC_DMCUB_ENABLE fuse register check and remove the
corresponding entries in the DCN42 DMUB register list.
Reviewed-by: Nicholas Kazlauskas <nicholas.kazlauskas@amd.com>
Signed-off-by: Gabe Teeger <gabe.teeger@amd.com>
Signed-off-by: Matthew Stewart <Matthew.Stewart2@amd.com>
Signed-off-by: Roman Li <roman.li@amd.com>
Signed-off-by: Chuanyu Tseng <chuanyu.tseng@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
get_gpio_i2c_info() computes the number of GPIO I2C assignment records
present in the BIOS table and then uses bfI2C_LineMux as an array index
into header->asGPIO_Info[]. The current check only rejects values
strictly larger than the record count, so an index equal to count still
falls through and reaches the fixed table one element past the end.
Reject indices at or above the number of available records before using
them as an array index.
Signed-off-by: Pengpeng Hou <pengpeng@iscas.ac.cn>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Currently, AMDGPU_VA_RESERVED_TRAP_SIZE is hardcoded to 8KB, while
KFD_CWSR_TBA_TMA_SIZE is defined as 2 * PAGE_SIZE. On systems with
4K pages, both values match (8KB), so allocation and reserved space
are consistent.
However, on 64K page-size systems, KFD_CWSR_TBA_TMA_SIZE becomes 128KB,
while the reserved trap area remains 8KB. This mismatch causes the
kernel to crash when running rocminfo or rccl unit tests.
Kernel attempted to read user page (2) - exploit attempt? (uid: 1001)
BUG: Kernel NULL pointer dereference on read at 0x00000002
Faulting instruction address: 0xc0000000002c8a64
Oops: Kernel access of bad area, sig: 11 [#1]
LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries
CPU: 34 UID: 1001 PID: 9379 Comm: rocminfo Tainted: G E
6.19.0-rc4-amdgpu-00320-gf23176405700 #56 VOLUNTARY
Tainted: [E]=UNSIGNED_MODULE
Hardware name: IBM,9105-42A POWER10 (architected) 0x800200 0xf000006
of:IBM,FW1060.30 (ML1060_896) hv:phyp pSeries
NIP: c0000000002c8a64 LR: c00000000125dbc8 CTR: c00000000125e730
REGS: c0000001e0957580 TRAP: 0300 Tainted: G E
MSR: 8000000000009033 <SF,EE,ME,IR,DR,RI,LE> CR: 24008268
XER: 00000036
CFAR: c00000000125dbc4 DAR: 0000000000000002 DSISR: 40000000
IRQMASK: 1
GPR00: c00000000125d908 c0000001e0957820 c0000000016e8100
c00000013d814540
GPR04: 0000000000000002 c00000013d814550 0000000000000045
0000000000000000
GPR08: c00000013444d000 c00000013d814538 c00000013d814538
0000000084002268
GPR12: c00000000125e730 c000007e2ffd5f00 ffffffffffffffff
0000000000020000
GPR16: 0000000000000000 0000000000000002 c00000015f653000
0000000000000000
GPR20: c000000138662400 c00000013d814540 0000000000000000
c00000013d814500
GPR24: 0000000000000000 0000000000000002 c0000001e0957888
c0000001e0957878
GPR28: c00000013d814548 0000000000000000 c00000013d814540
c0000001e0957888
NIP [c0000000002c8a64] __mutex_add_waiter+0x24/0xc0
LR [c00000000125dbc8] __mutex_lock.constprop.0+0x318/0xd00
Call Trace:
0xc0000001e0957890 (unreliable)
__mutex_lock.constprop.0+0x58/0xd00
amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu+0x6fc/0xb60 [amdgpu]
kfd_process_alloc_gpuvm+0x54/0x1f0 [amdgpu]
kfd_process_device_init_cwsr_dgpu+0xa4/0x1a0 [amdgpu]
kfd_process_device_init_vm+0xd8/0x2e0 [amdgpu]
kfd_ioctl_acquire_vm+0xd0/0x130 [amdgpu]
kfd_ioctl+0x514/0x670 [amdgpu]
sys_ioctl+0x134/0x180
system_call_exception+0x114/0x300
system_call_vectored_common+0x15c/0x2ec
This patch changes AMDGPU_VA_RESERVED_TRAP_SIZE to 64 KB and
KFD_CWSR_TBA_TMA_SIZE to the AMD GPU page size. This means we reserve
64 KB for the trap in the address space, but only allocate 8 KB within
it. With this approach, the allocation size never exceeds the reserved
area.
Fixes: 34a1de0f7935 ("drm/amdkfd: Relocate TBA/TMA to opposite side of VM hole")
Reviewed-by: Christian König <christian.koenig@amd.com>
Suggested-by: Felix Kuehling <felix.kuehling@amd.com>
Suggested-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Donet Tom <donettom@linux.ibm.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Fix the code alignment for if condition and also provide
a line space between multiline if condition and next
statement.
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
reset ras eeprom table when it is invalid
Signed-off-by: Gangliang Xie <ganglxie@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
In mes_userq_mqd_create(), the memdup_user() allocations for
IP-specific MQD structs are not freed when subsequent VA validation
fails. The goto free_mqd label only cleans up the MQD BO object and
userq_props.
Fix by adding kfree() before each goto free_mqd on VA validation
failure in the COMPUTE, GFX, and SDMA branches.
Fixes: 9e46b8bb0539 ("drm/amdgpu: validate userq buffer virtual address and size")
Reported-by: Yuhao Jiang <danisjiang@gmail.com>
Signed-off-by: Junrui Luo <moonafterrain@outlook.com>
Reviewed-by: Prike Liang <Prike.Liang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
For gfxV9, due to a hardware bug ("based on the comments in the code
here [1]"), the control stack of a user-mode compute queue must be
allocated immediately after the page boundary of its regular MQD buffer.
To handle this, we allocate an enlarged MQD buffer where the first page
is used as the MQD and the remaining pages store the control stack.
Although these regions share the same BO, they require different memory
types: the MQD must be UC (uncached), while the control stack must be
NC (non-coherent), matching the behavior when the control stack is
allocated in user space.
This logic works correctly on systems where the CPU page size matches
the GPU page size (4K). However, the current implementation aligns both
the MQD and the control stack to the CPU PAGE_SIZE. On systems with a
larger CPU page size, the entire first CPU page is marked UC—even though
that page may contain multiple GPU pages. The GPU treats the second 4K
GPU page inside that CPU page as part of the control stack, but it is
incorrectly mapped as UC.
This patch fixes the issue by aligning both the MQD and control stack
sizes to the GPU page size (4K). The first 4K page is correctly marked
as UC for the MQD, and the remaining GPU pages are marked NC for the
control stack. This ensures proper memory type assignment on systems
with larger CPU page sizes.
[1]: https://elixir.bootlin.com/linux/v6.18/source/drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_v9.c#L118
Acked-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Donet Tom <donettom@linux.ibm.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
The AQL queue size can be 4K, but the minimum buffer object (BO)
allocation size is PAGE_SIZE. On systems with a page size larger
than 4K, the expected queue size does not match the allocated BO
size, causing queue creation to fail.
Align the expected queue size to PAGE_SIZE so that it matches the
allocated BO size and allows queue creation to succeed.
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Signed-off-by: Donet Tom <donettom@linux.ibm.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Coccinelle flags hand-rolled "enabled"/"disabled" strings; use the shared
str_enabled_disabled() helper from string_choices.h for npm_status and
thermal throttling logging sysfs text.
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202603251434.zIN2QYWn-lkp@intel.com/
Signed-off-by: Asad Kamal <asad.kamal@amd.com>
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
Fix the IDR allocation flags by using atomic GFP
flags in non‑sleepable contexts to avoid the __might_sleep()
complaint.
268.290239] [drm] Initialized amdgpu 3.64.0 for 0000:03:00.0 on minor 0
[ 268.294900] BUG: sleeping function called from invalid context at ./include/linux/sched/mm.h:323
[ 268.295355] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 1744, name: modprobe
[ 268.295705] preempt_count: 1, expected: 0
[ 268.295886] RCU nest depth: 0, expected: 0
[ 268.296072] 2 locks held by modprobe/1744:
[ 268.296077] #0: ffff8c3a44abd1b8 (&dev->mutex){....}-{4:4}, at: __driver_attach+0xe4/0x210
[ 268.296100] #1: ffffffffc1a6ea78 (amdgpu_pasid_idr_lock){+.+.}-{3:3}, at: amdgpu_pasid_alloc+0x26/0xe0 [amdgpu]
[ 268.296494] CPU: 12 UID: 0 PID: 1744 Comm: modprobe Tainted: G U OE 6.19.0-custom #16 PREEMPT(voluntary)
[ 268.296498] Tainted: [U]=USER, [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
[ 268.296499] Hardware name: AMD Majolica-RN/Majolica-RN, BIOS RMJ1009A 06/13/2021
[ 268.296501] Call Trace:
Fixes: 8f1de51f49be ("drm/amdgpu: prevent immediate PASID reuse case")
Signed-off-by: Prike Liang <Prike.Liang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
In amdgpu_device_fini_hw(), deferred coredump formatting work may still
be pending when hardware and IP components are being torn down. Since
the work may access device registers and memory that will be freed or
powered off, it must be completed before proceeding.
Add a flush_work() call for adev->coredump_work, guarded by
CONFIG_DEV_COREDUMP, to ensure any pending coredump work finishes
before the device enters the early IP fini stage.
This avoids potential use-after-free or accessing hardware resources
that are no longer available.
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Suggested-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Jesse Zhang <jesse.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
During GPU reset coredump generation, amdgpu_devcoredump_fw_info() unconditionally
dereferences adev->mode_info.atom_context to print VBIOS fields. On reset/teardown
paths this pointer can be NULL, causing a kernel page fault from the deferred
coredump workqueue.
Fix by checking ctx before printing VBIOS fields:
if ctx is valid, print full VBIOS information as before;
This prevents NULL-dereference crashes while preserving coredump output.
Observed page fault log:
[ 667.933329] RIP: 0010:amdgpu_devcoredump_format+0x780/0xc00 [amdgpu]
[ 667.941517] amdgpu 0002:01:00.0: Dumping IP State
[ 667.949660] Code: 8d 57 74 48 c7 c6 01 65 9f c2 48 8d 7d 98 e8 97 96 7a ff 49 8d 97 b4 00 00 00 48 c7 c6 18 65 9f c2 48 8d 7d 98 e8 80 96 7a ff <41> 8b 97 f4 00 00 00 48 c7 c6 2f 65 9f c2 48 8d 7d 98 e8 69 96 7a
[ 667.949666] RSP: 0018:ffffc9002302bd50 EFLAGS: 00010246
[ 667.949673] RAX: 0000000000000000 RBX: ffff888110600000 RCX: 0000000000000000
[ 667.949676] RDX: 000000000000a9b5 RSI: 0000000000000405 RDI: 000000000000a999
[ 667.949680] RBP: ffffc9002302be00 R08: ffffffffc09c3084 R09: ffffffffc09c3085
[ 667.949684] R10: 0000000000000000 R11: 0000000000000004 R12: 00000000000048e0
[ 667.993908] amdgpu 0002:01:00.0: Dumping IP State Completed
[ 667.994229] R13: 0000000000000025 R14: 000000000000000c R15: 0000000000000000
[ 667.994233] FS: 0000000000000000(0000) GS:ffff88c44c2c9000(0000) knlGS:0000000000000000
[ 668.000076] amdgpu 0002:01:00.0: [drm] AMDGPU device coredump file has been created
[ 668.008025] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 668.008030] CR2: 00000000000000f4 CR3: 000000011195f001 CR4: 0000000000770ef0
[ 668.008035] PKRU: 55555554
[ 668.008040] Call Trace:
[ 668.008045] <TASK>
[ 668.016010] amdgpu 0002:01:00.0: [drm] Check your /sys/class/drm/card16/device/devcoredump/data
[ 668.023967] ? srso_alias_return_thunk+0x5/0xfbef5
[ 668.023988] ? __pfx___drm_printfn_coredump+0x10/0x10 [drm]
[ 668.031950] amdgpu 0003:01:00.0: Dumping IP State
[ 668.038159] ? __pfx___drm_puts_coredump+0x10/0x10 [drm]
[ 668.083017] amdgpu 0003:01:00.0: Dumping IP State Completed
[ 668.083824] amdgpu_devcoredump_deferred_work+0x26/0xc0 [amdgpu]
[ 668.086163] amdgpu 0003:01:00.0: [drm] AMDGPU device coredump file has been created
[ 668.095863] process_scheduled_works+0xa6/0x420
[ 668.095880] worker_thread+0x12a/0x270
[ 668.101223] amdgpu 0003:01:00.0: [drm] Check your /sys/class/drm/card24/device/devcoredump/data
[ 668.107441] kthread+0x10d/0x230
[ 668.107451] ? __pfx_worker_thread+0x10/0x10
[ 668.107458] ? __pfx_kthread+0x10/0x10
[ 668.112709] amdgpu 0000:01:00.0: ring vcn_unified_1 timeout, signaled seq=9, emitted seq=10
[ 668.118630] ret_from_fork+0x17c/0x1f0
[ 668.118640] ? __pfx_kthread+0x10/0x10
[ 668.118647] ret_from_fork_asm+0x1a/0x30
Reviewed-by: Lijo Lazar <lijo.lazar@amd.com>
Suggested-by: Lijo Lazar <lijo.lazar@amd.com>
Signed-off-by: Jesse Zhang <jesse.zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
amdgpu_userq_vm_validate function does not need userq_mutex and exec
lock is good enough to locking all bos and updating the eviction fence.
Also since we only need userq_mutex for amdgpu_userq_restore_all
so move the locks in the function itself.
Signed-off-by: Sunil Khatri <sunil.khatri@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
amdgpu_userq_get_doorbell_index() passes the user-provided
doorbell_offset to amdgpu_doorbell_index_on_bar() without bounds
checking. An arbitrarily large doorbell_offset can cause the
calculated doorbell index to fall outside the allocated doorbell BO,
potentially corrupting kernel doorbell space.
Validate that doorbell_offset falls within the doorbell BO before
computing the BAR index, using u64 arithmetic to prevent overflow.
Fixes: f09c1e6077ab ("drm/amdgpu: generate doorbell index for userqueue")
Reported-by: Yuhao Jiang <danisjiang@gmail.com>
Signed-off-by: Junrui Luo <moonafterrain@outlook.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
|
|
xe_device_declare_wedged() runs in the DMA-fence signaling path, where
GFP_KERNEL memory allocations are not allowed. However, registering
xe_device_wedged_fini via drmm_add_action_or_reset() triggers a
GFP_KERNEL allocation.
Fix this by deferring the registration of xe_device_wedged_fini until
late in the driver load sequence. Additionally, drop the wedged PM
reference only if the device is actually wedged in
xe_device_wedged_fini.
Fixes: 452bca0edbd0 ("drm/xe: Don't suspend device upon wedge")
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patch.msgid.link/20260326210116.202585-2-matthew.brost@intel.com
(cherry picked from commit b08ceb443866808b881b12d4183008d214d816c1)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
|
When an SVM is closed, the garbage collector work item must be stopped
synchronously and any future queuing must be prevented. Replace
flush_work() with disable_work_sync() to ensure both conditions are
met.
Fixes: 63f6e480d115 ("drm/xe: Add SVM garbage collector")
Cc: stable@vger.kernel.org
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Reviewed-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Link: https://patch.msgid.link/20260227015225.3081787-1-matthew.brost@intel.com
(cherry picked from commit 2247feb9badca5a4774df9a437bfc44fba4f22de)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
|
On PTL, older GSC FWs have a bug that can cause them to crash during
PXP invalidation events, which leads to a complete loss of power
management on the media GT. Therefore, we can't use PXP on FWs that
have this bug, which was fixed in PTL GSC build 1396.
Fixes: b1dcec9bd8a1 ("drm/xe/ptl: Enable PXP for PTL")
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Julia Filipchuk <julia.filipchuk@intel.com>
Reviewed-by: Julia Filipchuk <julia.filipchuk@intel.com>
Acked-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patch.msgid.link/20260324153718.3155504-10-daniele.ceraolospurio@intel.com
(cherry picked from commit 6eb04caaa972934c9b6cea0e0c29e466bf9a346f)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
|
If we don't clear the flag we'll keep jumping back at the beginning of
the function once we reach the end.
Fixes: ccd3c6820a90 ("drm/xe/pxp: Decouple queue addition from PXP start")
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Julia Filipchuk <julia.filipchuk@intel.com>
Reviewed-by: Julia Filipchuk <julia.filipchuk@intel.com>
Link: https://patch.msgid.link/20260324153718.3155504-9-daniele.ceraolospurio@intel.com
(cherry picked from commit 0850ec7bb2459602351639dccf7a68a03c9d1ee0)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
|
The default case of the PXP suspend switch is incorrectly exiting
without releasing the lock. However, this case is impossible to hit
because we're switching on an enum and all the valid enum values have
their own cases. Therefore, we can just get rid of the default case
and rely on the compiler to warn us if a new enum value is added and
we forget to add it to the switch.
Fixes: 51462211f4a9 ("drm/xe/pxp: add PXP PM support")
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Alan Previn Teres Alexis <alan.previn.teres.alexis@intel.com>
Cc: Julia Filipchuk <julia.filipchuk@intel.com>
Reviewed-by: Julia Filipchuk <julia.filipchuk@intel.com>
Link: https://patch.msgid.link/20260324153718.3155504-8-daniele.ceraolospurio@intel.com
(cherry picked from commit f1b5a77fc9b6a90cd9a5e3db9d4c73ae1edfcfac)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
|
If the PXP HW termination fails during PXP start, the normal completion
code won't be called, so the termination will remain uncomplete. To avoid
unnecessary waits, mark the termination as completed from the error path.
Note that we already do this if the termination fails when handling a
termination irq from the HW.
Fixes: f8caa80154c4 ("drm/xe/pxp: Add PXP queue tracking and session start")
Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com>
Cc: Alan Previn Teres Alexis <alan.previn.teres.alexis@intel.com>
Cc: Julia Filipchuk <julia.filipchuk@intel.com>
Reviewed-by: Julia Filipchuk <julia.filipchuk@intel.com>
Link: https://patch.msgid.link/20260324153718.3155504-7-daniele.ceraolospurio@intel.com
(cherry picked from commit 5d9e708d2a69ab1f64a17aec810cd7c70c5b9fab)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
|
Userspace passes canonical (sign-extended) GPU addresses where bits 63:48
mirror bit 47. The internal GPUVM uses non-canonical form (upper bits
zeroed), so passing raw canonical addresses into GPUVM lookups causes
mismatches for addresses above 128TiB.
Strip the sign extension with xe_device_uncanonicalize_addr() at the
top of xe_vm_madvise_ioctl(). Non-canonical addresses are unaffected.
Fixes: ada7486c5668 ("drm/xe: Implement madvise ioctl for xe")
Suggested-by: Matthew Brost <matthew.brost@intel.com>
Cc: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>
Signed-off-by: Arvind Yadav <arvind.yadav@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patch.msgid.link/20260326130843.3545241-13-arvind.yadav@intel.com
(cherry picked from commit 05c8b1cdc54036465ea457a0501a8c2f9409fce7)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
|
The page fault handler should reject write/atomic access to read only
VMAs. Add code to handle this in xe_pagefault_service after the VMA
lookup.
v2:
- Apply max line length (Matthew)
Fixes: fb544b844508 ("drm/xe: Implement xe_pagefault_queue_work")
Signed-off-by: Jonathan Cavitt <jonathan.cavitt@intel.com>
Suggested-by: Matthew Brost <matthew.brost@intel.com>
Cc: Shuicheng Lin <shuicheng.lin@intel.com>
Reviewed-by: Matthew Brost <matthew.brost@intel.com>
Signed-off-by: Matthew Brost <matthew.brost@intel.com>
Link: https://patch.msgid.link/20260324152935.72444-7-jonathan.cavitt@intel.com
(cherry picked from commit 714ee6754ac5fa3dc078856a196a6b124cd797a0)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@intel.com>
|
|
Ast's DP501 initialization reads the register SCU2C at offset 0x1202c
and tries to set it to source data from VGA. But writes the update to
offset 0x0, with unknown results. Write the result to SCU instead.
The bug only happens in ast_init_analog(). There's similar code in
ast_init_dvo(), which works correctly.
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Fixes: 83c6620bae3f ("drm/ast: initial DP501 support (v0.2)")
Reviewed-by: Jocelyn Falempe <jfalempe@redhat.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Thomas Zimmermann <tzimmermann@suse.de>
Cc: Jocelyn Falempe <jfalempe@redhat.com>
Cc: dri-devel@lists.freedesktop.org
Cc: <stable@vger.kernel.org> # v3.16+
Link: https://patch.msgid.link/20260327133532.79696-2-tzimmermann@suse.de
|
|
Boris needs 7.0-rc6 for a shmem helper fix.
Signed-off-by: Maxime Ripard <mripard@kernel.org>
|
|
Stop adjusting the horizontal timing values based on the
compression ratio in command mode. Bspec seems to be telling
us to do this only in video mode, and this is also how the
Windows driver does things.
This should also fix a div-by-zero on some machines because
the adjusted htotal ends up being so small that we end up with
line_time_us==0 when trying to determine the vtotal value in
command mode.
Note that this doesn't actually make the display on the
Huawei Matebook E work, but at least the kernel no longer
explodes when the driver loads.
Cc: stable@vger.kernel.org
Closes: https://gitlab.freedesktop.org/drm/i915/kernel/-/issues/12045
Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Link: https://patch.msgid.link/20260326111814.9800-2-ville.syrjala@linux.intel.com
Fixes: 53693f02d80e ("drm/i915/dsi: account for DSC in horizontal timings")
Reviewed-by: Jani Nikula <jani.nikula@intel.com>
(cherry picked from commit 0b475e91ecc2313207196c6d7fd5c53e1a878525)
Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
|
|
Factor out a chunk of complexity into a new subroutine. This is an
incremental step in adding ELF32 support to the existing ELF64 section
support, for handling GPU firmware.
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
Acked-by: Danilo Krummrich <dakr@kernel.org>
Link: https://patch.msgid.link/20260326013902.588242-9-jhubbard@nvidia.com
[acourbot: use fuller prefix in commit message.]
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
|
|
Up until now, only the GSP required parsing of its firmware headers.
However, upcoming support for Hopper/Blackwell+ adds another firmware
image (FMC), along with another format (ELF32).
Therefore, the current ELF64 section parsing support needs to be moved
up a level, so that both of the above can use it.
There are no functional changes. This is pure code movement.
Reviewed-by: Gary Guo <gary@garyguo.net>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
Acked-by: Danilo Krummrich <dakr@kernel.org>
Link: https://patch.msgid.link/20260326013902.588242-8-jhubbard@nvidia.com
[acourbot: use fuller prefix in commit message.]
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
|
|
https://gitlab.freedesktop.org/drm/xe/kernel into drm-next
Hi Dave and Sima,
Here goes our late, final drm-xe-next PR towards 7.1. We just purgeable
BO uAPI in today, hence the late pull.
In the big things we have:
- Add support for purgeable buffer objects
Thanks,
Matt
UAPI Changes:
- Add support for purgeable buffer objects (Arvind, Himal)
Driver Changes:
- Remove useless comment (Maarten)
- Issue GGTT invalidation under lock in ggtt_node_remove (Brost, Fixes)
- Fix mismatched include guards in header files (Shuicheng)
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Matthew Brost <matthew.brost@intel.com>
Link: https://patch.msgid.link/acX4fWxPkZrrfwnT@gsse-cloud1.jf.intel.com
|
|
Replace the nova-core local `DmaObject` with a `CoherentBox` that can
fulfill the same role.
Since `CoherentBox` is more flexible than `DmaObject`, we can use the
native `u64` type for page table entries instead of messing with bytes.
The `dma` module becomes unused with that change, so remove it as well.
Reviewed-by: Gary Guo <gary@garyguo.net>
Reviewed-by: Danilo Krummrich <dakr@kernel.org>
Link: https://patch.msgid.link/20260327-b4-nova-dma-removal-v2-7-616e1d0b5cb3@nvidia.com
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
|
|
Replace the nova-core local `DmaObject` with a `Coherent` that can
fulfill the same role.
Reviewed-by: Gary Guo <gary@garyguo.net>
Reviewed-by: Danilo Krummrich <dakr@kernel.org>
Link: https://patch.msgid.link/20260327-b4-nova-dma-removal-v2-6-616e1d0b5cb3@nvidia.com
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
|
|
Replace the nova-core local `DmaObject` with a `CoherentHandle` that can
fulfill the same role.
Reviewed-by: Danilo Krummrich <dakr@kernel.org>
Link: https://patch.msgid.link/20260327-b4-nova-dma-removal-v2-5-616e1d0b5cb3@nvidia.com
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
|
|
Replace the nova-core local `DmaObject` with a `Coherent` that can
fulfill the same role.
Reviewed-by: Gary Guo <gary@garyguo.net>
Reviewed-by: Danilo Krummrich <dakr@kernel.org>
Link: https://patch.msgid.link/20260327-b4-nova-dma-removal-v2-4-616e1d0b5cb3@nvidia.com
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
|
|
Replace the nova-core local `DmaObject` with a `Coherent` that can
fulfill the same role.
Reviewed-by: Gary Guo <gary@garyguo.net>
Reviewed-by: Danilo Krummrich <dakr@kernel.org>
Link: https://patch.msgid.link/20260327-b4-nova-dma-removal-v2-3-616e1d0b5cb3@nvidia.com
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
|
|
Replace the nova-core local `DmaObject` with a `Coherent` that can
fulfill the same role.
Reviewed-by: Gary Guo <gary@garyguo.net>
Reviewed-by: Danilo Krummrich <dakr@kernel.org>
Link: https://patch.msgid.link/20260327-b4-nova-dma-removal-v2-2-616e1d0b5cb3@nvidia.com
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
|
|
https://git.kernel.org/pub/scm/linux/kernel/git/chunkuang.hu/linux into drm-fixes
Mediatek DRM Fixes - 20260323
1. dsi: Store driver data before invoking mipi_dsi_host_register
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Chun-Kuang Hu <chunkuang.hu@kernel.org>
Link: https://patch.msgid.link/20260323160135.39609-1-chunkuang.hu@kernel.org
|
|
Switch LogBuffer from Coherent<[u8]> (unsized) to
Coherent<[u8; LOG_BUFFER_SIZE]> (sized). The buffer size is a
compile-time constant (RM_LOG_BUFFER_NUM_PAGES * GSP_PAGE_SIZE), so a
fixed-size array is more precise and avoids the need for the runtime
length parameter of zeroed_slice().
Acked-by: Gary Guo <gary@garyguo.net>
Reviewed-by: Alexandre Courbot <acourbot@nvidia.com>
Link: https://patch.msgid.link/20260325003921.3420-3-dakr@kernel.org
Signed-off-by: Danilo Krummrich <dakr@kernel.org>
|