summaryrefslogtreecommitdiffstats
path: root/drivers/gpu/drm/amd/ras
AgeCommit message (Collapse)AuthorLines
11 daysConvert more 'alloc_obj' cases to default GFP_KERNEL argumentsLinus Torvalds-2/+1
This converts some of the visually simpler cases that have been split over multiple lines. I only did the ones that are easy to verify the resulting diff by having just that final GFP_KERNEL argument on the next line. Somebody should probably do a proper coccinelle script for this, but for me the trivial script actually resulted in an assertion failure in the middle of the script. I probably had made it a bit _too_ trivial. So after fighting that far a while I decided to just do some of the syntactically simpler cases with variations of the previous 'sed' scripts. The more syntactically complex multi-line cases would mostly really want whitespace cleanup anyway. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
12 daysConvert 'alloc_obj' family to use the new default GFP_KERNEL argumentLinus Torvalds-9/+9
This was done entirely with mindless brute force, using git grep -l '\<k[vmz]*alloc_objs*(.*, GFP_KERNEL)' | xargs sed -i 's/\(alloc_objs*(.*\), GFP_KERNEL)/\1)/' to convert the new alloc_obj() users that had a simple GFP_KERNEL argument to just drop that argument. Note that due to the extreme simplicity of the scripting, any slightly more complex cases spread over multiple lines would not be triggered: they definitely exist, but this covers the vast bulk of the cases, and the resulting diff is also then easier to check automatically. For the same reason the 'flex' versions will be done as a separate conversion. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
12 daystreewide: Replace kmalloc with kmalloc_obj for non-scalar typesKees Cook-11/+11
This is the result of running the Coccinelle script from scripts/coccinelle/api/kmalloc_objs.cocci. The script is designed to avoid scalar types (which need careful case-by-case checking), and instead replace kmalloc-family calls that allocate struct or union object instances: Single allocations: kmalloc(sizeof(TYPE), ...) are replaced with: kmalloc_obj(TYPE, ...) Array allocations: kmalloc_array(COUNT, sizeof(TYPE), ...) are replaced with: kmalloc_objs(TYPE, COUNT, ...) Flex array allocations: kmalloc(struct_size(PTR, FAM, COUNT), ...) are replaced with: kmalloc_flex(*PTR, FAM, COUNT, ...) (where TYPE may also be *VAR) The resulting allocations no longer return "void *", instead returning "TYPE *". Signed-off-by: Kees Cook <kees@kernel.org>
2026-02-05drm/amd/ras: statistic xgmi training error countStanley.Yang-1/+1
Report xgmi training error uncorrectable error count. Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-05drm/amd/ras: Replace NPS flags in ras moduleJinzhou Su-1/+1
Replace AMDGPU_NPS8_PARTITION_MODE with UMC_MEMORY_PARTITION_MODE_NPS8 to pass sriov compilation. Signed-off-by: Jinzhou Su <jinzhou.su@amd.com> Reviewed-by: YiPeng Chai <YiPeng.Chai@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2026-01-05drm/amd/ras: Support physical address convertJinzhou Su-11/+68
Support physical address convert to current NPS pages in uniras. Signed-off-by: Jinzhou Su <jinzhou.su@amd.com> Reviewed-by: YiPeng Chai <YiPeng.Chai@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-10drm/amd/ras: Add vram_type to ras_ta_init_flagsCandice Li-0/+4
Add vram_type to ras_ta_init_flags. Signed-off-by: Candice Li <candice.li@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-10drm/amd/ras: Reduce stack usage in amdgpu_virt_ras_get_cper_records()Srinivasan Shanmugam-4/+13
amdgpu_virt_ras_get_cper_records() was using a large stack array of ras_log_info pointers. This contributed to the frame size warning on this function. Replace the fixed-size stack array: struct ras_log_info *trace[MAX_RECORD_PER_BATCH]; with a heap-allocated array using kcalloc(). We free the trace buffer together with out_buf on all exit paths. If allocation of trace or out_buf fails, we return a generic RAS error code. This reduces stack usage and keeps the runtime behaviour unchanged. Fixes: stack frame size: 1112 bytes (limit: 1024) Cc: Tao Zhou <tao.zhou1@amd.com> Cc: Hawking Zhang <Hawking.Zhang@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08drm/amd/ras: Reduce stack usage in ras_umc_handle_bad_pages()Srinivasan Shanmugam-8/+21
ras_umc_handle_bad_pages() function used a large local array: struct eeprom_umc_record records[MAX_ECC_NUM_PER_RETIREMENT]; Move this array off the stack by allocating it with kcalloc() and freeing it before return. This reduces the stack frame size of ras_umc_handle_bad_pages() and avoids the frame size warning. Fixes the below: drivers/gpu/drm/amd/amdgpu/../ras/rascore/ras_umc.c:498:5: warning: stack frame size (1208) exceeds limit (1024) in 'ras_umc_handle_bad_pages' [-Wframe-larger-than] v2: Removed the duplicate ras_umc_get_new_records() invocation. (Lijo) Cc: Tao Zhou <tao.zhou1@amd.com> Cc: Hawking Zhang <Hawking.Zhang@amd.com> Cc: Christian König <christian.koenig@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Srinivasan Shanmugam <srinivasan.shanmugam@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08drm/amd/ras: Compatible with legacy sriov hostYiPeng Chai-0/+36
If sriov host is legacy, the guest uniras will be disabled. Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08drm/amd/ras: Add sriov ras preprocessing before gpu resetYiPeng Chai-1/+23
Sriov host may clear all VF commands registered to auto update list during VF reset, set ecc.auto_uUpdate block to false before VF reset, and after VF reset is complete, RAS_CMD__GET_ALL_BLOCK_ECC_STATUS command will be re-registered to auto update list of sriov host. Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08drm/amd/ras: Support high-frequency querying sriov ras block error countYiPeng Chai-0/+154
Support high-frequency querying sriov ras block error count: 1. Create shared memory and fills it with RAS_CMD__GET_LAL_LOC_STATUS ras command. 2. The RAS_CMD_GET_ALL_BLOCK_ECC_STATUS command and shared memory are registered to sriov host ras auto-update list via RAS_CMD_SET_CMD_AUTO_UPDATE command. 3. Once sriov host detects ras error, it will automatically execute RAS_CMD__GET_ALL_BLOCK_ECC_STATUS command and write the result to shared memory. Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08drm/amd/ras: Add ras command to retrieve cper data from sriov hostYiPeng Chai-1/+173
In order to reduce the number of interactions with sriov host and the amount of data exchanged, a set of ras commands is first used to obtain the raw data used to generate cper from the host, then, guest driver generates cper based on the obtained raw data. Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-12-08drm/amd/ras: sriov supports handling VF ras commands.YiPeng Chai-10/+207
Add basic framework code to sriov to handle VF ras commands. Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-11-06drm/amd/ras: ras supports i2c eeprom for mp1 v13_0_12YiPeng Chai-0/+1
ras supports i2c eeprom for mp1 v13_0_12. Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-11-04drm/amdgpu: suspend ras module before gpu resetYiPeng Chai-0/+110
During gpu reset, all GPU-related resources are inaccessible. To avoid affecting ras functionality, suspend ras module before gpu reset and resume it after gpu reset is complete. V2: Rename functions to avoid misunderstanding. V3: Move flush_delayed_work to amdgpu_ras_process_pause, Move schedule_delayed_work to amdgpu_ras_process_unpause. V4: Rename functions. V5: Move the function to amdgpu_ras.c. Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Acked-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-11-04drm/amd/ras: Add ras support for umc v12_5_0YiPeng Chai-1/+3
Add ras support for umc v12_5_0. Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-11-04drm/amd/ras: Add ras support for nbio v7_9_1YiPeng Chai-1/+3
Add ras support for nbio v7_9_1. Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-11-04drm/amd/ras: Increase ras switch control rangeYiPeng Chai-6/+19
Increase ras switch control range. Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-11-04drm/amd/ras: Fix format truncationXiang Liu-2/+2
../ras/rascore/ras_cper.c: In function ‘cper_generate_fatal_record.isra’: ../ras/rascore/ras_cper.c:75:36: error: ‘%llX’ directive output may be truncated writing between 1 and 14 bytes into a region of size between 0 and 7 [-Werror=format-truncation=] 75 | snprintf(record_id, 9, "%d:%llX", dev_info.socket_id, | ^~~~ ../ras/rascore/ras_cper.c:75:32: note: directive argument in the range [0, 72057594037927935] 75 | snprintf(record_id, 9, "%d:%llX", dev_info.socket_id, | ^~~~~~~~~ ../ras/rascore/ras_cper.c:75:9: note: ‘snprintf’ output between 4 and 27 bytes into a destination of size 9 75 | snprintf(record_id, 9, "%d:%llX", dev_info.socket_id, | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 76 | RAS_LOG_SEQNO_TO_BATCH_IDX(trace->seqno)); | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ../ras/rascore/ras_cper.c: In function ‘cper_generate_runtime_record.isra’: ../ras/rascore/ras_cper.c:75:36: error: ‘%llX’ directive output may be truncated writing between 1 and 14 bytes into a region of size between 0 and 7 [-Werror=format-truncation=] 75 | snprintf(record_id, 9, "%d:%llX", dev_info.socket_id, | ^~~~ ../ras/rascore/ras_cper.c:75:32: note: directive argument in the range [0, 72057594037927935] 75 | snprintf(record_id, 9, "%d:%llX", dev_info.socket_id, | ^~~~~~~~~ ../ras/rascore/ras_cper.c:75:9: note: ‘snprintf’ output between 4 and 27 bytes into a destination of size 9 75 | snprintf(record_id, 9, "%d:%llX", dev_info.socket_id, | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 76 | RAS_LOG_SEQNO_TO_BATCH_IDX(trace->seqno)); | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ cc1: all warnings being treated as errors Signed-off-by: Xiang Liu <xiang.liu@amd.com> Reviewed-by: Lijo Lazar <lijo.lazar@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-11-04drm/amd/ras: Use correct severity for BP threshold exceed eventXiang Liu-2/+2
The severity of CPER for BP threshold exceed event should be set as FATAL to match the OOB implementation. Signed-off-by: Xiang Liu <xiang.liu@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-11-04drm/amd/ras: Correct info field of bad page threshold exceed CPERXiang Liu-3/+8
Correct valid_bits and ms_chk_bits of section info field for bad page threshold exceed CPER to match OOB's behavior. Signed-off-by: Xiang Liu <xiang.liu@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-11-04drm/amd/ras: Update IPID value for bad page threshold CPERXiang Liu-1/+8
The IPID register value for bad page threshold CPER holds socket_id info now according to the latest definition. Signed-off-by: Xiang Liu <xiang.liu@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-11-04drm/amd/ras: Fix the error of undefined reference to `__udivdi3'YiPeng Chai-2/+2
Fix the error: drivers/gpu/drm/amd/amdgpu/../ras/ras_mgr/amdgpu_ras_mgr.c:132:undefined reference to `__udivdi3' Fixes: fa0b203cd902 ("drm/amd/ras: Add amdgpu ras management function.") Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202510272144.6SUHUoWx-lkp@intel.com/ Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-10-20drm/amd/ras: Update function and remove redundant codeYiPeng Chai-127/+55
Update function and remove redundant code: 1. Update function to prepare for internal use. 2. Remove unused function code previously prepared for ioctl. V2: Update commit message content. Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-10-20drm/amd/ras: Update ras command context structure nameYiPeng Chai-22/+22
According to the actual usage of this structure, it is more appropriate to call it context, the structure name with ioctl is easy to cause misunderstanding. V2: Update commit message content. Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-10-20drm/amdgpu: Enable ras moduleYiPeng Chai-0/+12
Enable ras module, disabled by default. Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-10-20drm/amdgpu: Add ras module ip block to amdgpu discoveryYiPeng Chai-0/+10
Add ras module ip block to amdgpu discovery. Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-10-20drm/amdgpu: Improve ras fatal error handling functionYiPeng Chai-0/+5
In multi-gpu case, a fatal error will generate several fatal error interrupts. After improving this function, the ras module can reuse this function to only handle the first interrupt. V3: Initialize event_id using RAS_EVENT_INVALID_ID. Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-10-20drm/amdgpu: Intercept ras interrupts to ras moduleYiPeng Chai-4/+4
Intercept ras interrupts to ras module. V2: Change function names in ras module. Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-10-13drm/amdgpu: Add ras module files into amdgpuYiPeng Chai-17/+12
Add ras module files into amdgpu. Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-10-13drm/amd/ras: Add unified ras module top-level makefileYiPeng Chai-0/+34
Add unified ras module top-level makefile. Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-10-13drm/amd/ras: Add files to amdgpu ras manager makefileYiPeng Chai-0/+33
Add files to amdgpu ras manager makefile. Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-10-13drm/amd/ras: Add amdgpu ras management function.YiPeng Chai-0/+633
Add amdgpu system configuration parameters and functions needed by rascore. Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-10-13drm/amd/ras: Amdgpu preprocesses ras interruptsYiPeng Chai-0/+163
Amdgpu preprocesses ras interrupts. Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-10-13drm/amd/ras: Add amdgpu ras system functionsYiPeng Chai-0/+377
Add amdgpu ras system functions. Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-10-13drm/amd/ras: Amdgpu handle ras ioctl commandYiPeng Chai-0/+418
Amdgpu handle ras ioctl command. V2: Remove non-standard device information. Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-10-13drm/amd/ras: Add amdgpu eeprom i2c configuration functionYiPeng Chai-0/+208
Add amdgpu eeprom i2c configuration function. Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-10-13drm/amd/ras: Add amdgpu mp1 v13_0 configuration functionYiPeng Chai-0/+124
Add amdgpu mp1 v13_0 configuration function. Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-10-13drm/amd/ras: Add amdgpu nbio v7_9 configuration functionYiPeng Chai-0/+155
Add amdgpu nbio v7_9 configuration function. Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-10-13drm/amd/ras: Add files to ras core MakefileYiPeng Chai-0/+44
Add files to ras core Makefile. Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-10-13drm/amd/ras: Add rascore unified interface functionYiPeng Chai-0/+971
1. Complete the initialization call of all sub-functions. 2. Export common interfaces. V2: Remove the use of typedef to define function pointer. Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-10-13drm/amd/ras: Add cper conversion functionYiPeng Chai-0/+614
Add cper conversion function. V3: Change commit message and update the calling function. Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Signed-off-by: Xiang Liu <xiang.liu@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-10-13drm/amd/ras: Use ring buffer to record ras ecc dataYiPeng Chai-0/+403
Use ring buffer to record ras ecc data. V3: Change commit message and rename the file and function names. Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-10-13drm/amd/ras: Add thread to handle ras eventsYiPeng Chai-0/+368
Add thread to handle ras events. Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-10-13drm/amd/ras: Add ras ioctl command handlerYiPeng Chai-0/+952
Add ras ioctl command handler. V2: Remove ras global device list. Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-10-13drm/amd/ras: Add psp ras common functionsYiPeng Chai-0/+1126
Add psp ras common functions. Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-10-13drm/amd/ras: Add psp v13_0 ras functionsYiPeng Chai-0/+77
Add psp v13_0 ras functions. Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-10-13drm/amd/ras: Add eeprom ras functionsYiPeng Chai-0/+1565
Add eeprom ras functions. V5: Remove duplicate data structure definition. Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2025-10-13drm/amd/ras: Add gfx common ras functionsYiPeng Chai-0/+113
Add gfx common ras functions. Signed-off-by: YiPeng Chai <YiPeng.Chai@amd.com> Reviewed-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>