summaryrefslogtreecommitdiffstats
path: root/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
AgeCommit message (Expand)AuthorLines
2025-01-06drm/amdgpu: Fix error handling in amdgpu_ras_add_bad_pagesSrinivasan Shanmugam-5/+16
2024-12-18drm/amdgpu: Enable psp v14_0_3 RAS support for non-SRIOV configurations.Candice Li-1/+1
2024-12-10drm/amdgpu: Support nbif v6_3_1 fatal error handlingCandice Li-0/+12
2024-12-10drm/amdgpu: Add psp v14_0_3 ras supportCandice Li-0/+1
2024-12-10drm/amdgpu: Enable RAS for psp v13_0_12Hawking Zhang-0/+5
2024-12-10drm/amdgpu: correct the calculation of RAS bad pageTao Zhou-8/+2
2024-12-10drm/amdgpu: split ras_eeprom_init into init and check functionsTao Zhou-4/+11
2024-12-10drm/amdgpu: remove is_mca_add for ras_add_bad_pagesTao Zhou-11/+5
2024-12-10drm/amdgpu: parse legacy RAS bad page mixed with new data in various NPS modesTao Zhou-15/+81
2024-12-10drm/amdgpu: support to find RAS bad pages via old TATao Zhou-3/+25
2024-12-10drm/amdgpu: store only one RAS bad page record for all pages in one rowTao Zhou-8/+27
2024-12-10drm/amdgpu: Prefer RAS recovery for scheduler hangLijo Lazar-2/+53
2024-12-10drm/amdgpu: do RAS MCA2PA conversion in device init phaseTao Zhou-12/+82
2024-12-10drm/amdgpu: add flag to indicate the type of RAS eeprom recordTao Zhou-7/+26
2024-11-20drm/amdgpu: Use reset recovery state checksLijo Lazar-5/+5
2024-11-11drm/amdgpu: Implement virt req_ras_err_countVictor Skvortsov-7/+65
2024-11-11drm/amdgpu: VF Query RAS Caps from Host if supportedVictor Skvortsov-0/+5
2024-11-04drm/amdgpu: Skip IP coredump for RAS errorsLijo Lazar-0/+1
2024-09-26drm/amdgpu: Refactor XGMI reset on init handlingLijo Lazar-6/+0
2024-09-26drm/amdgpu: Add helper to initialize badpage infoLijo Lazar-18/+38
2024-09-26drm/amdgpu: Use init level for pending_reset flagLijo Lazar-1/+1
2024-09-26amd/amdgpu: Reduce unnecessary repetitive GPU resetsYiPeng Chai-1/+20
2024-09-18drm/amdgpu: fix typo in the commentYan Zhen-1/+1
2024-09-17drm/amdgpu: disable GPU RAS bad page feature for specific ASICTao Zhou-0/+5
2024-08-06drm/amdgpu: remove RAS unused paramter 'err_addr'Yang Wang-9/+9
2024-08-06drm/amdgpu: create function to check RAS RMA statusTao Zhou-6/+16
2024-08-06drm/amdgpu: Add more types for boot time error reportingHawking Zhang-0/+10
2024-07-23drm/amdgpu: Remove unused codeYiPeng Chai-23/+0
2024-07-10drm/amdgpu: timely save bad pages to eeprom after gpu ras reset is completedYiPeng Chai-1/+5
2024-07-10drm/amdgpu: flush all cached ras bad pages to eepromYiPeng Chai-6/+29
2024-07-08drm/amdgpu: add ras event state device attribute supportYang Wang-4/+52
2024-07-08drm/amdgpu: add ras POSION_CONSUMPTION event id supportYang Wang-3/+13
2024-07-08drm/amdgpu: add ras POSION_CREATION event id supportYang Wang-3/+14
2024-07-08drm/amdgpu: refine amdgpu ras event id core codeYang Wang-18/+84
2024-07-08drm/amdgpu: sysfs node disable query error count during gpu resetYiPeng Chai-0/+3
2024-07-01drm/amdgpu: Fix hbm stack id in boot error reportHawking Zhang-1/+1
2024-06-27drm/amdgpu: add gpu reset check and exception handlingYiPeng Chai-0/+53
2024-06-27drm/amdgpu: refine poison consumption interrupt handlerYiPeng Chai-18/+37
2024-06-27drm/amdgpu: refine poison creation interrupt handlerYiPeng Chai-22/+17
2024-06-27drm/amdgpu: add variable to record the deferred error number read by driverYiPeng Chai-18/+44
2024-06-14drm/amdgpu: set RAS fed status for more casesTao Zhou-0/+1
2024-06-14drm/amdgpu: create amdgpu_ras_in_recovery to simplify codeTao Zhou-12/+19
2024-06-14drm/amdgpu: trigger mode1 reset for RAS RMA statusTao Zhou-6/+22
2024-06-14drm/amdgpu: move aca/mca init functions into ras_init() stageYang Wang-23/+50
2024-06-14drm/amdgpu: add reset source in various casesEric Huang-0/+1
2024-06-05drm/amdgpu: add RAS is_rma flagTao Zhou-5/+4
2024-06-05drm/amdgpu: Update programming for boot error reportingHawking Zhang-54/+45
2024-06-05drm/amdgpu: Estimate RAS reservation when report capacity v2Hawking Zhang-0/+20
2024-05-29drm/amdgpu: fix typo in amdgpu_ras_aca_sysfs_read() functionYang Wang-1/+1
2024-05-23drm/amdgpu: skip to create ras xxx_err_count node when ACA is enabledYang Wang-0/+6