aboutsummaryrefslogtreecommitdiffstats
path: root/drivers/edac (follow)
AgeCommit message (Collapse)AuthorFilesLines
2025-10-13EDAC/versalnet: Fix off by one in handle_error()Dan Carpenter1-1/+1
The priv->mci[] array has NUM_CONTROLLERS so this > comparison needs to be >= to prevent an out of bounds access. Fixes: d5fe2fec6c40 ("EDAC: Add a driver for the AMD Versal NET DDR controller") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Yazen Ghannam <yazen.ghannam@amd.com>
2025-09-30Merge tag 'edac_updates_for_v6.18' of ↵Linus Torvalds16-57/+1342
git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras Pull EDAC updates from Borislav Petkov: - Add support for new AMD family 0x1a models to amd64_edac - Add an EDAC driver for the AMD VersalNET memory controller which reports hw errors from different IP blocks in the fabric using an IPC-type transport - Drop the silly static number of memory controllers in the Intel EDAC drivers (skx, i10nm) in favor of a flexible array so that former doesn't need to be increased with every new generation which adds more memory controllers; along with a proper refactoring - Add support for two Alder Lake-S SOCs to ie31200_edac - Add an EDAC driver for ADM Cortex A72 cores, and specifically for reporting L1 and L2 cache errors - Last but not least, the usual fixes, cleanups and improvements all over the subsystem * tag 'edac_updates_for_v6.18' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras: (23 commits) EDAC/versalnet: Return the correct error in mc_probe() EDAC/mc_sysfs: Increase legacy channel support to 16 EDAC/amd64: Add support for AMD family 1Ah-based newer models EDAC: Add a driver for the AMD Versal NET DDR controller dt-bindings: memory-controllers: Add support for Versal NET EDAC RAS: Export log_non_standard_event() to drivers cdx: Export Symbols for MCDI RPC and Initialization cdx: Split mcdi.h and reorganize headers EDAC/skx_common: Use topology_physical_package_id() instead of open coding EDAC: Fix wrong executable file modes for C source files EDAC/altera: Use dev_fwnode() EDAC/skx_common: Remove unused *NUM*_IMC macros EDAC/i10nm: Reallocate skx_dev list if preconfigured cnt != runtime cnt EDAC/skx_common: Remove redundant upper bound check for res->imc EDAC/skx_common: Make skx_dev->imc[] a flexible array EDAC/skx_common: Swap memory controller index mapping EDAC/skx_common: Move mc_mapping to be a field inside struct skx_imc EDAC/{skx_common,skx}: Use configuration data, not global macros EDAC/i10nm: Skip DIMM enumeration on a disabled memory controller EDAC/ie31200: Add two more Intel Alder Lake-S SoCs for EDAC support ...
2025-09-26Merge branches 'edac-drivers' and 'edac-misc' into edac-updatesBorislav Petkov (AMD)3-0/+0
* edac-drivers: EDAC/versalnet: Return the correct error in mc_probe() EDAC/mc_sysfs: Increase legacy channel support to 16 EDAC/amd64: Add support for AMD family 1Ah-based newer models EDAC: Add a driver for the AMD Versal NET DDR controller dt-bindings: memory-controllers: Add support for Versal NET EDAC RAS: Export log_non_standard_event() to drivers cdx: Export Symbols for MCDI RPC and Initialization cdx: Split mcdi.h and reorganize headers EDAC/skx_common: Use topology_physical_package_id() instead of open coding EDAC/altera: Use dev_fwnode() EDAC/skx_common: Remove unused *NUM*_IMC macros EDAC/i10nm: Reallocate skx_dev list if preconfigured cnt != runtime cnt EDAC/skx_common: Remove redundant upper bound check for res->imc EDAC/skx_common: Make skx_dev->imc[] a flexible array EDAC/skx_common: Swap memory controller index mapping EDAC/skx_common: Move mc_mapping to be a field inside struct skx_imc EDAC/{skx_common,skx}: Use configuration data, not global macros EDAC/i10nm: Skip DIMM enumeration on a disabled memory controller EDAC/ie31200: Add two more Intel Alder Lake-S SoCs for EDAC support dt-bindings: arm: cpus: Add edac-enabled property EDAC: Add EDAC driver for ARM Cortex A72 cores * edac-misc: EDAC: Fix wrong executable file modes for C source files MAINTAINERS: EDAC: Drop inactive reviewers Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
2025-09-18EDAC/versalnet: Return the correct error in mc_probe()Dan Carpenter1-1/+3
Return -ENOMEM if memory allocation in mc_probe() fails. [ bp: Massage commit message. ] Fixes: d5fe2fec6c40 ("EDAC: Add a driver for the AMD Versal NET DDR controller") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
2025-09-17EDAC/mc_sysfs: Increase legacy channel support to 16Avadhut Naik1-0/+24
Newer AMD systems can support up to 16 channels per EDAC "mc" device. These are detected by the EDAC module running on the device, and the current EDAC interface is appropriately enumerated. The legacy EDAC sysfs interface however, provides device attributes for channels 0 through 11 only. Consequently, the last four channels, 12 through 15, will not be enumerated and will not be visible through the legacy sysfs interface. Add additional device attributes to ensure that all 16 channels, if present, are enumerated by and visible through the legacy EDAC sysfs interface. Signed-off-by: Avadhut Naik <avadhut.naik@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/20250916203242.1281036-1-avadhut.naik@amd.com
2025-09-17EDAC/amd64: Add support for AMD family 1Ah-based newer modelsAvadhut Naik2-1/+21
Add support for family 1Ah-based models 50h-57h, 90h-9Fh, A0h-AFh, and C0h-C7h. Also, raise the maximum memory controllers number as those machines support that many. Signed-off-by: Avadhut Naik <avadhut.naik@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/20250916203242.1281036-1-avadhut.naik@amd.com
2025-09-15EDAC: Add a driver for the AMD Versal NET DDR controllerShubhrajyoti Datta3-0/+967
Add a driver for the AMD Versal NET DDR memory controller which supports single bit error correction, double bit error detection and other system errors from various IP subsystems (e.g., RPU, NOCs, HNICX, PL). The driver listens for notifications from the NMC (Network management controller) using RPMsg (Remote Processor Messaging). The channel used for communicating to RPMsg is named "error_edac". Upon receipt of a notification, the driver sends a RAS event trace. [ bp: - Fixup title - Rewrite commit message - Fixup Kconfig text - Zap unused defines and align them - Simplify rpmsg_cb() considerably - Drop silly double-brackets in conditionals - Use proper void * type in mcdi_request() - Do not clear chinfo in rpmsg_probe() unnecessarily - Fix indentation - Do a proper err unwind path in init_versalnet() - Redo the error unwind path in mc_probe() properly - Fix the ordering in mc_remove() ] Signed-off-by: Shubhrajyoti Datta <shubhrajyoti.datta@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/20250908115649.22903-1-shubhrajyoti.datta@amd.com Link: https://lore.kernel.org/r/20250703173105.GLaGa-WQCESDNsqygm@fat_crate.local
2025-09-03EDAC/skx_common: Use topology_physical_package_id() instead of open codingQiuxu Zhuo1-1/+2
Use topology_physical_package_id() to get the CPU package ID instead of open coding. Suggested-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Link: https://lore.kernel.org/r/20250903030648.3285935-1-qiuxu.zhuo@intel.com
2025-08-30EDAC: Fix wrong executable file modes for C source filesKuan-Wei Chiu3-0/+0
Three EDAC source files were mistakenly marked as executable when adding the EDAC scrub controls. These are plain C source files and should not carry the executable bit. Correcting their modes follows the principle of least privilege and avoids unnecessary execute permissions in the repository. [ bp: Massage commit message. ] Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/20250828191954.903125-1-visitorckw@gmail.com
2025-08-25EDAC/altera: Delete an inappropriate dma_free_coherent() callSalah Triki1-1/+0
dma_free_coherent() must only be called if the corresponding dma_alloc_coherent() call has succeeded. Calling it when the allocation fails leads to undefined behavior. Delete the wrong call. [ bp: Massage commit message. ] Fixes: 71bcada88b0f3 ("edac: altera: Add Altera SDRAM EDAC support") Signed-off-by: Salah Triki <salah.triki@gmail.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Acked-by: Dinh Nguyen <dinguyen@kernel.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/aIrfzzqh4IzYtDVC@pc
2025-08-20EDAC/altera: Use dev_fwnode()Jiri Slaby (SUSE)1-2/+2
irq_domain_create_simple() takes fwnode as the first argument. It can be extracted from the struct device using dev_fwnode() helper instead of using of_node with of_fwnode_handle(). So use the dev_fwnode() helper. Signed-off-by: Jiri Slaby (SUSE) <jirislaby@kernel.org> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Yazen Ghannam <yazen.ghannam@amd.com> Acked-by: Dinh Nguyen <dinguyen@kernel.org> Link: https://lore.kernel.org/20250723062631.1830757-1-jirislaby@kernel.org
2025-08-19EDAC/skx_common: Remove unused *NUM*_IMC macrosQiuxu Zhuo1-5/+0
There are no references to the *NUM*_IMC macros, so remove them. Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Link: https://lore.kernel.org/r/20250731145534.2759334-8-qiuxu.zhuo@intel.com
2025-08-19EDAC/i10nm: Reallocate skx_dev list if preconfigured cnt != runtime cntQiuxu Zhuo1-6/+7
Ideally, read the present DDR memory controller count first and then allocate the skx_dev list using this count. However, this approach requires adding a significant amount of code similar to skx_get_all_bus_mappings() to obtain the PCI bus mappings for the first socket and use these mappings along with the related PCI register offset to read the memory controller count. Given that the Granite Rapids CPU is the only one that can detect the count of memory controllers at runtime (other CPUs use the count in the configuration data), to reduce code complexity, reallocate the skx_dev list only if the preconfigured count of DDR memory controllers differs from the count read at runtime for Granite Rapids CPU. Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Link: https://lore.kernel.org/r/20250731145534.2759334-7-qiuxu.zhuo@intel.com
2025-08-19EDAC/skx_common: Remove redundant upper bound check for res->imcQiuxu Zhuo1-1/+1
The following upper bound check for the memory controller physical index decoded by ADXL is the only place where use the macro 'NUM_IMC' is used: res->imc > NUM_IMC - 1 Since this check is already covered by skx_get_mc_mapping(), meaning no memory controller logical index exists for an invalid memory controller physical index decoded by ADXL, remove the redundant upper bound check so that the definition for 'NUM_IMC' can be cleaned up (in another patch). Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Link: https://lore.kernel.org/r/20250731145534.2759334-6-qiuxu.zhuo@intel.com
2025-08-19EDAC/skx_common: Make skx_dev->imc[] a flexible arrayQiuxu Zhuo2-2/+3
The current skx->imc[NUM_IMC] array of memory controller instances is sized using the macro NUM_IMC. Each time EDAC support is added for a new CPU, NUM_IMC needs to be updated to ensure it is greater than or equal to the number of memory controllers for the new CPU. This approach is inconvenient and results in memory waste for older CPUs with fewer memory controllers. To address this, make skx->imc[] a flexible array and determine its size from configuration data or at runtime. Suggested-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Link: https://lore.kernel.org/r/20250731145534.2759334-5-qiuxu.zhuo@intel.com
2025-08-19EDAC/skx_common: Swap memory controller index mappingQiuxu Zhuo1-8/+20
The current mapping of memory controller indices is from physical index [1] to logical index [2], as show below: skx_dev->imc[pmc].mc_mapping = lmc Since skx_dev->imc[] is an array of present memory controller instances, mapping memory controller indices from logical index to physical index, as show below, is more reasonable. This is also a preparatory step for making skx_dev->imc[] a flexible array. skx_dev->imc[lmc].mc_mapping = pmc Both mappings are equivalent. No functional changes intended. [1] Indices for memory controllers include both those present to the OS and those disabled by BIOS. [2] Indices for memory controllers present to the OS. Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Link: https://lore.kernel.org/r/20250731145534.2759334-4-qiuxu.zhuo@intel.com
2025-08-19EDAC/skx_common: Move mc_mapping to be a field inside struct skx_imcQiuxu Zhuo2-14/+14
The mc_mapping and imc fields of struct skx_dev have the same size, NUM_IMC. Move mc_mapping to be a field inside struct skx_imc to prepare for making the imc array of memory controller instances a flexible array. No functional changes intended. Suggested-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Link: https://lore.kernel.org/r/20250731145534.2759334-3-qiuxu.zhuo@intel.com
2025-08-19EDAC/{skx_common,skx}: Use configuration data, not global macrosQiuxu Zhuo3-20/+30
Use model-specific configuration data for the number of memory controllers per socket, channels per memory controller, and DIMMs per channel as intended, instead of relying on global macros for maximum values. No functional changes intended. Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Link: https://lore.kernel.org/r/20250731145534.2759334-2-qiuxu.zhuo@intel.com
2025-08-19EDAC/i10nm: Skip DIMM enumeration on a disabled memory controllerQiuxu Zhuo1-0/+14
When loading the i10nm_edac driver on some Intel Granite Rapids servers, a call trace may appear as follows: UBSAN: shift-out-of-bounds in drivers/edac/skx_common.c:453:16 shift exponent -66 is negative ... __ubsan_handle_shift_out_of_bounds+0x1e3/0x390 skx_get_dimm_info.cold+0x47/0xd40 [skx_edac_common] i10nm_get_dimm_config+0x23e/0x390 [i10nm_edac] skx_register_mci+0x159/0x220 [skx_edac_common] i10nm_init+0xcb0/0x1ff0 [i10nm_edac] ... This occurs because some BIOS may disable a memory controller if there aren't any memory DIMMs populated on this memory controller. The DIMMMTR register of this disabled memory controller contains the invalid value ~0, resulting in the call trace above. Fix this call trace by skipping DIMM enumeration on a disabled memory controller. Fixes: ba987eaaabf9 ("EDAC/i10nm: Add Intel Granite Rapids server support") Reported-by: Jose Jesus Ambriz Meza <jose.jesus.ambriz.meza@intel.com> Reported-by: Chia-Lin Kao (AceLan) <acelan.kao@canonical.com> Closes: https://lore.kernel.org/all/20250730063155.2612379-1-acelan.kao@canonical.com/ Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Tested-by: Chia-Lin Kao (AceLan) <acelan.kao@canonical.com> Link: https://lore.kernel.org/r/20250806065707.3533345-1-qiuxu.zhuo@intel.com
2025-08-19EDAC/ie31200: Add two more Intel Alder Lake-S SoCs for EDAC supportKyle Manna1-0/+4
Host Device IDs (DID0) correspond to: * Intel Core i7-12700K * Intel Core i5-12600K See documentation: * 12th Generation Intel® Core™ Processors Datasheet * Volume 1 of 2, Doc. No.: 655258, Rev.: 011 * https://edc.intel.com/output/DownloadPdfDocument?id=8297 (PDF) Signed-off-by: Kyle Manna <kyle@kylemanna.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Reviewed-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Link: https://lore.kernel.org/r/20250819161739.3241152-1-kyle@kylemanna.com
2025-08-15EDAC: Add EDAC driver for ARM Cortex A72 coresSascha Hauer3-0/+234
The driver is designed to support error detection and reporting for Cortex A72 cores, specifically within their L1 and L2 cache systems. The errors are detected by reading CPU/L2 memory error syndrome registers. Unfortunately there is no robust way to inject errors into the caches, so this driver doesn't contain any code to actually test it. It has been tested though with code taken from an older version [1] of this driver. For reasons stated in thread [1], the error injection code is not suitable for mainline, so it is removed from the driver. [1] https://lore.kernel.org/all/1521073067-24348-1-git-send-email-york.sun@nxp.com/#t [ bp: minor touchups. ] Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de> Co-developed-by: Vijay Balakrishna <vijayb@linux.microsoft.com> Signed-off-by: Vijay Balakrishna <vijayb@linux.microsoft.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/1752714390-27389-2-git-send-email-vijayb@linux.microsoft.com
2025-07-29Merge tag 'edac_updates_for_v6.17_rc1' of ↵Linus Torvalds6-102/+140
git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras Pull EDAC updates from Borislav Petkov: - i10nm: - switch to using scnprintf() - Add Granite Rapids-D support - synopsys: Make sure ECC error and counter registers are cleared during init/probing to avoid reporting stale errors - igen6: Add Wildcat Lake SoCs support - Make sure scrub features sysfs attributes are initialized properly - Allocate memory repair sysfs attributes statically to reduce stack usage - Fix DIMM module size computation for DIMMs with total capacity which is a non power-of-two number, in amd64_edac - Do not be too dramatic when reporting disabled memory controllers in igen6_edac - Add support to ie31200_edac for the following SoCs: - Core i5-14[67]00 - Bartless Lake-S SoCs - Raptor Lake-HX * tag 'edac_updates_for_v6.17_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras: EDAC/{skx_common,i10nm}: Use scnprintf() for safer buffer handling EDAC/synopsys: Clear the ECC counters on init EDAC/ie31200: Add Intel Raptor Lake-HX SoCs support EDAC/igen6: Add Intel Wildcat Lake SoCs support EDAC/i10nm: Add Intel Granite Rapids-D support EDAC/mem_repair: Reduce stack usage in edac_mem_repair_get_desc() EDAC/igen6: Reduce log level to debug for absent memory controllers EDAC/ie31200: Document which CPUs correspond to each Raptor Lake-S device ID EDAC/ie31200: Enable support for Core i5-14600 and i7-14700 ie31200/EDAC: Add Intel Bartlett Lake-S SoCs support
2025-07-15EDAC/{skx_common,i10nm}: Use scnprintf() for safer buffer handlingWang Haoran2-11/+11
snprintf() is fragile when its return value will be used to append additional data to a buffer. Use scnprintf() instead. Signed-off-by: Wang Haoran <haoranwangsec@gmail.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Tested-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Reviewed-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Link: https://lore.kernel.org/r/20250715131700.1092720-1-haoranwangsec@gmail.com
2025-07-14EDAC/synopsys: Clear the ECC counters on initShubhrajyoti Datta1-51/+46
Clear the ECC error and counter registers during initialization/probe to avoid reporting stale errors that may have occurred before EDAC registration. For that, unify the Zynq and ZynqMP ECC state reading paths and simplify the code. [ bp: Massage commit message. Fix an -Wsometimes-uninitialized warning as reported by Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202507141048.obUv3ZUm-lkp@intel.com ] Signed-off-by: Shubhrajyoti Datta <shubhrajyoti.datta@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/20250713050753.7042-1-shubhrajyoti.datta@amd.com
2025-07-07EDAC/ie31200: Add Intel Raptor Lake-HX SoCs supportQiuxu Zhuo1-0/+4
Intel Raptor Lake-HX SoC shares the same memory controller registers as Raptor Lake-S SoC. Add a compute die ID for Raptor Lake-HX SoCs with Out-of-Band ECC capability for EDAC support. Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Tested-by: Laurens SEGHERS <laurens@rale.com> Link: https://lore.kernel.org/r/20250704151609.7833-4-qiuxu.zhuo@intel.com
2025-07-07EDAC/igen6: Add Intel Wildcat Lake SoCs supportLili Li1-0/+15
Intel Wildcat Lake is a mobile derivative of Panther Lake with one memory controller. Wildcat Lake SoCs share the same IBECC registers with Meteor Lake-P SoCs. Add a compute die ID and a new configuration structure for Wildcat Lake SoCs with In-Band ECC capability for EDAC support. Signed-off-by: Lili Li <lili.li@intel.com> Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Link: https://lore.kernel.org/r/20250704151609.7833-3-qiuxu.zhuo@intel.com
2025-07-07EDAC/i10nm: Add Intel Granite Rapids-D supportQiuxu Zhuo1-1/+11
The Granite Rapids-D CPU model uses memory controller registers similar to those of the Granite Rapids server CPU but with a different memory controller MMIO base. Add the Granite Rapids-D CPU model ID and use the new memory controller MMIO base for EDAC support. Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Tested-by: VikasX Chougule <vikasx.chougule@intel.com> Link: https://lore.kernel.org/r/20250704151609.7833-2-qiuxu.zhuo@intel.com
2025-06-30EDAC: Initialize EDAC features sysfs attributesShiju Jose3-1/+5
Fix the lockdep splat caused by missing sysfs_attr_init() calls for the recently added EDAC feature's sysfs attributes. In lockdep_init_map_type(), the check for the lock-class key if (!static_obj(key) && !is_dynamic_key(key)) causes the splat. Backtrace: RIP: 0010:lockdep_init_map_type Call Trace: __kernfs_create_file sysfs_add_file_mode_ns internal_create_group internal_create_groups device_add ? __init_waitqueue_head edac_dev_register devm_cxl_memdev_edac_register ? lock_acquire ? find_held_lock ? cxl_mem_probe ? cxl_mem_probe ? lockdep_hardirqs_on ? cxl_mem_probe cxl_mem_probe [ bp: Massage. ] Fixes: f90b738166fe ("EDAC: Add scrub control feature") Fixes: bcbd069b11b0 ("EDAC: Add a Error Check Scrub control feature") Fixes: 699ea5219c4b ("EDAC: Add a memory repair control feature") Reported-by: Dave Jiang <dave.jiang@intel.com> Suggested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Signed-off-by: Shiju Jose <shiju.jose@huawei.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com> Link: https://lore.kernel.org/20250626101344.1726-1-shiju.jose@huawei.com
2025-06-26EDAC/mem_repair: Reduce stack usage in edac_mem_repair_get_desc()Arnd Bergmann1-34/+22
Constructing an array on the stack adds complexity and can exceed the warning limit for per-function stack usage: drivers/edac/mem_repair.c:361:5: error: stack frame size (1296) exceeds limit (1280) in 'edac_mem_repair_get_desc' [-Werror,-Wframe-larger-than] Change this to have the actual attribute array allocated statically and then just add the instance number on the per-instance copy. Fixes: 699ea5219c4b ("EDAC: Add a memory repair control feature") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/20250620114135.4017183-1-arnd@kernel.org
2025-06-25EDAC/amd64: Fix size calculation for Non-Power-of-Two DIMMsAvadhut Naik1-21/+36
Each Chip-Select (CS) of a Unified Memory Controller (UMC) on AMD Zen-based SOCs has an Address Mask and a Secondary Address Mask register associated with it. The amd64_edac module logs DIMM sizes on a per-UMC per-CS granularity during init using these two registers. Currently, the module primarily considers only the Address Mask register for computing DIMM sizes. The Secondary Address Mask register is only considered for odd CS. Additionally, if it has been considered, the Address Mask register is ignored altogether for that CS. For power-of-two DIMMs i.e. DIMMs whose total capacity is a power of two (32GB, 64GB, etc), this is not an issue since only the Address Mask register is used. For non-power-of-two DIMMs i.e., DIMMs whose total capacity is not a power of two (48GB, 96GB, etc), however, the Secondary Address Mask register is used in conjunction with the Address Mask register. However, since the module only considers either of the two registers for a CS, the size computed by the module is incorrect. The Secondary Address Mask register is not considered for even CS, and the Address Mask register is not considered for odd CS. Introduce a new helper function so that both Address Mask and Secondary Address Mask registers are considered, when valid, for computing DIMM sizes. Furthermore, also rename some variables for greater clarity. Fixes: 81f5090db843 ("EDAC/amd64: Support asymmetric dual-rank DIMMs") Closes: https://lore.kernel.org/dbec22b6-00f2-498b-b70d-ab6f8a5ec87e@natrix.lt Reported-by: Žilvinas Žaltiena <zilvinas@natrix.lt> Signed-off-by: Avadhut Naik <avadhut.naik@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Yazen Ghannam <yazen.ghannam@amd.com> Tested-by: Žilvinas Žaltiena <zilvinas@natrix.lt> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/20250529205013.403450-1-avadhut.naik@amd.com
2025-06-23EDAC/igen6: Reduce log level to debug for absent memory controllersQiuxu Zhuo1-1/+1
The current KERN_WARNING level message for detecting absent memory controllers is overly dramatic. The BIOS likely had valid reasons to disable the memory controller (e.g. it isn't connected to any DIMM slots on the motherboard for this system). So there's nothing actually wrong that needs to be fixed. Reduce the log level to KERN_DEBUG to eliminate the false warning. Suggested-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20250618162307.1523736-2-qiuxu.zhuo@intel.com
2025-06-23EDAC/ie31200: Document which CPUs correspond to each Raptor Lake-S device IDGeorge Gaidarov1-6/+6
Based on table 103 ("Host Device ID (DID0)") in [1], document which CPUs correspond to each Raptor Lake-S device ID for better readability. [1] https://www.intel.com/content/www/us/en/content-details/743844/13th-generation-intel-core-intel-core-14th-generation-intel-core-processor-series-1-and-series-2-and-intel-xeon-e-2400-processor-datasheet-volume-1-of-2.html Signed-off-by: George Gaidarov <gdgaidarov+lkml@gmail.com> Reviewed-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20250529162933.1228735-2-gdgaidarov+lkml@gmail.com
2025-06-23EDAC/ie31200: Enable support for Core i5-14600 and i7-14700George Gaidarov1-0/+4
Device ID '0xa740' is shared by i7-14700, i7-14700K, and i7-14700T. Device ID '0xa704' is shared by i5-14600, i5-14600K, and i5-14600T. Tested locally on my i7-14700K. Signed-off-by: George Gaidarov <gdgaidarov+lkml@gmail.com> Reviewed-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20250529162933.1228735-1-gdgaidarov+lkml@gmail.com
2025-06-23ie31200/EDAC: Add Intel Bartlett Lake-S SoCs supportQiuxu Zhuo1-0/+22
Bartlett Lake-S is a derivative of Raptor Lake-S and is optimized for IoT/Edge applications. It shares the same memory controller registers as Raptor Lake-S. Add compute die IDs of Bartlett Lake-S and reuse the configuration data of Raptor Lake-S for Bartlett Lake-S EDAC support. Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://lore.kernel.org/r/20250502013900.343430-1-qiuxu.zhuo@intel.com
2025-06-18EDAC/igen6: Fix NULL pointer dereferenceQiuxu Zhuo1-11/+13
A kernel panic was reported with the following kernel log: EDAC igen6: Expected 2 mcs, but only 1 detected. BUG: unable to handle page fault for address: 000000000000d570 ... Hardware name: Notebook V54x_6x_TU/V54x_6x_TU, BIOS Dasharo (coreboot+UEFI) v0.9.0 07/17/2024 RIP: e030:ecclog_handler+0x7e/0xf0 [igen6_edac] ... igen6_probe+0x2a0/0x343 [igen6_edac] ... igen6_init+0xc5/0xff0 [igen6_edac] ... This issue occurred because one memory controller was disabled by the BIOS but the igen6_edac driver still checked all the memory controllers, including this absent one, to identify the source of the error. Accessing the null MMIO for the absent memory controller resulted in the oops above. Fix this issue by reverting the configuration structure to non-const and updating the field 'res_cfg->num_imc' to reflect the number of detected memory controllers. Fixes: 20e190b1c1fd ("EDAC/igen6: Skip absent memory controllers") Reported-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com> Closes: https://lore.kernel.org/all/aFFN7RlXkaK_loQb@mail-itl/ Suggested-by: Borislav Petkov <bp@alien8.de> Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Tested-by: Marek Marczykowski-Górecki <marmarek@invisiblethingslab.com> Link: https://lore.kernel.org/r/20250618162307.1523736-1-qiuxu.zhuo@intel.com
2025-06-16EDAC/amd64: Correct number of UMCs for family 19h models 70h-7fhAvadhut Naik1-0/+1
AMD's Family 19h-based Models 70h-7fh support 4 unified memory controllers (UMC) per processor die. The amd64_edac driver, however, assumes only 2 UMCs are supported since max_mcs variable for the models has not been explicitly set to 4. The same results in incomplete or incorrect memory information being logged to dmesg by the module during initialization in some instances. Fixes: 6c79e42169fe ("EDAC/amd64: Add support for ECC on family 19h model 60h-7Fh") Closes: https://lore.kernel.org/all/27dc093f-ce27-4c71-9e81-786150a040b6@reox.at/ Reported-by: reox <mailinglist@reox.at> Signed-off-by: Avadhut Naik <avadhut.naik@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Cc: stable@kernel.org Link: https://lore.kernel.org/20250613005233.2330627-1-avadhut.naik@amd.com
2025-06-03Merge tag 'cxl-for-6.16' of ↵Linus Torvalds1-0/+9
git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl Pull Compute Express Link (CXL) updates from Dave Jiang: - Remove always true condition in cxl features code - Add verification of CHBS length for CXL 2.0 - Ignore interleave granularity when interleave ways is 1 - Add update addressing mising MODULE_DESCRIPTION for cxl_test - A series of cleanups/refactor to prep for AMD Zen5 translate code - Clean %pa debug printk in core/hdm.c - Documentation updates: - Update to CXL Maturity Map - Fixes to source linking in CXL documentation - CXL documentation fixes, spelling corrections - A large collection of CXL documentation for the entire CXL subsystem, including documentation on CXL related platform and firmware notes - Remove redundant code of cxlctl_get_supported_features() - Series to support CXL RAS Features - Including "Patrol Scrub Control", "Error Check Scrub", "Performance Maitenance" and "Memory Sparing". The series connects CXL to EDAC. * tag 'cxl-for-6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: (53 commits) cxl/edac: Add CXL memory device soft PPR control feature cxl/edac: Add CXL memory device memory sparing control feature cxl/edac: Support for finding memory operation attributes from the current boot cxl/edac: Add support for PERFORM_MAINTENANCE command cxl/edac: Add CXL memory device ECS control feature cxl/edac: Add CXL memory device patrol scrub control feature cxl: Update prototype of function get_support_feature_info() EDAC: Update documentation for the CXL memory patrol scrub control feature cxl/features: Remove the inline specifier from to_cxlfs() cxl/feature: Remove redundant code of get supported features docs: ABI: Fix "firwmare" to "firmware" cxl/Documentation: Fix typo in sysfs write_bandwidth attribute path cxl: doc/linux/access-coordinates Update access coordinates calculation methods cxl: docs/platform/acpi/srat Add generic target documentation cxl: docs/platform/cdat reference documentation Documentation: Update the CXL Maturity Map cxl: Sync up the driver-api/cxl documentation cxl: docs - add self-referencing cross-links cxl: docs/allocation/hugepages cxl: docs/allocation/reclaim ...
2025-05-29EDAC/altera: Use correct write width with the INTTEST registerNiravkumar L Rabara1-3/+3
On the SoCFPGA platform, the INTTEST register supports only 16-bit writes. A 32-bit write triggers an SError to the CPU so do 16-bit accesses only. [ bp: AI-massage the commit message. ] Fixes: c7b4be8db8bc ("EDAC, altera: Add Arria10 OCRAM ECC support") Signed-off-by: Niravkumar L Rabara <niravkumar.l.rabara@intel.com> Signed-off-by: Matthew Gerlach <matthew.gerlach@altera.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Acked-by: Dinh Nguyen <dinguyen@kernel.org> Cc: stable@kernel.org Link: https://lore.kernel.org/20250527145707.25458-1-matthew.gerlach@altera.com
2025-05-27Merge tag 'edac_updates_for_v6.16' of ↵Linus Torvalds6-231/+422
git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras Pull EDAC updates from Borislav Petkov: - ie31200: Add support for Raptor Lake-S and Alder Lake-S compute dies - Rework how RRL registers per channel tracking is done in order to support newer hardware with different RRL configurations and refactor that code. Add support for Granite Rapids server - i10nm: explicitly set RRL modes to fix any wrong BIOS programming - Properly save and restore Retry Read error Log channel configuration info on Intel drivers - igen6: Handle correctly the case of fused off memory controllers on Arizona Beach and Amston Lake SoCs before adding support for them - the usual set of fixes and cleanups * tag 'edac_updates_for_v6.16' of git://git.kernel.org/pub/scm/linux/kernel/git/ras/ras: EDAC/bluefield: Don't use bluefield_edac_readl() result on error EDAC/i10nm: Fix the bitwise operation between variables of different sizes EDAC/ie31200: Add two Intel SoCs for EDAC support EDAC/{skx_common,i10nm}: Add RRL support for Intel Granite Rapids server EDAC/{skx_common,i10nm}: Refactor show_retry_rd_err_log() EDAC/{skx_common,i10nm}: Refactor enable_retry_rd_err_log() EDAC/{skx_common,i10nm}: Structure the per-channel RRL registers EDAC/i10nm: Explicitly set the modes of the RRL register sets EDAC/{skx_common,i10nm}: Fix the loss of saved RRL for HBM pseudo channel 0 EDAC/skx_common: Fix general protection fault EDAC/igen6: Add Intel Amston Lake SoCs support EDAC/igen6: Add Intel Arizona Beach SoCs support EDAC/igen6: Skip absent memory controllers
2025-05-27Merge tag 'irq-cleanups-2025-05-25' of ↵Linus Torvalds1-2/+2
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull irq cleanups from Thomas Gleixner: "A set of cleanups for the generic interrupt subsystem: - Consolidate on one set of functions for the interrupt domain code to get rid of pointlessly duplicated code with only marginal different semantics. - Update the documentation accordingly and consolidate the coding style of the irqdomain header" * tag 'irq-cleanups-2025-05-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (46 commits) irqdomain: Consolidate coding style irqdomain: Fix kernel-doc and add it to Documentation Documentation: irqdomain: Update it Documentation: irq-domain.rst: Simple improvements Documentation: irq/concepts: Minor improvements Documentation: irq/concepts: Add commas and reflow irqdomain: Improve kernel-docs of functions irqdomain: Make struct irq_domain_info variables const irqdomain: Use irq_domain_instantiate()'s return value as initializers irqdomain: Drop irq_linear_revmap() pinctrl: keembay: Switch to irq_find_mapping() irqchip/armada-370-xp: Switch to irq_find_mapping() gpu: ipu-v3: Switch to irq_find_mapping() gpio: idt3243x: Switch to irq_find_mapping() sh: Switch to irq_find_mapping() powerpc: Switch to irq_find_mapping() irqdomain: Drop irq_domain_add_*() functions powerpc: Switch irq_domain_add_nomap() to use fwnode thermal: Switch to irq_domain_create_linear() soc: Switch to irq_domain_create_*() ...
2025-05-23cxl/edac: Add CXL memory device memory sparing control featureShiju Jose1-0/+9
Memory sparing is defined as a repair function that replaces a portion of memory with a portion of functional memory at that same DPA. The subclasses for this operation vary in terms of the scope of the sparing being performed. The cacheline sparing subclass refers to a sparing action that can replace a full cacheline. Row sparing is provided as an alternative to PPR sparing functions and its scope is that of a single DDR row. As per CXL r3.2 Table 8-125 foot note 1. Memory sparing is preferred over PPR when possible. Bank sparing allows an entire bank to be replaced. Rank sparing is defined as an operation in which an entire DDR rank is replaced. Memory sparing maintenance operations may be supported by CXL devices that implement CXL.mem protocol. A sparing maintenance operation requests the CXL device to perform a repair operation on its media. For example, a CXL device with DRAM components that support memory sparing features may implement sparing maintenance operations. The host may issue a query command by setting query resources flag in the input payload (CXL spec 3.2 Table 8-120) to determine availability of sparing resources for a given address. In response to a query request, the device shall report the resource availability by producing the memory sparing event record (CXL spec 3.2 Table 8-60) in which the Channel, Rank, Nibble Mask, Bank Group, Bank, Row, Column, Sub-Channel fields are a copy of the values specified in the request. During the execution of a sparing maintenance operation, a CXL memory device: - may not retain data - may not be able to process CXL.mem requests correctly. These CXL memory device capabilities are specified by restriction flags in the memory sparing feature readable attributes. When a CXL device identifies error on a memory component, the device may inform the host about the need for a memory sparing maintenance operation by using DRAM event record, where the 'maintenance needed' flag may set. The event record contains some of the DPA, Channel, Rank, Nibble Mask, Bank Group, Bank, Row, Column, Sub-Channel fields that should be repaired. The userspace tool requests for maintenance operation if the 'maintenance needed' flag set in the CXL DRAM error record. CXL spec 3.2 section 8.2.10.7.1.4 describes the device's memory sparing maintenance operation feature. CXL spec 3.2 section 8.2.10.7.2.3 describes the memory sparing feature discovery and configuration. Add support for controlling CXL memory device memory sparing feature. Register with EDAC driver, which gets the memory repair attr descriptors from the EDAC memory repair driver and exposes sysfs repair control attributes for memory sparing to the userspace. For example CXL memory sparing control for the CXL mem0 device is exposed in /sys/bus/edac/devices/cxl_mem0/mem_repairX/ Use case ======== 1. CXL device identifies a failure in a memory component, report to userspace in a CXL DRAM trace event with DPA and other attributes of memory to repair such as channel, rank, nibble mask, bank Group, bank, row, column, sub-channel. 2. Rasdaemon process the trace event and may issue query request in sysfs check resources available for memory sparing if either of the following conditions met. - 'maintenance needed' flag set in the event record. - 'threshold event' flag set for CVME threshold feature. - When the number of corrected error reported on a CXL.mem media to the userspace exceeds the threshold value for corrected error count defined by the userspace policy. 3. Rasdaemon process the memory sparing trace event and issue repair request for memory sparing. Kernel CXL driver shall report memory sparing event record to the userspace with the resource availability in order rasdaemon to process the event record and issue a repair request in sysfs for the memory sparing operation in the CXL device. Note: Based on the feedbacks from the community 'query' sysfs attribute is removed and reporting memory sparing error record to the userspace are not supported. Instead userspace issues sparing operation and kernel does the same to the CXL memory device, when 'maintenance needed' flag set in the DRAM event record. Add checks to ensure the memory to be repaired is offline and if online, then originates from a CXL DRAM error record reported in the current boot before requesting a memory sparing operation on the device. Note: Tested memory sparing feature control with QEMU patch "hw/cxl: Add emulation for memory sparing control feature" https://lore.kernel.org/linux-cxl/20250509172229.726-1-shiju.jose@huawei.com/T/#m5f38512a95670d75739f9dad3ee91b95c7f5c8d6 Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Dave Jiang <dave.jiang@intel.com> Signed-off-by: Shiju Jose <shiju.jose@huawei.com> Reviewed-by: Alison Schofield <alison.schofield@intel.com> Acked-by: Dan Williams <dan.j.williams@intel.com> Link: https://patch.msgid.link/20250521124749.817-8-shiju.jose@huawei.com Signed-off-by: Dave Jiang <dave.jiang@intel.com>
2025-05-22EDAC/bluefield: Don't use bluefield_edac_readl() result on errorDavid Thompson1-5/+15
The bluefield_edac_readl() routine returns an uninitialized result on error paths. In those cases the calling routine should not use the uninitialized result. The driver should simply log the error, and then return early. Fixes: e41967575474 ("EDAC/bluefield: Use Arm SMC for EMI access on BlueField-2") Signed-off-by: David Thompson <davthompson@nvidia.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Shravan Kumar Ramani <shravankr@nvidia.com> Link: https://lore.kernel.org/20250318214747.12271-1-davthompson@nvidia.com
2025-05-16EDAC/altera: Switch to irq_domain_create_linear()Jiri Slaby (SUSE)1-2/+2
irq_domain_add_linear() is going away as being obsolete now. Switch to the preferred irq_domain_create_linear(). That differs in the first parameter: It takes more generic struct fwnode_handle instead of struct device_node. Therefore, of_fwnode_handle() is added around the parameter. Note some of the users can likely use dev->fwnode directly instead of indirect of_fwnode_handle(dev->of_node). But dev->fwnode is not guaranteed to be set for all, so this has to be investigated on case to case basis (by people who can actually test with the HW). [ tglx: Fix up subject prefix ] Signed-off-by: Jiri Slaby (SUSE) <jirislaby@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/20250319092951.37667-17-jirislaby@kernel.org
2025-05-13Merge branch 'x86/msr' into x86/core, to resolve conflictsIngo Molnar3-3/+5
Conflicts: arch/x86/boot/startup/sme.c arch/x86/coco/sev/core.c arch/x86/kernel/fpu/core.c arch/x86/kernel/fpu/xstate.c Semantic conflict: arch/x86/include/asm/sev-internal.h Signed-off-by: Ingo Molnar <mingo@kernel.org>
2025-05-06Merge tag 'v6.15-rc5' into x86/msr, to pick up fixes and to resolve conflictsIngo Molnar2-4/+7
Conflicts: drivers/cpufreq/intel_pstate.c Signed-off-by: Ingo Molnar <mingo@kernel.org>
2025-05-06Merge tag 'v6.15-rc5' into x86/cpu, to resolve conflictsIngo Molnar2-4/+7
Conflicts: tools/arch/x86/include/asm/cpufeatures.h Signed-off-by: Ingo Molnar <mingo@kernel.org>
2025-05-02x86/msr: Add explicit includes of <asm/msr.h>Xin Li (Intel)2-0/+2
For historic reasons there are some TSC-related functions in the <asm/msr.h> header, even though there's an <asm/tsc.h> header. To facilitate the relocation of rdtsc{,_ordered}() from <asm/msr.h> to <asm/tsc.h> and to eventually eliminate the inclusion of <asm/msr.h> in <asm/tsc.h>, add an explicit <asm/msr.h> dependency to the source files that reference definitions from <asm/msr.h>. [ mingo: Clarified the changelog. ] Signed-off-by: Xin Li (Intel) <xin@zytor.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Dave Hansen <dave.hansen@linux.intel.com> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Brian Gerst <brgerst@gmail.com> Cc: Juergen Gross <jgross@suse.com> Cc: H. Peter Anvin <hpa@zytor.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Kees Cook <keescook@chromium.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Borislav Petkov <bp@alien8.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Uros Bizjak <ubizjak@gmail.com> Link: https://lore.kernel.org/r/20250501054241.1245648-1-xin@zytor.com
2025-04-28EDAC/altera: Set DDR and SDMMC interrupt mask before registrationNiravkumar L Rabara2-3/+6
Mask DDR and SDMMC in probe function to avoid spurious interrupts before registration. Removed invalid register write to system manager. Fixes: 1166fde93d5b ("EDAC, altera: Add Arria10 ECC memory init functions") Signed-off-by: Niravkumar L Rabara <niravkumar.l.rabara@altera.com> Signed-off-by: Matthew Gerlach <matthew.gerlach@altera.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Acked-by: Dinh Nguyen <dinguyen@kernel.org> Cc: stable@kernel.org Link: https://lore.kernel.org/20250425142640.33125-3-matthew.gerlach@altera.com
2025-04-28EDAC/altera: Test the correct error reg offsetNiravkumar L Rabara1-1/+1
Test correct structure member, ecc_cecnt_offset, before using it. [ bp: Massage commit message. ] Fixes: 73bcc942f427 ("EDAC, altera: Add Arria10 EDAC support") Signed-off-by: Niravkumar L Rabara <niravkumar.l.rabara@altera.com> Signed-off-by: Matthew Gerlach <matthew.gerlach@altera.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Acked-by: Dinh Nguyen <dinguyen@kernel.org> Cc: stable@kernel.org Link: https://lore.kernel.org/20250425142640.33125-2-matthew.gerlach@altera.com
2025-04-24EDAC/i10nm: Fix the bitwise operation between variables of different sizesQiuxu Zhuo1-2/+2
The tool of Smatch static checker reported the following warning: drivers/edac/i10nm_base.c:364 show_retry_rd_err_log() warn: should bitwise negate be 'ullong'? This warning was due to the bitwise NOT/AND operations between 'status_mask' (a u32 type) and 'log' (a u64 type), which resulted in the high 32 bits of 'log' were cleared. This was a false positive warning, as only the low 32 bits of 'log' was written to the first RRL memory controller register (a u32 type). To improve code sanity, fix this warning by changing 'status_mask' to a u64 type, ensuring it matches the size of 'log' for bitwise operations. Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Closes: https://lore.kernel.org/all/aAih0KmEVq7ch6v2@stanley.mountain/ Signed-off-by: Qiuxu Zhuo <qiuxu.zhuo@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Link: https://lore.kernel.org/r/20250424081454.2952632-1-qiuxu.zhuo@intel.com