linux/drivers/acpi/apei, branch master

ACPI: APEI: GHES: Add NVIDIA vendor CPER record handler

2026-04-06T14:48:58Z

Add support for decoding NVIDIA-specific CPER sections delivered via the APEI GHES vendor record notifier chain. NVIDIA hardware generates vendor-specific CPER sections containing error signatures and diagnostic register dumps. This implementation registers a notifier_block with the GHES vendor record notifier and decodes these sections, printing error details via dev_info(). The driver binds to ACPI device NVDA2012, present on NVIDIA server platforms. The NVIDIA CPER section contains a fixed header with error metadata (signature, error type, severity, socket) followed by variable-length register address-value pairs for hardware diagnostics. This work is based on libcper [1]. Example output: nvidia-ghes NVDA2012:00: NVIDIA CPER section, error_data_length: 544 nvidia-ghes NVDA2012:00: signature: CMET-INFO nvidia-ghes NVDA2012:00: error_type: 0 nvidia-ghes NVDA2012:00: error_instance: 0 nvidia-ghes NVDA2012:00: severity: 3 nvidia-ghes NVDA2012:00: socket: 0 nvidia-ghes NVDA2012:00: number_regs: 32 nvidia-ghes NVDA2012:00: instance_base: 0x0000000000000000 nvidia-ghes NVDA2012:00: register[0]: address=0x8000000100000000 value=0x0000000100000000 https://github.com/openbmc/libcper/commit/683e055061ce [1] Reviewed-by: Jonathan Cameron Signed-off-by: Kai-Heng Feng [ rjw: Changelog edits ] Link: https://patch.msgid.link/20260330094203.38022-4-kaihengf@nvidia.com Signed-off-by: Rafael J. Wysocki

ACPI: APEI: GHES: Add devm_ghes_register_vendor_record_notifier()

2026-04-06T14:48:58Z

Add a device-managed wrapper around ghes_register_vendor_record_notifier() so drivers can avoid manual cleanup on device removal or probe failure. Signed-off-by: Kai-Heng Feng Reviewed-by: Breno Leitao Reviewed-by: Shiju Jose Reviewed-by: Shuai Xue Link: https://patch.msgid.link/20260330094203.38022-2-kaihengf@nvidia.com Signed-off-by: Rafael J. Wysocki

Convert more 'alloc_obj' cases to default GFP_KERNEL arguments

2026-02-22T04:03:00Z

This converts some of the visually simpler cases that have been split over multiple lines. I only did the ones that are easy to verify the resulting diff by having just that final GFP_KERNEL argument on the next line. Somebody should probably do a proper coccinelle script for this, but for me the trivial script actually resulted in an assertion failure in the middle of the script. I probably had made it a bit _too_ trivial. So after fighting that far a while I decided to just do some of the syntactically simpler cases with variations of the previous 'sed' scripts. The more syntactically complex multi-line cases would mostly really want whitespace cleanup anyway. Signed-off-by: Linus Torvalds

Convert 'alloc_obj' family to use the new default GFP_KERNEL argument

2026-02-22T01:09:51Z

This was done entirely with mindless brute force, using git grep -l '\

treewide: Replace kmalloc with kmalloc_obj for non-scalar types

2026-02-21T09:02:28Z

This is the result of running the Coccinelle script from scripts/coccinelle/api/kmalloc_objs.cocci. The script is designed to avoid scalar types (which need careful case-by-case checking), and instead replace kmalloc-family calls that allocate struct or union object instances: Single allocations: kmalloc(sizeof(TYPE), ...) are replaced with: kmalloc_obj(TYPE, ...) Array allocations: kmalloc_array(COUNT, sizeof(TYPE), ...) are replaced with: kmalloc_objs(TYPE, COUNT, ...) Flex array allocations: kmalloc(struct_size(PTR, FAM, COUNT), ...) are replaced with: kmalloc_flex(*PTR, FAM, COUNT, ...) (where TYPE may also be *VAR) The resulting allocations no longer return "void *", instead returning "TYPE *". Signed-off-by: Kees Cook

ACPI: APEI: GHES: Add ghes_edac support for ZX and _BYO_ systems

2026-01-28T20:39:26Z

Let ghes_edac be the preferred driver to load on __ZX__ and _BYO_ systems by extending the platform detection list in ghes.c Signed-off-by: Tony W Wang-oc Tested-by: Lyle Li Acked-by: Borislav Petkov (AMD) [ rjw: Subject and changelog edits ] Link: https://patch.msgid.link/20260128025216.12564-1-TonyWWang-oc@zhaoxin.com Signed-off-by: Rafael J. Wysocki

ACPI: APEI: GHES: Disable KASAN instrumentation when compile testing with clang < 18

2026-01-28T20:35:30Z

After a recent innocuous change to drivers/acpi/apei/ghes.c, building ARCH=arm64 allmodconfig with clang-17 or older (which has both CONFIG_KASAN=y and CONFIG_WERROR=y) fails with: drivers/acpi/apei/ghes.c:902:13: error: stack frame size (2768) exceeds limit (2048) in 'ghes_do_proc' [-Werror,-Wframe-larger-than] 902 | static void ghes_do_proc(struct ghes *ghes, | ^ A KASAN pass that removes unneeded stack instrumentation, enabled by default in clang-18 [1], drastically improves stack usage in this case. To avoid the warning in the common allmodconfig case when it can break the build, disable KASAN for ghes.o when compile testing with clang-17 and older. Disabling KASAN outright may hide legitimate runtime issues, so live with the warning in that case; the user can either increase the frame warning limit or disable -Werror, which they should probably do when debugging with KASAN anyways. Closes: https://github.com/ClangBuiltLinux/linux/issues/2148 Link: https://github.com/llvm/llvm-project/commit/51fbab134560ece663517bf1e8c2a30300d08f1a [1] Signed-off-by: Nathan Chancellor Cc: All applicable Link: https://patch.msgid.link/20260114-ghes-avoid-wflt-clang-older-than-18-v1-1-9c8248bfe4f4@kernel.org Signed-off-by: Rafael J. Wysocki

ACPI: APEI: GHES: Add helper to copy CPER CXL protocol error info to work struct

2026-01-14T16:09:34Z

Make a helper out of cxl_cper_post_prot_err() that checks the CXL agent type and copy the CPER CXL protocol errors information to a work data structure. Export the new symbol for reuse by ELOG. Reviewed-by: Dave Jiang Reviewed-by: Jonathan Cameron Reviewed-by: Hanjun Guo Signed-off-by: Fabio M. De Francesco [ rjw: Subject tweak ] Link: https://patch.msgid.link/20260114101543.85926-5-fabio.m.de.francesco@linux.intel.com Signed-off-by: Rafael J. Wysocki

ACPI: APEI: GHES: Add helper for CPER CXL protocol errors checks

2026-01-14T16:09:34Z

Move the CPER CXL protocol errors validity check out of cxl_cper_post_prot_err() to new cxl_cper_sec_prot_err_valid() and limit the serial number check only to CXL agents that are CXL devices (UEFI v2.10, Appendix N.2.13). Export the new symbol for reuse by ELOG. Reviewed-by: Dave Jiang Reviewed-by: Hanjun Guo Reviewed-by: Jonathan Cameron Signed-off-by: Fabio M. De Francesco [ rjw: Subject tweak ] Link: https://patch.msgid.link/20260114101543.85926-4-fabio.m.de.francesco@linux.intel.com Signed-off-by: Rafael J. Wysocki

ACPI: APEI: GHES: Improve ghes_notify_sea() status check

2026-01-14T16:05:05Z

Performance testing on ARMv8 systems shows significant overhead in error status handling in SEA error handling. - ghes_peek_estatus(): 8,138.3 ns (21,160 cycles). - ghes_clear_estatus(): 2,038.3 ns (5,300 cycles). Apply the same optimization used in ghes_notify_nmi() to ghes_notify_sea() by checking for active errors before processing, Tested-by: Tony Luck Reviewed-by: Tony Luck Signed-off-by: Shuai Xue Reviewed-by: Hanjun Guo Link: https://patch.msgid.link/20260112032239.30023-4-xueshuai@linux.alibaba.com Signed-off-by: Rafael J. Wysocki

linux/drivers/acpi/apei, branch master

ACPI: APEI: GHES: Add NVIDIA vendor CPER record handler

ACPI: APEI: GHES: Add devm_ghes_register_vendor_record_notifier()

Convert more 'alloc_obj' cases to default GFP_KERNEL arguments

Convert 'alloc_obj' family to use the new default GFP_KERNEL argument

treewide: Replace kmalloc with kmalloc_obj for non-scalar types

ACPI: APEI: GHES: Add ghes_edac support for __ZX__ and _BYO_ systems

ACPI: APEI: GHES: Disable KASAN instrumentation when compile testing with clang < 18

ACPI: APEI: GHES: Add helper to copy CPER CXL protocol error info to work struct

ACPI: APEI: GHES: Add helper for CPER CXL protocol errors checks

ACPI: APEI: GHES: Improve ghes_notify_sea() status check

ACPI: APEI: GHES: Add ghes_edac support for ZX and _BYO_ systems