aboutsummaryrefslogtreecommitdiffstats
path: root/tools/perf/scripts/python/stackcollapse.py (unfollow)
AgeCommit message (Collapse)AuthorFilesLines
2025-09-26dt-bindings: PCI: qcom,pcie-x1e80100: Set clocks minItems for the fifth ↵Qiang Yu1-1/+2
Glymur PCIe Controller On the Qualcomm Glymur platform, the fifth PCIe host is compatible with the DWC controller present on the X1E80100 platform, but does not have cnoc_sf_axi clock. Hence, set minItems of clocks and clock-names to six. Signed-off-by: Qiang Yu <qiang.yu@oss.qualcomm.com> Signed-off-by: Pankaj Patil <pankaj.patil@oss.qualcomm.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Rob Herring (Arm) <robh@kernel.org> Link: https://patch.msgid.link/20250919142325.1090059-1-pankaj.patil@oss.qualcomm.com
2025-09-26PCI: dwc: Support 16-lane operationKonrad Dybcio2-0/+4
Some hosts support 16 lanes of PCIe. Make num-lanes accept that number. Signed-off-by: Konrad Dybcio <konrad.dybcio@oss.qualcomm.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://patch.msgid.link/20250926-topic-pcie_16ln-v1-1-c249acc18790@oss.qualcomm.com
2025-09-26PCI: Add lockdep assertion in pci_stop_and_remove_bus_device()Niklas Schnelle3-1/+4
Removing a PCI devices requires holding pci_rescan_remove_lock. Prompted by this being missed in sriov_disable() and going unnoticed since its inception, add a lockdep assert so this doesn't get missed again in the future. Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Benjamin Block <bblock@linux.ibm.com> Reviewed-by: Julian Ruess <julianr@linux.ibm.com> Link: https://patch.msgid.link/20250826-pci_fix_sriov_disable-v1-2-2d0bc938f2a3@linux.ibm.com
2025-09-26PCI/IOV: Add PCI rescan-remove locking when enabling/disabling SR-IOVNiklas Schnelle1-0/+5
Before disabling SR-IOV via config space accesses to the parent PF, sriov_disable() first removes the PCI devices representing the VFs. Since commit 9d16947b7583 ("PCI: Add global pci_lock_rescan_remove()") such removal operations are serialized against concurrent remove and rescan using the pci_rescan_remove_lock. No such locking was ever added in sriov_disable() however. In particular when commit 18f9e9d150fc ("PCI/IOV: Factor out sriov_add_vfs()") factored out the PCI device removal into sriov_del_vfs() there was still no locking around the pci_iov_remove_virtfn() calls. On s390 the lack of serialization in sriov_disable() may cause double remove and list corruption with the below (amended) trace being observed: PSW: 0704c00180000000 0000000c914e4b38 (klist_put+56) GPRS: 000003800313fb48 0000000000000000 0000000100000001 0000000000000001 00000000f9b520a8 0000000000000000 0000000000002fbd 00000000f4cc9480 0000000000000001 0000000000000000 0000000000000000 0000000180692828 00000000818e8000 000003800313fe2c 000003800313fb20 000003800313fad8 #0 [3800313fb20] device_del at c9158ad5c #1 [3800313fb88] pci_remove_bus_device at c915105ba #2 [3800313fbd0] pci_iov_remove_virtfn at c9152f198 #3 [3800313fc28] zpci_iov_remove_virtfn at c90fb67c0 #4 [3800313fc60] zpci_bus_remove_device at c90fb6104 #5 [3800313fca0] __zpci_event_availability at c90fb3dca #6 [3800313fd08] chsc_process_sei_nt0 at c918fe4a2 #7 [3800313fd60] crw_collect_info at c91905822 #8 [3800313fe10] kthread at c90feb390 #9 [3800313fe68] __ret_from_fork at c90f6aa64 #10 [3800313fe98] ret_from_fork at c9194f3f2. This is because in addition to sriov_disable() removing the VFs, the platform also generates hot-unplug events for the VFs. This being the reverse operation to the hotplug events generated by sriov_enable() and handled via pdev->no_vf_scan. And while the event processing takes pci_rescan_remove_lock and checks whether the struct pci_dev still exists, the lack of synchronization makes this checking racy. Other races may also be possible of course though given that this lack of locking persisted so long observable races seem very rare. Even on s390 the list corruption was only observed with certain devices since the platform events are only triggered by config accesses after the removal, so as long as the removal finished synchronously they would not race. Either way the locking is missing so fix this by adding it to the sriov_del_vfs() helper. Just like PCI rescan-remove, locking is also missing in sriov_add_vfs() including for the error case where pci_stop_and_remove_bus_device() is called without the PCI rescan-remove lock being held. Even in the non-error case, adding new PCI devices and buses should be serialized via the PCI rescan-remove lock. Add the necessary locking. Fixes: 18f9e9d150fc ("PCI/IOV: Factor out sriov_add_vfs()") Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Benjamin Block <bblock@linux.ibm.com> Reviewed-by: Farhan Ali <alifm@linux.ibm.com> Reviewed-by: Julian Ruess <julianr@linux.ibm.com> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20250826-pci_fix_sriov_disable-v1-1-2d0bc938f2a3@linux.ibm.com
2025-09-25PCI: Set up bridge resources earlierIlpo Järvinen1-3/+10
Bridge windows are read twice from PCI Config Space, the first time from pci_read_bridge_windows(), which does not set up the device's resources. This causes problems down the road as child resources of the bridge cannot check whether they reside within the bridge window or not. Set up the bridge windows already in pci_read_bridge_windows(). Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://patch.msgid.link/20250924134228.1663-2-ilpo.jarvinen@linux.intel.com
2025-09-24PCI: Don't print stale information about resourceIlpo Järvinen1-5/+10
pbus_size_mem() logs the bridge window resource using pci_info() before the start and end fields of the resource have been updated which then prints stale information. Set resource addresses earlier to make understanding logs easier. Regrettably, this results in setting the addresses multiple times but that seems unavoidable. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://patch.msgid.link/20250924135641.3399-1-ilpo.jarvinen@linux.intel.com
2025-09-24PCI: tegra194: Handle errors in BPMP responseVidya Sagar1-2/+16
The return value from tegra_bpmp_transfer() indicates the success or failure of the IPC transaction with BPMP. If the transaction succeeded, we also need to check the actual command's result code. If we don't have error handling for tegra_bpmp_transfer(), we will set the pcie->ep_state to EP_STATE_ENABLED even when the tegra_bpmp_transfer() command fails. Thus, the pcie->ep_state will get out of sync with reality, and any further PERST# assert + deassert will be a no-op and will not trigger the hardware initialization sequence. This is because pex_ep_event_pex_rst_deassert() checks the current pcie->ep_state, and does nothing if the current state is already EP_STATE_ENABLED. Thus, it is important to have error handling for tegra_bpmp_transfer(), such that the pcie->ep_state can not get out of sync with reality, so that we will try to initialize the hardware not only during the first PERST# assert + deassert, but also during any succeeding PERST# assert + deassert. One example where this fix is needed is when using a rock5b as host. During the initial PERST# assert + deassert (triggered by the bootloader on the rock5b) pex_ep_event_pex_rst_deassert() will get called, but for some unknown reason, the tegra_bpmp_transfer() call to initialize the PHY fails. Once Linux has been loaded on the rock5b, the PCIe driver will once again assert + deassert PERST#. However, without tegra_bpmp_transfer() error handling, this second PERST# assert + deassert will not trigger the hardware initialization sequence. With tegra_bpmp_transfer() error handling, the second PERST# assert + deassert will once again trigger the hardware to be initialized and this time the tegra_bpmp_transfer() succeeds. Fixes: c57247f940e8 ("PCI: tegra: Add support for PCIe endpoint mode in Tegra194") Signed-off-by: Vidya Sagar <vidyas@nvidia.com> [cassel: improve commit log] Signed-off-by: Niklas Cassel <cassel@kernel.org> Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Jon Hunter <jonathanh@nvidia.com> Acked-by: Thierry Reding <treding@nvidia.com> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20250922140822.519796-8-cassel@kernel.org
2025-09-24PCI: tegra194: Reset BARs when running in PCIe endpoint modeNiklas Cassel1-0/+10
Tegra already defines all BARs except BAR0 as BAR_RESERVED. This is sufficient for pci-epf-test to not allocate backing memory and to not call set_bar() for those BARs. However, marking a BAR as BAR_RESERVED does not mean that the BAR gets disabled. The host side driver, pci_endpoint_test, simply does an ioremap for all enabled BARs and will run tests against all enabled BARs, so it will run tests against the BARs marked as BAR_RESERVED. After running the BAR tests (which will write to all enabled BARs), the inbound address translation is broken. This is because the tegra controller exposes the ATU Port Logic Structure in BAR4, so when BAR4 is written, the inbound address translation settings get overwritten. To avoid this, implement the dw_pcie_ep_ops .init() callback and start off by disabling all BARs (pci-epf-test will later enable/configure BARs that are not defined as BAR_RESERVED). This matches the behavior of other PCIe endpoint drivers: dra7xx, imx6, layerscape-ep, artpec6, dw-rockchip, qcom-ep, rcar-gen4, and uniphier-ep. With this, the PCI endpoint kselftest test case CONSECUTIVE_BAR_TEST (which was specifically made to detect address translation issues) passes. Fixes: c57247f940e8 ("PCI: tegra: Add support for PCIe endpoint mode in Tegra194") Signed-off-by: Niklas Cassel <cassel@kernel.org> Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20250922140822.519796-7-cassel@kernel.org
2025-09-24PCI/sysfs: Ensure devices are powered for config readsBrian Norris1-1/+19
The "max_link_width", "current_link_speed", "current_link_width", "secondary_bus_number", and "subordinate_bus_number" sysfs files all access config registers, but they don't check the runtime PM state. If the device is in D3cold or a parent bridge is suspended, we may see -EINVAL, bogus values, or worse, depending on implementation details. Wrap these access in pci_config_pm_runtime_{get,put}() like most of the rest of the similar sysfs attributes. Notably, "max_link_speed" does not access config registers; it returns a cached value since d2bd39c0456b ("PCI: Store all PCIe Supported Link Speeds"). Fixes: 56c1af4606f0 ("PCI: Add sysfs max_link_speed/width, current_link_speed/width, etc") Signed-off-by: Brian Norris <briannorris@google.com> Signed-off-by: Brian Norris <briannorris@chromium.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20250924095711.v2.1.Ibb5b6ca1e2c059e04ec53140cd98a44f2684c668@changeid
2025-09-24PCI: tegra194: Set pci_epc_features::msi_capable to trueNiklas Cassel1-0/+1
Since the driver supports MSI, set the flag to true. This helps pci_endpoint_test to use the optimal IRQ type when using PCITEST_IRQ_TYPE_AUTO. Signed-off-by: Niklas Cassel <cassel@kernel.org> [mani: splitted this change from the bug fix] Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Link: https://patch.msgid.link/20250922140822.519796-6-cassel@kernel.org
2025-09-24PCI: tegra194: Fix broken tegra_pcie_ep_raise_msi_irq()Niklas Cassel1-2/+2
The pci_epc_raise_irq() supplies a MSI or MSI-X interrupt number in range (1-N), as per the pci_epc_raise_irq() kdoc, where N is 32 for MSI. But tegra_pcie_ep_raise_msi_irq() incorrectly uses the interrupt number as the MSI vector. This causes wrong MSI vector to be triggered, leading to the failure of PCI endpoint Kselftest MSI_TEST test case. To fix this issue, convert the interrupt number to MSI vector. Fixes: c57247f940e8 ("PCI: tegra: Add support for PCIe endpoint mode in Tegra194") Signed-off-by: Niklas Cassel <cassel@kernel.org> Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20250922140822.519796-6-cassel@kernel.org
2025-09-23PCI: qcom: Remove custom ASPM enablement codeManivannan Sadhasivam1-32/+0
Since the PCI subsystem has started enabling all ASPM states for all devicetree based platforms, the ASPM enablement code from this driver can now be dropped. Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@oss.qualcomm.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://patch.msgid.link/20250922-pci-dt-aspm-v2-2-2a65cf84e326@oss.qualcomm.com
2025-09-23PCI/ASPM: Enable all ClockPM and ASPM states for devicetree platformsManivannan Sadhasivam1-2/+43
So far, the PCI subsystem has honored the ASPM and Clock PM states set by the BIOS (through LNKCTL) during device initialization, if it relies on the default state selected using: * Kconfig: CONFIG_PCIEASPM_DEFAULT=y, or * cmdline: "pcie_aspm=off", or * FADT: ACPI_FADT_NO_ASPM This was done conservatively to avoid issues with the buggy devices that advertise ASPM capabilities, but behave erratically if the ASPM states are enabled. So the PCI subsystem ended up trusting the BIOS to enable only the ASPM states that were known to work for the devices. But this turned out to be a problem for devicetree platforms, especially the ARM based devicetree platforms powering Embedded and *some* Compute devices as they tend to run without any standard BIOS. So the ASPM states on these platforms were left disabled during boot and the PCI subsystem never bothered to enable them, unless the user has forcefully enabled the ASPM states through Kconfig, cmdline, and sysfs or the device drivers themselves, enabling the ASPM states through pci_enable_link_state() APIs. This caused runtime power issues on those platforms. So a couple of approaches were tried to mitigate this BIOS dependency without user intervention by enabling the ASPM states in the PCI controller drivers after device enumeration, and overriding the ASPM/Clock PM states by the PCI controller drivers through an API before enumeration. But it has been concluded that none of these mitigations should really be required and the PCI subsystem should enable the ASPM states advertised by the devices without relying on BIOS or the PCI controller drivers. If any device is found to be misbehaving after enabling ASPM states that they advertised, then those devices should be quirked to disable the problematic ASPM/Clock PM states. In an effort to do so, start by overriding the ASPM and Clock PM states set by the BIOS for devicetree platforms first. Separate helper functions are introduced to override the BIOS set states by enabling all of them if of_have_populated_dt() returns true. To aid debugging, print the overridden ASPM and Clock PM states as well. In the future, these helpers could be extended to allow other platforms like VMD, newer ACPI systems with a cutoff year etc... to follow the path. Link: https://lore.kernel.org/linux-pci/20250828204345.GA958461@bhelgaas Suggested-by: Bjorn Helgaas <helgaas@kernel.org> Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam@oss.qualcomm.com> [bhelgaas: tweak comments and dmesg logs] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://patch.msgid.link/20250922-pci-dt-aspm-v2-1-2a65cf84e326@oss.qualcomm.com
2025-09-22PCI/PM: Skip resuming to D0 if device is disconnectedMario Limonciello1-0/+5
When a device is surprise-removed (e.g., due to a dock unplug), the PCI core unconfigures all downstream devices and sets their error state to pci_channel_io_perm_failure. This marks them as disconnected via pci_dev_is_disconnected(). During device removal, the runtime PM framework may attempt to resume the device to D0 via pm_runtime_get_sync(), which calls into pci_power_up(). Since the device is already disconnected, this resume attempt is unnecessary and results in a predictable errors like this, typically when undocking from a TBT3 or USB4 dock with PCIe tunneling: pci 0000:01:00.0: Unable to change power state from D3cold to D0, device inaccessible Avoid powering up disconnected devices by checking their status early in pci_power_up() and returning -EIO. Suggested-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> [bhelgaas: add typical message] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Lukas Wunner <lukas@wunner.de> Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Acked-by: Rafael J. Wysocki <rafael@kernel.org> Link: https://patch.msgid.link/20250909031916.4143121-1-superm1@kernel.org
2025-09-18Documentation: PCI: Fix typosEmilio Perez1-1/+1
The PCIe spec uses "Requester ID", not "requestor ID". Follow the spec to avoid confusion. Signed-off-by: Emilio Perez <emiliopeju@gmail.com> [bhelgaas: capitalize as a hint that the spec defines this] Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://patch.msgid.link/20250818023121.33427-1-emiliopeju@gmail.com
2025-09-16PCI: Alter misleading recursion to pci_bus_release_bridge_resources()Ilpo Järvinen1-1/+1
Recursing into pci_bus_release_bridge_resources() should not alter rel_type because it makes no sense to change the release type within the recursion call chain. A literal "whole_subtree" is passed into the recursion instead of "rel_type" parameter which is misleading as the release type should remain the same throughout the entire operation. This is not a correctness issue because of the preceding if () that only allows the recursion to happen if rel_type is "whole_subtree". Still, replace the non-intuitive parameter with direct passing of "rel_type". Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://patch.msgid.link/20250829131113.36754-25-ilpo.jarvinen@linux.intel.com
2025-09-16PCI: Pass bridge window to pci_bus_release_bridge_resources()Ilpo Järvinen1-42/+27
pci_bus_release_bridge_resources() takes type, which is converted into a bridge window resource in pci_bridge_release_resources(). Find out the correct bridge window for resource whose assignment failed. Pass that bridge window to pci_bus_release_bridge_resources() instead of passing the type. When recursing to subordinate, check which bridge windows have to be released and recurse for each. For now, use pbus_select_window_for_type() instead of pbus_select_window() because non-bridge window resources still have their flags reset which destroys the type information from the struct resource. The struct pci_dev_resource holds a copy of the flags which are used instead. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://patch.msgid.link/20250829131113.36754-24-ilpo.jarvinen@linux.intel.com
2025-09-16PCI: Add pci_setup_one_bridge_window()Ilpo Järvinen1-18/+19
pci_bridge_release_resources() contains a resource type hack to work around the unsuitable __pci_setup_bridge() interface. Extract the switch statement that picks the correct bridge window setup function from pci_claim_bridge_resource() into pci_setup_one_bridge_window() and use it also in pci_bridge_release_resources(). Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://patch.msgid.link/20250829131113.36754-23-ilpo.jarvinen@linux.intel.com
2025-09-16PCI: Refactor remove_dev_resources() to use pbus_select_window()Ilpo Järvinen1-25/+9
Convert remove_dev_resources() to use pbus_select_window(). As 'available' is not the real resources, the index has to be adjusted as only bridge resource counterparts are present in the 'available' array. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://patch.msgid.link/20250829131113.36754-22-ilpo.jarvinen@linux.intel.com
2025-09-16PCI: Refactor distributing available memory to use loopsIlpo Järvinen2-91/+74
pci_bus_distribute_available_resources() and pci_bridge_distribute_available_resources() retain bridge window resources and related data needed for distributing the available window in independent variables for io, memory, and prefetchable memory windows. The code is essentially the same for all of them and therefore repeated three times with different variable names. Refactor pci_bus_distribute_available_resources() to take an array. This is complicated slightly by the function taking advantage of passing the struct as value, which cannot be done for arrays in C. Therefore, copy the data into a local array in the stack in the first loop. Variable names are (hopefully) improved slightly as well. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://patch.msgid.link/20250829131113.36754-21-ilpo.jarvinen@linux.intel.com
2025-09-16PCI: Use pbus_select_window_for_type() during mem window sizingIlpo Järvinen1-87/+24
__pci_bus_size_bridges() goes to great lengths of helping pbus_size_mem() in which types it should put into a particular bridge window, requiring passing up to three resource type into pbus_size_mem(). Instead of having complex logic in __pci_bus_size_bridges() and a non-straightforward interface between those functions, use pbus_select_window_for_type() and pbus_select_window() to find the correct bridge window and compare if the resources belong to that window. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://patch.msgid.link/20250829131113.36754-20-ilpo.jarvinen@linux.intel.com
2025-09-16PCI: Use pbus_select_window() in space available checkerIlpo Järvinen1-33/+32
pbus_upstream_space_available() figures out the upstream bridge window resources on its own. Migrate it to use pbus_select_window(). Note: pbus_select_window() -> pbus_select_window_for_type() calls find_bus_resource_of_type() for root bus, which does not do parent check similar to what pbus_upstream_space_available() did earlier, but the difference does not matter because pbus_upstream_space_available() itself stops when it encounters the root bus. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://patch.msgid.link/20250829131113.36754-19-ilpo.jarvinen@linux.intel.com
2025-09-16PCI: Rename resource variable from r to resIlpo Järvinen1-7/+7
Resource is going to be passed in as argument aften an upcoming change. Rename the struct resource variable from "r" to "res" to avoid using one letter variable name in a function argument. This rename is made separately to reduce churn in the upcoming change. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://patch.msgid.link/20250829131113.36754-18-ilpo.jarvinen@linux.intel.com
2025-09-16PCI: Use pbus_select_window_for_type() during IO window sizingIlpo Järvinen1-2/+1
Convert pbus_size_io() to use pbus_select_window_for_type(). Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://patch.msgid.link/20250829131113.36754-17-ilpo.jarvinen@linux.intel.com
2025-09-16PCI: Use pbus_select_window() during BAR resizeIlpo Järvinen1-7/+13
Prior to a BAR resize, __resource_resize_store() loops through the normal resources of the PCI device and releases those that match to the flags of the BAR to be resized. This is necessary to allow resizing also the upstream bridge window as only childless bridge windows can be resized. While the flags check (mostly) works (if corner cases are ignored), the more straightforward way is to check if the resources share the bridge window. Change __resource_resize_store() to do the check using pbus_select_window(). Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://patch.msgid.link/20250829131113.36754-16-ilpo.jarvinen@linux.intel.com
2025-09-16PCI: Warn if bridge window cannot be released when resizing BARIlpo Järvinen1-0/+6
BAR resizing calls to pci_reassign_bridge_resources(), which attempts to release any upstream bridge window to allow them to accommodate the new BAR size. The release can only be performed if there are no other child resources for the bridge window. Previously the code continued silently when other child resources were detected. Add pci_warn() to inform user that a bridge window could not be released because of child resources. As a small bridge window is often the reason why BAR resize fails, this warning will help to pinpoint to the cause. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://patch.msgid.link/20250829131113.36754-15-ilpo.jarvinen@linux.intel.com
2025-09-16PCI: Fix finding bridge window in pci_reassign_bridge_resources()Ilpo Järvinen3-22/+20
pci_reassign_bridge_resources() walks upwards in the PCI bus hierarchy, locates the relevant bridge window on each level using flags check, and attempts to release the bridge window. The flags-based check is fragile due to various fallbacks in the bridge window selection logic. As such, the algorithm might not locate the correct bridge window. Refactor pci_reassign_bridge_resources() to determine the correct bridge window using pbus_select_window(), which contains logic to handle all fallback cases correctly. Change function prefix to pbus as it now inputs struct bus and resource for which to locate the bridge window. The main purpose is to make bridge window selection logic consistent across the entire PCI core (one step at a time). While this technically also fixes the commit 8bb705e3e79d ("PCI: Add pci_resize_resource() for resizing BARs") making the bridge window walk algorithm more robust, the normal setup having a 64-bit resizable BAR underneath bridge(s) with 64-bit prefetchable windows does not need to use any fallbacks. As such, the practical impact is low (requiring BAR resize use case and a non-typical bridge device). The way to detect if unrelated resource failed again is left to use the type based approximation which should not behave worse than before. Fixes: 8bb705e3e79d ("PCI: Add pci_resize_resource() for resizing BARs") Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://patch.msgid.link/20250829131113.36754-14-ilpo.jarvinen@linux.intel.com
2025-09-16PCI: Add bridge window selection functionsIlpo Järvinen2-0/+103
Various places in the PCI core code independently decide into which bridge window a child resource should be placed. It is hard to see whether these decisions always end up in agreement, especially in the corner cases, and in some places it requires complex logic to pass multiple resource types and/or bridge windows around. Add pbus_select_window() and pbus_select_window_for_type() for cases where the former cannot be used so that eventually the same helper can be used to select the bridge window everywhere. Using the same function ensures the selected bridge window remains always the same and it can be easily recalculated in-situ allowing simplifying the interfaces between internal functions in upcoming changes. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://patch.msgid.link/20250829131113.36754-13-ilpo.jarvinen@linux.intel.com
2025-09-16PCI: Add defines for bridge window indexingIlpo Järvinen2-3/+11
include/linux/pci.h provides PCI_BRIDGE_{IO,MEM,PREF_MEM}_WINDOW defines, however, they're based on the resource array indexing in the pci_dev struct. The struct pci_bus also has pointers to those same resources but they start from zeroth index. Add PCI_BUS_BRIDGE_{IO,MEM,PREF_MEM}_WINDOW defines to get rid of literal indexing. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://patch.msgid.link/20250829131113.36754-12-ilpo.jarvinen@linux.intel.com
2025-09-16PCI: Preserve bridge window resource type flagsIlpo Järvinen5-37/+90
When a bridge window is found unused or fails to assign, the flags of the associated resource are cleared. Clearing flags is problematic as it also removes the type information of the resource which is needed later. Thus, always preserve the bridge window type flags and use IORESOURCE_UNSET and IORESOURCE_DISABLED to indicate the status of the bridge window. Also, when initializing resources, make sure all valid bridge windows do get their type flags set. Change various places that relied on resource flags being cleared to check for IORESOURCE_UNSET and IORESOURCE_DISABLED to allow bridge window resource to retain their type flags. Add pdev_resource_assignable() and pdev_resource_should_fit() helpers to filter out disabled bridge windows during resource fitting; the latter combines more common checks into the helper. When reading the bridge windows from the registers, instead of leaving the resource flags cleared for bridge windows that are not enabled, always set up the flags and set IORESOURCE_UNSET | IORESOURCE_DISABLED as needed. When resource fitting or assignment fails for a bridge window resource, or the bridge window is not needed, mark the resource with IORESOURCE_UNSET or IORESOURCE_DISABLED, respectively. Use dummy zero resource in resource_show() for backwards compatibility as lspci will otherwise misrepresent disabled bridge windows. This change fixes an issue which highlights the importance of keeping the resource type flags intact: At the end of __assign_resources_sorted(), reset_resource() is called, previously clearing the flags. Later, pci_prepare_next_assign_round() attempted to release bridge resources using pci_bus_release_bridge_resources() that calls into pci_bridge_release_resources() that assumes type flags are still present. As type flags were cleared, IORESOURCE_MEM_64 was not set leading to resources under an incorrect bridge window to be released (idx = 1 instead of idx = 2). While the assignments performed later covered this problem so that the wrongly released resources got assigned in the end, it was still causing extra release+assign pairs. There are other reasons why the resource flags should be retained in upcoming changes too. Removing the flag reset for non-bridge window resource is left as future work, in part because it has a much higher regression potential due to pci_enable_resources() that will start to work also for those resources then and due to what endpoint drivers might assume about resources. Despite the Fixes tag, backporting this (at least any time soon) is highly discouraged. The issue fixed is borderline cosmetic as the later assignments normally cover the problem entirely. Also there might be non-obvious dependencies. Fixes: 5b28541552ef ("PCI: Restrict 64-bit prefetchable bridge windows to 64-bit resources") Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://patch.msgid.link/20250829131113.36754-11-ilpo.jarvinen@linux.intel.com
2025-09-16PCI: Enable bridge even if bridge window fails to assignIlpo Järvinen1-13/+17
A normal PCI bridge has multiple bridge windows and not all of them are always required by devices underneath the bridge. If a Root Port or bridge does not have a device underneath, no bridge windows get assigned. Yet, pci_enable_resources() is set to fail indiscriminantly on any resource assignment failure if the resource is not known to be optional. In practice, the code in pci_enable_resources() is currently largely dormant. The kernel sets resource flags to zero for any unused bridge window and resets flags to zero in case of an resource assignment failure, which short-circuits pci_enable_resources() because of this check: if (!(r->flags & (IORESOURCE_IO | IORESOURCE_MEM))) continue; However, an upcoming change to resource flags will alter how bridge window resource flags behave activating these long dormants checks in pci_enable_resources(). While complex logic could be built to selectively enable a bridge only under some conditions, a few versions of such logic were tried during development of this change and none of them worked satisfactorily. Thus, I just gave up and decided to enable any bridge regardless of the bridge windows as there seems to be no clear benefit from not enabling it, but a major downside as pcieport will not be probed for the bridge if it's not enabled. Therefore, change pci_enable_resources() to not check if bridge window resources remain unassigned. Resource assignment failures are pretty noisy already so there is no need to log that for bridge windows in pci_enable_resources(). Ignoring bridge window failures hopefully prevents an obvious source of regressions when the upcoming change that no longer clears resource flags for bridge windows is enacted. I've hit this problem even during my own testing on multiple occasions so I expect it to be a quite common problem. This can always be revisited later if somebody thinks the enable check for bridges is not strict enough, but expect a mind-boggling number of regressions from such a change. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://patch.msgid.link/20250829131113.36754-10-ilpo.jarvinen@linux.intel.com
2025-09-16PCI: Use pci_release_resource() instead of release_resource()Ilpo Järvinen3-36/+23
A few places in setup-bus.c call release_resource() directly and end up duplicating functionality from pci_release_resource() such as parent check, logging, and clearing the resource. Worse yet, the way the resource is cleared is inconsistent between different sites. Convert release_resource() calls into pci_release_resource() to remove code duplication. This will also make the resource start, end, and flags behavior consistent, i.e., start address is cleared, and only IORESOURCE_UNSET is asserted for the resource. While at it, eliminate the unnecessary initialization of idx variable in pci_bridge_release_resources(). Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://patch.msgid.link/20250829131113.36754-9-ilpo.jarvinen@linux.intel.com
2025-09-16PCI: Disable non-claimed bridge windowIlpo Järvinen1-13/+12
If clipping or claiming the bridge window fails, the bridge window is left in a state that does not match the kernel's view on what the bridge window is. Disable the bridge window by writing the magic disable value into the Base and Limit Registers if clipping or claiming failed. To detect if claiming the resource was successful, add res->parent checks into the bridge setup functions. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://patch.msgid.link/20250829131113.36754-8-ilpo.jarvinen@linux.intel.com
2025-09-16PCI: Always claim bridge window before its setupIlpo Järvinen1-4/+8
When the claim of a resource fails for the full range in pci_claim_bridge_resource(), clipping the resource to a smaller size is attempted. If clipping is successful, the new bridge window is programmed and only as the last step the code attempts to claim the resource again. The order of the last two steps is slightly illogical and inconsistent with the assignment call chains. If claiming the bridge window after clipping fails, the bridge window that was set up is left in place. Rework the logic such that the bridge window is claimed before calling the relevant bridge setup function. This make the behavior consistent with resource fitting call chains that always assign the bridge window before programming it. If claiming the bridge window fails, the clipped bridge window is no longer set up but pci_claim_bridge_resource() returns without writing the bridge window at all. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://patch.msgid.link/20250829131113.36754-7-ilpo.jarvinen@linux.intel.com
2025-09-16PCI: Refactor find_bus_resource_of_type() logic checksIlpo Järvinen1-3/+7
Reorder the logic checks in find_bus_resource_of_type() to simplify them. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://patch.msgid.link/20250829131113.36754-6-ilpo.jarvinen@linux.intel.com
2025-09-16PCI: Move find_bus_resource_of_type() earlierIlpo Järvinen1-28/+28
Move find_bus_resource_of_type() earlier in setup-bus.c to be able to call it in upcoming changes. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://patch.msgid.link/20250829131113.36754-5-ilpo.jarvinen@linux.intel.com
2025-09-16MIPS: PCI: Use pci_enable_resources()Ilpo Järvinen1-36/+2
pci-legacy.c under MIPS has a copy of pci_enable_resources() named as pcibios_enable_resources(). Having own copy of same functionality could lead to inconsistencies in behavior, especially now as pci_enable_resources() and the bridge window resource flags behavior are going to be altered by upcoming changes. The check for !r->start && r->end is already covered by the more generic checks done in pci_enable_resources(). Call pci_enable_resources() from MIPS's pcibios_enable_device() and remove pcibios_enable_resources(). Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Acked-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Link: https://patch.msgid.link/20250829131113.36754-4-ilpo.jarvinen@linux.intel.com
2025-09-16sparc/PCI: Remove pcibios_enable_device() as they do nothing extraIlpo Järvinen3-81/+0
Under arch/sparc/ there are multiple copies of pcibios_enable_device() but none of those seem to do anything extra beyond what pci_enable_resources() is supposed to do. These functions could lead to inconsistencies in behavior, especially now as pci_enable_resources() and the bridge window resource flags behavior are going to be altered by upcoming changes. Remove all pcibios_enable_device() from arch/sparc/ so that PCI core can simply call into pci_enable_resources() instead using its __weak version of pcibios_enable_device(). Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://patch.msgid.link/20250829131113.36754-3-ilpo.jarvinen@linux.intel.com
2025-09-16m68k/PCI: Use pci_enable_resources() in pcibios_enable_device()Ilpo Järvinen1-28/+11
m68k has a resource enable (check) loop in its pcibios_enable_device() which for some reason differs from pci_enable_resources(). This could lead to inconsistencies in behavior, especially now as pci_enable_resources() and the bridge window resource flags behavior are going to be altered by upcoming changes. The check for !r->start && r->end is already covered by the more generic checks done in pci_enable_resources(). The entire pcibios_enable_device() suspiciously looks copy-paste from some other arch as also indicated by the preceding comment. However, it also enables PCI_COMMAND_IO | PCI_COMMAND_MEMORY always for bridges. It is not clear why that is being done as the commit e93a6bbeb5a5 ("m68k: common PCI support definitions and code") introducing this code states "Nothing specific to any PCI implementation in any m68k class CPU hardware yet". Replace the resource enable loop with a call to pci_enable_resources() and adjust the Command Register afterwards as it's unclear if that is necessary or not so keep it for now. Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://patch.msgid.link/20250829131113.36754-2-ilpo.jarvinen@linux.intel.com
2025-09-16PCI: Fix failure detection during resource resizeIlpo Järvinen1-8/+18
Since 96336ec70264 ("PCI: Perform reset_resource() and build fail list in sync") the failed list is always built and returned to let the caller decide what to do with the failures. The caller may want to retry resource fitting and assignment and before that can happen, the resources should be restored to their original state (a reset effectively clears the struct resource), which requires returning them to the failed list so the original state remains stored in the associated struct pci_dev_resource. Resource resizing is different from the ordinary resource fitting and assignment in that it only considers part of the resources. This means failures for other resource types are not relevant at all and should be ignored. As resize doesn't unassign such unrelated resources, those resources ending up in the failed list implies assignment of that resource must have failed before resize too. The check in pci_reassign_bridge_resources() to decide if the whole assignment is successful, however, is based on list emptiness which will cause false negatives when the failed list has resources with an unrelated type. If the failed list is not empty, call pci_required_resource_failed() and extend it to be able to filter on specific resource types too (if provided). Calling pci_required_resource_failed() at this point is slightly problematic because the resource itself is reset when the failed list is constructed in __assign_resources_sorted(). As a result, pci_resource_is_optional() does not have access to the original resource flags. This could be worked around by restoring and re-resetting the resource around the call to pci_resource_is_optional(), however, it shouldn't cause issue as resource resizing is meant for 64-bit prefetchable resources according to Christian König (see the Link which unfortunately doesn't point directly to Christian's reply because lore didn't store that email at all). Fixes: 96336ec70264 ("PCI: Perform reset_resource() and build fail list in sync") Link: https://lore.kernel.org/all/c5d1b5d8-8669-5572-75a7-0b480f581ac1@linux.intel.com/ Reported-by: D Scott Phillips <scott@os.amperecomputing.com> Closes: https://lore.kernel.org/all/86plf0lgit.fsf@scott-ph-mail.amperecomputing.com/ Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: D Scott Phillips <scott@os.amperecomputing.com> Reviewed-by: D Scott Phillips <scott@os.amperecomputing.com> Cc: Christian König <christian.koenig@amd.com> Cc: stable@vger.kernel.org # v6.15+ Link: https://patch.msgid.link/20250822123359.16305-4-ilpo.jarvinen@linux.intel.com
2025-09-16PCI: Fix pdev_resources_assignable() disparityIlpo Järvinen1-0/+1
pdev_sort_resources() uses pdev_resources_assignable() helper to decide if device's resources cannot be assigned, so it ignores class 0 (PCI_CLASS_NOT_DEFINED) devices. pbus_size_mem(), on the other hand, does not do the same check. This could lead into a situation where a resource ends up on realloc_head list but is not on the head list, which in turn prevents emptying the resource from the realloc_head list in __assign_resources_sorted(). A non-empty realloc_head is unacceptable because it triggers an internal sanity check as shown in this log with a device that has class 0 (PCI_CLASS_NOT_DEFINED): pci 0001:01:00.0: [144d:a5a5] type 00 class 0x000000 PCIe Endpoint pci 0001:01:00.0: BAR 0 [mem 0x00000000-0x000fffff 64bit] pci 0001:01:00.0: ROM [mem 0x00000000-0x0000ffff pref] pcieport 0001:00:00.0: bridge window [mem 0x00100000-0x001fffff] to [bus 01-ff] add_size 100000 add_align 100000 pcieport 0001:00:00.0: bridge window [mem 0x40000000-0x401fffff]: assigned ------------[ cut here ]------------ kernel BUG at drivers/pci/setup-bus.c:2532! Internal error: Oops - BUG: 00000000f2000800 [#1] SMP ... Call trace: pci_assign_unassigned_bus_resources+0x110/0x114 (P) pci_rescan_bus+0x28/0x48 Use pdev_resources_assignable() also within pbus_size_mem() to skip processing of non-assignable resources which removes the disparity in between what resources pdev_sort_resources() and pbus_size_mem() consider. As non-assignable resources are no longer processed, they are not added to the realloc_head list, thus the sanity check no longer triggers. This disparity problem is very old but only now became apparent after 2499f5348431 ("PCI: Rework optional resource handling") that made the ROM resources optional when calculating bridge window sizes which required adding the resource to the realloc_head list. Previously, bridge windows were just sized larger than necessary. Fixes: 2499f5348431 ("PCI: Rework optional resource handling") Reported-by: Tudor Ambarus <tudor.ambarus@linaro.org> Closes: https://lore.kernel.org/all/5f103643-5e1c-43c6-b8fe-9617d3b5447c@linaro.org/ Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Cc: stable@vger.kernel.org # v6.15+ Link: https://patch.msgid.link/20250822123359.16305-3-ilpo.jarvinen@linux.intel.com
2025-09-16PCI: Ensure relaxed tail alignment does not increase min_alignIlpo Järvinen1-4/+7
When using relaxed tail alignment for the bridge window, pbus_size_mem() also tries to minimize min_align, which can under certain scenarios end up increasing min_align from that found by calculate_mem_align(). Ensure min_align is not increased by the relaxed tail alignment. Eventually, it would be better to add calculate_relaxed_head_align() similar to calculate_mem_align() which finds out what alignment can be used for the head without introducing any gaps into the bridge window to give flexibility on head address too. But that looks relatively complex so it requires much more testing than fixing the immediate problem causing a regression. Fixes: 67f9085596ee ("PCI: Allow relaxed bridge window tail sizing for optional resources") Reported-by: Rio Liu <rio@r26.me> Closes: https://lore.kernel.org/all/o2bL8MtD_40-lf8GlslTw-AZpUPzm8nmfCnJKvS8RQ3NOzOW1uq1dVCEfRpUjJ2i7G2WjfQhk2IWZ7oGp-7G-jXN4qOdtnyOcjRR0PZWK5I=@r26.me/ Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Tested-by: Rio Liu <rio@r26.me> Cc: stable@vger.kernel.org # v6.15+ Link: https://patch.msgid.link/20250822123359.16305-2-ilpo.jarvinen@linux.intel.com
2025-09-16Documentation: PCI: Tidy error recovery doc's PCIe nomenclatureLukas Wunner1-7/+7
Commit 11502feab423 ("Documentation: PCI: Tidy AER documentation") replaced the terms "PCI-E", "PCI-Express" and "PCI Express" with "PCIe" in the AER documentation. Do the same in the documentation on PCI error recovery. While at it, add a missing period and a missing blank. Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Brian Norris <briannorris@chromium.org> Reviewed-by: Niklas Schnelle <schnelle@linux.ibm.com> Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Link: https://patch.msgid.link/db56b7ef12043f709a04ce67c1d1e102ab5f4e19.1757942121.git.lukas@wunner.de
2025-09-16Documentation: PCI: Amend error recovery doc with DPC/AER specificsLukas Wunner1-0/+22
Amend the documentation on PCI error recovery with specifics about Downstream Port Containment and Advanced Error Reporting: * Explain that with DPC, devices are inaccessible upon an error (similar to EEH on powerpc) and do not become accessible until the link is re-enabled. * Explain that with AER, although devices may already be accessible in the ->error_detected() callback, accesses should be deferred to the ->mmio_enabled() callback for compatibility with EEH on powerpc and with s390. Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Brian Norris <briannorris@chromium.org> Reviewed-by: Niklas Schnelle <schnelle@linux.ibm.com> Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Link: https://patch.msgid.link/61d8eeadb20ee71c3a852f44c863bfe0209c454d.1757942121.git.lukas@wunner.de
2025-09-16Documentation: PCI: Sync error recovery doc with codeLukas Wunner1-2/+5
Amend the documentation on PCI error recovery to fix minor inaccuracies vis-à-vis the actual code: * The documentation claims that a missing ->resume() or ->mmio_enabled() callback always leads to recovery through reset. But none of the implementations do this (pcie_do_recovery(), eeh_handle_normal_event(), zpci_event_do_error_state_clear()). Drop the claim to align the documentation with the code. * The documentation does not list PCI_ERS_RESULT_RECOVERED as a valid return value from ->error_detected(). But none of the implementations forbid this and some drivers are returning it, e.g.: drivers/bus/mhi/host/pci_generic.c drivers/infiniband/hw/hfi1/pcie.c Further down in the documentation it is implied that the return value is in fact allowed: "The platform will call the resume() callback on all affected device drivers if all drivers on the segment have returned PCI_ERS_RESULT_RECOVERED from one of the 3 previous callbacks." The "3 previous callbacks" being ->error_detected(), ->mmio_enabled() and ->slot_reset(). Add it to the valid return values for consistency. Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Brian Norris <briannorris@chromium.org> Reviewed-by: Niklas Schnelle <schnelle@linux.ibm.com> Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Link: https://patch.msgid.link/ed3c3385499775fcc25f1ee66f395e212919f94a.1757942121.git.lukas@wunner.de
2025-09-16Documentation: PCI: Sync AER doc with codeLukas Wunner1-44/+39
The PCIe Advanced Error Reporting driver has evolved over the years but its documentation hasn't. Catch up with past code changes: * The documentation claims that Correctable Errors are logged with KERN_INFO severity, but the code uses KERN_WARN. It had used KERN_WARN from the beginning with commit 6c2b374d7485 ("PCI-Express AER implemetation: AER core and aerdriver"). In 2013, commit 2cced2d95961 ("aerdrv: Cleanup log output for AER") switched to KERN_ERR, until 2020 when it was reverted back to KERN_WARN by commit e83e2ca3c395 ("PCI/AER: Log correctable errors as warning, not error"). * An example log message in the documentation uses the term "Uncorrected", but the code uses "Uncorrectable" since commit 02a06f5f1a6a ("PCI/AER: Use 'Correctable' and 'Uncorrectable' spec terms for errors"). * The example contains the Requester ID "id=0500", which is omitted since commit 010caed4ccb6 ("PCI/AER: Decode Error Source Requester ID"). * The example contains the error name "Unsupported Request", which is instead reported as "UnsupReq" since commit bd237801fef2 ("PCI/AER: Adopt lspci names for AER error decoding"). * The example doesn't prepend "0x" to hex values from the TLP Header Log, as introduced by commit f68ea779d98a ("PCI: Add pcie_print_tlp_log() to print TLP Header and Prefix Log"). * The documentation refers to a reset_link callback which was removed by commit b6cf1a42f916 ("PCI/ERR: Remove service dependency in pcie_do_recovery()"). * Commit 579086225502 ("PCI/ERR: Recover from RCiEP AER errors") added support to recover Root Complex Integrated Endpoints by applying a Function Level Reset, alternatively to the Secondary Bus Reset which is applied otherwise. * On non-fatal errors, a reset was previously never performed. But the AER driver has just been amended to allow drivers to opt in to a reset. * The documentation claims that a warning message is logged if a driver lacks pci_error_handlers. But the message has been informational (logged with KERN_INFO severity) since its introduction with commit 01daacfb9035 ("PCI/AER: Log which device prevents error recovery"). The documentation claims that the message is only logged for fatal errors, which is incorrect. Moreover it refers to "section 3", even though the documentation no longer contains section numbers since commit 4e37f055a92e ("Documentation: PCI: convert pcieaer-howto.txt to reST"). Section 3 is titled "Developer Guide". That's the same section where the reference is located, so it is self-referential and can be dropped. Signed-off-by: Lukas Wunner <lukas@wunner.de> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Brian Norris <briannorris@chromium.org> Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> Link: https://patch.msgid.link/7501bfc5b9920193a25998a3cbcf72c47674ec63.1757942121.git.lukas@wunner.de
2025-09-16PCI: endpoint: pci-epf-test: Add NULL check for DMA channels before releaseShin'ichiro Kawasaki1-6/+11
The fields dma_chan_tx and dma_chan_rx of the struct pci_epf_test can be NULL even after EPF initialization. Then it is prudent to check that they have non-NULL values before releasing the channels. Add the checks in pci_epf_test_clean_dma_chan(). Without the checks, NULL pointer dereferences happen and they can lead to a kernel panic in some cases: Unable to handle kernel NULL pointer dereference at virtual address 0000000000000050 Call trace: dma_release_channel+0x2c/0x120 (P) pci_epf_test_epc_deinit+0x94/0xc0 [pci_epf_test] pci_epc_deinit_notify+0x74/0xc0 tegra_pcie_ep_pex_rst_irq+0x250/0x5d8 irq_thread_fn+0x34/0xb8 irq_thread+0x18c/0x2e8 kthread+0x14c/0x210 ret_from_fork+0x10/0x20 Fixes: 8353813c88ef ("PCI: endpoint: Enable DMA tests for endpoints with DMA capabilities") Fixes: 5ebf3fc59bd2 ("PCI: endpoint: functions/pci-epf-test: Add DMA support to transfer data") Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com> [mani: trimmed the stack trace] Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Krzysztof Wilczyński <kwilczynski@kernel.org> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20250916025756.34807-1-shinichiro.kawasaki@wdc.com
2025-09-12PCI: endpoint: pci-epf-test: Fix doorbell test supportNiklas Cassel1-1/+13
The doorbell feature temporarily overrides the inbound translation to point to the address stored in epf_test->db_bar.phys_addr, i.e., it calls set_bar() twice without ever calling clear_bar(), as calling clear_bar() would clear the BAR's PCI address assigned by the host. Thus, when disabling the doorbell, restore the inbound translation to point to the memory allocated for the BAR. Without this, running the PCI endpoint kselftest doorbell test case more than once would fail. Fixes: eff0c286aa91 ("PCI: endpoint: pci-epf-test: Add doorbell test support") Signed-off-by: Niklas Cassel <cassel@kernel.org> Signed-off-by: Manivannan Sadhasivam <mani@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Frank Li <Frank.Li@nxp.com> Link: https://patch.msgid.link/20250908161942.534799-2-cassel@kernel.org
2025-09-12PCI: of: Update parent unit address generation in of_pci_prop_intr_map()Lorenzo Pieralisi1-7/+15
Some interrupt controllers require an #address-cells property in their bindings without requiring a "reg" property to be present. The current logic used to craft an interrupt-map property in of_pci_prop_intr_map() is based on reading the #address-cells property in the interrupt-parent and, if != 0, read the interrupt parent "reg" property to determine the parent unit address to be used to create the parent unit interrupt specifier. First of all, it is not correct to read the "reg" property of the interrupt-parent with an #address-cells value taken from the interrupt-parent node, because the #address-cells value define the number of address cells required by child nodes. More importantly, for all modern interrupt controllers, the parent unit address is irrelevant in hardware in relation to the device <-> interrupt-controller connection and the kernel actually ignores the parent unit address value when hierarchically parsing the interrupt-map property (i.e., of_irq_parse_raw()). For the reasons above, remove the code parsing the interrupt parent "reg" property in of_pci_prop_intr_map() -- it is not needed and prevents interrupt-map property generation on systems with an interrupt-controller that has no "reg" property in its interrupt-controller node -- and leave the parent unit address always initialized to 0 since it is simply ignored by the kernel. Signed-off-by: Lorenzo Pieralisi <lpieralisi@kernel.org> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Reviewed-by: Lizhi Hou <lizhi.hou@amd.com> Reviewed-by: Rob Herring <robh@kernel.org> Link: https://lore.kernel.org/lkml/aJms+YT8TnpzpCY8@lpieralisi/ Link: https://patch.msgid.link/20250818093504.80651-1-lpieralisi@kernel.org
2025-09-11PCI/AER: Fix NULL pointer access by aer_infoVernon Yang1-0/+4
The kzalloc(GFP_KERNEL) may return NULL, so all accesses to aer_info->xxx will result in kernel panic. Fix it. Signed-off-by: Vernon Yang <yanglincheng@kylinos.cn> Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> Link: https://patch.msgid.link/20250904182527.67371-1-vernon2gm@gmail.com