<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/drivers/pci, branch for-next</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=for-next</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=for-next'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2025-01-14T19:32:14Z</updated>
<entry>
<title>Merge tag 'pci-v6.13-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci</title>
<updated>2025-01-14T19:32:14Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2025-01-14T19:32:14Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=7f5b6a8ec18e3add4c74682f60b90c31bdf849f2'/>
<id>urn:sha1:7f5b6a8ec18e3add4c74682f60b90c31bdf849f2</id>
<content type='text'>
Pull pci fix from Bjorn Helgaas:

 - Prevent bwctrl NULL pointer dereference that caused hangs on shutdown
   on ASUS ROG Strix SCAR 17 G733PYV (Lukas Wunner)

* tag 'pci-v6.13-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci:
  PCI/bwctrl: Fix NULL pointer deref on unbind and bind
</content>
</entry>
<entry>
<title>PCI/bwctrl: Fix NULL pointer deref on unbind and bind</title>
<updated>2025-01-07T20:24:06Z</updated>
<author>
<name>Lukas Wunner</name>
<email>lukas@wunner.de</email>
</author>
<published>2025-01-06T11:26:35Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=15b8968dcb90f194d44501468b230e6e0d816d4a'/>
<id>urn:sha1:15b8968dcb90f194d44501468b230e6e0d816d4a</id>
<content type='text'>
The interrupt handler for bandwidth notifications, pcie_bwnotif_irq(),
dereferences a "data" pointer.

On unbind, that pointer is set to NULL by pcie_bwnotif_remove().  However
the interrupt handler may still be invoked afterwards and will dereference
that NULL pointer.

That's because the interrupt is requested using a devm_*() helper and the
driver core releases devm_*() resources *after* calling -&gt;remove().

pcie_bwnotif_remove() does clear the Link Bandwidth Management Interrupt
Enable and Link Autonomous Bandwidth Interrupt Enable bits in the Link
Control Register, but that won't prevent execution of pcie_bwnotif_irq():
The interrupt for bandwidth notifications may be shared with AER, DPC,
PME, and hotplug.  So pcie_bwnotif_irq() may be executed as long as the
interrupt is requested.

There's a similar race on bind:  pcie_bwnotif_probe() requests the
interrupt when the "data" pointer still points to NULL.  A NULL pointer
deref may thus likewise occur if AER, DPC, PME or hotplug raise an
interrupt in-between the bandwidth controller's call to devm_request_irq()
and assignment of the "data" pointer.

Drop the devm_*() usage and reorder requesting of the interrupt to fix the
issue.

While at it, drop a stray but harmless no_free_ptr() invocation when
assigning the "data" pointer in pcie_bwnotif_probe().

Ilpo points out that the locking on unbind and bind needs to be symmetric,
so move the call to pcie_bwnotif_disable() inside the critical section
protected by pcie_bwctrl_setspeed_rwsem and pcie_bwctrl_lbms_rwsem.

Evert reports a hang on shutdown of an ASUS ROG Strix SCAR 17 G733PYV.
The issue is no longer reproducible with the present commit.

Evert found that attaching a USB-C monitor prevented the hang.  The
machine contains an ASMedia USB 3.2 controller below a hotplug-capable
Root Port.  So one possible explanation is that the controller gets
hot-removed on shutdown unless something is connected.  And the ensuing
hotplug interrupt occurs exactly when the bandwidth controller is
unregistering.  The precise cause could not be determined because the
screen had already turned black when the hang occurred.

Link: https://lore.kernel.org/r/ae2b02c9cfbefff475b6e132b0aa962aaccbd7b2.1736162539.git.lukas@wunner.de
Fixes: 665745f27487 ("PCI/bwctrl: Re-add BW notification portdrv as PCIe BW controller")
Reported-by: Evert Vorster &lt;evorster@gmail.com&gt;
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219629
Signed-off-by: Lukas Wunner &lt;lukas@wunner.de&gt;
Signed-off-by: Bjorn Helgaas &lt;bhelgaas@google.com&gt;
Tested-by: Evert Vorster &lt;evorster@gmail.com&gt;
Reviewed-by: Ilpo Järvinen &lt;ilpo.jarvinen@linux.intel.com&gt;
</content>
</entry>
<entry>
<title>Merge tag 'irq-urgent-2024-12-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip</title>
<updated>2024-12-29T18:03:01Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2024-12-29T18:03:01Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=feffd35a03445ed2e9ea65f47af5fc9a0d1ede80'/>
<id>urn:sha1:feffd35a03445ed2e9ea65f47af5fc9a0d1ede80</id>
<content type='text'>
Pull irq fix from Ingo Molnar:
 "Fix bogus MSI IRQ setup warning on RISC-V"

* tag 'irq-urgent-2024-12-29' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  PCI/MSI: Handle lack of irqdomain gracefully
</content>
</entry>
<entry>
<title>Merge tag 'pci-v6.13-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci</title>
<updated>2024-12-21T18:51:04Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2024-12-21T18:51:04Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=a99b4a369a5495dbb625e1dfb5cd7a5ff6ba4bd5'/>
<id>urn:sha1:a99b4a369a5495dbb625e1dfb5cd7a5ff6ba4bd5</id>
<content type='text'>
Pull PCI fixes from Krzysztof Wilczyński:
 "Two small patches that are important for fixing boot time hang on
  Intel JHL7540 'Titan Ridge' platforms equipped with a Thunderbolt
  controller.

  The boot time issue manifests itself when a PCI Express bandwidth
  control is unnecessarily enabled on the Thunderbolt controller
  downstream ports, which only supports a link speed of 2.5 GT/s in
  accordance with USB4 v2 specification (p. 671, sec. 11.2.1, "PCIe
  Physical Layer Logical Sub-block").

  As such, there is no need to enable bandwidth control on such
  downstream port links, which also works around the issue.

  Both patches were tested by the original reporter on the hardware on
  which the failure origin golly manifested itself. Both fixes were
  proven to resolve the reported boot hang issue, and both patches have
  been in linux-next this week with no reported problems"

* tag 'pci-v6.13-fixes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci:
  PCI/bwctrl: Enable only if more than one speed is supported
  PCI: Honor Max Link Speed when determining supported speeds
</content>
</entry>
<entry>
<title>PCI/bwctrl: Enable only if more than one speed is supported</title>
<updated>2024-12-19T16:36:36Z</updated>
<author>
<name>Lukas Wunner</name>
<email>lukas@wunner.de</email>
</author>
<published>2024-12-17T09:51:02Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=774c71c52aa487001c7da9f93b10cedc9985c371'/>
<id>urn:sha1:774c71c52aa487001c7da9f93b10cedc9985c371</id>
<content type='text'>
If a PCIe port only supports a single speed, enabling bandwidth control
is pointless:  There's no need to monitor autonomous speed changes, nor
can the speed be changed.

Not enabling it saves a small amount of memory and compute resources,
but also fixes a boot hang reported by Niklas:  It occurs when enabling
bandwidth control on Downstream Ports of Intel JHL7540 "Titan Ridge 2018"
Thunderbolt controllers.  The ports only support 2.5 GT/s in accordance
with USB4 v2 sec 11.2.1, so the present commit works around the issue.

PCIe r6.2 sec 8.2.1 prescribes that:

   "A device must support 2.5 GT/s and is not permitted to skip support
    for any data rates between 2.5 GT/s and the highest supported rate."

Consequently, bandwidth control is currently only disabled if a port
doesn't support higher speeds than 2.5 GT/s.  However the Implementation
Note in PCIe r6.2 sec 7.5.3.18 cautions:

   "It is strongly encouraged that software primarily utilize the
    Supported Link Speeds Vector instead of the Max Link Speed field,
    so that software can determine the exact set of supported speeds on
    current and future hardware.  This can avoid software being confused
    if a future specification defines Links that do not require support
    for all slower speeds."

In other words, future revisions of the PCIe Base Spec may allow gaps
in the Supported Link Speeds Vector.  To be future-proof, don't just
check whether speeds above 2.5 GT/s are supported, but rather check
whether *more than one* speed is supported.

Fixes: 665745f27487 ("PCI/bwctrl: Re-add BW notification portdrv as PCIe BW controller")
Closes: https://lore.kernel.org/r/db8e457fcd155436449b035e8791a8241b0df400.camel@kernel.org
Link: https://lore.kernel.org/r/3564908a9c99fc0d2a292473af7a94ebfc8f5820.1734428762.git.lukas@wunner.de
Reported-by: Niklas Schnelle &lt;niks@kernel.org&gt;
Tested-by: Niklas Schnelle &lt;niks@kernel.org&gt;
Signed-off-by: Lukas Wunner &lt;lukas@wunner.de&gt;
Signed-off-by: Krzysztof Wilczyński &lt;kwilczynski@kernel.org&gt;
Reviewed-by: Jonathan Cameron &lt;Jonthan.Cameron@huawei.com&gt;
Reviewed-by: Mario Limonciello &lt;mario.limonciello@amd.com&gt;
Reviewed-by: Ilpo Järvinen &lt;ilpo.jarvinen@linux.intel.com&gt;
</content>
</entry>
<entry>
<title>PCI: Honor Max Link Speed when determining supported speeds</title>
<updated>2024-12-19T16:35:59Z</updated>
<author>
<name>Lukas Wunner</name>
<email>lukas@wunner.de</email>
</author>
<published>2024-12-17T09:51:01Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=3202ca221578850f34e0fea39dc6cfa745ed7aac'/>
<id>urn:sha1:3202ca221578850f34e0fea39dc6cfa745ed7aac</id>
<content type='text'>
The Supported Link Speeds Vector in the Link Capabilities 2 Register
indicates the *supported* link speeds.  The Max Link Speed field in the
Link Capabilities Register indicates the *maximum* of those speeds.

pcie_get_supported_speeds() neglects to honor the Max Link Speed field and
will thus incorrectly deem higher speeds as supported.  Fix it.

One user-visible issue addressed here is an incorrect value in the sysfs
attribute "max_link_speed".

But the main motivation is a boot hang reported by Niklas:  Intel JHL7540
"Titan Ridge 2018" Thunderbolt controllers supports 2.5-8 GT/s speeds,
but indicate 2.5 GT/s as maximum.  Ilpo recalls seeing this on more
devices.  It can be explained by the controller's Downstream Ports
supporting 8 GT/s if an Endpoint is attached, but limiting to 2.5 GT/s
if the port interfaces to a PCIe Adapter, in accordance with USB4 v2
sec 11.2.1:

   "This section defines the functionality of an Internal PCIe Port that
    interfaces to a PCIe Adapter. [...]
    The Logical sub-block shall update the PCIe configuration registers
    with the following characteristics: [...]
    Max Link Speed field in the Link Capabilities Register set to 0001b
    (data rate of 2.5 GT/s only).
    Note: These settings do not represent actual throughput. Throughput
    is implementation specific and based on the USB4 Fabric performance."

The present commit is not sufficient on its own to fix Niklas' boot hang,
but it is a prerequisite:  A subsequent commit will fix the boot hang by
enabling bandwidth control only if more than one speed is supported.

The GENMASK() macro used herein specifies 0 as lowest bit, even though
the Supported Link Speeds Vector ends at bit 1.  This is done on purpose
to avoid a GENMASK(0, 1) macro if Max Link Speed is zero.  That macro
would be invalid as the lowest bit is greater than the highest bit.
Ilpo has witnessed a zero Max Link Speed on Root Complex Integrated
Endpoints in particular, so it does occur in practice.  (The Link
Capabilities Register is optional on RCiEPs per PCIe r6.2 sec 7.5.3.)

Fixes: d2bd39c0456b ("PCI: Store all PCIe Supported Link Speeds")
Closes: https://lore.kernel.org/r/70829798889c6d779ca0f6cd3260a765780d1369.camel@kernel.org
Link: https://lore.kernel.org/r/fe03941e3e1cc42fb9bf4395e302bff53ee2198b.1734428762.git.lukas@wunner.de
Reported-by: Niklas Schnelle &lt;niks@kernel.org&gt;
Tested-by: Niklas Schnelle &lt;niks@kernel.org&gt;
Signed-off-by: Lukas Wunner &lt;lukas@wunner.de&gt;
Signed-off-by: Krzysztof Wilczyński &lt;kwilczynski@kernel.org&gt;
Reviewed-by: Jonathan Cameron &lt;Jonathan.Cameron@huawei.com&gt;
Reviewed-by: Ilpo Järvinen &lt;ilpo.jarvinen@linux.intel.com&gt;
</content>
</entry>
<entry>
<title>PCI/MSI: Handle lack of irqdomain gracefully</title>
<updated>2024-12-16T09:59:47Z</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2024-12-14T11:50:18Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=a60b990798eb17433d0283788280422b1bd94b18'/>
<id>urn:sha1:a60b990798eb17433d0283788280422b1bd94b18</id>
<content type='text'>
Alexandre observed a warning emitted from pci_msi_setup_msi_irqs() on a
RISCV platform which does not provide PCI/MSI support:

 WARNING: CPU: 1 PID: 1 at drivers/pci/msi/msi.h:121 pci_msi_setup_msi_irqs+0x2c/0x32
 __pci_enable_msix_range+0x30c/0x596
 pci_msi_setup_msi_irqs+0x2c/0x32
 pci_alloc_irq_vectors_affinity+0xb8/0xe2

RISCV uses hierarchical interrupt domains and correctly does not implement
the legacy fallback. The warning triggers from the legacy fallback stub.

That warning is bogus as the PCI/MSI layer knows whether a PCI/MSI parent
domain is associated with the device or not. There is a check for MSI-X,
which has a legacy assumption. But that legacy fallback assumption is only
valid when legacy support is enabled, but otherwise the check should simply
return -ENOTSUPP.

Loongarch tripped over the same problem and blindly enabled legacy support
without implementing the legacy fallbacks. There are weak implementations
which return an error, so the problem was papered over.

Correct pci_msi_domain_supports() to evaluate the legacy mode and add
the missing supported check into the MSI enable path to complete it.

Fixes: d2a463b29741 ("PCI/MSI: Reject multi-MSI early")
Reported-by: Alexandre Ghiti &lt;alexghiti@rivosinc.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Tested-by: Alexandre Ghiti &lt;alexghiti@rivosinc.com&gt;
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/all/87ed2a8ow5.ffs@tglx

</content>
</entry>
<entry>
<title>module: Convert symbol namespace to string literal</title>
<updated>2024-12-02T19:34:44Z</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2024-12-02T14:59:47Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=cdd30ebb1b9f36159d66f088b61aee264e649d7a'/>
<id>urn:sha1:cdd30ebb1b9f36159d66f088b61aee264e649d7a</id>
<content type='text'>
Clean up the existing export namespace code along the same lines of
commit 33def8498fdd ("treewide: Convert macro and uses of __section(foo)
to __section("foo")") and for the same reason, it is not desired for the
namespace argument to be a macro expansion itself.

Scripted using

  git grep -l -e MODULE_IMPORT_NS -e EXPORT_SYMBOL_NS | while read file;
  do
    awk -i inplace '
      /^#define EXPORT_SYMBOL_NS/ {
        gsub(/__stringify\(ns\)/, "ns");
        print;
        next;
      }
      /^#define MODULE_IMPORT_NS/ {
        gsub(/__stringify\(ns\)/, "ns");
        print;
        next;
      }
      /MODULE_IMPORT_NS/ {
        $0 = gensub(/MODULE_IMPORT_NS\(([^)]*)\)/, "MODULE_IMPORT_NS(\"\\1\")", "g");
      }
      /EXPORT_SYMBOL_NS/ {
        if ($0 ~ /(EXPORT_SYMBOL_NS[^(]*)\(([^,]+),/) {
  	if ($0 !~ /(EXPORT_SYMBOL_NS[^(]*)\(([^,]+), ([^)]+)\)/ &amp;&amp;
  	    $0 !~ /(EXPORT_SYMBOL_NS[^(]*)\(\)/ &amp;&amp;
  	    $0 !~ /^my/) {
  	  getline line;
  	  gsub(/[[:space:]]*\\$/, "");
  	  gsub(/[[:space:]]/, "", line);
  	  $0 = $0 " " line;
  	}

  	$0 = gensub(/(EXPORT_SYMBOL_NS[^(]*)\(([^,]+), ([^)]+)\)/,
  		    "\\1(\\2, \"\\3\")", "g");
        }
      }
      { print }' $file;
  done

Requested-by: Masahiro Yamada &lt;masahiroy@kernel.org&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Link: https://mail.google.com/mail/u/2/#inbox/FMfcgzQXKWgMmjdFwwdsfgxzKpVHWPlc
Acked-by: Greg KH &lt;gregkh@linuxfoundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>Merge tag 'pci-v6.13-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci</title>
<updated>2024-12-01T02:23:05Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2024-12-01T02:23:05Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=0cb71708c5816569f8addd5c6f33cb9679e73b5b'/>
<id>urn:sha1:0cb71708c5816569f8addd5c6f33cb9679e73b5b</id>
<content type='text'>
Pull PCI fix from Bjorn Helgaas:

 - When removing a PCI device, only look up and remove a platform device
   if there is an associated device node for which there could be a
   platform device, to fix a merge window regression (Brian Norris)

* tag 'pci-v6.13-fixes-1' of git://git.kernel.org/pub/scm/linux/kernel/git/pci/pci:
  PCI/pwrctrl: Unregister platform device only if one actually exists
</content>
</entry>
<entry>
<title>PCI/pwrctrl: Unregister platform device only if one actually exists</title>
<updated>2024-11-30T17:41:25Z</updated>
<author>
<name>Brian Norris</name>
<email>briannorris@chromium.org</email>
</author>
<published>2024-11-26T21:04:34Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=5c8418cf4025388bedd4d65ada993f7d3786cc3a'/>
<id>urn:sha1:5c8418cf4025388bedd4d65ada993f7d3786cc3a</id>
<content type='text'>
If a PCI device has an associated device_node with power supplies,
pci_bus_add_device() creates platform devices for use by pwrctrl.  When the
PCI device is removed, pci_stop_dev() uses of_find_device_by_node() to
locate the related platform device, then unregisters it.

But when we remove a PCI device with no associated device node,
dev_of_node(dev) is NULL, and of_find_device_by_node(NULL) returns the
first device with "dev-&gt;of_node == NULL".  The result is that we (a)
mistakenly unregister a completely unrelated platform device, leading to
issues like the first trace below, and (b) dereference the NULL pointer
from dev_of_node() when clearing OF_POPULATED, as in the second trace.

Unregister a platform device only if there is one associated with this PCI
device.  This resolves issues seen when doing:

  # echo 1 &gt; /sys/bus/pci/devices/.../remove

Sample issue from unregistering the wrong platform device:

  WARNING: CPU: 0 PID: 5095 at drivers/regulator/core.c:5885 regulator_unregister+0x140/0x160
  Call trace:
   regulator_unregister+0x140/0x160
   devm_rdev_release+0x1c/0x30
   release_nodes+0x68/0x100
   devres_release_all+0x98/0xf8
   device_unbind_cleanup+0x20/0x70
   device_release_driver_internal+0x1f4/0x240
   device_release_driver+0x20/0x40
   bus_remove_device+0xd8/0x170
   device_del+0x154/0x380
   device_unregister+0x28/0x88
   of_device_unregister+0x1c/0x30
   pci_stop_bus_device+0x154/0x1b0
   pci_stop_and_remove_bus_device_locked+0x28/0x48
   remove_store+0xa0/0xb8
   dev_attr_store+0x20/0x40
   sysfs_kf_write+0x4c/0x68

Later NULL pointer dereference for of_node_clear_flag(NULL, OF_POPULATED):

  Unable to handle kernel NULL pointer dereference at virtual address 00000000000000c0
  Call trace:
   pci_stop_bus_device+0x190/0x1b0
   pci_stop_and_remove_bus_device_locked+0x28/0x48
   remove_store+0xa0/0xb8
   dev_attr_store+0x20/0x40
   sysfs_kf_write+0x4c/0x68

Link: https://lore.kernel.org/r/20241126210443.4052876-1-briannorris@chromium.org
Fixes: 681725afb6b9 ("PCI/pwrctl: Remove pwrctl device without iterating over all children of pwrctl parent")
Reported-by: Saurabh Sengar &lt;ssengar@linux.microsoft.com&gt;
Closes: https://lore.kernel.org/r/1732890621-19656-1-git-send-email-ssengar@linux.microsoft.com
Signed-off-by: Brian Norris &lt;briannorris@chromium.org&gt;
[bhelgaas: commit log]
Signed-off-by: Bjorn Helgaas &lt;bhelgaas@google.com&gt;
</content>
</entry>
</feed>
