<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/include/rdma, branch v6.15</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v6.15</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v6.15'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2025-04-07T18:14:34Z</updated>
<entry>
<title>RDMA/mlx5: Fix compilation warning when USER_ACCESS isn't set</title>
<updated>2025-04-07T18:14:34Z</updated>
<author>
<name>Mark Bloch</name>
<email>mbloch@nvidia.com</email>
</author>
<published>2025-04-02T07:09:44Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=d247667ecd6411ec5bec9a38db7feaa599ce3ee2'/>
<id>urn:sha1:d247667ecd6411ec5bec9a38db7feaa599ce3ee2</id>
<content type='text'>
The cited commit made fs.c always compile, even when
INFINIBAND_USER_ACCESS isn't set. This results in a compilation
warning about an unused object when compiling with W=1 and
USER_ACCESS is unset.

Fix this by defining uverbs_destroy_def_handler() even when
USER_ACCESS isn't set.

Fixes: 36e0d433672f ("RDMA/mlx5: Compile fs.c regardless of INFINIBAND_USER_ACCESS config")
Link: https://patch.msgid.link/r/20250402070944.1022093-1-mbloch@nvidia.com
Signed-off-by: Mark Bloch &lt;mbloch@nvidia.com&gt;
Tested-by: Arnd Bergmann &lt;arnd@arndb.de&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@nvidia.com&gt;
</content>
</entry>
<entry>
<title>RDMA/core: Pass port to counter bind/unbind operations</title>
<updated>2025-03-18T10:18:46Z</updated>
<author>
<name>Patrisious Haddad</name>
<email>phaddad@nvidia.com</email>
</author>
<published>2025-03-13T14:18:44Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=88ae02feda84f0f78ff7243ef437e79624fdcd80'/>
<id>urn:sha1:88ae02feda84f0f78ff7243ef437e79624fdcd80</id>
<content type='text'>
This will be useful for the next patches in the series since port number
is needed for optional counters binding and unbinding.

Note that this change is needed since when the operation is done qp-&gt;port
isn't necessarily initialized yet and can't be used.

Signed-off-by: Patrisious Haddad &lt;phaddad@nvidia.com&gt;
Reviewed-by: Mark Bloch &lt;mbloch@nvidia.com&gt;
Link: https://patch.msgid.link/b6f6797844acbd517358e8d2a270ea9b3e6ecba1.1741875070.git.leon@kernel.org
Signed-off-by: Leon Romanovsky &lt;leon@kernel.org&gt;
</content>
</entry>
<entry>
<title>RDMA/core: Add support to optional-counters binding configuration</title>
<updated>2025-03-18T10:18:42Z</updated>
<author>
<name>Patrisious Haddad</name>
<email>phaddad@nvidia.com</email>
</author>
<published>2025-03-13T14:18:43Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=da3711074f5252ee35d6348ca286351013f1d7fe'/>
<id>urn:sha1:da3711074f5252ee35d6348ca286351013f1d7fe</id>
<content type='text'>
Whenever a new counter is created, save inside it the user requested
configuration for optional-counters binding, for manual configuration it
is requested directly by the user and for the automatic configuration it
depends on if the automatic binding was enabled with or without
optional-counters binding.

This argument will later be used by the driver to determine if to bind the
optional-counters as well or not when trying to bind this counter to a QP.

It indicates that when binding counters to a QP we also want the
currently enabled link optional-counters to be bound as well.

Signed-off-by: Patrisious Haddad &lt;phaddad@nvidia.com&gt;
Reviewed-by: Mark Bloch &lt;mbloch@nvidia.com&gt;
Link: https://patch.msgid.link/82f1c357606a16932979ef9a5910122675c74a3a.1741875070.git.leon@kernel.org
Signed-off-by: Leon Romanovsky &lt;leon@kernel.org&gt;
</content>
</entry>
<entry>
<title>RDMA/core: Create and destroy rdma_counter using rdma_zalloc_drv_obj()</title>
<updated>2025-03-18T10:18:37Z</updated>
<author>
<name>Patrisious Haddad</name>
<email>phaddad@nvidia.com</email>
</author>
<published>2025-03-13T14:18:42Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=7e53b31acc7f976d01a0718724a4c36f1fddf739'/>
<id>urn:sha1:7e53b31acc7f976d01a0718724a4c36f1fddf739</id>
<content type='text'>
Change rdma_counter allocation to use rdma_zalloc_drv_obj() instead of,
explicitly allocating at core, in order to be contained inside driver
specific structures.

Adjust all drivers that use it to have their containing structure, and
add driver specific initialization operation.

This change is needed to allow upcoming patches to implement
optional-counters binding whereas inside each driver specific counter
struct his bound optional-counters will be maintained.

Signed-off-by: Patrisious Haddad &lt;phaddad@nvidia.com&gt;
Reviewed-by: Mark Bloch &lt;mbloch@nvidia.com&gt;
Link: https://patch.msgid.link/a5a484f421fc2e5595158e61a354fba43272b02d.1741875070.git.leon@kernel.org
Signed-off-by: Leon Romanovsky &lt;leon@kernel.org&gt;
</content>
</entry>
<entry>
<title>RDMA/uverbs: Propagate errors from rdma_lookup_get_uobject()</title>
<updated>2025-03-13T12:26:37Z</updated>
<author>
<name>Maher Sanalla</name>
<email>msanalla@nvidia.com</email>
</author>
<published>2025-02-26T13:54:13Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=81f8f7454ad9e0bf95efdec6542afdc9a6ab1e24'/>
<id>urn:sha1:81f8f7454ad9e0bf95efdec6542afdc9a6ab1e24</id>
<content type='text'>
Currently, the IB uverbs API calls uobj_get_uobj_read(), which in turn
uses the rdma_lookup_get_uobject() helper to retrieve user objects.
In case of failure, uobj_get_uobj_read() returns NULL, overriding the
error code from rdma_lookup_get_uobject(). The IB uverbs API then
translates this NULL to -EINVAL, masking the actual error and
complicating debugging. For example, applications calling ibv_modify_qp
that fails with EBUSY when retrieving the QP uobject will see the
overridden error code EINVAL instead, masking the actual error.

Furthermore, based on rdma-core commit:
"2a22f1ced5f3 ("Merge pull request #1568 from jakemoroni/master")"
Kernel's IB uverbs return values are either ignored and passed on as is
to application or overridden with other errnos in a few cases.

Thus, to improve error reporting and debuggability, propagate the
original error from rdma_lookup_get_uobject() instead of replacing it
with EINVAL.

Signed-off-by: Maher Sanalla &lt;msanalla@nvidia.com&gt;
Link: https://patch.msgid.link/64f9d3711b183984e939962c2f83383904f97dfb.1740577869.git.leon@kernel.org
Signed-off-by: Leon Romanovsky &lt;leon@kernel.org&gt;
</content>
</entry>
<entry>
<title>RDMA/uverbs: Add support for UCAPs in context creation</title>
<updated>2025-03-09T17:13:02Z</updated>
<author>
<name>Chiara Meiohas</name>
<email>cmeiohas@nvidia.com</email>
</author>
<published>2025-03-06T11:51:28Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=fe9d7822baee65d8e77542391799b2777f5216f5'/>
<id>urn:sha1:fe9d7822baee65d8e77542391799b2777f5216f5</id>
<content type='text'>
Add support for file descriptor array attribute for GET_CONTEXT
commands.

Check that the file descriptor (fd) array represents fds for valid UCAPs.
Store the enabled UCAPs from the fd array as a bitmask in ib_ucontext.

Signed-off-by: Chiara Meiohas &lt;cmeiohas@nvidia.com&gt;
Link: https://patch.msgid.link/ebfb30bc947e2259b193c96a319c80e82599045b.1741261611.git.leon@kernel.org
Reviewed-by: Yishai Hadas &lt;yishaih@nvidia.com&gt;
Signed-off-by: Leon Romanovsky &lt;leon@kernel.org&gt;
</content>
</entry>
<entry>
<title>RDMA/uverbs: Introduce UCAP (User CAPabilities) API</title>
<updated>2025-03-09T17:13:02Z</updated>
<author>
<name>Chiara Meiohas</name>
<email>cmeiohas@nvidia.com</email>
</author>
<published>2025-03-06T11:51:26Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=61e51682816d395307f78ae06d640089054c28ab'/>
<id>urn:sha1:61e51682816d395307f78ae06d640089054c28ab</id>
<content type='text'>
Implement a new User CAPabilities (UCAP) API to provide fine-grained
control over specific firmware features.

This approach offers more granular capabilities than the existing Linux
capabilities, which may be too generic for certain FW features.

This mechanism represents each capability as a character device with
root read-write access. Root processes can grant users special
privileges by allowing access to these character devices (e.g., using
chown).

UCAP character devices are located in /dev/infiniband and the class path
is /sys/class/infiniband_ucaps.

Signed-off-by: Chiara Meiohas &lt;cmeiohas@nvidia.com&gt;
Link: https://patch.msgid.link/5a1379187cd21178e8554afc81a3c941f21af22f.1741261611.git.leon@kernel.org
Reviewed-by: Yishai Hadas &lt;yishaih@nvidia.com&gt;
Reviewed-by: Zhu Yanjun &lt;yanjun.zhu@linux.dev&gt;
Signed-off-by: Leon Romanovsky &lt;leon@kernel.org&gt;
</content>
</entry>
<entry>
<title>RDMA/core: Don't expose hw_counters outside of init net namespace</title>
<updated>2025-03-03T12:14:48Z</updated>
<author>
<name>Roman Gushchin</name>
<email>roman.gushchin@linux.dev</email>
</author>
<published>2025-02-27T16:54:20Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=a1ecb30f90856b0be4168ad51b8875148e285c1f'/>
<id>urn:sha1:a1ecb30f90856b0be4168ad51b8875148e285c1f</id>
<content type='text'>
Commit 467f432a521a ("RDMA/core: Split port and device counter sysfs
attributes") accidentally almost exposed hw counters to non-init net
namespaces. It didn't expose them fully, as an attempt to read any of
those counters leads to a crash like this one:

[42021.807566] BUG: kernel NULL pointer dereference, address: 0000000000000028
[42021.814463] #PF: supervisor read access in kernel mode
[42021.819549] #PF: error_code(0x0000) - not-present page
[42021.824636] PGD 0 P4D 0
[42021.827145] Oops: 0000 [#1] SMP PTI
[42021.830598] CPU: 82 PID: 2843922 Comm: switchto-defaul Kdump: loaded Tainted: G S      W I        XXX
[42021.841697] Hardware name: XXX
[42021.849619] RIP: 0010:hw_stat_device_show+0x1e/0x40 [ib_core]
[42021.855362] Code: 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 0f 1f 44 00 00 49 89 d0 4c 8b 5e 20 48 8b 8f b8 04 00 00 48 81 c7 f0 fa ff ff &lt;48&gt; 8b 41 28 48 29 ce 48 83 c6 d0 48 c1 ee 04 69 d6 ab aa aa aa 48
[42021.873931] RSP: 0018:ffff97fe90f03da0 EFLAGS: 00010287
[42021.879108] RAX: ffff9406988a8c60 RBX: ffff940e1072d438 RCX: 0000000000000000
[42021.886169] RDX: ffff94085f1aa000 RSI: ffff93c6cbbdbcb0 RDI: ffff940c7517aef0
[42021.893230] RBP: ffff97fe90f03e70 R08: ffff94085f1aa000 R09: 0000000000000000
[42021.900294] R10: ffff94085f1aa000 R11: ffffffffc0775680 R12: ffffffff87ca2530
[42021.907355] R13: ffff940651602840 R14: ffff93c6cbbdbcb0 R15: ffff94085f1aa000
[42021.914418] FS:  00007fda1a3b9700(0000) GS:ffff94453fb80000(0000) knlGS:0000000000000000
[42021.922423] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[42021.928130] CR2: 0000000000000028 CR3: 00000042dcfb8003 CR4: 00000000003726f0
[42021.935194] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[42021.942257] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[42021.949324] Call Trace:
[42021.951756]  &lt;TASK&gt;
[42021.953842]  [&lt;ffffffff86c58674&gt;] ? show_regs+0x64/0x70
[42021.959030]  [&lt;ffffffff86c58468&gt;] ? __die+0x78/0xc0
[42021.963874]  [&lt;ffffffff86c9ef75&gt;] ? page_fault_oops+0x2b5/0x3b0
[42021.969749]  [&lt;ffffffff87674b92&gt;] ? exc_page_fault+0x1a2/0x3c0
[42021.975549]  [&lt;ffffffff87801326&gt;] ? asm_exc_page_fault+0x26/0x30
[42021.981517]  [&lt;ffffffffc0775680&gt;] ? __pfx_show_hw_stats+0x10/0x10 [ib_core]
[42021.988482]  [&lt;ffffffffc077564e&gt;] ? hw_stat_device_show+0x1e/0x40 [ib_core]
[42021.995438]  [&lt;ffffffff86ac7f8e&gt;] dev_attr_show+0x1e/0x50
[42022.000803]  [&lt;ffffffff86a3eeb1&gt;] sysfs_kf_seq_show+0x81/0xe0
[42022.006508]  [&lt;ffffffff86a11134&gt;] seq_read_iter+0xf4/0x410
[42022.011954]  [&lt;ffffffff869f4b2e&gt;] vfs_read+0x16e/0x2f0
[42022.017058]  [&lt;ffffffff869f50ee&gt;] ksys_read+0x6e/0xe0
[42022.022073]  [&lt;ffffffff8766f1ca&gt;] do_syscall_64+0x6a/0xa0
[42022.027441]  [&lt;ffffffff8780013b&gt;] entry_SYSCALL_64_after_hwframe+0x78/0xe2

The problem can be reproduced using the following steps:
  ip netns add foo
  ip netns exec foo bash
  cat /sys/class/infiniband/mlx4_0/hw_counters/*

The panic occurs because of casting the device pointer into an
ib_device pointer using container_of() in hw_stat_device_show() is
wrong and leads to a memory corruption.

However the real problem is that hw counters should never been exposed
outside of the non-init net namespace.

Fix this by saving the index of the corresponding attribute group
(it might be 1 or 2 depending on the presence of driver-specific
attributes) and zeroing the pointer to hw_counters group for compat
devices during the initialization.

With this fix applied hw_counters are not available in a non-init
net namespace:
  find /sys/class/infiniband/mlx4_0/ -name hw_counters
    /sys/class/infiniband/mlx4_0/ports/1/hw_counters
    /sys/class/infiniband/mlx4_0/ports/2/hw_counters
    /sys/class/infiniband/mlx4_0/hw_counters

  ip netns add foo
  ip netns exec foo bash
  find /sys/class/infiniband/mlx4_0/ -name hw_counters

Fixes: 467f432a521a ("RDMA/core: Split port and device counter sysfs attributes")
Signed-off-by: Roman Gushchin &lt;roman.gushchin@linux.dev&gt;
Cc: Jason Gunthorpe &lt;jgg@ziepe.ca&gt;
Cc: Leon Romanovsky &lt;leon@kernel.org&gt;
Cc: Maher Sanalla &lt;msanalla@nvidia.com&gt;
Cc: linux-rdma@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Link: https://patch.msgid.link/20250227165420.3430301-1-roman.gushchin@linux.dev
Reviewed-by: Parav Pandit &lt;parav@nvidia.com&gt;
Signed-off-by: Leon Romanovsky &lt;leon@kernel.org&gt;
</content>
</entry>
<entry>
<title>IB/cache: Add log messages for IB device state changes</title>
<updated>2025-02-06T08:40:50Z</updated>
<author>
<name>Maher Sanalla</name>
<email>msanalla@nvidia.com</email>
</author>
<published>2025-02-03T12:48:04Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=5459f6523c1f1a053cdc6411a810061ee717219c'/>
<id>urn:sha1:5459f6523c1f1a053cdc6411a810061ee717219c</id>
<content type='text'>
Enhance visibility into IB device state transitions by adding log messages
to the kernel log (dmesg). Whenever an IB device changes state, a relevant
print will be printed, such as:

"mlx5_core 0000:08:00.0 mlx5_0: Port: 1 Link DOWN"
"mlx5_core 0000:08:00.0 rdmap8s0f0: Port: 2 Link ACTIVE"

Signed-off-by: Maher Sanalla &lt;msanalla@nvidia.com&gt;
Link: https://patch.msgid.link/2d26ccbd669bad99089fa2aebb5cba4014fc4999.1738586601.git.leon@kernel.org
Signed-off-by: Leon Romanovsky &lt;leon@kernel.org&gt;
</content>
</entry>
<entry>
<title>RDMA/core: Support link status events dispatching</title>
<updated>2024-12-24T10:22:18Z</updated>
<author>
<name>Yuyu Li</name>
<email>liyuyu6@huawei.com</email>
</author>
<published>2024-11-22T10:52:58Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=1fb0644c3899b2f857b11037b19ed362b67bfe91'/>
<id>urn:sha1:1fb0644c3899b2f857b11037b19ed362b67bfe91</id>
<content type='text'>
Currently the dispatching of link status events is implemented by
each RDMA driver independently, and most of them have very similar
patterns. Add support for this in ib_core so that we can get rid
of duplicate codes in each driver.

A new last_port_state is added in ib_port_cache to cache the port
state of the last link status events dispatching. The original
port_state in ib_port_cache is not used here because it will be
updated when ib_dispatch_event() is called, which means it may
be changed between two link status events, and may lead to a loss
of event dispatching.

Some drivers currently have some private stuff in their link status
events handler in addition to event dispatching, and cannot be
perfectly integrated into the ib_core handling process. For these
drivers, add a new ops report_port_event() so that they can keep
their current processing.

Finally, events of LAG devices are not supported yet in this patch
as currently there is no way to obtain ibdev from upper netdev in
ib_core. This can be a TODO work after the core have more support
for LAG.

Signed-off-by: Yuyu Li &lt;liyuyu6@huawei.com&gt;
Signed-off-by: Junxian Huang &lt;huangjunxian6@hisilicon.com&gt;
Signed-off-by: Leon Romanovsky &lt;leon@kernel.org&gt;
</content>
</entry>
</feed>
