<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/include/rdma, branch v5.3</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v5.3</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v5.3'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2019-08-20T17:44:44Z</updated>
<entry>
<title>RDMA/restrack: Rewrite PID namespace check to be reliable</title>
<updated>2019-08-20T17:44:44Z</updated>
<author>
<name>Leon Romanovsky</name>
<email>leonro@mellanox.com</email>
</author>
<published>2019-08-15T08:38:29Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=60c78668ae50d6b815ead4a62216822a92097125'/>
<id>urn:sha1:60c78668ae50d6b815ead4a62216822a92097125</id>
<content type='text'>
task_active_pid_ns() is wrong API to check PID namespace because it
posses some restrictions and return PID namespace where the process
was allocated. It created mismatches with current namespace, which
can be different.

Rewrite whole rdma_is_visible_in_pid_ns() logic to provide reliable
results without any relation to allocated PID namespace.

Fixes: 8be565e65fa9 ("RDMA/nldev: Factor out the PID namespace check")
Fixes: 6a6c306a09b5 ("RDMA/restrack: Make is_visible_in_pid_ns() as an API")
Reviewed-by: Mark Zhang &lt;markz@mellanox.com&gt;
Signed-off-by: Leon Romanovsky &lt;leonro@mellanox.com&gt;
Link: https://lore.kernel.org/r/20190815083834.9245-4-leon@kernel.org
Signed-off-by: Doug Ledford &lt;dledford@redhat.com&gt;
</content>
</entry>
<entry>
<title>RDMA/devices: Remove the lock around remove_client_context</title>
<updated>2019-08-01T15:44:48Z</updated>
<author>
<name>Jason Gunthorpe</name>
<email>jgg@mellanox.com</email>
</author>
<published>2019-07-31T08:18:41Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=9cd5881719e9555cae300ec8b389eda3c8101339'/>
<id>urn:sha1:9cd5881719e9555cae300ec8b389eda3c8101339</id>
<content type='text'>
Due to the complexity of client-&gt;remove() callbacks it is desirable to not
hold any locks while calling them. Remove the last one by tracking only
the highest client ID and running backwards from there over the xarray.

Since the only purpose of that lock was to protect the linked list, we can
drop the lock.

Signed-off-by: Jason Gunthorpe &lt;jgg@mellanox.com&gt;
Signed-off-by: Leon Romanovsky &lt;leonro@mellanox.com&gt;
Link: https://lore.kernel.org/r/20190731081841.32345-3-leon@kernel.org
Signed-off-by: Doug Ledford &lt;dledford@redhat.com&gt;
</content>
</entry>
<entry>
<title>RDMA/devices: Do not deadlock during client removal</title>
<updated>2019-08-01T15:44:47Z</updated>
<author>
<name>Jason Gunthorpe</name>
<email>jgg@mellanox.com</email>
</author>
<published>2019-07-31T08:18:40Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=621e55ff5b8e0ab5d1063f0eae0ef3960bef8f6e'/>
<id>urn:sha1:621e55ff5b8e0ab5d1063f0eae0ef3960bef8f6e</id>
<content type='text'>
lockdep reports:

   WARNING: possible circular locking dependency detected

   modprobe/302 is trying to acquire lock:
   0000000007c8919c ((wq_completion)ib_cm){+.+.}, at: flush_workqueue+0xdf/0x990

   but task is already holding lock:
   000000002d3d2ca9 (&amp;device-&gt;client_data_rwsem){++++}, at: remove_client_context+0x79/0xd0 [ib_core]

   which lock already depends on the new lock.

   the existing dependency chain (in reverse order) is:

   -&gt; #2 (&amp;device-&gt;client_data_rwsem){++++}:
          down_read+0x3f/0x160
          ib_get_net_dev_by_params+0xd5/0x200 [ib_core]
          cma_ib_req_handler+0x5f6/0x2090 [rdma_cm]
          cm_process_work+0x29/0x110 [ib_cm]
          cm_req_handler+0x10f5/0x1c00 [ib_cm]
          cm_work_handler+0x54c/0x311d [ib_cm]
          process_one_work+0x4aa/0xa30
          worker_thread+0x62/0x5b0
          kthread+0x1ca/0x1f0
          ret_from_fork+0x24/0x30

   -&gt; #1 ((work_completion)(&amp;(&amp;work-&gt;work)-&gt;work)){+.+.}:
          process_one_work+0x45f/0xa30
          worker_thread+0x62/0x5b0
          kthread+0x1ca/0x1f0
          ret_from_fork+0x24/0x30

   -&gt; #0 ((wq_completion)ib_cm){+.+.}:
          lock_acquire+0xc8/0x1d0
          flush_workqueue+0x102/0x990
          cm_remove_one+0x30e/0x3c0 [ib_cm]
          remove_client_context+0x94/0xd0 [ib_core]
          disable_device+0x10a/0x1f0 [ib_core]
          __ib_unregister_device+0x5a/0xe0 [ib_core]
          ib_unregister_device+0x21/0x30 [ib_core]
          mlx5_ib_stage_ib_reg_cleanup+0x9/0x10 [mlx5_ib]
          __mlx5_ib_remove+0x3d/0x70 [mlx5_ib]
          mlx5_ib_remove+0x12e/0x140 [mlx5_ib]
          mlx5_remove_device+0x144/0x150 [mlx5_core]
          mlx5_unregister_interface+0x3f/0xf0 [mlx5_core]
          mlx5_ib_cleanup+0x10/0x3a [mlx5_ib]
          __x64_sys_delete_module+0x227/0x350
          do_syscall_64+0xc3/0x6a4
          entry_SYSCALL_64_after_hwframe+0x49/0xbe

Which is due to the read side of the client_data_rwsem being obtained
recursively through a work queue flush during cm client removal.

The lock is being held across the remove in remove_client_context() so
that the function is a fence, once it returns the client is removed. This
is required so that the two callers do not proceed with destruction until
the client completes removal.

Instead of using client_data_rwsem use the existing device unregistration
refcount and add a similar client unregistration (client-&gt;uses) refcount.

This will fence the two unregistration paths without holding any locks.

Cc: &lt;stable@vger.kernel.org&gt;
Fixes: 921eab1143aa ("RDMA/devices: Re-organize device.c locking")
Signed-off-by: Jason Gunthorpe &lt;jgg@mellanox.com&gt;
Signed-off-by: Leon Romanovsky &lt;leonro@mellanox.com&gt;
Link: https://lore.kernel.org/r/20190731081841.32345-2-leon@kernel.org
Signed-off-by: Doug Ledford &lt;dledford@redhat.com&gt;
</content>
</entry>
<entry>
<title>IB/hfi1: Unreserve a flushed OPFN request</title>
<updated>2019-07-22T17:57:55Z</updated>
<author>
<name>Kaike Wan</name>
<email>kaike.wan@intel.com</email>
</author>
<published>2019-07-15T16:45:28Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=2b74c878b0eae4c32629c2d5ba69a29f69048313'/>
<id>urn:sha1:2b74c878b0eae4c32629c2d5ba69a29f69048313</id>
<content type='text'>
When an OPFN request is flushed, the request is completed without
unreserving itself from the send queue. Subsequently, when a new
request is post sent, the following warning will be triggered:

WARNING: CPU: 4 PID: 8130 at rdmavt/qp.c:1761 rvt_post_send+0x72a/0x880 [rdmavt]
Call Trace:
[&lt;ffffffffbbb61e41&gt;] dump_stack+0x19/0x1b
[&lt;ffffffffbb497688&gt;] __warn+0xd8/0x100
[&lt;ffffffffbb4977cd&gt;] warn_slowpath_null+0x1d/0x20
[&lt;ffffffffc01c941a&gt;] rvt_post_send+0x72a/0x880 [rdmavt]
[&lt;ffffffffbb4dcabe&gt;] ? account_entity_dequeue+0xae/0xd0
[&lt;ffffffffbb61d645&gt;] ? __kmalloc+0x55/0x230
[&lt;ffffffffc04e1a4c&gt;] ib_uverbs_post_send+0x37c/0x5d0 [ib_uverbs]
[&lt;ffffffffc04e5e36&gt;] ? rdma_lookup_put_uobject+0x26/0x60 [ib_uverbs]
[&lt;ffffffffc04dbce6&gt;] ib_uverbs_write+0x286/0x460 [ib_uverbs]
[&lt;ffffffffbb6f9457&gt;] ? security_file_permission+0x27/0xa0
[&lt;ffffffffbb641650&gt;] vfs_write+0xc0/0x1f0
[&lt;ffffffffbb64246f&gt;] SyS_write+0x7f/0xf0
[&lt;ffffffffbbb74ddb&gt;] system_call_fastpath+0x22/0x27

This patch fixes the problem by moving rvt_qp_wqe_unreserve() into
rvt_qp_complete_swqe() to simplify the code and make it less
error-prone.

Fixes: ca95f802ef51 ("IB/hfi1: Unreserve a reserved request when it is completed")
Link: https://lore.kernel.org/r/20190715164528.74174.31364.stgit@awfm-01.aw.intel.com
Cc: &lt;stable@vger.kernel.org&gt;
Reviewed-by: Mike Marciniszyn &lt;mike.marciniszyn@intel.com&gt;
Reviewed-by: Dennis Dalessandro &lt;dennis.dalessandro@intel.com&gt;
Signed-off-by: Kaike Wan &lt;kaike.wan@intel.com&gt;
Signed-off-by: Mike Marciniszyn &lt;mike.marciniszyn@intel.com&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@mellanox.com&gt;
</content>
</entry>
<entry>
<title>RDMA/core: Make rdma_counter.h compile stand alone</title>
<updated>2019-07-09T12:44:47Z</updated>
<author>
<name>Jason Gunthorpe</name>
<email>jgg@mellanox.com</email>
</author>
<published>2019-07-09T12:44:47Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=390d57728d8e6f7283030cb20d3b5459771a32f1'/>
<id>urn:sha1:390d57728d8e6f7283030cb20d3b5459771a32f1</id>
<content type='text'>
5.4-rc1 will have new compile time debugging to test that headers can be
compiled stand alone. Many rdma headers are already broken and excluded
from the mechanism, however to avoid compile failures during the merge
window fix enough so that the newly added header compiles clean.

Fixes: 413d3347503b ("RDMA/counter: Add set/clear per-port auto mode support")
Reported-by: Stephen Rothwell &lt;sfr@canb.auug.org.au&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@mellanox.com&gt;
Signed-off-by: Mark Zhang &lt;markz@mellanox.com&gt;
</content>
</entry>
<entry>
<title>RDMA/core: Provide RDMA DIM support for ULPs</title>
<updated>2019-07-08T19:37:22Z</updated>
<author>
<name>Yamin Friedman</name>
<email>yaminf@mellanox.com</email>
</author>
<published>2019-07-08T10:59:03Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=da6629793aa6944db6c8a908ca1a52d87f1489aa'/>
<id>urn:sha1:da6629793aa6944db6c8a908ca1a52d87f1489aa</id>
<content type='text'>
Added the interface in the infiniband driver that applies the rdma_dim
adaptive moderation. There is now a special function for allocating an
ib_cq that uses rdma_dim.

Performance improvement (ConnectX-5 100GbE, x86) running FIO benchmark over
NVMf between two equal end-hosts with 56 cores across a Mellanox switch
using null_blk device:

READS without DIM:
blk size | BW       | IOPS | 99th percentile latency  | 99.99th latency
512B     | 3.8GiB/s | 7.7M | 1401  usec               | 2442  usec
4k       | 7.0GiB/s | 1.8M | 4817  usec               | 6587  usec
64k      | 10.7GiB/s| 175k | 9896  usec               | 10028 usec

IO WRITES without DIM:
blk size | BW       | IOPS | 99th percentile latency  | 99.99th latency
512B     | 3.6GiB/s | 7.5M | 1434  usec               | 2474  usec
4k       | 6.3GiB/s | 1.6M | 938   usec               | 1221  usec
64k      | 10.7GiB/s| 175k | 8979  usec               | 12780 usec

IO READS with DIM:
blk size | BW       | IOPS | 99th percentile latency  | 99.99th latency
512B     | 4GiB/s   | 8.2M | 816    usec              | 889   usec
4k       | 10.1GiB/s| 2.65M| 3359   usec              | 5080  usec
64k      | 10.7GiB/s| 175k | 9896   usec              | 10028 usec

IO WRITES with DIM:
blk size | BW       | IOPS  | 99th percentile latency | 99.99th latency
512B     | 3.9GiB/s | 8.1M  | 799   usec              | 922   usec
4k       | 9.6GiB/s | 2.5M  | 717   usec              | 1004  usec
64k      | 10.7GiB/s| 176k  | 8586  usec              | 12256 usec

The rdma_dim algorithm was designed to measure the effectiveness of
moderation on the flow in a general way and thus should be appropriate
for all RDMA storage protocols.

rdma_dim is configured to be the default option based on performance
improvement seen after extensive tests.

Signed-off-by: Yamin Friedman &lt;yaminf@mellanox.com&gt;
Reviewed-by: Max Gurtovoy &lt;maxg@mellanox.com&gt;
Reviewed-by: Sagi Grimberg &lt;sagi@grimberg.me&gt;
Signed-off-by: Saeed Mahameed &lt;saeedm@mellanox.com&gt;
Signed-off-by: Leon Romanovsky &lt;leonro@mellanox.com&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@mellanox.com&gt;
</content>
</entry>
<entry>
<title>IB/mlx5: Report correctly tag matching rendezvous capability</title>
<updated>2019-07-08T17:26:37Z</updated>
<author>
<name>Danit Goldberg</name>
<email>danitg@mellanox.com</email>
</author>
<published>2019-07-05T16:21:57Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=89705e92700170888236555fe91b45e4c1bb0985'/>
<id>urn:sha1:89705e92700170888236555fe91b45e4c1bb0985</id>
<content type='text'>
Userspace expects the IB_TM_CAP_RC bit to indicate that the device
supports RC transport tag matching with rendezvous offload. However the
firmware splits this into two capabilities for eager and rendezvous tag
matching.

Only if the FW supports both modes should userspace be told the tag
matching capability is available.

Cc: &lt;stable@vger.kernel.org&gt; # 4.13
Fixes: eb761894351d ("IB/mlx5: Fill XRQ capabilities")
Signed-off-by: Danit Goldberg &lt;danitg@mellanox.com&gt;
Reviewed-by: Yishai Hadas &lt;yishaih@mellanox.com&gt;
Reviewed-by: Artemy Kovalyov &lt;artemyko@mellanox.com&gt;
Signed-off-by: Leon Romanovsky &lt;leonro@mellanox.com&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@mellanox.com&gt;
</content>
</entry>
<entry>
<title>RDMA/nldev: Allow get default counter statistics through RDMA netlink</title>
<updated>2019-07-05T13:22:55Z</updated>
<author>
<name>Mark Zhang</name>
<email>markz@mellanox.com</email>
</author>
<published>2019-07-02T10:02:46Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=6e7be47a53459ba3d288c3240ccd948fc699c377'/>
<id>urn:sha1:6e7be47a53459ba3d288c3240ccd948fc699c377</id>
<content type='text'>
This patch adds the ability to return the hwstats of per-port default
counters (which can also be queried through sysfs nodes).

Signed-off-by: Mark Zhang &lt;markz@mellanox.com&gt;
Signed-off-by: Leon Romanovsky &lt;leonro@mellanox.com&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@mellanox.com&gt;
</content>
</entry>
<entry>
<title>RDMA/nldev: Allow counter manual mode configration through RDMA netlink</title>
<updated>2019-07-05T13:22:55Z</updated>
<author>
<name>Mark Zhang</name>
<email>markz@mellanox.com</email>
</author>
<published>2019-07-02T10:02:44Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=b389327df90530d47931d0f5616b5cd6abb96c96'/>
<id>urn:sha1:b389327df90530d47931d0f5616b5cd6abb96c96</id>
<content type='text'>
Provide an option to allow users to manually bind a qp with a counter
through RDMA netlink. Limit it to users with ADMIN capability only.

Signed-off-by: Mark Zhang &lt;markz@mellanox.com&gt;
Reviewed-by: Majd Dibbiny &lt;majd@mellanox.com&gt;
Signed-off-by: Leon Romanovsky &lt;leonro@mellanox.com&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@mellanox.com&gt;
</content>
</entry>
<entry>
<title>RDMA/counter: Allow manual mode configuration support</title>
<updated>2019-07-05T13:22:55Z</updated>
<author>
<name>Mark Zhang</name>
<email>markz@mellanox.com</email>
</author>
<published>2019-07-02T10:02:43Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=1bd8e0a9d0fd1be03d2833a0c15ac676bdf275d8'/>
<id>urn:sha1:1bd8e0a9d0fd1be03d2833a0c15ac676bdf275d8</id>
<content type='text'>
In manual mode a QP is bound to a counter manually. If counter is not
specified then a new one will be allocated.

Manual mode is enabled when user binds a QP, and disabled when the last
manually bound QP is unbound.

When auto-mode is turned off and there are counters left, manual mode is
enabled so that the user is able to access these counters.

Signed-off-by: Mark Zhang &lt;markz@mellanox.com&gt;
Reviewed-by: Majd Dibbiny &lt;majd@mellanox.com&gt;
Signed-off-by: Leon Romanovsky &lt;leonro@mellanox.com&gt;
Signed-off-by: Jason Gunthorpe &lt;jgg@mellanox.com&gt;
</content>
</entry>
</feed>
