<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/net/tipc/socket.c, branch v5.8</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v5.8</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v5.8'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2020-06-11T19:47:23Z</updated>
<entry>
<title>tipc: fix kernel WARNING in tipc_msg_append()</title>
<updated>2020-06-11T19:47:23Z</updated>
<author>
<name>Tuong Lien</name>
<email>tuong.t.lien@dektech.com.au</email>
</author>
<published>2020-06-11T10:07:35Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=c9aa81faf19115fc2e732e7f210b37bb316987ff'/>
<id>urn:sha1:c9aa81faf19115fc2e732e7f210b37bb316987ff</id>
<content type='text'>
syzbot found the following issue:

WARNING: CPU: 0 PID: 6808 at include/linux/thread_info.h:150 check_copy_size include/linux/thread_info.h:150 [inline]
WARNING: CPU: 0 PID: 6808 at include/linux/thread_info.h:150 copy_from_iter include/linux/uio.h:144 [inline]
WARNING: CPU: 0 PID: 6808 at include/linux/thread_info.h:150 tipc_msg_append+0x49a/0x5e0 net/tipc/msg.c:242
Kernel panic - not syncing: panic_on_warn set ...

This happens after commit 5e9eeccc58f3 ("tipc: fix NULL pointer
dereference in streaming") that tried to build at least one buffer even
when the message data length is zero... However, it now exposes another
bug that the 'mss' can be zero and the 'cpy' will be negative, thus the
above kernel WARNING will appear!
The zero value of 'mss' is never expected because it means Nagle is not
enabled for the socket (actually the socket type was 'SOCK_SEQPACKET'),
so the function 'tipc_msg_append()' must not be called at all. But that
was in this particular case since the message data length was zero, and
the 'send &lt;= maxnagle' check became true.

We resolve the issue by explicitly checking if Nagle is enabled for the
socket, i.e. 'maxnagle != 0' before calling the 'tipc_msg_append()'. We
also reinforce the function to against such a negative values if any.

Reported-by: syzbot+75139a7d2605236b0b7f@syzkaller.appspotmail.com
Fixes: c0bceb97db9e ("tipc: add smart nagle feature")
Acked-by: Jon Maloy &lt;jmaloy@redhat.com&gt;
Signed-off-by: Tuong Lien &lt;tuong.t.lien@dektech.com.au&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>tipc: Fix NULL pointer dereference in __tipc_sendstream()</title>
<updated>2020-06-01T22:33:24Z</updated>
<author>
<name>YueHaibing</name>
<email>yuehaibing@huawei.com</email>
</author>
<published>2020-05-28T14:34:07Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=4c21daae3dbc9f8536cc18e6e53627821fa2c90c'/>
<id>urn:sha1:4c21daae3dbc9f8536cc18e6e53627821fa2c90c</id>
<content type='text'>
tipc_sendstream() may send zero length packet, then tipc_msg_append()
do not alloc skb, skb_peek_tail() will get NULL, msg_set_ack_required
will trigger NULL pointer dereference.

Reported-by: syzbot+8eac6d030e7807c21d32@syzkaller.appspotmail.com
Fixes: 0a3e060f340d ("tipc: add test for Nagle algorithm effectiveness")
Signed-off-by: YueHaibing &lt;yuehaibing@huawei.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>tipc: call tsk_set_importance from tipc_topsrv_create_listener</title>
<updated>2020-05-28T18:11:46Z</updated>
<author>
<name>Christoph Hellwig</name>
<email>hch@lst.de</email>
</author>
<published>2020-05-28T05:12:36Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=095ae612530c9465df6d372d688cb30c6abfc5f5'/>
<id>urn:sha1:095ae612530c9465df6d372d688cb30c6abfc5f5</id>
<content type='text'>
Avoid using kernel_setsockopt for the TIPC_IMPORTANCE option when we can
just use the internal helper.  The only change needed is to pass a struct
sock instead of tipc_sock, which is private to socket.c

Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>tipc: add test for Nagle algorithm effectiveness</title>
<updated>2020-05-26T22:16:52Z</updated>
<author>
<name>Tuong Lien</name>
<email>tuong.t.lien@dektech.com.au</email>
</author>
<published>2020-05-26T09:38:38Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=0a3e060f340dbe232ffa290c40f879b7f7db595b'/>
<id>urn:sha1:0a3e060f340dbe232ffa290c40f879b7f7db595b</id>
<content type='text'>
When streaming in Nagle mode, we try to bundle small messages from user
as many as possible if there is one outstanding buffer, i.e. not ACK-ed
by the receiving side, which helps boost up the overall throughput. So,
the algorithm's effectiveness really depends on when Nagle ACK comes or
what the specific network latency (RTT) is, compared to the user's
message sending rate.

In a bad case, the user's sending rate is low or the network latency is
small, there will not be many bundles, so making a Nagle ACK or waiting
for it is not meaningful.
For example: a user sends its messages every 100ms and the RTT is 50ms,
then for each messages, we require one Nagle ACK but then there is only
one user message sent without any bundles.

In a better case, even if we have a few bundles (e.g. the RTT = 300ms),
but now the user sends messages in medium size, then there will not be
any difference at all, that says 3 x 1000-byte data messages if bundled
will still result in 3 bundles with MTU = 1500.

When Nagle is ineffective, the delay in user message sending is clearly
wasted instead of sending directly.

Besides, adding Nagle ACKs will consume some processor load on both the
sending and receiving sides.

This commit adds a test on the effectiveness of the Nagle algorithm for
an individual connection in the network on which it actually runs.
Particularly, upon receipt of a Nagle ACK we will compare the number of
bundles in the backlog queue to the number of user messages which would
be sent directly without Nagle. If the ratio is good (e.g. &gt;= 2), Nagle
mode will be kept for further message sending. Otherwise, we will leave
Nagle and put a 'penalty' on the connection, so it will have to spend
more 'one-way' messages before being able to re-enter Nagle.

In addition, the 'ack-required' bit is only set when really needed that
the number of Nagle ACKs will be reduced during Nagle mode.

Testing with benchmark showed that with the patch, there was not much
difference in throughput for small messages since the tool continuously
sends messages without a break, so Nagle would still take in effect.

Acked-by: Ying Xue &lt;ying.xue@windriver.com&gt;
Acked-by: Jon Maloy &lt;jmaloy@redhat.com&gt;
Signed-off-by: Tuong Lien &lt;tuong.t.lien@dektech.com.au&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>tipc: fix large latency in smart Nagle streaming</title>
<updated>2020-05-13T19:33:18Z</updated>
<author>
<name>Tuong Lien</name>
<email>tuong.t.lien@dektech.com.au</email>
</author>
<published>2020-05-13T12:33:16Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=c72685894506a3fec5978b9b11b94069a7d1c995'/>
<id>urn:sha1:c72685894506a3fec5978b9b11b94069a7d1c995</id>
<content type='text'>
Currently when a connection is in Nagle mode, we set the 'ack_required'
bit in the last sending buffer and wait for the corresponding ACK prior
to pushing more data. However, on the receiving side, the ACK is issued
only when application really  reads the whole data. Even if part of the
last buffer is received, we will not do the ACK as required. This might
cause an unnecessary delay since the receiver does not always fetch the
message as fast as the sender, resulting in a large latency in the user
message sending, which is: [one RTT + the receiver processing time].

The commit makes Nagle ACK as soon as possible i.e. when a message with
the 'ack_required' arrives in the receiving side's stack even before it
is processed or put in the socket receive queue...
This way, we can limit the streaming latency to one RTT as committed in
Nagle mode.

Acked-by: Ying Xue &lt;ying.xue@windriver.com&gt;
Acked-by: Jon Maloy &lt;jmaloy@redhat.com&gt;
Signed-off-by: Tuong Lien &lt;tuong.t.lien@dektech.com.au&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>tipc: Add a missing case of TIPC_DIRECT_MSG type</title>
<updated>2020-03-26T18:21:02Z</updated>
<author>
<name>Hoang Le</name>
<email>hoang.h.le@dektech.com.au</email>
</author>
<published>2020-03-26T02:50:29Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=8b1e5b0a99f04bda2d6c85ecfe5e68a356c10914'/>
<id>urn:sha1:8b1e5b0a99f04bda2d6c85ecfe5e68a356c10914</id>
<content type='text'>
In the commit f73b12812a3d
("tipc: improve throughput between nodes in netns"), we're missing a check
to handle TIPC_DIRECT_MSG type, it's still using old sending mechanism for
this message type. So, throughput improvement is not significant as
expected.

Besides that, when sending a large message with that type, we're also
handle wrong receiving queue, it should be enqueued in socket receiving
instead of multicast messages.

Fix this by adding the missing case for TIPC_DIRECT_MSG.

Fixes: f73b12812a3d ("tipc: improve throughput between nodes in netns")
Reported-by: Tuong Lien &lt;tuong.t.lien@dektech.com.au&gt;
Signed-off-by: Hoang Le &lt;hoang.h.le@dektech.com.au&gt;
Acked-by: Jon Maloy &lt;jmaloy@redhat.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>tipc: fix successful connect() but timed out</title>
<updated>2020-02-10T09:23:00Z</updated>
<author>
<name>Tuong Lien</name>
<email>tuong.t.lien@dektech.com.au</email>
</author>
<published>2020-02-10T08:35:44Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=5391a87751a164b3194864126f3b016038abc9fe'/>
<id>urn:sha1:5391a87751a164b3194864126f3b016038abc9fe</id>
<content type='text'>
In commit 9546a0b7ce00 ("tipc: fix wrong connect() return code"), we
fixed the issue with the 'connect()' that returns zero even though the
connecting has failed by waiting for the connection to be 'ESTABLISHED'
really. However, the approach has one drawback in conjunction with our
'lightweight' connection setup mechanism that the following scenario
can happen:

          (server)                        (client)

   +- accept()|                      |             wait_for_conn()
   |          |                      |connect() -------+
   |          |&lt;-------[SYN]---------|                 &gt; sleeping
   |          |                      *CONNECTING       |
   |---------&gt;*ESTABLISHED           |                 |
              |--------[ACK]--------&gt;*ESTABLISHED      &gt; wakeup()
        send()|--------[DATA]-------&gt;|\                &gt; wakeup()
        send()|--------[DATA]-------&gt;| |               &gt; wakeup()
          .   .          .           . |-&gt; recvq       .
          .   .          .           . |               .
        send()|--------[DATA]-------&gt;|/                &gt; wakeup()
       close()|--------[FIN]--------&gt;*DISCONNECTING    |
              *DISCONNECTING         |                 |
              |                      ~~~~~~~~~~~~~~~~~~&gt; schedule()
                                                       | wait again
                                                       .
                                                       .
                                                       | ETIMEDOUT

Upon the receipt of the server 'ACK', the client becomes 'ESTABLISHED'
and the 'wait_for_conn()' process is woken up but not run. Meanwhile,
the server starts to send a number of data following by a 'close()'
shortly without waiting any response from the client, which then forces
the client socket to be 'DISCONNECTING' immediately. When the wait
process is switched to be running, it continues to wait until the timer
expires because of the unexpected socket state. The client 'connect()'
will finally get ‘-ETIMEDOUT’ and force to release the socket whereas
there remains the messages in its receive queue.

Obviously the issue would not happen if the server had some delay prior
to its 'close()' (or the number of 'DATA' messages is large enough),
but any kind of delay would make the connection setup/shutdown "heavy".
We solve this by simply allowing the 'connect()' returns zero in this
particular case. The socket is already 'DISCONNECTING', so any further
write will get '-EPIPE' but the socket is still able to read the
messages existing in its receive queue.

Note: This solution doesn't break the previous one as it deals with a
different situation that the socket state is 'DISCONNECTING' but has no
error (i.e. sk-&gt;sk_err = 0).

Fixes: 9546a0b7ce00 ("tipc: fix wrong connect() return code")
Acked-by: Ying Xue &lt;ying.xue@windriver.com&gt;
Acked-by: Jon Maloy &lt;jon.maloy@ericsson.com&gt;
Signed-off-by: Tuong Lien &lt;tuong.t.lien@dektech.com.au&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>tipc: fix wrong connect() return code</title>
<updated>2020-01-08T23:57:35Z</updated>
<author>
<name>Tuong Lien</name>
<email>tuong.t.lien@dektech.com.au</email>
</author>
<published>2020-01-08T02:19:00Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=9546a0b7ce0077d827470f603f2522b845ce5954'/>
<id>urn:sha1:9546a0b7ce0077d827470f603f2522b845ce5954</id>
<content type='text'>
The current 'tipc_wait_for_connect()' function does a wait-loop for the
condition 'sk-&gt;sk_state != TIPC_CONNECTING' to conclude if the socket
connecting has done. However, when the condition is met, it returns '0'
even in the case the connecting is actually failed, the socket state is
set to 'TIPC_DISCONNECTING' (e.g. when the server socket has closed..).
This results in a wrong return code for the 'connect()' call from user,
making it believe that the connection is established and go ahead with
building, sending a message, etc. but finally failed e.g. '-EPIPE'.

This commit fixes the issue by changing the wait condition to the
'tipc_sk_connected(sk)', so the function will return '0' only when the
connection is really established. Otherwise, either the socket 'sk_err'
if any or '-ETIMEDOUT'/'-EINTR' will be returned correspondingly.

Acked-by: Ying Xue &lt;ying.xue@windriver.com&gt;
Acked-by: Jon Maloy &lt;jon.maloy@ericsson.com&gt;
Signed-off-by: Tuong Lien &lt;tuong.t.lien@dektech.com.au&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>tipc: fix link overflow issue at socket shutdown</title>
<updated>2020-01-08T23:57:07Z</updated>
<author>
<name>Tuong Lien</name>
<email>tuong.t.lien@dektech.com.au</email>
</author>
<published>2020-01-08T02:18:15Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=49afb806cb650dd1f06f191994f3aa657d264009'/>
<id>urn:sha1:49afb806cb650dd1f06f191994f3aa657d264009</id>
<content type='text'>
When a socket is suddenly shutdown or released, it will reject all the
unreceived messages in its receive queue. This applies to a connected
socket too, whereas there is only one 'FIN' message required to be sent
back to its peer in this case.

In case there are many messages in the queue and/or some connections
with such messages are shutdown at the same time, the link layer will
easily get overflowed at the 'TIPC_SYSTEM_IMPORTANCE' backlog level
because of the message rejections. As a result, the link will be taken
down. Moreover, immediately when the link is re-established, the socket
layer can continue to reject the messages and the same issue happens...

The commit refactors the '__tipc_shutdown()' function to only send one
'FIN' in the situation mentioned above. For the connectionless case, it
is unavoidable but usually there is no rejections for such socket
messages because they are 'dest-droppable' by default.

In addition, the new code makes the other socket states clear
(e.g.'TIPC_LISTEN') and treats as a separate case to avoid misbehaving.

Acked-by: Ying Xue &lt;ying.xue@windriver.com&gt;
Acked-by: Jon Maloy &lt;jon.maloy@ericsson.com&gt;
Signed-off-by: Tuong Lien &lt;tuong.t.lien@dektech.com.au&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>tipc: fix retrans failure due to wrong destination</title>
<updated>2019-12-11T01:45:04Z</updated>
<author>
<name>Tuong Lien</name>
<email>tuong.t.lien@dektech.com.au</email>
</author>
<published>2019-12-10T08:21:04Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=abc9b4e0549b93fdaff56e9532bc49a2d7b04955'/>
<id>urn:sha1:abc9b4e0549b93fdaff56e9532bc49a2d7b04955</id>
<content type='text'>
When a user message is sent, TIPC will check if the socket has faced a
congestion at link layer. If that happens, it will make a sleep to wait
for the congestion to disappear. This leaves a gap for other users to
take over the socket (e.g. multi threads) since the socket is released
as well. Also, in case of connectionless (e.g. SOCK_RDM), user is free
to send messages to various destinations (e.g. via 'sendto()'), then
the socket's preformatted header has to be updated correspondingly
prior to the actual payload message building.

Unfortunately, the latter action is done before the first action which
causes a condition issue that the destination of a certain message can
be modified incorrectly in the middle, leading to wrong destination
when that message is built. Consequently, when the message is sent to
the link layer, it gets stuck there forever because the peer node will
simply reject it. After a number of retransmission attempts, the link
is eventually taken down and the retransmission failure is reported.

This commit fixes the problem by rearranging the order of actions to
prevent the race condition from occurring, so the message building is
'atomic' and its header will not be modified by anyone.

Fixes: 365ad353c256 ("tipc: reduce risk of user starvation during link congestion")
Acked-by: Jon Maloy &lt;jon.maloy@ericsson.com&gt;
Signed-off-by: Tuong Lien &lt;tuong.t.lien@dektech.com.au&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
</feed>
