<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/net/tipc/socket.c, branch v5.7</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v5.7</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v5.7'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2020-05-13T19:33:18Z</updated>
<entry>
<title>tipc: fix large latency in smart Nagle streaming</title>
<updated>2020-05-13T19:33:18Z</updated>
<author>
<name>Tuong Lien</name>
<email>tuong.t.lien@dektech.com.au</email>
</author>
<published>2020-05-13T12:33:16Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=c72685894506a3fec5978b9b11b94069a7d1c995'/>
<id>urn:sha1:c72685894506a3fec5978b9b11b94069a7d1c995</id>
<content type='text'>
Currently when a connection is in Nagle mode, we set the 'ack_required'
bit in the last sending buffer and wait for the corresponding ACK prior
to pushing more data. However, on the receiving side, the ACK is issued
only when application really  reads the whole data. Even if part of the
last buffer is received, we will not do the ACK as required. This might
cause an unnecessary delay since the receiver does not always fetch the
message as fast as the sender, resulting in a large latency in the user
message sending, which is: [one RTT + the receiver processing time].

The commit makes Nagle ACK as soon as possible i.e. when a message with
the 'ack_required' arrives in the receiving side's stack even before it
is processed or put in the socket receive queue...
This way, we can limit the streaming latency to one RTT as committed in
Nagle mode.

Acked-by: Ying Xue &lt;ying.xue@windriver.com&gt;
Acked-by: Jon Maloy &lt;jmaloy@redhat.com&gt;
Signed-off-by: Tuong Lien &lt;tuong.t.lien@dektech.com.au&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>tipc: Add a missing case of TIPC_DIRECT_MSG type</title>
<updated>2020-03-26T18:21:02Z</updated>
<author>
<name>Hoang Le</name>
<email>hoang.h.le@dektech.com.au</email>
</author>
<published>2020-03-26T02:50:29Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=8b1e5b0a99f04bda2d6c85ecfe5e68a356c10914'/>
<id>urn:sha1:8b1e5b0a99f04bda2d6c85ecfe5e68a356c10914</id>
<content type='text'>
In the commit f73b12812a3d
("tipc: improve throughput between nodes in netns"), we're missing a check
to handle TIPC_DIRECT_MSG type, it's still using old sending mechanism for
this message type. So, throughput improvement is not significant as
expected.

Besides that, when sending a large message with that type, we're also
handle wrong receiving queue, it should be enqueued in socket receiving
instead of multicast messages.

Fix this by adding the missing case for TIPC_DIRECT_MSG.

Fixes: f73b12812a3d ("tipc: improve throughput between nodes in netns")
Reported-by: Tuong Lien &lt;tuong.t.lien@dektech.com.au&gt;
Signed-off-by: Hoang Le &lt;hoang.h.le@dektech.com.au&gt;
Acked-by: Jon Maloy &lt;jmaloy@redhat.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>tipc: fix successful connect() but timed out</title>
<updated>2020-02-10T09:23:00Z</updated>
<author>
<name>Tuong Lien</name>
<email>tuong.t.lien@dektech.com.au</email>
</author>
<published>2020-02-10T08:35:44Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=5391a87751a164b3194864126f3b016038abc9fe'/>
<id>urn:sha1:5391a87751a164b3194864126f3b016038abc9fe</id>
<content type='text'>
In commit 9546a0b7ce00 ("tipc: fix wrong connect() return code"), we
fixed the issue with the 'connect()' that returns zero even though the
connecting has failed by waiting for the connection to be 'ESTABLISHED'
really. However, the approach has one drawback in conjunction with our
'lightweight' connection setup mechanism that the following scenario
can happen:

          (server)                        (client)

   +- accept()|                      |             wait_for_conn()
   |          |                      |connect() -------+
   |          |&lt;-------[SYN]---------|                 &gt; sleeping
   |          |                      *CONNECTING       |
   |---------&gt;*ESTABLISHED           |                 |
              |--------[ACK]--------&gt;*ESTABLISHED      &gt; wakeup()
        send()|--------[DATA]-------&gt;|\                &gt; wakeup()
        send()|--------[DATA]-------&gt;| |               &gt; wakeup()
          .   .          .           . |-&gt; recvq       .
          .   .          .           . |               .
        send()|--------[DATA]-------&gt;|/                &gt; wakeup()
       close()|--------[FIN]--------&gt;*DISCONNECTING    |
              *DISCONNECTING         |                 |
              |                      ~~~~~~~~~~~~~~~~~~&gt; schedule()
                                                       | wait again
                                                       .
                                                       .
                                                       | ETIMEDOUT

Upon the receipt of the server 'ACK', the client becomes 'ESTABLISHED'
and the 'wait_for_conn()' process is woken up but not run. Meanwhile,
the server starts to send a number of data following by a 'close()'
shortly without waiting any response from the client, which then forces
the client socket to be 'DISCONNECTING' immediately. When the wait
process is switched to be running, it continues to wait until the timer
expires because of the unexpected socket state. The client 'connect()'
will finally get ‘-ETIMEDOUT’ and force to release the socket whereas
there remains the messages in its receive queue.

Obviously the issue would not happen if the server had some delay prior
to its 'close()' (or the number of 'DATA' messages is large enough),
but any kind of delay would make the connection setup/shutdown "heavy".
We solve this by simply allowing the 'connect()' returns zero in this
particular case. The socket is already 'DISCONNECTING', so any further
write will get '-EPIPE' but the socket is still able to read the
messages existing in its receive queue.

Note: This solution doesn't break the previous one as it deals with a
different situation that the socket state is 'DISCONNECTING' but has no
error (i.e. sk-&gt;sk_err = 0).

Fixes: 9546a0b7ce00 ("tipc: fix wrong connect() return code")
Acked-by: Ying Xue &lt;ying.xue@windriver.com&gt;
Acked-by: Jon Maloy &lt;jon.maloy@ericsson.com&gt;
Signed-off-by: Tuong Lien &lt;tuong.t.lien@dektech.com.au&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>tipc: fix wrong connect() return code</title>
<updated>2020-01-08T23:57:35Z</updated>
<author>
<name>Tuong Lien</name>
<email>tuong.t.lien@dektech.com.au</email>
</author>
<published>2020-01-08T02:19:00Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=9546a0b7ce0077d827470f603f2522b845ce5954'/>
<id>urn:sha1:9546a0b7ce0077d827470f603f2522b845ce5954</id>
<content type='text'>
The current 'tipc_wait_for_connect()' function does a wait-loop for the
condition 'sk-&gt;sk_state != TIPC_CONNECTING' to conclude if the socket
connecting has done. However, when the condition is met, it returns '0'
even in the case the connecting is actually failed, the socket state is
set to 'TIPC_DISCONNECTING' (e.g. when the server socket has closed..).
This results in a wrong return code for the 'connect()' call from user,
making it believe that the connection is established and go ahead with
building, sending a message, etc. but finally failed e.g. '-EPIPE'.

This commit fixes the issue by changing the wait condition to the
'tipc_sk_connected(sk)', so the function will return '0' only when the
connection is really established. Otherwise, either the socket 'sk_err'
if any or '-ETIMEDOUT'/'-EINTR' will be returned correspondingly.

Acked-by: Ying Xue &lt;ying.xue@windriver.com&gt;
Acked-by: Jon Maloy &lt;jon.maloy@ericsson.com&gt;
Signed-off-by: Tuong Lien &lt;tuong.t.lien@dektech.com.au&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>tipc: fix link overflow issue at socket shutdown</title>
<updated>2020-01-08T23:57:07Z</updated>
<author>
<name>Tuong Lien</name>
<email>tuong.t.lien@dektech.com.au</email>
</author>
<published>2020-01-08T02:18:15Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=49afb806cb650dd1f06f191994f3aa657d264009'/>
<id>urn:sha1:49afb806cb650dd1f06f191994f3aa657d264009</id>
<content type='text'>
When a socket is suddenly shutdown or released, it will reject all the
unreceived messages in its receive queue. This applies to a connected
socket too, whereas there is only one 'FIN' message required to be sent
back to its peer in this case.

In case there are many messages in the queue and/or some connections
with such messages are shutdown at the same time, the link layer will
easily get overflowed at the 'TIPC_SYSTEM_IMPORTANCE' backlog level
because of the message rejections. As a result, the link will be taken
down. Moreover, immediately when the link is re-established, the socket
layer can continue to reject the messages and the same issue happens...

The commit refactors the '__tipc_shutdown()' function to only send one
'FIN' in the situation mentioned above. For the connectionless case, it
is unavoidable but usually there is no rejections for such socket
messages because they are 'dest-droppable' by default.

In addition, the new code makes the other socket states clear
(e.g.'TIPC_LISTEN') and treats as a separate case to avoid misbehaving.

Acked-by: Ying Xue &lt;ying.xue@windriver.com&gt;
Acked-by: Jon Maloy &lt;jon.maloy@ericsson.com&gt;
Signed-off-by: Tuong Lien &lt;tuong.t.lien@dektech.com.au&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>tipc: fix retrans failure due to wrong destination</title>
<updated>2019-12-11T01:45:04Z</updated>
<author>
<name>Tuong Lien</name>
<email>tuong.t.lien@dektech.com.au</email>
</author>
<published>2019-12-10T08:21:04Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=abc9b4e0549b93fdaff56e9532bc49a2d7b04955'/>
<id>urn:sha1:abc9b4e0549b93fdaff56e9532bc49a2d7b04955</id>
<content type='text'>
When a user message is sent, TIPC will check if the socket has faced a
congestion at link layer. If that happens, it will make a sleep to wait
for the congestion to disappear. This leaves a gap for other users to
take over the socket (e.g. multi threads) since the socket is released
as well. Also, in case of connectionless (e.g. SOCK_RDM), user is free
to send messages to various destinations (e.g. via 'sendto()'), then
the socket's preformatted header has to be updated correspondingly
prior to the actual payload message building.

Unfortunately, the latter action is done before the first action which
causes a condition issue that the destination of a certain message can
be modified incorrectly in the middle, leading to wrong destination
when that message is built. Consequently, when the message is sent to
the link layer, it gets stuck there forever because the peer node will
simply reject it. After a number of retransmission attempts, the link
is eventually taken down and the retransmission failure is reported.

This commit fixes the problem by rearranging the order of actions to
prevent the race condition from occurring, so the message building is
'atomic' and its header will not be modified by anyone.

Fixes: 365ad353c256 ("tipc: reduce risk of user starvation during link congestion")
Acked-by: Jon Maloy &lt;jon.maloy@ericsson.com&gt;
Signed-off-by: Tuong Lien &lt;tuong.t.lien@dektech.com.au&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>tipc: fix duplicate SYN messages under link congestion</title>
<updated>2019-11-29T07:09:15Z</updated>
<author>
<name>Tung Nguyen</name>
<email>tung.q.nguyen@dektech.com.au</email>
</author>
<published>2019-11-28T03:10:08Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=d34910e1751be79672db9cc61ca8892fbb4763f2'/>
<id>urn:sha1:d34910e1751be79672db9cc61ca8892fbb4763f2</id>
<content type='text'>
Scenario:
1. A client socket initiates a SYN message to a listening socket.
2. The send link is congested, the SYN message is put in the
send link and a wakeup message is put in wakeup queue.
3. The congestion situation is abated, the wakeup message is
pulled out of the wakeup queue. Function tipc_sk_push_backlog()
is called to send out delayed messages by Nagle. However,
the client socket is still in CONNECTING state. So, it sends
the SYN message in the socket write queue to the listening socket
again.
4. The listening socket receives the first SYN message and creates
first server socket. The client socket receives ACK- and establishes
a connection to the first server socket. The client socket closes
its connection with the first server socket.
5. The listening socket receives the second SYN message and creates
second server socket. The second server socket sends ACK- to the
client socket, but it has been closed. It results in connection
reset error when reading from the server socket in user space.

Solution: return from function tipc_sk_push_backlog() immediately
if there is pending SYN message in the socket write queue.

Fixes: c0bceb97db9e ("tipc: add smart nagle feature")
Signed-off-by: Tung Nguyen &lt;tung.q.nguyen@dektech.com.au&gt;
Acked-by: Jon Maloy &lt;jon.maloy@ericsson.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>tipc: fix wrong timeout input for tipc_wait_for_cond()</title>
<updated>2019-11-29T07:09:14Z</updated>
<author>
<name>Tung Nguyen</name>
<email>tung.q.nguyen@dektech.com.au</email>
</author>
<published>2019-11-28T03:10:07Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=12db3c8083fcab4270866a88191933f2d9f24f89'/>
<id>urn:sha1:12db3c8083fcab4270866a88191933f2d9f24f89</id>
<content type='text'>
In function __tipc_shutdown(), the timeout value passed to
tipc_wait_for_cond() is not jiffies.

This commit fixes it by converting that value from milliseconds
to jiffies.

Fixes: 365ad353c256 ("tipc: reduce risk of user starvation during link congestion")
Signed-off-by: Tung Nguyen &lt;tung.q.nguyen@dektech.com.au&gt;
Acked-by: Jon Maloy &lt;jon.maloy@ericsson.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>tipc: fix wrong socket reference counter after tipc_sk_timeout() returns</title>
<updated>2019-11-29T07:09:14Z</updated>
<author>
<name>Tung Nguyen</name>
<email>tung.q.nguyen@dektech.com.au</email>
</author>
<published>2019-11-28T03:10:06Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=91a4a3eb433e4d786420c41f3c08d1d16c605962'/>
<id>urn:sha1:91a4a3eb433e4d786420c41f3c08d1d16c605962</id>
<content type='text'>
When tipc_sk_timeout() is executed but user space is grabbing
ownership, this function rearms itself and returns. However, the
socket reference counter is not reduced. This causes potential
unexpected behavior.

This commit fixes it by calling sock_put() before tipc_sk_timeout()
returns in the above-mentioned case.

Fixes: afe8792fec69 ("tipc: refactor function tipc_sk_timeout()")
Signed-off-by: Tung Nguyen &lt;tung.q.nguyen@dektech.com.au&gt;
Acked-by: Ying Xue &lt;ying.xue@windriver.com&gt;
Acked-by: Jon Maloy &lt;jon.maloy@ericsson.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>tipc: fix potential memory leak in __tipc_sendmsg()</title>
<updated>2019-11-29T07:09:14Z</updated>
<author>
<name>Tung Nguyen</name>
<email>tung.q.nguyen@dektech.com.au</email>
</author>
<published>2019-11-28T03:10:05Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=2fe97a578d7bad3116a89dc8a6692a51e6fc1d9c'/>
<id>urn:sha1:2fe97a578d7bad3116a89dc8a6692a51e6fc1d9c</id>
<content type='text'>
When initiating a connection message to a server side, the connection
message is cloned and added to the socket write queue. However, if the
cloning is failed, only the socket write queue is purged. It causes
memory leak because the original connection message is not freed.

This commit fixes it by purging the list of connection message when
it cannot be cloned.

Fixes: 6787927475e5 ("tipc: buffer overflow handling in listener socket")
Reported-by: Hoang Le &lt;hoang.h.le@dektech.com.au&gt;
Signed-off-by: Tung Nguyen &lt;tung.q.nguyen@dektech.com.au&gt;
Acked-by: Ying Xue &lt;ying.xue@windriver.com&gt;
Acked-by: Jon Maloy &lt;jon.maloy@ericsson.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
</feed>
