<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/net/tipc/socket.c, branch v4.5</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v4.5</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v4.5'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2016-03-03T21:30:29Z</updated>
<entry>
<title>tipc: Revert "tipc: use existing sk_write_queue for outgoing packet chain"</title>
<updated>2016-03-03T21:30:29Z</updated>
<author>
<name>Parthasarathy Bhuvaragan</name>
<email>parthasarathy.bhuvaragan@ericsson.com</email>
</author>
<published>2016-03-01T10:07:09Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=f214fc402967e1bc94ad7f39faa03db5813d6849'/>
<id>urn:sha1:f214fc402967e1bc94ad7f39faa03db5813d6849</id>
<content type='text'>
reverts commit 94153e36e709e ("tipc: use existing sk_write_queue for
outgoing packet chain")

In Commit 94153e36e709e, we assume that we fill &amp; empty the socket's
sk_write_queue within the same lock_sock() session.

This is not true if the link is congested. During congestion, the
socket lock is released while we wait for the congestion to cease.
This implementation causes a nullptr exception, if the user space
program has several threads accessing the same socket descriptor.

Consider two threads of the same program performing the following:
     Thread1                                  Thread2
--------------------                    ----------------------
Enter tipc_sendmsg()                    Enter tipc_sendmsg()
lock_sock()                             lock_sock()
Enter tipc_link_xmit(), ret=ELINKCONG   spin on socket lock..
sk_wait_event()                             :
release_sock()                          grab socket lock
    :                                   Enter tipc_link_xmit(), ret=0
    :                                   release_sock()
Wakeup after congestion
lock_sock()
skb = skb_peek(pktchain);
!! TIPC_SKB_CB(skb)-&gt;wakeup_pending = tsk-&gt;link_cong;

In this case, the second thread transmits the buffers belonging to
both thread1 and thread2 successfully. When the first thread wakeup
after the congestion it assumes that the pktchain is intact and
operates on the skb's in it, which leads to the following exception:

[2102.439969] BUG: unable to handle kernel NULL pointer dereference at 00000000000000d0
[2102.440074] IP: [&lt;ffffffffa005f330&gt;] __tipc_link_xmit+0x2b0/0x4d0 [tipc]
[2102.440074] PGD 3fa3f067 PUD 3fa6b067 PMD 0
[2102.440074] Oops: 0000 [#1] SMP
[2102.440074] CPU: 2 PID: 244 Comm: sender Not tainted 3.12.28 #1
[2102.440074] RIP: 0010:[&lt;ffffffffa005f330&gt;]  [&lt;ffffffffa005f330&gt;] __tipc_link_xmit+0x2b0/0x4d0 [tipc]
[...]
[2102.440074] Call Trace:
[2102.440074]  [&lt;ffffffff8163f0b9&gt;] ? schedule+0x29/0x70
[2102.440074]  [&lt;ffffffffa006a756&gt;] ? tipc_node_unlock+0x46/0x170 [tipc]
[2102.440074]  [&lt;ffffffffa005f761&gt;] tipc_link_xmit+0x51/0xf0 [tipc]
[2102.440074]  [&lt;ffffffffa006d8ae&gt;] tipc_send_stream+0x11e/0x4f0 [tipc]
[2102.440074]  [&lt;ffffffff8106b150&gt;] ? __wake_up_sync+0x20/0x20
[2102.440074]  [&lt;ffffffffa006dc9c&gt;] tipc_send_packet+0x1c/0x20 [tipc]
[2102.440074]  [&lt;ffffffff81502478&gt;] sock_sendmsg+0xa8/0xd0
[2102.440074]  [&lt;ffffffff81507895&gt;] ? release_sock+0x145/0x170
[2102.440074]  [&lt;ffffffff815030d8&gt;] ___sys_sendmsg+0x3d8/0x3e0
[2102.440074]  [&lt;ffffffff816426ae&gt;] ? _raw_spin_unlock+0xe/0x10
[2102.440074]  [&lt;ffffffff81115c2a&gt;] ? handle_mm_fault+0x6ca/0x9d0
[2102.440074]  [&lt;ffffffff8107dd65&gt;] ? set_next_entity+0x85/0xa0
[2102.440074]  [&lt;ffffffff816426de&gt;] ? _raw_spin_unlock_irq+0xe/0x20
[2102.440074]  [&lt;ffffffff8107463c&gt;] ? finish_task_switch+0x5c/0xc0
[2102.440074]  [&lt;ffffffff8163ea8c&gt;] ? __schedule+0x34c/0x950
[2102.440074]  [&lt;ffffffff81504e12&gt;] __sys_sendmsg+0x42/0x80
[2102.440074]  [&lt;ffffffff81504e62&gt;] SyS_sendmsg+0x12/0x20
[2102.440074]  [&lt;ffffffff8164aed2&gt;] system_call_fastpath+0x16/0x1b

In this commit, we maintain the skb list always in the stack.

Signed-off-by: Parthasarathy Bhuvaragan &lt;parthasarathy.bhuvaragan@ericsson.com&gt;
Acked-by: Ying Xue &lt;ying.xue@windriver.com&gt;
Acked-by: Jon Maloy &lt;jon.maloy@ericsson.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net</title>
<updated>2015-12-04T02:09:12Z</updated>
<author>
<name>David S. Miller</name>
<email>davem@davemloft.net</email>
</author>
<published>2015-12-04T02:03:21Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=f188b951f33a0464338f94f928338f84fc0e4392'/>
<id>urn:sha1:f188b951f33a0464338f94f928338f84fc0e4392</id>
<content type='text'>
Conflicts:
	drivers/net/ethernet/renesas/ravb_main.c
	kernel/bpf/syscall.c
	net/ipv4/ipmr.c

All three conflicts were cases of overlapping changes.

Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>net: Generalise wq_has_sleeper helper</title>
<updated>2015-11-30T19:47:33Z</updated>
<author>
<name>Herbert Xu</name>
<email>herbert@gondor.apana.org.au</email>
</author>
<published>2015-11-26T05:55:39Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=1ce0bf50ae2233c7115a18c0c623662d177b434c'/>
<id>urn:sha1:1ce0bf50ae2233c7115a18c0c623662d177b434c</id>
<content type='text'>
The memory barrier in the helper wq_has_sleeper is needed by just
about every user of waitqueue_active.  This patch generalises it
by making it take a wait_queue_head_t directly.  The existing
helper is renamed to skwq_has_sleeper.

Signed-off-by: Herbert Xu &lt;herbert@gondor.apana.org.au&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>tipc: avoid packets leaking on socket receive queue</title>
<updated>2015-11-24T04:45:15Z</updated>
<author>
<name>Ying Xue</name>
<email>ying.xue@windriver.com</email>
</author>
<published>2015-11-22T07:46:05Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=f4195d1eac954a67adf112dd53404560cc55b942'/>
<id>urn:sha1:f4195d1eac954a67adf112dd53404560cc55b942</id>
<content type='text'>
Even if we drain receive queue thoroughly in tipc_release() after tipc
socket is removed from rhashtable, it is possible that some packets
are in flight because some CPU runs receiver and did rhashtable lookup
before we removed socket. They will achieve receive queue, but nobody
delete them at all. To avoid this leak, we register a private socket
destructor to purge receive queue, meaning releasing packets pending
on receive queue will be delayed until the last reference of tipc
socket will be released.

Signed-off-by: Ying Xue &lt;ying.xue@windriver.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>tipc: introduce jumbo frame support for broadcast</title>
<updated>2015-10-24T13:56:40Z</updated>
<author>
<name>Jon Paul Maloy</name>
<email>jon.maloy@ericsson.com</email>
</author>
<published>2015-10-22T12:51:43Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=959e1781aa230aecc90e4deb80117fd9a53dede7'/>
<id>urn:sha1:959e1781aa230aecc90e4deb80117fd9a53dede7</id>
<content type='text'>
Until now, we have only been supporting a fix MTU size of 1500 bytes
for all broadcast media, irrespective of their actual capability.

We now make the broadcast MTU adaptable to the carrying media, i.e.,
we use the smallest MTU supported by any of the interfaces attached
to TIPC.

Signed-off-by: Jon Maloy &lt;jon.maloy@ericsson.com&gt;
Reviewed-by: Ying Xue &lt;ying.xue@windriver.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>tipc: move bcast definitions to bcast.c</title>
<updated>2015-10-24T13:56:24Z</updated>
<author>
<name>Jon Paul Maloy</name>
<email>jon.maloy@ericsson.com</email>
</author>
<published>2015-10-22T12:51:33Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=6beb19a62a87ef6f7107fcd43c2cc1ebad3edfb5'/>
<id>urn:sha1:6beb19a62a87ef6f7107fcd43c2cc1ebad3edfb5</id>
<content type='text'>
Currently, a number of structure and function definitions related
to the broadcast functionality are unnecessarily exposed in the file
bcast.h. This obscures the fact that the external interface towards
the broadcast link in fact is very narrow, and causes unnecessary
recompilations of other files when anything changes in those
definitions.

In this commit, we move as many of those definitions as is currently
possible to the file bcast.c.

We also rename the structure 'tipc_bclink' to 'tipc_bc_base', both
since the name does not correctly describe the contents of this
struct, and will do so even less in the future, and because we want
to use the term 'link' more appropriately in the functionality
introduced later in this series.

Finally, we rename a couple of functions, such as tipc_bclink_xmit()
and others that will be kept in the future, to include the term 'bcast'
instead.

There are no functional changes in this commit.

Signed-off-by: Jon Maloy &lt;jon.maloy@ericsson.com&gt;
Reviewed-by: Ying Xue &lt;ying.xue@windriver.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>tipc: clean up socket layer message reception</title>
<updated>2015-07-26T23:31:50Z</updated>
<author>
<name>Jon Paul Maloy</name>
<email>jon.maloy@ericsson.com</email>
</author>
<published>2015-07-22T14:11:20Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=cda3696d3d26eb798c94de0dab5bd66ddb5627cb'/>
<id>urn:sha1:cda3696d3d26eb798c94de0dab5bd66ddb5627cb</id>
<content type='text'>
When a message is received in a socket, one of the call chains
tipc_sk_rcv()-&gt;tipc_sk_enqueue()-&gt;filter_rcv()(-&gt;tipc_sk_proto_rcv())
or
tipc_sk_backlog_rcv()-&gt;filter_rcv()(-&gt;tipc_sk_proto_rcv())
are followed. At each of these levels we may encounter situations
where the message may need to be rejected, or a new message
produced for transfer back to the sender. Despite recent
improvements, the current code for doing this is perceived
as awkward and hard to follow.

Leveraging the two previous commits in this series, we now
introduce a more uniform handling of such situations. We
let each of the functions in the chain itself produce/reverse
the message to be returned to the sender, but also perform the
actual forwarding. This simplifies the necessary logics within
each function.

Reviewed-by: Ying Xue &lt;ying.xue@windriver.com&gt;
Signed-off-by: Jon Maloy &lt;jon.maloy@ericsson.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>tipc: introduce new tipc_sk_respond() function</title>
<updated>2015-07-26T23:31:50Z</updated>
<author>
<name>Jon Paul Maloy</name>
<email>jon.maloy@ericsson.com</email>
</author>
<published>2015-07-22T14:11:19Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=bcd3ffd4f6d7c994c93be2ab8598fdfb2952a1f1'/>
<id>urn:sha1:bcd3ffd4f6d7c994c93be2ab8598fdfb2952a1f1</id>
<content type='text'>
Currently, we use the code sequence

if (msg_reverse())
   tipc_link_xmit_skb()

at numerous locations in socket.c. The preparation of arguments
for these calls, as well as the sequence itself, makes the code
unecessarily complex.

In this commit, we introduce a new function, tipc_sk_respond(),
that performs this call combination. We also replace some, but not
yet all, of these explicit call sequences with calls to the new
function. Notably, we let the function tipc_sk_proto_rcv() use
the new function to directly send out PROBE_REPLY messages,
instead of deferring this to the calling tipc_sk_rcv() function,
as we do now.

Reviewed-by: Ying Xue &lt;ying.xue@windriver.com&gt;
Signed-off-by: Jon Maloy &lt;jon.maloy@ericsson.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>tipc: let function tipc_msg_reverse() expand header when needed</title>
<updated>2015-07-26T23:31:50Z</updated>
<author>
<name>Jon Paul Maloy</name>
<email>jon.maloy@ericsson.com</email>
</author>
<published>2015-07-22T14:11:18Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=29042e19f2c602fabe4705b5b719550b4627639c'/>
<id>urn:sha1:29042e19f2c602fabe4705b5b719550b4627639c</id>
<content type='text'>
The shortest TIPC message header, for cluster local CONNECTED messages,
is 24 bytes long. With this format, the fields "dest_node" and
"orig_node" are optimized away, since they in reality are redundant
in this particular case.

However, the absence of these fields leads to code inconsistencies
that are difficult to handle in some cases, especially when we need
to reverse or reject messages at the socket layer.

In this commit, we concentrate the handling of the absent fields
to one place, by letting the function tipc_msg_reverse() reallocate
the buffer and expand the header to 32 bytes when necessary. This
means that the socket code now can assume that the two previously
absent fields are present in the header when a message needs to be
rejected. This opens up for some further simplifications of the
socket code.

Reviewed-by: Ying Xue &lt;ying.xue@windriver.com&gt;
Signed-off-by: Jon Maloy &lt;jon.maloy@ericsson.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>tipc: make media xmit call outside node spinlock context</title>
<updated>2015-07-21T03:41:15Z</updated>
<author>
<name>Jon Paul Maloy</name>
<email>jon.maloy@ericsson.com</email>
</author>
<published>2015-07-16T20:54:24Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=af9b028e270fda6fb812d70d17d902297df1ceb5'/>
<id>urn:sha1:af9b028e270fda6fb812d70d17d902297df1ceb5</id>
<content type='text'>
Currently, message sending is performed through a deep call chain,
where the node spinlock is grabbed and held during a significant
part of the transmission time. This is clearly detrimental to
overall throughput performance; it would be better if we could send
the message after the spinlock has been released.

In this commit, we do instead let the call revert on the stack after
the buffer chain has been added to the transmission queue, whereafter
clones of the buffers are transmitted to the device layer outside the
spinlock scope.

As a further step in our effort to separate the roles of the node
and link entities we also move the function tipc_link_xmit() to
node.c, and rename it to tipc_node_xmit().

Reviewed-by: Ying Xue &lt;ying.xue@windriver.com&gt;
Signed-off-by: Jon Maloy &lt;jon.maloy@ericsson.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
</feed>
