<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/net/tipc/node.c, branch v4.11</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v4.11</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v4.11'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2017-02-24T16:42:54Z</updated>
<entry>
<title>tipc: move premature initilalization of stack variables</title>
<updated>2017-02-24T16:42:54Z</updated>
<author>
<name>Jon Paul Maloy</name>
<email>jon.maloy@ericsson.com</email>
</author>
<published>2017-02-23T16:10:31Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=681a55d71799b575f46fe94121728cf67460d1c3'/>
<id>urn:sha1:681a55d71799b575f46fe94121728cf67460d1c3</id>
<content type='text'>
In the function tipc_rcv() we initialize a couple of stack variables
from the message header before that same header has been validated.
In rare cases when the arriving header is non-linar, the validation
function itself may linearize the buffer by calling skb_may_pull(),
while the wrongly initialized stack fields are not updated accordingly.

We fix this in this commit.

Reported-by: Matthew Wong &lt;mwong@sonusnet.com&gt;
Signed-off-by: Jon Maloy &lt;jon.maloy@ericsson.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net</title>
<updated>2017-01-28T15:33:06Z</updated>
<author>
<name>David S. Miller</name>
<email>davem@davemloft.net</email>
</author>
<published>2017-01-28T15:33:06Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=4e8f2fc1a55d543717efb70e170b09e773d0542b'/>
<id>urn:sha1:4e8f2fc1a55d543717efb70e170b09e773d0542b</id>
<content type='text'>
Two trivial overlapping changes conflicts in MPLS and mlx5.

Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>tipc: fix nametbl_lock soft lockup at node/link events</title>
<updated>2017-01-24T21:14:57Z</updated>
<author>
<name>Parthasarathy Bhuvaragan</name>
<email>parthasarathy.bhuvaragan@ericsson.com</email>
</author>
<published>2017-01-24T12:00:43Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=93f955aad4bacee5acebad141d1a03cd51f27b4e'/>
<id>urn:sha1:93f955aad4bacee5acebad141d1a03cd51f27b4e</id>
<content type='text'>
We trigger a soft lockup as we grab nametbl_lock twice if the node
has a pending node up/down or link up/down event while:
- we process an incoming named message in tipc_named_rcv() and
  perform an tipc_update_nametbl().
- we have pending backlog items in the name distributor queue
  during a nametable update using tipc_nametbl_publish() or
  tipc_nametbl_withdraw().

The following are the call chain associated:
tipc_named_rcv() Grabs nametbl_lock
   tipc_update_nametbl() (publish/withdraw)
     tipc_node_subscribe()/unsubscribe()
       tipc_node_write_unlock()
          &lt;&lt; lockup occurs if an outstanding node/link event
             exits, as we grabs nametbl_lock again &gt;&gt;

tipc_nametbl_withdraw() Grab nametbl_lock
  tipc_named_process_backlog()
    tipc_update_nametbl()
      &lt;&lt; rest as above &gt;&gt;

The function tipc_node_write_unlock(), in addition to releasing the
lock processes the outstanding node/link up/down events. To do this,
we need to grab the nametbl_lock again leading to the lockup.

In this commit we fix the soft lockup by introducing a fast variant of
node_unlock(), where we just release the lock. We adapt the
node_subscribe()/node_unsubscribe() to use the fast variants.

Reported-and-Tested-by: John Thompson &lt;thompa.atl@gmail.com&gt;
Acked-by: Ying Xue &lt;ying.xue@windriver.com&gt;
Acked-by: Jon Maloy &lt;jon.maloy@ericsson.com&gt;
Signed-off-by: Parthasarathy Bhuvaragan &lt;parthasarathy.bhuvaragan@ericsson.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>tipc: introduce replicast as transport option for multicast</title>
<updated>2017-01-20T17:10:17Z</updated>
<author>
<name>Jon Paul Maloy</name>
<email>jon.maloy@ericsson.com</email>
</author>
<published>2017-01-18T18:50:52Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=a853e4c6d0843729e1f25a7a7beff168e1dd7420'/>
<id>urn:sha1:a853e4c6d0843729e1f25a7a7beff168e1dd7420</id>
<content type='text'>
TIPC multicast messages are currently carried over a reliable
'broadcast link', making use of the underlying media's ability to
transport packets as L2 broadcast or IP multicast to all nodes in
the cluster.

When the used bearer is lacking that ability, we can instead emulate
the broadcast service by replicating and sending the packets over as
many unicast links as needed to reach all identified destinations.
We now introduce a new TIPC link-level 'replicast' service that does
this.

Reviewed-by: Parthasarathy Bhuvaragan &lt;parthasarathy.bhuvaragan@ericsson.com&gt;
Acked-by: Ying Xue &lt;ying.xue@windriver.com&gt;
Signed-off-by: Jon Maloy &lt;jon.maloy@ericsson.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>tipc: reduce risk of user starvation during link congestion</title>
<updated>2017-01-03T16:13:05Z</updated>
<author>
<name>Jon Paul Maloy</name>
<email>jon.maloy@ericsson.com</email>
</author>
<published>2017-01-03T15:55:11Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=365ad353c2564bba8835290061308ba825166b3a'/>
<id>urn:sha1:365ad353c2564bba8835290061308ba825166b3a</id>
<content type='text'>
The socket code currently handles link congestion by either blocking
and trying to send again when the congestion has abated, or just
returning to the user with -EAGAIN and let him re-try later.

This mechanism is prone to starvation, because the wakeup algorithm is
non-atomic. During the time the link issues a wakeup signal, until the
socket wakes up and re-attempts sending, other senders may have come
in between and occupied the free buffer space in the link. This in turn
may lead to a socket having to make many send attempts before it is
successful. In extremely loaded systems we have observed latency times
of several seconds before a low-priority socket is able to send out a
message.

In this commit, we simplify this mechanism and reduce the risk of the
described scenario happening. When a message is attempted sent via a
congested link, we now let it be added to the link's backlog queue
anyway, thus permitting an oversubscription of one message per source
socket. We still create a wakeup item and return an error code, hence
instructing the sender to block or stop sending. Only when enough space
has been freed up in the link's backlog queue do we issue a wakeup event
that allows the sender to continue with the next message, if any.

The fact that a socket now can consider a message sent even when the
link returns a congestion code means that the sending socket code can
be simplified. Also, since this is a good opportunity to get rid of the
obsolete 'mtu change' condition in the three socket send functions, we
now choose to refactor those functions completely.

Signed-off-by: Parthasarathy Bhuvaragan &lt;parthasarathy.bhuvaragan@ericsson.com&gt;
Acked-by: Ying Xue &lt;ying.xue@windriver.com&gt;
Signed-off-by: Jon Maloy &lt;jon.maloy@ericsson.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>tipc: fix broadcast link synchronization problem</title>
<updated>2016-10-29T21:21:09Z</updated>
<author>
<name>Jon Paul Maloy</name>
<email>jon.maloy@ericsson.com</email>
</author>
<published>2016-10-27T22:51:55Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=06bd2b1ed04ca9fdbc767859885944a1e8b86b40'/>
<id>urn:sha1:06bd2b1ed04ca9fdbc767859885944a1e8b86b40</id>
<content type='text'>
In commit 2d18ac4ba745 ("tipc: extend broadcast link initialization
criteria") we tried to fix a problem with the initial synchronization
of broadcast link acknowledge values. Unfortunately that solution is
not sufficient to solve the issue.

We have seen it happen that LINK_PROTOCOL/STATE packets with a valid
non-zero unicast acknowledge number may bypass BCAST_PROTOCOL
initialization, NAME_DISTRIBUTOR and other STATE packets with invalid
broadcast acknowledge numbers, leading to premature opening of the
broadcast link. When the bypassed packets finally arrive, they are
inadvertently accepted, and the already correctly initialized
acknowledge number in the broadcast receive link is overwritten by
the invalid (zero) value of the said packets. After this the broadcast
link goes stale.

We now fix this by marking the packets where we know the acknowledge
value is or may be invalid, and then ignoring the acks from those.

To this purpose, we claim an unused bit in the header to indicate that
the value is invalid. We set the bit to 1 in the initial BCAST_PROTOCOL
synchronization packet and all initial ("bulk") NAME_DISTRIBUTOR
packets, plus those LINK_PROTOCOL packets sent out before the broadcast
links are fully synchronized.

This minor protocol update is fully backwards compatible.

Reported-by: John Thompson &lt;thompa.atl@gmail.com&gt;
Tested-by: John Thompson &lt;thompa.atl@gmail.com&gt;
Signed-off-by: Jon Maloy &lt;jon.maloy@ericsson.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>tipc: transfer broadcast nacks in link state messages</title>
<updated>2016-09-03T00:10:24Z</updated>
<author>
<name>Jon Paul Maloy</name>
<email>jon.maloy@ericsson.com</email>
</author>
<published>2016-09-01T17:52:49Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=02d11ca20091fcef904f05defda80c53e5b4e793'/>
<id>urn:sha1:02d11ca20091fcef904f05defda80c53e5b4e793</id>
<content type='text'>
When we send broadcasts in clusters of more 70-80 nodes, we sometimes
see the broadcast link resetting because of an excessive number of
retransmissions. This is caused by a combination of two factors:

1) A 'NACK crunch", where loss of broadcast packets is discovered
   and NACK'ed by several nodes simultaneously, leading to multiple
   redundant broadcast retransmissions.

2) The fact that the NACKS as such also are sent as broadcast, leading
   to excessive load and packet loss on the transmitting switch/bridge.

This commit deals with the latter problem, by moving sending of
broadcast nacks from the dedicated BCAST_PROTOCOL/NACK message type
to regular unicast LINK_PROTOCOL/STATE messages. We allocate 10 unused
bits in word 8 of the said message for this purpose, and introduce a
new capability bit, TIPC_BCAST_STATE_NACK in order to keep the change
backwards compatible.

Reviewed-by: Ying Xue &lt;ying.xue@windriver.com&gt;
Signed-off-by: Jon Maloy &lt;jon.maloy@ericsson.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>tipc: add peer removal functionality</title>
<updated>2016-08-19T06:36:07Z</updated>
<author>
<name>Richard Alpe</name>
<email>richard.alpe@ericsson.com</email>
</author>
<published>2016-08-18T08:33:52Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=b34040227be7da760cc72ef3c807e0985e7f0f16'/>
<id>urn:sha1:b34040227be7da760cc72ef3c807e0985e7f0f16</id>
<content type='text'>
Add TIPC_NL_PEER_REMOVE netlink command. This command can remove
an offline peer node from the internal data structures.

This will be supported by the tipc user space tool in iproute2.

Signed-off-by: Richard Alpe &lt;richard.alpe@ericsson.com&gt;
Reviewed-by: Jon Maloy &lt;jon.maloy@ericsson.com&gt;
Acked-by: Ying Xue &lt;ying.xue@windriver.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>tipc: dump monitor attributes</title>
<updated>2016-07-26T21:26:42Z</updated>
<author>
<name>Parthasarathy Bhuvaragan</name>
<email>parthasarathy.bhuvaragan@ericsson.com</email>
</author>
<published>2016-07-26T06:47:22Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=cf6f7e1d51090772d5ff7355aaf0fcff17f20d1a'/>
<id>urn:sha1:cf6f7e1d51090772d5ff7355aaf0fcff17f20d1a</id>
<content type='text'>
In this commit, we dump the monitor attributes when queried.
The link monitor attributes are separated into two kinds:
1. general attributes per bearer
2. specific attributes per node/peer
This style resembles the socket attributes and the nametable
publications per socket.

Reviewed-by: Jon Maloy &lt;jon.maloy@ericsson.com&gt;
Signed-off-by: Parthasarathy Bhuvaragan &lt;parthasarathy.bhuvaragan@ericsson.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>tipc: get monitor threshold for the cluster</title>
<updated>2016-07-26T21:26:42Z</updated>
<author>
<name>Parthasarathy Bhuvaragan</name>
<email>parthasarathy.bhuvaragan@ericsson.com</email>
</author>
<published>2016-07-26T06:47:20Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=bf1035b2ff5296c7c49e262152253ce29d87e82d'/>
<id>urn:sha1:bf1035b2ff5296c7c49e262152253ce29d87e82d</id>
<content type='text'>
In this commit, we add support to fetch the configured
cluster monitoring threshold.

Reviewed-by: Jon Maloy &lt;jon.maloy@ericsson.com&gt;
Signed-off-by: Parthasarathy Bhuvaragan &lt;parthasarathy.bhuvaragan@ericsson.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
</feed>
