<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/include/net/ip_vs.h, branch v3.12</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v3.12</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v3.12'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2013-09-18T19:39:03Z</updated>
<entry>
<title>ipvs: make the service replacement more robust</title>
<updated>2013-09-18T19:39:03Z</updated>
<author>
<name>Julian Anastasov</name>
<email>ja@ssi.bg</email>
</author>
<published>2013-09-12T08:21:07Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=bcbde4c0a7556cca72874c5e1efa4dccb5198a2b'/>
<id>urn:sha1:bcbde4c0a7556cca72874c5e1efa4dccb5198a2b</id>
<content type='text'>
commit 578bc3ef1e473a ("ipvs: reorganize dest trash") added
IP_VS_DEST_STATE_REMOVING flag and RCU callback named
ip_vs_dest_wait_readers() to keep dests and services after
removal for at least a RCU grace period. But we have the
following corner cases:

- we can not reuse the same dest if its service is removed
while IP_VS_DEST_STATE_REMOVING is still set because another dest
removal in the first grace period can not extend this period.
It can happen when ipvsadm -C &amp;&amp; ipvsadm -R is used.

- dest-&gt;svc can be replaced but ip_vs_in_stats() and
ip_vs_out_stats() have no explicit read memory barriers
when accessing dest-&gt;svc. It can happen that dest-&gt;svc
was just freed (replaced) while we use it to update
the stats.

We solve the problems as follows:

- IP_VS_DEST_STATE_REMOVING is removed and we ensure a fixed
idle period for the dest (IP_VS_DEST_TRASH_PERIOD). idle_start
will remember when for first time after deletion we noticed
dest-&gt;refcnt=0. Later, the connections can grab a reference
while in RCU grace period but if refcnt becomes 0 we can
safely free the dest and its svc.

- dest-&gt;svc becomes RCU pointer. As result, we add explicit
RCU locking in ip_vs_in_stats() and ip_vs_out_stats().

- __ip_vs_unbind_svc is renamed to __ip_vs_svc_put(), it
now can free the service immediately or after a RCU grace
period. dest-&gt;svc is not set to NULL anymore.

	As result, unlinked dests and their services are
freed always after IP_VS_DEST_TRASH_PERIOD period, unused
services are freed after a RCU grace period.

Signed-off-by: Julian Anastasov &lt;ja@ssi.bg&gt;
Signed-off-by: Simon Horman &lt;horms@verge.net.au&gt;
</content>
</entry>
<entry>
<title>ipvs: fix overflow on dest weight multiply</title>
<updated>2013-09-18T19:38:53Z</updated>
<author>
<name>Simon Kirby</name>
<email>sim@hostway.ca</email>
</author>
<published>2013-08-10T08:26:18Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=c16526a7b99c1c28e9670a8c8e3dbcf741bb32be'/>
<id>urn:sha1:c16526a7b99c1c28e9670a8c8e3dbcf741bb32be</id>
<content type='text'>
Schedulers such as lblc and lblcr require the weight to be as high as the
maximum number of active connections. In commit b552f7e3a9524abcbcdf
("ipvs: unify the formula to estimate the overhead of processing
connections"), the consideration of inactconns and activeconns was cleaned
up to always count activeconns as 256 times more important than inactconns.
In cases where 3000 or more connections are expected, a weight of 3000 *
256 * 3000 connections overflows the 32-bit signed result used to determine
if rescheduling is required.

On amd64, this merely changes the multiply and comparison instructions to
64-bit. On x86, a 64-bit result is already present from imull, so only
a few more comparison instructions are emitted.

Signed-off-by: Simon Kirby &lt;sim@hostway.ca&gt;
Acked-by: Julian Anastasov &lt;ja@ssi.bg&gt;
Signed-off-by: Simon Horman &lt;horms@verge.net.au&gt;
</content>
</entry>
<entry>
<title>ipvs: add sync_persist_mode flag</title>
<updated>2013-06-26T09:01:46Z</updated>
<author>
<name>Julian Anastasov</name>
<email>ja@ssi.bg</email>
</author>
<published>2013-06-24T19:44:41Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=4d0c875dcc4923476f364e83912d134da2df224c'/>
<id>urn:sha1:4d0c875dcc4923476f364e83912d134da2df224c</id>
<content type='text'>
Add sync_persist_mode flag to reduce sync traffic
by syncing only persistent templates.

Signed-off-by: Julian Anastasov &lt;ja@ssi.bg&gt;
Tested-by: Aleksey Chudov &lt;aleksey.chudov@gmail.com&gt;
Signed-off-by: Simon Horman &lt;horms@verge.net.au&gt;
</content>
</entry>
<entry>
<title>ipvs: replace the SCTP state machine</title>
<updated>2013-06-26T09:01:46Z</updated>
<author>
<name>Julian Anastasov</name>
<email>ja@ssi.bg</email>
</author>
<published>2013-06-18T07:08:07Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=61e7c420b4b2a797ac209106ba743ab6ebe984d8'/>
<id>urn:sha1:61e7c420b4b2a797ac209106ba743ab6ebe984d8</id>
<content type='text'>
Convert the SCTP state table, so that it is more readable.
Change the states to be according to the diagram in RFC 2960
and add more states suitable for middle box. Still, such
change in states adds incompatibility if systems in sync
setup include this change and others do not include it.

With this change we also have proper transitions in INPUT-ONLY
mode (DR/TUN) where we see packets only from client. Now
we should not switch to 10-second CLOSED state at a time
when we should stay in ESTABLISHED state.

The short names for states are because we have 16-char space
in ipvsadm and 11-char limit for the connection list format.
It is a sequence of the TCP implementation where the longest
state name is ESTABLISHED.

Signed-off-by: Julian Anastasov &lt;ja@ssi.bg&gt;
Signed-off-by: Simon Horman &lt;horms@verge.net.au&gt;
</content>
</entry>
<entry>
<title>ipvs: sloppy TCP and SCTP</title>
<updated>2013-06-26T09:01:46Z</updated>
<author>
<name>Alexander Frolkin</name>
<email>avf@eldamar.org.uk</email>
</author>
<published>2013-06-13T07:56:15Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=c6c96c188336b2b95d5f14facd101f1e4165a9d3'/>
<id>urn:sha1:c6c96c188336b2b95d5f14facd101f1e4165a9d3</id>
<content type='text'>
This adds support for sloppy TCP and SCTP modes to IPVS.

When enabled (sysctls net.ipv4.vs.sloppy_tcp and
net.ipv4.vs.sloppy_sctp), allows IPVS to create connection state on any
packet, not just a TCP SYN (or SCTP INIT).

This allows connections to fail over from one IPVS director to another
mid-flight.

Signed-off-by: Alexander Frolkin &lt;avf@eldamar.org.uk&gt;
Signed-off-by: Julian Anastasov &lt;ja@ssi.bg&gt;
Signed-off-by: Simon Horman &lt;horms@verge.net.au&gt;
</content>
</entry>
<entry>
<title>ipvs: provide iph to schedulers</title>
<updated>2013-06-26T09:01:45Z</updated>
<author>
<name>Julian Anastasov</name>
<email>ja@ssi.bg</email>
</author>
<published>2013-06-16T06:09:36Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=bba54de5bdd107d3841b560f1a9cb0ed06e79533'/>
<id>urn:sha1:bba54de5bdd107d3841b560f1a9cb0ed06e79533</id>
<content type='text'>
Before now the schedulers needed access only to IP
addresses and it was easy to get them from skb by
using ip_vs_fill_iph_addr_only.

New changes for the SH scheduler will need the protocol
and ports which is difficult to get from skb for the
IPv6 case. As we have all the data in the iph structure,
to avoid the same slow lookups provide the iph to schedulers.

Signed-off-by: Julian Anastasov &lt;ja@ssi.bg&gt;
Acked-by: Hans Schillstrom &lt;hans@schillstrom.com&gt;
Signed-off-by: Simon Horman &lt;horms@verge.net.au&gt;
</content>
</entry>
<entry>
<title>ipvs: change type of netns_ipvs-&gt;sysctl_sync_qlen_max</title>
<updated>2013-05-25T23:17:33Z</updated>
<author>
<name>Zhang Yanfei</name>
<email>zhangyanfei@cn.fujitsu.com</email>
</author>
<published>2013-04-29T18:55:10Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=079956742452494326081349a66942654498cafa'/>
<id>urn:sha1:079956742452494326081349a66942654498cafa</id>
<content type='text'>
This member of struct netns_ipvs is calculated from nr_free_buffer_pages
so change its type to unsigned long in case of overflow.  Also, type of
its related proc var sync_qlen_max and the return type of function
sysctl_sync_qlen_max() should be changed to unsigned long, too.

Besides, the type of ipvs_master_sync_state-&gt;sync_queue_len should be
changed to unsigned long accordingly.

Signed-off-by: Zhang Yanfei &lt;zhangyanfei@cn.fujitsu.com&gt;
Cc: Julian Anastasov &lt;ja@ssi.bg&gt;
Cc: David Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Simon Horman &lt;horms@verge.net.au&gt;
</content>
</entry>
<entry>
<title>ipvs: fix sparse warnings for some parameters</title>
<updated>2013-04-23T02:43:05Z</updated>
<author>
<name>Julian Anastasov</name>
<email>ja@ssi.bg</email>
</author>
<published>2013-04-17T20:50:49Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=0a925864c1038a78fd1cc9b048d9a2b1ae04b63e'/>
<id>urn:sha1:0a925864c1038a78fd1cc9b048d9a2b1ae04b63e</id>
<content type='text'>
Some service fields are in network order:

- netmask: used once in network order and also as prefix len for IPv6
- port

Other parameters are in host order:

- struct ip_vs_flags: flags and mask moved between user and kernel only
- sync state: moved between user and kernel only
- syncid: sent over network as single octet

Signed-off-by: Julian Anastasov &lt;ja@ssi.bg&gt;
Signed-off-by: Simon Horman &lt;horms@verge.net.au&gt;
</content>
</entry>
<entry>
<title>ipvs: convert services to rcu</title>
<updated>2013-04-01T22:23:58Z</updated>
<author>
<name>Julian Anastasov</name>
<email>ja@ssi.bg</email>
</author>
<published>2013-03-22T09:46:53Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=ceec4c3816818459d90c92152e61371ff5b1d5a1'/>
<id>urn:sha1:ceec4c3816818459d90c92152e61371ff5b1d5a1</id>
<content type='text'>
This is the final step in RCU conversion.

Things that are removed:

- svc-&gt;usecnt: now svc is accessed under RCU read lock
- svc-&gt;inc: and some unused code
- ip_vs_bind_pe and ip_vs_unbind_pe: no ability to replace PE
- __ip_vs_svc_lock: replaced with RCU
- IP_VS_WAIT_WHILE: now readers lookup svcs and dests under
	RCU and work in parallel with configuration

Other changes:

- before now, a RCU read-side critical section included the
calling of the schedule method, now it is extended to include
service lookup
- ip_vs_svc_table and ip_vs_svc_fwm_table are now using hlist
- svc-&gt;pe and svc-&gt;scheduler remain to the end (of grace period),
	the schedulers are prepared for such RCU readers
	even after done_service is called but they need
	to use synchronize_rcu because last ip_vs_scheduler_put
	can happen while RCU read-side critical sections
	use an outdated svc-&gt;scheduler pointer
- as planned, update_service is removed
- empty services can be freed immediately after grace period.
	If dests were present, the services are freed from
	the dest trash code

Signed-off-by: Julian Anastasov &lt;ja@ssi.bg&gt;
Signed-off-by: Simon Horman &lt;horms@verge.net.au&gt;
</content>
</entry>
<entry>
<title>ipvs: convert dests to rcu</title>
<updated>2013-04-01T22:23:57Z</updated>
<author>
<name>Julian Anastasov</name>
<email>ja@ssi.bg</email>
</author>
<published>2013-03-22T09:46:52Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=413c2d04e9494ca38629d8a7ffeff1e4398a9fe3'/>
<id>urn:sha1:413c2d04e9494ca38629d8a7ffeff1e4398a9fe3</id>
<content type='text'>
In previous commits the schedulers started to access
svc-&gt;destinations with _rcu list traversal primitives
because the IP_VS_WAIT_WHILE macro still plays the role of
grace period. Now it is time to finish the updating part,
i.e. adding and deleting of dests with _rcu suffix before
removing the IP_VS_WAIT_WHILE in next commit.

We use the same rule for conns as for the
schedulers: dests can be searched in RCU read-side critical
section where ip_vs_dest_hold can be called by ip_vs_bind_dest.

Some things are not perfect, for example, calling
functions like ip_vs_lookup_dest from updating code under
RCU, just because we use some function both from reader
and from updater.

Signed-off-by: Julian Anastasov &lt;ja@ssi.bg&gt;
Signed-off-by: Simon Horman &lt;horms@verge.net.au&gt;
</content>
</entry>
</feed>
