<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/include/net/ip.h, branch v6.17</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v6.17</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v6.17'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2025-07-02T21:32:30Z</updated>
<entry>
<title>ipv4: adopt dst_dev, skb_dst_dev and skb_dst_dev_net[_rcu]</title>
<updated>2025-07-02T21:32:30Z</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2025-06-30T12:19:31Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=a74fc62eec155ca5a6da8ff3856f3dc87fe24558'/>
<id>urn:sha1:a74fc62eec155ca5a6da8ff3856f3dc87fe24558</id>
<content type='text'>
Use the new helpers as a first step to deal with
potential dst-&gt;dev races.

Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Reviewed-by: Kuniyuki Iwashima &lt;kuniyu@google.com&gt;
Link: https://patch.msgid.link/20250630121934.3399505-8-edumazet@google.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
</entry>
<entry>
<title>net: dst: annotate data-races around dst-&gt;expires</title>
<updated>2025-07-02T21:32:29Z</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2025-06-30T12:19:26Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=36229b2caca2228b834c03fb83867022485a0563'/>
<id>urn:sha1:36229b2caca2228b834c03fb83867022485a0563</id>
<content type='text'>
(dst_entry)-&gt;expires is read and written locklessly,
add corresponding annotations.

Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Reviewed-by: Kuniyuki Iwashima &lt;kuniyu@google.com&gt;
Link: https://patch.msgid.link/20250630121934.3399505-3-edumazet@google.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
</entry>
<entry>
<title>net: ipv4: Add ip_mr_output()</title>
<updated>2025-06-18T01:18:45Z</updated>
<author>
<name>Petr Machata</name>
<email>petrm@nvidia.com</email>
</author>
<published>2025-06-16T22:44:12Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=35bec72a24ace52a7f57642ff2813f22733b08fd'/>
<id>urn:sha1:35bec72a24ace52a7f57642ff2813f22733b08fd</id>
<content type='text'>
Multicast routing is today handled in the input path. Locally generated MC
packets don't hit the IPMR code today. Thus if a VXLAN remote address is
multicast, the driver needs to set an OIF during route lookup. Thus MC
routing configuration needs to be kept in sync with the VXLAN FDB and MDB.
Ideally, the VXLAN packets would be routed by the MC routing code instead.

To that end, this patch adds support to route locally generated multicast
packets. The newly-added routines do largely what ip_mr_input() and
ip_mr_forward() do: make an MR cache lookup to find where to send the
packets, and use ip_mc_output() to send each of them. When no cache entry
is found, the packet is punted to the daemon for resolution.

However, an installation that uses a VXLAN underlay netdevice for which it
also has matching MC routes, would get a different routing with this patch.
Previously, the MC packets would be delivered directly to the underlay
port, whereas now they would be MC-routed. In order to avoid this change in
behavior, introduce an IPCB flag. Only if the flag is set will
ip_mr_output() actually engage, otherwise it reverts to ip_mc_output().

This code is based on work by Roopa Prabhu and Nikolay Aleksandrov.

Signed-off-by: Roopa Prabhu &lt;roopa@nvidia.com&gt;
Signed-off-by: Nikolay Aleksandrov &lt;razor@blackwall.org&gt;
Signed-off-by: Benjamin Poirier &lt;bpoirier@nvidia.com&gt;
Signed-off-by: Petr Machata &lt;petrm@nvidia.com&gt;
Reviewed-by: Ido Schimmel &lt;idosch@nvidia.com&gt;
Link: https://patch.msgid.link/0aadbd49330471c0f758d54afb05eb3b6e3a6b65.1750113335.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
</entry>
<entry>
<title>net: use netif_disable_lro in ipv6_add_dev</title>
<updated>2025-04-03T22:32:08Z</updated>
<author>
<name>Stanislav Fomichev</name>
<email>sdf@fomichev.me</email>
</author>
<published>2025-04-01T16:34:44Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=8965c160b8f7333df895321c8aa6bad4a7175f2b'/>
<id>urn:sha1:8965c160b8f7333df895321c8aa6bad4a7175f2b</id>
<content type='text'>
ipv6_add_dev might call dev_disable_lro which unconditionally grabs
instance lock, so it will deadlock during NETDEV_REGISTER. Switch
to netif_disable_lro.

Make sure all callers hold the instance lock as well.

Cc: Cosmin Ratiu &lt;cratiu@nvidia.com&gt;
Fixes: ad7c7b2172c3 ("net: hold netdev instance lock during sysfs operations")
Signed-off-by: Stanislav Fomichev &lt;sdf@fomichev.me&gt;
Link: https://patch.msgid.link/20250401163452.622454-4-sdf@fomichev.me
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
</entry>
<entry>
<title>inet: call inet6_ehashfn() once from inet6_hash_connect()</title>
<updated>2025-03-06T23:26:02Z</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2025-03-05T03:45:50Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=d4438ce68bf145aa1d7ed03ebf3b8ece337e3f64'/>
<id>urn:sha1:d4438ce68bf145aa1d7ed03ebf3b8ece337e3f64</id>
<content type='text'>
inet6_ehashfn() being called from __inet6_check_established()
has a big impact on performance, as shown in the Tested section.

After prior patch, we can compute the hash for port 0
from inet6_hash_connect(), and derive each hash in
__inet_hash_connect() from this initial hash:

hash(saddr, lport, daddr, dport) == hash(saddr, 0, daddr, dport) + lport

Apply the same principle for __inet_check_established(),
although inet_ehashfn() has a smaller cost.

Tested:

Server: ulimit -n 40000; neper/tcp_crr -T 200 -F 30000 -6 --nolog
Client: ulimit -n 40000; neper/tcp_crr -T 200 -F 30000 -6 --nolog -c -H server

Before this patch:

  utime_start=0.286131
  utime_end=4.378886
  stime_start=11.952556
  stime_end=1991.655533
  num_transactions=1446830
  latency_min=0.001061085
  latency_max=12.075275028
  latency_mean=0.376375302
  latency_stddev=1.361969596
  num_samples=306383
  throughput=151866.56

perf top:

 50.01%  [kernel]       [k] __inet6_check_established
 20.65%  [kernel]       [k] __inet_hash_connect
 15.81%  [kernel]       [k] inet6_ehashfn
  2.92%  [kernel]       [k] rcu_all_qs
  2.34%  [kernel]       [k] __cond_resched
  0.50%  [kernel]       [k] _raw_spin_lock
  0.34%  [kernel]       [k] sched_balance_trigger
  0.24%  [kernel]       [k] queued_spin_lock_slowpath

After this patch:

  utime_start=0.315047
  utime_end=9.257617
  stime_start=7.041489
  stime_end=1923.688387
  num_transactions=3057968
  latency_min=0.003041375
  latency_max=7.056589232
  latency_mean=0.141075048    # Better latency metrics
  latency_stddev=0.526900516
  num_samples=312996
  throughput=320677.21        # 111 % increase, and 229 % for the series

perf top: inet6_ehashfn is no longer seen.

 39.67%  [kernel]       [k] __inet_hash_connect
 37.06%  [kernel]       [k] __inet6_check_established
  4.79%  [kernel]       [k] rcu_all_qs
  3.82%  [kernel]       [k] __cond_resched
  1.76%  [kernel]       [k] sched_balance_domains
  0.82%  [kernel]       [k] _raw_spin_lock
  0.81%  [kernel]       [k] sched_balance_rq
  0.81%  [kernel]       [k] sched_balance_trigger
  0.76%  [kernel]       [k] queued_spin_lock_slowpath

Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Reviewed-by: Kuniyuki Iwashima &lt;kuniyu@amazon.com&gt;
Tested-by: Jason Xing &lt;kerneljasonxing@gmail.com&gt;
Reviewed-by: Jason Xing &lt;kerneljasonxing@gmail.com&gt;
Link: https://patch.msgid.link/20250305034550.879255-3-edumazet@google.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
</entry>
<entry>
<title>ipv4: remove get_rttos</title>
<updated>2025-02-19T02:27:19Z</updated>
<author>
<name>Willem de Bruijn</name>
<email>willemb@google.com</email>
</author>
<published>2025-02-14T22:27:01Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=9329b58395e51bba9c847419cc4ba176df3dd2b7'/>
<id>urn:sha1:9329b58395e51bba9c847419cc4ba176df3dd2b7</id>
<content type='text'>
Initialize the ip cookie tos field when initializing the cookie, in
ipcm_init_sk.

The existing code inverts the standard pattern for initializing cookie
fields. Default is to initialize the field from the sk, then possibly
overwrite that when parsing cmsgs (the unlikely case).

This field inverts that, setting the field to an illegal value and
after cmsg parsing checking whether the value is still illegal and
thus should be overridden.

Be careful to always apply mask INET_DSCP_MASK, as before.

Signed-off-by: Willem de Bruijn &lt;willemb@google.com&gt;
Reviewed-by: David Ahern &lt;dsahern@kernel.org&gt;
Link: https://patch.msgid.link/20250214222720.3205500-5-willemdebruijn.kernel@gmail.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
</entry>
<entry>
<title>ipv4: initialize inet socket cookies with sockcm_init</title>
<updated>2025-02-19T02:27:19Z</updated>
<author>
<name>Willem de Bruijn</name>
<email>willemb@google.com</email>
</author>
<published>2025-02-14T22:27:00Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=94788792f37902f1f4d417f6f9663831cf7e91fc'/>
<id>urn:sha1:94788792f37902f1f4d417f6f9663831cf7e91fc</id>
<content type='text'>
Avoid open coding the same logic.

Signed-off-by: Willem de Bruijn &lt;willemb@google.com&gt;
Reviewed-by: David Ahern &lt;dsahern@kernel.org&gt;
Link: https://patch.msgid.link/20250214222720.3205500-4-willemdebruijn.kernel@gmail.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
</entry>
<entry>
<title>net: introduce EXPORT_IPV6_MOD() and EXPORT_IPV6_MOD_GPL()</title>
<updated>2025-02-14T21:09:38Z</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2025-02-12T13:24:15Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=54568a84c95bdea20227cf48d41f198d083e78dd'/>
<id>urn:sha1:54568a84c95bdea20227cf48d41f198d083e78dd</id>
<content type='text'>
We have many EXPORT_SYMBOL(x) in networking tree because IPv6
can be built as a module.

CONFIG_IPV6=y is becoming the norm.

Define a EXPORT_IPV6_MOD(x) which only exports x
for modular IPv6.

Same principle applies to EXPORT_IPV6_MOD_GPL()

Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Reviewed-by: Kuniyuki Iwashima &lt;kuniyu@amazon.com&gt;
Reviewed-by: Mateusz Polchlopek &lt;mateusz.polchlopek@intel.com&gt;
Link: https://patch.msgid.link/20250212132418.1524422-2-edumazet@google.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
</entry>
<entry>
<title>ipv4: use RCU protection in ip_dst_mtu_maybe_forward()</title>
<updated>2025-02-07T00:14:14Z</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2025-02-05T15:51:11Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=071d8012869b6af352acca346ade13e7be90a49f'/>
<id>urn:sha1:071d8012869b6af352acca346ade13e7be90a49f</id>
<content type='text'>
ip_dst_mtu_maybe_forward() must use RCU protection to make
sure the net structure it reads does not disappear.

Fixes: f87c10a8aa1e8 ("ipv4: introduce ip_dst_mtu_maybe_forward and protect forwarding path against pmtu spoofing")
Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Reviewed-by: Kuniyuki Iwashima &lt;kuniyu@amazon.com&gt;
Link: https://patch.msgid.link/20250205155120.1676781-4-edumazet@google.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
</entry>
<entry>
<title>sock: support SO_PRIORITY cmsg</title>
<updated>2024-12-17T02:13:44Z</updated>
<author>
<name>Anna Emese Nyiri</name>
<email>annaemesenyiri@gmail.com</email>
</author>
<published>2024-12-13T08:44:55Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=a32f3e9d1ed146f81162702605d65447a319eb76'/>
<id>urn:sha1:a32f3e9d1ed146f81162702605d65447a319eb76</id>
<content type='text'>
The Linux socket API currently allows setting SO_PRIORITY at the
socket level, applying a uniform priority to all packets sent through
that socket. The exception to this is IP_TOS, when the priority value
is calculated during the handling of
ancillary data, as implemented in commit f02db315b8d8 ("ipv4: IP_TOS
and IP_TTL can be specified as ancillary data").
However, this is a computed
value, and there is currently no mechanism to set a custom priority
via control messages prior to this patch.

According to this patch, if SO_PRIORITY is specified as ancillary data,
the packet is sent with the priority value set through
sockc-&gt;priority, overriding the socket-level values
set via the traditional setsockopt() method. This is analogous to
the existing support for SO_MARK, as implemented in
commit c6af0c227a22 ("ip: support SO_MARK cmsg").

If both cmsg SO_PRIORITY and IP_TOS are passed, then the one that
takes precedence is the last one in the cmsg list.

This patch has the side effect that raw_send_hdrinc now interprets cmsg
IP_TOS.

Reviewed-by: Willem de Bruijn &lt;willemb@google.com&gt;
Suggested-by: Ferenc Fejes &lt;fejes@inf.elte.hu&gt;
Signed-off-by: Anna Emese Nyiri &lt;annaemesenyiri@gmail.com&gt;
Link: https://patch.msgid.link/20241213084457.45120-3-annaemesenyiri@gmail.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
</entry>
</feed>
