linux - Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/

Age	Commit message (Collapse)	Author	Lines
2026-04-24	Merge tag 'net-deletions' of ↵	Linus Torvalds	-1989/+0
	git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next Pull networking deletions from Jakub Kicinski: "Delete some obsolete networking code Old code like amateur radio and NFC have long been a burden to core networking developers. syzbot loves to find bugs in BKL-era code, and noobs try to fix them. If we want to have a fighting chance of surviving the LLM-pocalypse this code needs to find a dedicated owner or get deleted. We've talked about these deletions multiple times in the past and every time someone wanted the code to stay. It is never very clear to me how many of those people actually use the code vs are just nostalgic to see it go. Amateur radio did have occasional users (or so I think) but most users switched to user space implementations since its all super slow stuff. Nobody stepped up to maintain the kernel code. We were lucky enough to find someone who wants to help with NFC so we're giving that a chance. Let's try to put the rest of this code behind us" * tag 'net-deletions' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: drivers: net: 8390: wd80x3: Remove this driver drivers: net: 8390: ultra: Remove this driver drivers: net: 8390: AX88190: Remove this driver drivers: net: fujitsu: fmvj18x: Remove this driver drivers: net: smsc: smc91c92: Remove this driver drivers: net: smsc: smc9194: Remove this driver drivers: net: amd: nmclan: Remove this driver drivers: net: amd: lance: Remove this driver drivers: net: 3com: 3c589: Remove this driver drivers: net: 3com: 3c574: Remove this driver drivers: net: 3com: 3c515: Remove this driver drivers: net: 3com: 3c509: Remove this driver net: packetengines: remove obsolete yellowfin driver and vendor dir net: packetengines: remove obsolete hamachi driver net: remove unused ATM protocols and legacy ATM device drivers net: remove ax25 and amateur radio (hamradio) subsystem net: remove ISDN subsystem and Bluetooth CMTP caif: remove CAIF NETWORK LAYER
2026-04-23	drivers: net: smsc: smc91c92: Remove this driver	Andrew Lunn	-49/+0
	The smc91c92 was written by David A Hinds in 1999. It is an PCMCIA device, so unlikely to be used with modern kernels. Remove the Documentation as well, since it refers to kernel versions 1.2.13 until 1.3.71 and FTP sites which no longer exist. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20260422-v7-0-0-net-next-driver-removal-v1-v2-8-08a5b59784d5@lunn.ch Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-23	drivers: net: 3com: 3c509: Remove this driver	Andrew Lunn	-250/+0
	The 3c509 was written by Donald Becker between 1993-2000. It is an ISA device, so unlikely to be used with modern kernels. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20260422-v7-0-0-net-next-driver-removal-v1-v2-1-08a5b59784d5@lunn.ch Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-23	net: remove unused ATM protocols and legacy ATM device drivers	Jakub Kicinski	-261/+0
	Remove the ATM protocol modules and PCI/SBUS ATM device drivers that are no longer in active use. The ATM core protocol stack, PPPoATM, BR2684, and USB DSL modem drivers (drivers/usb/atm/) are retained in-tree to maintain PPP over ATM (PPPoA) and PPPoE-over-BR2684 support for DSL connections. The Solos ADSL2+ PCI driver is also retained. Removed ATM protocol modules: - net/atm/clip.c - Classical IP over ATM (RFC 2225) - net/atm/lec.c - LAN Emulation Client (LANE) - net/atm/mpc.c, mpoa_caches.c, mpoa_proc.c - Multi-Protocol Over ATM Removed PCI/SBUS ATM device drivers (drivers/atm/): - adummy, atmtcp - software/testing ATM devices - eni - Efficient Networks ENI155P (OC-3, ~1995) - fore200e - FORE Systems 200E PCI/SBUS (OC-3, ~1999) - he - ForeRunner HE (OC-3/OC-12, ~2000) - idt77105 - IDT 77105 25 Mbps ATM PHY - idt77252 - IDT 77252 NICStAR II (OC-3, ~2000) - iphase - Interphase ATM PCI (OC-3/DS3/E3) - lanai - Efficient Networks Speedstream 3010 - nicstar - IDT 77201 NICStAR (155/25 Mbps, ~1999) - suni - PMC S/UNI SONET PHY library Also clean up references in: - net/bridge/ - remove ATM LANE hook (br_fdb_test_addr_hook, br_fdb_test_addr) - net/core/dev.c - remove br_fdb_test_addr_hook export - defconfig files - remove ATM driver config options The removed code is moved to an out-of-tree module package (mod-orphan). Acked-by: Andy Shevchenko <andriy.shevchenko@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://patch.msgid.link/20260422041846.2035118-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-23	net: remove ax25 and amateur radio (hamradio) subsystem	Jakub Kicinski	-1083/+0
	Remove the amateur radio (AX.25, NET/ROM, ROSE) protocol implementation and all associated hamradio device drivers from the kernel tree. This set of protocols has long been a huge bug/syzbot magnet, and since nobody stepped up to help us deal with the influx of the AI-generated bug reports we need to move it out of tree to protect our sanity. The code is moved to an out-of-tree repo: https://github.com/linux-netdev/mod-orphan if it's cleaned up and reworked there we can accept it back. Minimal stub headers are kept for include/net/ax25.h (AX25_P_IP, AX25_ADDR_LEN, ax25_address) and include/net/rose.h (ROSE_ADDR_LEN) so that the conditional integration code in arp.c and tun.c continues to compile and work when the out-of-tree modules are loaded. Signed-off-by: Jakub Kicinski <kuba@kernel.org> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Acked-by: Carlos Bilbao <carlos.bilbao@kernel.org> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> Acked-by: Toke Høiland-Jørgensen <toke@toke.dk> Link: https://patch.msgid.link/20260421021824.1293976-1-kuba@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2026-04-23	caif: remove CAIF NETWORK LAYER	Jakub Kicinski	-346/+0
	Remove CAIF (Communication CPU to Application CPU Interface), the ST-Ericsson modem protocol. The subsystem has been orphaned since 2013. The last meaningful changes from the maintainers were in March 2013: a8c7687bf216 ("caif_virtio: Check that vringh_config is not null") b2273be8d2df ("caif_virtio: Use vringh_notify_enable correctly") 0d2e1a2926b1 ("caif_virtio: Introduce caif over virtio") Not-so-coincidentally, according to "the Internet" ST-Ericsson officially shut down its modem joint venture in Aug 2013. If anyone is using this code please yell! In the 13 years since, the code has accumulated 200 non-merge commits, of which 71 were cross-tree API changes, 21 carried Fixes: tags, and the remaining ~110 were cleanups, doc conversions, treewide refactors, and one partial removal (caif_hsi, ca75bcf0a83b). We are still getting fixes to this code, in the last 10 days there were 3 reports on security@ about CAIF that I have been CCed on. UAPI constants (AF_CAIF, ARPHRD_CAIF, N_CAIF, VIRTIO_ID_CAIF) and the SELinux classmap entry are intentionally kept for ABI stability. Acked-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Linus Walleij <linusw@kernel.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20260416182829.1440262-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-21	net: move promiscuity handling into netdev_rx_mode_work	Stanislav Fomichev	-0/+4
	Move unicast promiscuity tracking into netdev_rx_mode_work so it runs under netdev_ops_lock instead of under the addr_lock spinlock. This is required because __dev_set_promiscuity calls dev_change_rx_flags and __dev_notify_flags, both of which may need to sleep. Change ASSERT_RTNL() to netdev_ops_assert_locked() in __dev_set_promiscuity, netif_set_allmulti and __dev_change_flags since these are now called from the work queue under the ops lock. Link: https://lore.kernel.org/netdev/20260214033859.43857-1-jiayuan.chen@linux.dev/ Fixes: 78cd408356fe ("net: add missing instance lock to dev_set_promiscuity") Reported-by: syzbot+2b3391f44313b3983e91@syzkaller.appspotmail.com Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20260416185712.2155425-5-sdf@fomichev.me Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2026-04-21	net: introduce ndo_set_rx_mode_async and netdev_rx_mode_work	Stanislav Fomichev	-0/+9
	Add ndo_set_rx_mode_async callback that drivers can implement instead of the legacy ndo_set_rx_mode. The legacy callback runs under the netif_addr_lock spinlock with BHs disabled, preventing drivers from sleeping. The async variant runs from a work queue with rtnl_lock and netdev_lock_ops held, in fully sleepable context. When __dev_set_rx_mode() sees ndo_set_rx_mode_async, it schedules netdev_rx_mode_work instead of calling the driver inline. The work function takes two snapshots of each address list (uc/mc) under the addr_lock, then drops the lock and calls the driver with the work copies. After the driver returns, it reconciles the snapshots back to the real lists under the lock. Add netif_rx_mode_sync() to opportunistically execute the pending workqueue update inline, so that rx mode changes are committed before returning to userspace: - dev_change_flags (SIOCSIFFLAGS / RTM_NEWLINK) - dev_set_promiscuity - dev_set_allmulti - dev_ifsioc SIOCADDMULTI / SIOCDELMULTI - do_setlink (RTM_SETLINK) Note that some deep hierarchies still do skip the lower updates via: - dev_uc_sync - dev_mc_sync If we do end up hitting user-visible issues, we can add more calls to netif_rx_mode_sync in specific places. But hopefully we should not, the actual user-visible lists are still synced, it's that just HW state that might be lagging. Signed-off-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20260416185712.2155425-3-sdf@fomichev.me Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2026-04-12	Merge tag 'nf-next-26-04-10' of ↵	Jakub Kicinski	-0/+37
	https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next Florian Westphal says: ==================== netfilter: updates for net-next 1-3) IPVS updates from Julian Anastasov to enhance visibility into IPVS internal state by exposing hash size, load factor etc and allows userspace to tune the load factor used for resizing hash tables. 4) reject empty/not nul terminated device names from xt_physdev. This isn't a bug fix; existing code doesn't require a c-string. But clean this up anyway because conceptually the interface name definitely should be a c-string. 5) Switch nfnetlink to skb_mac_header helpers that didn't exist back when this code was written. This gives us additional debug checks but is not intended to change functionality. 6) Let the xt ttl/hoplimit match reject unknown operator modes. This is a cleanup, the evaluation function simply returns false when the mode is out of range. From Marino Dzalto. 7) xt_socket match should enable defrag after all other checks. This bug is harmless, historically defrag could not be disabled either except by rmmod. 8) remove UDP-Lite conntrack support, from Fernando Fernandez Mancera. 9) Avoid a couple -Wflex-array-member-not-at-end warnings in the old xtables 32bit compat code, from Gustavo A. R. Silva. 10) nftables fwd expression should drop packets when their ttl/hl has expired. This is a bug fix deferred, its not deemed important enough for -rc8. 11) Add additional checks before assuming the mac header is an ethernet header, from Zhengchuan Liang. * tag 'nf-next-26-04-10' of https://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next: netfilter: require Ethernet MAC header before using eth_hdr() netfilter: nft_fwd_netdev: check ttl/hl before forwarding netfilter: x_tables: Avoid a couple -Wflex-array-member-not-at-end warnings netfilter: conntrack: remove UDP-Lite conntrack support netfilter: xt_socket: enable defrag after all other checks netfilter: xt_HL: add pr_fmt and checkentry validation netfilter: nfnetlink: prefer skb_mac_header helpers netfilter: x_physdev: reject empty or not-nul terminated device names ipvs: add conn_lfactor and svc_lfactor sysctl vars ipvs: add ip_vs_status info ipvs: show the current conn_tab size to users ==================== Link: https://patch.msgid.link/20260410112352.23599-1-fw@strlen.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-10	docs: net: bridge: document stp_mode attribute	Andy Roulin	-0/+22
	Add documentation for the IFLA_BR_STP_MODE bridge attribute in the "User space STP helper" section of the bridge documentation. Reference the BR_STP_MODE_* values via kernel-doc and describe the use case for network namespace environments. Reviewed-by: Ido Schimmel <idosch@nvidia.com> Acked-by: Nikolay Aleksandrov <nikolay@nvidia.com> Signed-off-by: Andy Roulin <aroulin@nvidia.com> Link: https://patch.msgid.link/20260405205224.3163000-3-aroulin@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-10	ipvs: add conn_lfactor and svc_lfactor sysctl vars	Julian Anastasov	-0/+37
	Allow the default load factor for the connection and service tables to be configured. Signed-off-by: Julian Anastasov <ja@ssi.bg> Signed-off-by: Florian Westphal <fw@strlen.de>
2026-04-09	net: Implement netdev_nl_queue_create_doit	Daniel Borkmann	-0/+6
	Implement netdev_nl_queue_create_doit which creates a new rx queue in a virtual netdev and then leases it to a rx queue in a physical netdev. Example with ynl client: # ynl --family netdev --output-json --do queue-create \ --json '{"ifindex": 8, "type": "rx", "lease": {"ifindex": 4, "queue": {"type": "rx", "id": 15}}}' {'id': 1} Note that the netdevice locking order is always from the virtual to the physical device. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Co-developed-by: David Wei <dw@davidwei.uk> Signed-off-by: David Wei <dw@davidwei.uk> Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org> Link: https://patch.msgid.link/20260402231031.447597-3-daniel@iogearbox.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-08	devlink: Document resource scope filtering	Or Har-Toov	-0/+35
	Document the scope parameter for devlink resource show, which allows filtering the dump to device-level or port-level resources only. Signed-off-by: Or Har-Toov <ohartoov@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20260407194107.148063-13-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-08	devlink: Document port-level resources and full dump	Or Har-Toov	-0/+35
	Document the port-level resource support and the option to dump all resources, including both device-level and port-level entries. Signed-off-by: Or Har-Toov <ohartoov@nvidia.com> Reviewed-by: Shay Drori <shayd@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Link: https://patch.msgid.link/20260407194107.148063-10-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-04-08	net: dsa: remove struct platform_data	Vladimir Oltean	-5/+0
	This is not used anywhere in the kernel. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://patch.msgid.link/20260406212158.721806-2-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-26	docs/mlx5: Fix typo subfuction	Ryohei Kinugawa	-2/+2
	Fix two typos: - 'Subfunctons' -> 'Subfunctions' - 'subfuction' -> 'subfunction' Reviewed-by: Joe Damato <joe@dama.to> Signed-off-by: Ryohei Kinugawa <ryohei.kinugawa@gmail.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Acked-by: Randy Dunlap <rdunlap@infradead.org> Link: https://patch.msgid.link/20260324053416.70166-1-ryohei.kinugawa@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-18	net: ethtool: add ethtool COALESCE_RX_CQE_FRAMES/NSECS	Haiyang Zhang	-0/+11
	Add two parameters for drivers supporting Rx CQE coalescing / descriptor writeback. ETHTOOL_A_COALESCE_RX_CQE_FRAMES: Maximum number of frames that can be coalesced into a CQE or writeback. ETHTOOL_A_COALESCE_RX_CQE_NSECS: Max time in nanoseconds after the first packet arrival in a coalesced CQE or writeback to be sent. Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com> Link: https://patch.msgid.link/20260317191826.1346111-2-haiyangz@linux.microsoft.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-14	documentation: networking: add shared devlink documentation	Jiri Pirko	-0/+98
	Document shared devlink instances for multiple PFs on the same chip. Signed-off-by: Jiri Pirko <jiri@nvidia.com> Link: https://patch.msgid.link/20260312100407.551173-13-jiri@resnulli.us Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-14	tcp: implement RFC 7323 window retraction receiver requirements	Simon Baatz	-0/+1
	By default, the Linux TCP implementation does not shrink the advertised window (RFC 7323 calls this "window retraction") with the following exceptions: - When an incoming segment cannot be added due to the receive buffer running out of memory. Since commit 8c670bdfa58e ("tcp: correct handling of extreme memory squeeze") a zero window will be advertised in this case. It turns out that reaching the required memory pressure is easy when window scaling is in use. In the simplest case, sending a sufficient number of segments smaller than the scale factor to a receiver that does not read data is enough. - Commit b650d953cd39 ("tcp: enforce receive buffer memory limits by allowing the tcp window to shrink") addressed the "eating memory" problem by introducing a sysctl knob that allows shrinking the window before running out of memory. However, RFC 7323 does not only state that shrinking the window is necessary in some cases, it also formulates requirements for TCP implementations when doing so (Section 2.4). This commit addresses the receiver-side requirements: After retracting the window, the peer may have a snd_nxt that lies within a previously advertised window but is now beyond the retracted window. This means that all incoming segments (including pure ACKs) will be rejected until the application happens to read enough data to let the peer's snd_nxt be in window again (which may be never). To comply with RFC 7323, the receiver MUST honor any segment that would have been in window for any ACK sent by the receiver and, when window scaling is in effect, SHOULD track the maximum window sequence number it has advertised. This patch tracks that maximum window sequence number rcv_mwnd_seq throughout the connection and uses it in tcp_sequence() when deciding whether a segment is acceptable. rcv_mwnd_seq is updated together with rcv_wup and rcv_wnd in tcp_select_window(). If we count tcp_sequence() as fast path, it is read in the fast path. Therefore, rcv_mwnd_seq is put into rcv_wnd's cacheline group. The logic for handling received data in tcp_data_queue() is already sufficient and does not need to be updated. Signed-off-by: Simon Baatz <gmbnomis@gmail.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20260309-tcp_rfc7323_retract_wnd_rfc-v3-1-4c7f96b1ec69@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-12	docs: octeontx2: fix typo in documentation	ShravyaPanchagiri	-1/+1
	Fix spelling mistake "Crate" to "Create" in the documentation. Signed-off-by: ShravyaPanchagiri <shravy112@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20260311030450.8461-1-shravy112@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-11	tcp: add sysctl_tcp_shrink_window to netns_ipv4_sysctl.rst	Eric Dumazet	-0/+1
	Add missing entry for sysctl_tcp_shrink_window. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> Link: https://patch.msgid.link/20260310073855.564927-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-10	inet: add ip_local_port_step_width sysctl to improve port usage distribution	Fernando Fernandez Mancera	-0/+17
	With the current port selection algorithm, ports after a reserved port range or long time used port are used more often than others [1]. This causes an uneven port usage distribution. This combines with cloud environments blocking connections between the application server and the database server if there was a previous connection with the same source port, leading to connectivity problems between applications on cloud environments. The real issue here is that these firewalls cannot cope with standards-compliant port reuse. This is a workaround for such situations and an improvement on the distribution of ports selected. The proposed solution is to implement a variant of RFC 6056 Algorithm 5. The step size is selected randomly on every connect() call ensuring it is a coprime with respect to the size of the range of ports we want to scan. This way, we can ensure that all ports within the range are scanned before returning an error. To enable this algorithm, the user must configure the new sysctl option "net.ipv4.ip_local_port_step_width". In addition, on graphs generated we can observe that the distribution of source ports is more even with the proposed approach. [2] [1] https://0xffsoftware.com/port_graph_current_alg.html [2] https://0xffsoftware.com/port_graph_random_step_alg.html Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de> Link: https://patch.msgid.link/20260309023946.5473-2-fmancera@suse.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-10	net/smc: Add documentation for limit_smc_hs and hs_ctrl	Kyoji Ogasawara	-0/+27
	Document missing SMC sysctl parameters limit_smc_hs and hs_ctrl Signed-off-by: Kyoji Ogasawara <sawara04.o@gmail.com> Reviewed-by: D. Wythe<alibuda@linux.alibaba.com> Link: https://patch.msgid.link/20260309124541.22723-3-sawara04.o@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-10	net/smc: fix indentation in smcr_buf_type section	Kyoji Ogasawara	-8/+8
	smcr_buf_type section used inconsistent indentation compared with the rest of this document. Signed-off-by: Kyoji Ogasawara <sawara04.o@gmail.com> Reviewed-by: D. Wythe<alibuda@linux.alibaba.com> Link: https://patch.msgid.link/20260309124541.22723-2-sawara04.o@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-03-04	net-sysfs: use rps_tag_ptr and remove metadata from rps_sock_flow_table	Eric Dumazet	-4/+9
	Instead of storing the @mask at the beginning of rps_sock_flow_table, use 5 low order bits of the rps_tag_ptr to store the log of the size. This removes a potential cache line miss to fetch @mask. More importantly, we can switch to vmalloc_huge() without wasting memory. Tested with: numactl --interleave=all bash -c "echo 4194304 >/proc/sys/net/core/rps_sock_flow_entries" Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> Link: https://patch.msgid.link/20260302181432.1836150-5-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-02-27	net/handshake: Fixed grammar mistake	Leon Kral	-1/+1
	The word "a" was used instead of "an" which is grammatically incorrect. Fixed by changing from "a" to "an". This improves readability of the documentation. Signed-off-by: Leon Kral <leon.j.kral@protonmail.com> Reviewed-by: Alistair Francis <alistair.francis@wdc.com> Link: https://patch.msgid.link/20260227001151.41610-1-leon.j.kral@protonmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-02-26	docs: ethtool: clarify the bit-by-bit bitset format description	Yohei Kojima	-4/+8
	Clarify the bit-by-bit bitset format's behavior around mandatory attributes and bit identification. More specifically, the following changes are made: * Rephrase a misleading sentence which implies name and index are mutually exclusive * Describe that ETHTOOL_A_BITSET_BITS nest is mandatory * Describe that a request fails if inconsistent identifiers are given Signed-off-by: Yohei Kojima <yk@y-koj.net> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/ef90a56965ca66e57aa177929ce3e10c5ca815fa.1772031974.git.yk@y-koj.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-02-26	docs: net: document neigh gc_interval sysctl	Gabriel Goller	-0/+7
	Add entry for the neigh/default/gc_interval sysctl. This sysctl is unused since kernel v2.6.8. Suggested-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Gabriel Goller <g.goller@proxmox.com> Link: https://patch.msgid.link/20260225095822.44050-1-g.goller@proxmox.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-02-24	icmp: increase net.ipv4.icmp_msgs_{per_sec,burst}	Eric Dumazet	-3/+3
	These sysctls were added in 4cdf507d5452 ("icmp: add a global rate limitation") and their default values might be too small. Some network tools send probes to closed UDP ports from many hosts to estimate proportion of packet drops on a particular target. This patch sets both sysctls to 10000. Note the per-peer rate-limit (as described in RFC 4443 2.4 (f)) intent is still enforced. This also increases security, see b38e7819cae9 ("icmp: randomize the global rate limiter") for reference. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> Link: https://patch.msgid.link/20260223161742.929830-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-02-24	docs: net: document neigh gc_stale_time sysctl	Gabriel Goller	-0/+11
	Add missing documentation for a neighbor table garbage collector sysctl parameter in ip-sysctl.rst: neigh/default/gc_stale_time: controls how long an unused neighbor entry is kept before becoming eligible for garbage collection (default: 60 seconds) Signed-off-by: Gabriel Goller <g.goller@proxmox.com> Link: https://patch.msgid.link/20260223101257.47563-1-g.goller@proxmox.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-02-18	ipv6: icmp: remove obsolete code in icmpv6_xrlim_allow()	Eric Dumazet	-3/+4
	Following part was needed before the blamed commit, because inet_getpeer_v6() second argument was the prefix. /* Give more bandwidth to wider prefixes. */ if (rt->rt6i_dst.plen < 128) tmo >>= ((128 - rt->rt6i_dst.plen)>>5); Now inet_getpeer_v6() retrieves hosts, we need to remove @tmo adjustement or wider prefixes likes /24 allow 8x more ICMP to be sent for a given ratelimit. As we had this issue for a while, this patch changes net.ipv6.icmp.ratelimit default value from 1000ms to 100ms to avoid potential regressions. Also add a READ_ONCE() when reading net->ipv6.sysctl.icmpv6_time. Fixes: fd0273d7939f ("ipv6: Remove external dependency on rt6i_dst and rt6i_src") Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> Cc: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20260216142832.3834174-4-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-02-11	Merge tag 'net-next-7.0' of ↵	Linus Torvalds	-229/+180
	git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next Pull networking updates from Paolo Abeni: "Core & protocols: - A significant effort all around the stack to guide the compiler to make the right choice when inlining code, to avoid unneeded calls for small helper and stack canary overhead in the fast-path. This generates better and faster code with very small or no text size increases, as in many cases the call generated more code than the actual inlined helper. - Extend AccECN implementation so that is now functionally complete, also allow the user-space enabling it on a per network namespace basis. - Add support for memory providers with large (above 4K) rx buffer. Paired with hw-gro, larger rx buffer sizes reduce the number of buffers traversing the stack, dincreasing single stream CPU usage by up to ~30%. - Do not add HBH header to Big TCP GSO packets. This simplifies the RX path, the TX path and the NIC drivers, and is possible because user-space taps can now interpret correctly such packets without the HBH hint. - Allow IPv6 routes to be configured with a gateway address that is resolved out of a different interface than the one specified, aligning IPv6 to IPv4 behavior. - Multi-queue aware sch_cake. This makes it possible to scale the rate shaper of sch_cake across multiple CPUs, while still enforcing a single global rate on the interface. - Add support for the nbcon (new buffer console) infrastructure to netconsole, enabling lock-free, priority-based console operations that are safer in crash scenarios. - Improve the TCP ipv6 output path to cache the flow information, saving cpu cycles, reducing cache line misses and stack use. - Improve netfilter packet tracker to resolve clashes for most protocols, avoiding unneeded drops on rare occasions. - Add IP6IP6 tunneling acceleration to the flowtable infrastructure. - Reduce tcp socket size by one cache line. - Notify neighbour changes atomically, avoiding inconsistencies between the notification sequence and the actual states sequence. - Add vsock namespace support, allowing complete isolation of vsocks across different network namespaces. - Improve xsk generic performances with cache-alignment-oriented optimizations. - Support netconsole automatic target recovery, allowing netconsole to reestablish targets when underlying low-level interface comes back online. Driver API: - Support for switching the working mode (automatic vs manual) of a DPLL device via netlink. - Introduce PHY ports representation to expose multiple front-facing media ports over a single MAC. - Introduce "rx-polarity" and "tx-polarity" device tree properties, to generalize polarity inversion requirements for differential signaling. - Add helper to create, prepare and enable managed clocks. Device drivers: - Add Huawei hinic3 PF etherner driver. - Add DWMAC glue driver for Motorcomm YT6801 PCIe ethernet controller. - Add ethernet driver for MaxLinear MxL862xx switches - Remove parallel-port Ethernet driver. - Convert existing driver timestamp configuration reporting to hwtstamp_get and remove legacy ioctl(). - Convert existing drivers to .get_rx_ring_count(), simplifing the RX ring count retrieval. Also remove the legacy fallback path. - Ethernet high-speed NICs: - Broadcom (bnxt, bng): - bnxt: add FW interface update to support FEC stats histogram and NVRAM defragmentation - bng: add TSO and H/W GRO support - nVidia/Mellanox (mlx5): - improve latency of channel restart operations, reducing the used H/W resources - add TSO support for UDP over GRE over VLAN - add flow counters support for hardware steering (HWS) rules - use a static memory area to store headers for H/W GRO, leading to 12% RX tput improvement - Intel (100G, ice, idpf): - ice: reorganizes layout of Tx and Rx rings for cacheline locality and utilizes __cacheline_group* macros on the new layouts - ice: introduces Synchronous Ethernet (SyncE) support - Meta (fbnic): - adds debugfs for firmware mailbox and tx/rx rings vectors - Ethernet virtual: - geneve: introduce GRO/GSO support for double UDP encapsulation - Ethernet NICs consumer, and embedded: - Synopsys (stmmac): - some code refactoring and cleanups - RealTek (r8169): - add support for RTL8127ATF (10G Fiber SFP) - add dash and LTR support - Airoha: - AN8811HB 2.5 Gbps phy support - Freescale (fec): - add XDP zero-copy support - Thunderbolt: - add get link setting support to allow bonding - Renesas: - add support for RZ/G3L GBETH SoC - Ethernet switches: - Maxlinear: - support R(G)MII slow rate configuration - add support for Intel GSW150 - Motorcomm (yt921x): - add DCB/QoS support - TI: - icssm-prueth: support bridging (STP/RSTP) via the switchdev framework - Ethernet PHYs: - Realtek: - enable SGMII and 2500Base-X in-band auto-negotiation - simplify and reunify C22/C45 drivers - Micrel: convert bindings to DT schema - CAN: - move skb headroom content into skb extensions, making CAN metadata access more robust - CAN drivers: - rcar_canfd: - add support for FD-only mode - add support for the RZ/T2H SoC - sja1000: cleanup the CAN state handling - WiFi: - implement EPPKE/802.1X over auth frames support - split up drop reasons better, removing generic RX_DROP - additional FTM capabilities: 6 GHz support, supported number of spatial streams and supported number of LTF repetitions - better mac80211 iterators to enumerate resources - initial UHR (Wi-Fi 8) support for cfg80211/mac80211 - WiFi drivers: - Qualcomm/Atheros: - ath11k: support for Channel Frequency Response measurement - ath12k: a significant driver refactor to support multi-wiphy devices and and pave the way for future device support in the same driver (rather than splitting to ath13k) - ath12k: support for the QCC2072 chipset - Intel: - iwlwifi: partial Neighbor Awareness Networking (NAN) support - iwlwifi: initial support for U-NII-9 and IEEE 802.11bn - RealTek (rtw89): - preparations for RTL8922DE support - Bluetooth: - implement setsockopt(BT_PHY) to set the connection packet type/PHY - set link_policy on incoming ACL connections - Bluetooth drivers: - btusb: add support for MediaTek7920, Realtek RTL8761BU and 8851BE - btqca: add WCN6855 firmware priority selection feature" * tag 'net-next-7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1254 commits) bnge/bng_re: Add a new HSI net: macb: Fix tx/rx malfunction after phy link down and up af_unix: Fix memleak of newsk in unix_stream_connect(). net: ti: icssg-prueth: Add optional dependency on HSR net: dsa: add basic initial driver for MxL862xx switches net: mdio: add unlocked mdiodev C45 bus accessors net: dsa: add tag format for MxL862xx switches dt-bindings: net: dsa: add MaxLinear MxL862xx selftests: drivers: net: hw: Modify toeplitz.c to poll for packets octeontx2-pf: Unregister devlink on probe failure net: renesas: rswitch: fix forwarding offload statemachine ionic: Rate limit unknown xcvr type messages tcp: inet6_csk_xmit() optimization tcp: populate inet->cork.fl.u.ip6 in tcp_v6_syn_recv_sock() tcp: populate inet->cork.fl.u.ip6 in tcp_v6_connect() ipv6: inet6_csk_xmit() and inet6_csk_update_pmtu() use inet->cork.fl.u.ip6 ipv6: use inet->cork.fl.u.ip6 and np->final in ip6_datagram_dst_update() ipv6: use np->final in inet6_sk_rebuild_header() ipv6: add daddr/final storage in struct ipv6_pinfo net: stmmac: qcom-ethqos: fix qcom_ethqos_serdes_powerup() ...
2026-02-03	tcp: accecn: detect loss ACK w/ AccECN option and add TCP_ACCECN_OPTION_PERSIST	Chia-Yu Chang	-1/+3
	Detect spurious retransmission of a previously sent ACK carrying the AccECN option after the second retransmission. Since this might be caused by the middlebox dropping ACK with options it does not recognize, disable the sending of the AccECN option in all subsequent ACKs. This patch follows Section 3.2.3.2.2 of AccECN spec (RFC9768), and a new field (accecn_opt_sent_w_dsack) is added to indicate that an AccECN option was sent with duplicate SACK info. Also, a new AccECN option sending mode is added to tcp_ecn_option sysctl: (TCP_ECN_OPTION_PERSIST), which ignores the AccECN fallback policy and persistently sends AccECN option once it fits into TCP option space. Signed-off-by: Chia-Yu Chang <chia-yu.chang@nokia-bell-labs.com> Acked-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20260131222515.8485-13-chia-yu.chang@nokia-bell-labs.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2026-02-03	tcp: try to avoid safer when ACKs are thinned	Ilpo Järvinen	-0/+1
	Add newly acked pkts EWMA. When ACK thinning occurs, select between safer and unsafe cep delta in AccECN processing based on it. If the packets ACKed per ACK tends to be large, don't conservatively assume ACE field overflow. This patch uses the existing 2-byte holes in the rx group for new u16 variables withtout creating more holes. Below are the pahole outcomes before and after this patch: [BEFORE THIS PATCH] struct tcp_sock { [...] u32 delivered_ecn_bytes[3]; /* 2744 12 / / XXX 4 bytes hole, try to pack / [...] __cacheline_group_end__tcp_sock_write_rx[0]; / 2816 0 / [...] / size: 3264, cachelines: 51, members: 177 / } [AFTER THIS PATCH] struct tcp_sock { [...] u32 delivered_ecn_bytes[3]; / 2744 12 / u16 pkts_acked_ewma; / 2756 2 / / XXX 2 bytes hole, try to pack / [...] __cacheline_group_end__tcp_sock_write_rx[0]; / 2816 0 / [...] / size: 3264, cachelines: 51, members: 178 */ } Signed-off-by: Ilpo Järvinen <ij@kernel.org> Co-developed-by: Chia-Yu Chang <chia-yu.chang@nokia-bell-labs.com> Signed-off-by: Chia-Yu Chang <chia-yu.chang@nokia-bell-labs.com> Acked-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20260131222515.8485-2-chia-yu.chang@nokia-bell-labs.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2026-02-02	docs: networking: mention that RSS table should be 4x the queue count	Jakub Kicinski	-4/+8
	Spell out the recommendation that the RSS table should be 4x the queue count to avoid traffic imbalance. Include minor rephrasing and removal of the explicit 128 entry example since a 128 entry table is inadequate on modern machines. Reviewed-by: Eric Dumazet <edumazet@google.com> Link: https://patch.msgid.link/20260131225454.1225151-2-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-28	net: ethernet: neterion: s2io: remove unused driver	Ethan Nelson-Moore	-197/+0
	The s2io driver supports Exar (formerly Neterion and S2io) PCI-X 10 Gigabit Ethernet cards. Hardware supporting PCI-X has not been manufactured in years. On x86, it was quickly replaced by PCIe. While it stuck around longer on POWER hardware, the last POWER hardware to support it was POWER7, which is not supported by ppc64le Linux distributions. The last supported mainstream ppc64 Linux distribution was RHEL 7; while it is still supported under ELS, ELS is only available for x86 and IBM Z. It is possible to use many PCI-X cards in standard PCI slots (which are still available on new motherboards), but it does not make sense to do so for 10 Gigabit Ethernet because the maximum bandwidth of standard PCI is only 1067 Mbps. It is therefore highly unlikely that this driver is still being used. Remove the driver, and move the former maintainer to the CREDITS file (restoring credit for the vxge driver, which was removed in commit f05643a0f60b ("eth: remove neterion/vxge"). Signed-off-by: Ethan Nelson-Moore <enelsonmoore@gmail.com> Link: https://patch.msgid.link/20260126031352.22997-1-enelsonmoore@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-25	Documentation: net: Fix typos in netdevices.rst	Dimitri Daskalakis	-2/+2
	Fixes two minor typos. Specifically, on -> or and Devices -> Device. Signed-off-by: Dimitri Daskalakis <dimitri.daskalakis1@gmail.com> Link: https://patch.msgid.link/20260122225723.2368698-1-dimitri.daskalakis1@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-23	Documentation: use a source-read extension for the index link boilerplate	Jani Nikula	-84/+0
	The root document usually has a special :ref:`genindex` link to the generated index. This is also the case for Documentation/index.rst. The other index.rst files deeper in the directory hierarchy usually don't. For SPHINXDIRS builds, the root document isn't Documentation/index.rst, but some other index.rst in the hierarchy. Currently they have a ".. only::" block to add the index link when doing SPHINXDIRS html builds. This is obviously very tedious and repetitive. The link is also added to all index.rst files in the hierarchy for SPHINXDIRS builds, not just the root document. Put the boilerplate in a sphinx-includes/subproject-index.rst file, and include it at the end of the root document for subproject builds in an ad-hoc source-read extension defined in conf.py. For now, keep having the boilerplate in translations, because this approach currently doesn't cover translated index link headers. Cc: Jonathan Corbet <corbet@lwn.net> Cc: Mauro Carvalho Chehab <mchehab@kernel.org> Cc: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Jani Nikula <jani.nikula@intel.com> Tested-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Reviewed-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> [jc: did s/doctree/kern_doc_dir/ ] Signed-off-by: Jonathan Corbet <corbet@lwn.net> Message-ID: <20260123143149.2024303-1-jani.nikula@intel.com>
2026-01-20	net: remove legacy way to get/set HW timestamp config	Vadim Fedorenko	-4/+3
	With all drivers converted to use ndo_hwstamp callbacks the legacy way can be removed, marking ioctl interface as deprecated. Signed-off-by: Vadim Fedorenko <vadim.fedorenko@linux.dev> Reviewed-by: Kory Maincent <kory.maincent@bootlin.com> Link: https://patch.msgid.link/20260116062121.1230184-1-vadim.fedorenko@linux.dev Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-20	Merge tag 'net-queue-rx-buf-len-v9' of https://github.com/isilence/linux	Jakub Kicinski	-0/+20
	Pavel Begunkov says: ==================== Add support for providers with large rx buffer Many modern NICs support configurable receive buffer lengths, and zcrx and memory providers can use buffers larger than 4K to improve performance. When paired with hw-gro larger rx buffer sizes can drastically reduce the number of buffers traversing the stack and save a lot of processing time. It also allows to give to users larger contiguous chunks of data. Single stream benchmarks showed up to ~30% CPU util improvement. E.g. comparison for 4K vs 32K buffers using a 200Gbit NIC: packets=23987040 (MB=2745098), rps=199559 (MB/s=22837) CPU %usr %nice %sys %iowait %irq %soft %idle 0 1.53 0.00 27.78 2.72 1.31 66.45 0.22 packets=24078368 (MB=2755550), rps=200319 (MB/s=22924) CPU %usr %nice %sys %iowait %irq %soft %idle 0 0.69 0.00 8.26 31.65 1.83 57.00 0.57 This series adds net infrastructure for memory providers configuring the size and implements it for bnxt. It's an opt-in feature for drivers, they should advertise support for the parameter in the qops and must check if the hardware supports the given size. It's limited to memory providers as it drastically simplifies implementation. It doesn't affect the fast path zcrx uAPI, and the user exposed parameter is defined in zcrx terms, which allows it to be flexible and adjusted in the future. A liburing example can be found at [2] full branch: [1] https://github.com/isilence/linux.git zcrx/large-buffers-v8 Liburing example: [2] https://github.com/isilence/liburing.git zcrx/rx-buf-len * tag 'net-queue-rx-buf-len-v9' of https://github.com/isilence/linux: io_uring/zcrx: document area chunking parameter selftests: iou-zcrx: test large chunk sizes eth: bnxt: support qcfg provided rx page size eth: bnxt: adjust the fill level of agg queues with larger buffers eth: bnxt: store rx buffer size per queue net: pass queue rx page size from memory provider net: add bare bone queue configs net: reduce indent of struct netdev_queue_mgmt_ops members net: memzero mp params when closing a queue ==================== Link: https://patch.msgid.link/ Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-17	docs: tls: Enhance TLS resync async process documentation	Shahar Shitrit	-0/+30
	Expand the tls-offload.rst documentation to provide a more detailed explanation of the asynchronous resync process, including the role of struct tls_offload_resync_async in managing resync requests on the kernel side. Also, add documentation for helper functions tls_offload_rx_resync_async_request_start/ _end/ _cancel. Signed-off-by: Shahar Shitrit <shshitrit@nvidia.com> Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/1768298883-1602599-1-git-send-email-tariqt@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-15	net: phy: remove unused fixup unregistering functions	Heiner Kallweit	-21/+1
	No user of PHY fixups unregisters these. IOW: The fixup unregistering functions are unused and can be removed. Remove also documentation for these functions. Whilst at it, remove also mentioning of phy_register_fixup() from the Documentation, as this function has been static since ea47e70e476f ("net: phy: remove fixup-related definitions from phy.h which are not used outside phylib"). Fixup unregistering functions were added with f38e7a32ee4f ("phy: add phy fixup unregister functions") in 2016, and last user was removed with 6782d06a47ad ("net: usb: lan78xx: Remove KSZ9031 PHY fixup") in 2024. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Link: https://patch.msgid.link/ff8ac321-435c-48d0-b376-fbca80c0c22e@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-13	Documentation: networking: Document the phy_port infrastructure	Maxime Chevallier	-0/+112
	This documentation aims at describing the main goal of the phy_port infrastructure. Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Tested-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Link: https://patch.msgid.link/20260108080041.553250-15-maxime.chevallier@bootlin.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2026-01-14	io_uring/zcrx: document area chunking parameter	Pavel Begunkov	-0/+20
	struct io_uring_zcrx_ifq_reg::rx_buf_len is used as a hint specifying the kernel what buffer size it should use. Document the API and limitations. Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
2025-12-01	Documentation: net: dsa: mention simple HSR offload helpers	Vladimir Oltean	-0/+8
	Keep the documentation up to date. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://patch.msgid.link/20251130131657.65080-16-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-12-01	Documentation: net: dsa: mention availability of RedBox	Vladimir Oltean	-5/+4
	Since commit 5055cccfc2d1 ("net: hsr: Provide RedBox support (HSR-SAN)"), RedBox is available (including for offload in DSA). Update the DSA documentation that states it isn't. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://patch.msgid.link/20251130131657.65080-15-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-25	tcp: remove icsk->icsk_retransmit_timer	Eric Dumazet	-1/+0
	Now sk->sk_timer is no longer used by TCP keepalive, we can use its storage for TCP and MPTCP retransmit timers for better cache locality. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> Link: https://patch.msgid.link/20251124175013.1473655-5-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-25	tcp: introduce icsk->icsk_keepalive_timer	Eric Dumazet	-0/+1
	sk->sk_timer has been used for TCP keepalives. Keepalive timers are not in fast path, we want to use sk->sk_timer storage for retransmit timers, for better cache locality. Create icsk->icsk_keepalive_timer and change keepalive code to no longer use sk->sk_timer. Added space is reclaimed in the following patch. This includes changes to MPTCP, which was also using sk_timer. Alias icsk->mptcp_tout_timer and icsk->icsk_keepalive_timer for inet_sk_diag_fill() sake. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> Link: https://patch.msgid.link/20251124175013.1473655-4-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-20	net/mlx5: implement swp_l4_csum_mode via devlink params	Daniel Zahka	-0/+14
	swp_l4_csum_mode controls how L4 transmit checksums are computed when using Software Parser (SWP) hints for header locations. Supported values: 1. default: device will choose between full_csum or l4_only. Driver will discover the device's choice during initialization. 2. full_csum: calculate L4 checksum with the pseudo-header. 3. l4_only: calculate L4 checksum without the pseudo-header. Only available when swp_l4_csum_mode_l4_only is set in mlx5_ifc_nv_sw_offload_cap_bits. Note that 'default' might be returned from the device and passed to userspace, and it might also be set during a devlink_param::reset_default() call, but attempts to set a value of default directly with param-set will be rejected. The l4_only setting is a dependency for PSP initialization in mlx5e_psp_init(). Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Daniel Zahka <daniel.zahka@gmail.com> Link: https://patch.msgid.link/20251119025038.651131-5-daniel.zahka@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-20	devlink: support default values for param-get and param-set	Daniel Zahka	-0/+10
	Support querying and resetting to default param values. Introduce two new devlink netlink attrs: DEVLINK_ATTR_PARAM_VALUE_DEFAULT and DEVLINK_ATTR_PARAM_RESET_DEFAULT. The former is used to contain an optional parameter value inside of the param_value nested attribute. The latter is used in param-set requests from userspace to indicate that the driver should reset the param to its default value. To implement this, two new functions are added to the devlink driver api: devlink_param::get_default() and devlink_param::reset_default(). These callbacks allow drivers to implement default param actions for runtime and permanent cmodes. For driverinit params, the core latches the last value set by a driver via devl_param_driverinit_value_set(), and uses that as the default value for a param. Because default parameter values are optional, it would be impossible to discern whether or not a param of type bool has default value of false or not provided if the default value is encoded using a netlink flag type. For this reason, when a DEVLINK_PARAM_TYPE_BOOL has an associated default value, the default value is encoded using a u8 type. Signed-off-by: Daniel Zahka <daniel.zahka@gmail.com> Link: https://patch.msgid.link/20251119025038.651131-4-daniel.zahka@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>