linux - Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/

Age	Commit message (Collapse)	Author	Files	Lines
2025-10-15	net: usb: lan78xx: fix use of improperly initialized dev->chipid in ↵	I Viswanath	1	-4/+4
	lan78xx_reset dev->chipid is used in lan78xx_init_mac_address before it's initialized: lan78xx_reset() { lan78xx_init_mac_address() lan78xx_read_eeprom() lan78xx_read_raw_eeprom() <- dev->chipid is used here dev->chipid = ... <- dev->chipid is initialized correctly here } Reorder initialization so that dev->chipid is set before calling lan78xx_init_mac_address(). Fixes: a0db7d10b76e ("lan78xx: Add to handle mux control per chip id") Signed-off-by: I Viswanath <viswanathiyyappan@gmail.com> Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev> Reviewed-by: Khalid Aziz <khalid@kernel.org> Link: https://patch.msgid.link/20251013181648.35153-1-viswanathiyyappan@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-10-15	netdevsim: set the carrier when the device goes up	Breno Leitao	1	-0/+7
	Bringing a linked netdevsim device down and then up causes communication failure because both interfaces lack carrier. Basically a ifdown/ifup on the interface make the link broken. Commit 3762ec05a9fbda ("netdevsim: add NAPI support") added supported for NAPI, calling netif_carrier_off() in nsim_stop(). This patch re-enables the carrier symmetrically on nsim_open(), in case the device is linked and the peer is up. Signed-off-by: Breno Leitao <leitao@debian.org> Fixes: 3762ec05a9fbda ("netdevsim: add NAPI support") Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20251014-netdevsim_fix-v2-1-53b40590dae1@debian.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-10-15	selftests: tls: add test for short splice due to full skmsg	Sabrina Dubroca	1	-0/+31
	We don't have a test triggering a partial splice caused by a full skmsg. Add one, based on a program by Jann Horn. Use MAX_FRAGS=48 to make sure the skmsg will be full for any allowed value of CONFIG_MAX_SKB_FRAGS (17..45). Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Link: https://patch.msgid.link/1d129a15f526ea3602f3a2b368aa0b6f7e0d35d5.1760432043.git.sd@queasysnail.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-10-15	selftests: net: tls: add tests for cmsg vs MSG_MORE	Sabrina Dubroca	1	-0/+34
	We don't have a test to check that MSG_MORE won't let us merge records of different types across sendmsg calls. Add new tests that check: - MSG_MORE is only allowed for DATA records - a pending DATA record gets closed and pushed before a non-DATA record is processed Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Link: https://patch.msgid.link/b34feeadefe8a997f068d5ed5617afd0072df3c0.1760432043.git.sd@queasysnail.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-10-15	tls: don't rely on tx_work during send()	Sabrina Dubroca	1	-0/+13
	With async crypto, we rely on tx_work to actually transmit records once encryption completes. But while send() is running, both the tx_lock and socket lock are held, so tx_work_handler cannot process the queue of encrypted records, and simply reschedules itself. During a large send(), this could last a long time, and use a lot of memory. Transmit any pending encrypted records before restarting the main loop of tls_sw_sendmsg_locked. Fixes: a42055e8d2c3 ("net/tls: Add support for async encryption of records for performance") Reported-by: Jann Horn <jannh@google.com> Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Link: https://patch.msgid.link/8396631478f70454b44afb98352237d33f48d34d.1760432043.git.sd@queasysnail.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-10-15	tls: wait for pending async decryptions if tls_strp_msg_hold fails	Sabrina Dubroca	1	-2/+4
	Async decryption calls tls_strp_msg_hold to create a clone of the input skb to hold references to the memory it uses. If we fail to allocate that clone, proceeding with async decryption can lead to various issues (UAF on the skb, writing into userspace memory after the recv() call has returned). In this case, wait for all pending decryption requests. Fixes: 84c61fe1a75b ("tls: rx: do not use the standard strparser") Reported-by: Jann Horn <jannh@google.com> Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Link: https://patch.msgid.link/b9fe61dcc07dab15da9b35cf4c7d86382a98caf2.1760432043.git.sd@queasysnail.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-10-15	tls: always set record_type in tls_process_cmsg	Sabrina Dubroca	1	-5/+2
	When userspace wants to send a non-DATA record (via the TLS_SET_RECORD_TYPE cmsg), we need to send any pending data from a previous MSG_MORE send() as a separate DATA record. If that DATA record is encrypted asynchronously, tls_handle_open_record will return -EINPROGRESS. This is currently treated as an error by tls_process_cmsg, and it will skip setting record_type to the correct value, but the caller (tls_sw_sendmsg_locked) handles that return value correctly and proceeds with sending the new message with an incorrect record_type (DATA instead of whatever was requested in the cmsg). Always set record_type before handling the open record. If tls_handle_open_record returns an error, record_type will be ignored. If it succeeds, whether with synchronous crypto (returning 0) or asynchronous (returning -EINPROGRESS), the caller will proceed correctly. Fixes: a42055e8d2c3 ("net/tls: Add support for async encryption of records for performance") Reported-by: Jann Horn <jannh@google.com> Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Link: https://patch.msgid.link/0457252e578a10a94e40c72ba6288b3a64f31662.1760432043.git.sd@queasysnail.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-10-15	tls: wait for async encrypt in case of error during latter iterations of sendmsg	Sabrina Dubroca	1	-3/+4
	If we hit an error during the main loop of tls_sw_sendmsg_locked (eg failed allocation), we jump to send_end and immediately return. Previous iterations may have queued async encryption requests that are still pending. We should wait for those before returning, as we could otherwise be reading from memory that userspace believes we're not using anymore, which would be a sort of use-after-free. This is similar to what tls_sw_recvmsg already does: failures during the main loop jump to the "wait for async" code, not straight to the unlock/return. Fixes: a42055e8d2c3 ("net/tls: Add support for async encryption of records for performance") Reported-by: Jann Horn <jannh@google.com> Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Link: https://patch.msgid.link/c793efe9673b87f808d84fdefc0f732217030c52.1760432043.git.sd@queasysnail.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-10-15	tls: trim encrypted message to match the plaintext on short splice	Sabrina Dubroca	1	-1/+4
	During tls_sw_sendmsg_locked, we pre-allocate the encrypted message for the size we're expecting to send during the current iteration, but we may end up sending less, for example when splicing: if we're getting the data from small fragments of memory, we may fill up all the slots in the skmsg with less data than expected. In this case, we need to trim the encrypted message to only the length we actually need, to avoid pushing uninitialized bytes down the underlying TCP socket. Fixes: fe1e81d4f73b ("tls/sw: Support MSG_SPLICE_PAGES") Reported-by: Jann Horn <jannh@google.com> Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Link: https://patch.msgid.link/66a0ae99c9efc15f88e9e56c1f58f902f442ce86.1760432043.git.sd@queasysnail.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-10-15	tg3: prevent use of uninitialized remote_adv and local_adv variables	Alexey Simakov	1	-4/+1
	Some execution paths that jump to the fiber_setup_done label could leave the remote_adv and local_adv variables uninitialized and then use it. Initialize this variables at the point of definition to avoid this. Fixes: 85730a631f0c ("tg3: Add SGMII phy support for 5719/5718 serdes") Co-developed-by: Alexandr Sapozhnikov <alsp705@gmail.com> Signed-off-by: Alexandr Sapozhnikov <alsp705@gmail.com> Signed-off-by: Alexey Simakov <bigalex934@gmail.com> Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Link: https://patch.msgid.link/20251014164736.5890-1-bigalex934@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-10-15	MAINTAINERS: new entry for IPv6 IOAM	Justin Iurman	1	-0/+10
	Create a maintainer entry for IPv6 IOAM. Add myself as I authored most if not all of the IPv6 IOAM code in the kernel and actively participate in the related IETF groups. Signed-off-by: Justin Iurman <justin.iurman@uliege.be> Link: https://patch.msgid.link/20251014170650.27679-1-justin.iurman@uliege.be Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-10-15	gve: Check valid ts bit on RX descriptor before hw timestamping	Tim Hostetler	3	-7/+16
	The device returns a valid bit in the LSB of the low timestamp byte in the completion descriptor that the driver should check before setting the SKB's hardware timestamp. If the timestamp is not valid, do not hardware timestamp the SKB. Cc: stable@vger.kernel.org Fixes: b2c7aeb49056 ("gve: Implement ndo_hwtstamp_get/set for RX timestamping") Reviewed-by: Joshua Washington <joshwash@google.com> Signed-off-by: Tim Hostetler <thostet@google.com> Signed-off-by: Harshitha Ramamurthy <hramamurthy@google.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Willem de Bruijn <willemb@google.com> Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev> Link: https://patch.msgid.link/20251014004740.2775957-1-hramamurthy@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-10-15	Remove long-stale ext3 defconfig option	Linus Torvalds	31	-31/+0
	Inspired by commit c065b6046b34 ("Use CONFIG_EXT4_FS instead of CONFIG_EXT3_FS in all of the defconfigs") I looked around for any other left-over EXT3 config options, and found some old defconfig files still mentioned CONFIG_EXT3_DEFAULTS_TO_ORDERED. That config option was removed a decade ago in commit c290ea01abb7 ("fs: Remove ext3 filesystem driver"). It had a good run, but let's remove it for good. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2025-10-15	PM / devfreq: rockchip-dfi: switch to FIELD_PREP_WM16 macro	Nicolas Frattaroli	1	-23/+22
	The era of hand-rolled HIWORD_UPDATE macros is over, at least for those drivers that use constant masks. Like many other Rockchip drivers, rockchip-dfi brings with it its own HIWORD_UPDATE macro. This variant doesn't shift the value (and like the others, doesn't do any checking). Remove it, and replace instances of it with hw_bitfield.h's FIELD_PREP_WM16. Since FIELD_PREP_WM16 requires contiguous masks and shifts the value for us, some reshuffling of definitions needs to happen. This gives us better compile-time error checking, and in my opinion, nicer code. Tested on an RK3568 ODROID-M1 board (LPDDR4X at 1560 MHz, an RK3588 Radxa ROCK 5B board (LPDDR4X at 2112 MHz) and an RK3588 Radxa ROCK 5T board (LPDDR5 at 2400 MHz). perf measurements were consistent with the measurements of stress-ng --stream in all cases. Signed-off-by: Nicolas Frattaroli <nicolas.frattaroli@collabora.com> Signed-off-by: Yury Norov (NVIDIA) <yury.norov@gmail.com>
2025-10-15	rust: bitmap: clean Rust 1.92.0 `unused_unsafe` warning	Miguel Ojeda	1	-0/+2
	Starting with Rust 1.92.0 (expected 2025-12-11), Rust allows to safely take the address of a union field [1][2]: CLIPPY L rust/kernel.o error: unnecessary `unsafe` block --> rust/kernel/bitmap.rs:169:13 \| 169 \| unsafe { core::ptr::addr_of!(self.repr.bitmap) } \| ^^^^^^ unnecessary `unsafe` block \| = note: `-D unused-unsafe` implied by `-D warnings` = help: to override `-D warnings` add `#[allow(unused_unsafe)]` error: unnecessary `unsafe` block --> rust/kernel/bitmap.rs:185:13 \| 185 \| unsafe { core::ptr::addr_of_mut!(self.repr.bitmap) } \| ^^^^^^ unnecessary `unsafe` block Thus allow both instances to clean the warning in newer compilers. Link: https://github.com/rust-lang/rust/issues/141264 [1] Link: https://github.com/rust-lang/rust/pull/141469 [2] Signed-off-by: Miguel Ojeda <ojeda@kernel.org> Reviewed-by: Alice Ryhl <aliceryhl@google.com> Signed-off-by: Yury Norov (NVIDIA) <yury.norov@gmail.com>
2025-10-15	ksmbd: fix recursive locking in RPC handle list access	Marios Makassikis	3	-6/+22
	Since commit 305853cce3794 ("ksmbd: Fix race condition in RPC handle list access"), ksmbd_session_rpc_method() attempts to lock sess->rpc_lock. This causes hung connections / tasks when a client attempts to open a named pipe. Using Samba's rpcclient tool: $ rpcclient //192.168.1.254 -U user%password $ rpcclient $> srvinfo <connection hung here> Kernel side: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. task:kworker/0:0 state:D stack:0 pid:5021 tgid:5021 ppid:2 flags:0x00200000 Workqueue: ksmbd-io handle_ksmbd_work Call trace: __schedule from schedule+0x3c/0x58 schedule from schedule_preempt_disabled+0xc/0x10 schedule_preempt_disabled from rwsem_down_read_slowpath+0x1b0/0x1d8 rwsem_down_read_slowpath from down_read+0x28/0x30 down_read from ksmbd_session_rpc_method+0x18/0x3c ksmbd_session_rpc_method from ksmbd_rpc_open+0x34/0x68 ksmbd_rpc_open from ksmbd_session_rpc_open+0x194/0x228 ksmbd_session_rpc_open from create_smb2_pipe+0x8c/0x2c8 create_smb2_pipe from smb2_open+0x10c/0x27ac smb2_open from handle_ksmbd_work+0x238/0x3dc handle_ksmbd_work from process_scheduled_works+0x160/0x25c process_scheduled_works from worker_thread+0x16c/0x1e8 worker_thread from kthread+0xa8/0xb8 kthread from ret_from_fork+0x14/0x38 Exception stack(0x8529ffb0 to 0x8529fff8) The task deadlocks because the lock is already held: ksmbd_session_rpc_open down_write(&sess->rpc_lock) ksmbd_rpc_open ksmbd_session_rpc_method down_read(&sess->rpc_lock) <-- deadlock Adjust ksmbd_session_rpc_method() callers to take the lock when necessary. Fixes: 305853cce3794 ("ksmbd: Fix race condition in RPC handle list access") Signed-off-by: Marios Makassikis <mmakassikis@freebox.fr> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2025-10-15	smb/server: fix possible refcount leak in smb2_sess_setup()	ZhangGuoDong	1	-0/+1
	Reference count of ksmbd_session will leak when session need reconnect. Fix this by adding the missing ksmbd_user_session_put(). Co-developed-by: ChenXiaoSong <chenxiaosong@kylinos.cn> Signed-off-by: ChenXiaoSong <chenxiaosong@kylinos.cn> Signed-off-by: ZhangGuoDong <zhangguodong@kylinos.cn> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2025-10-15	smb/server: fix possible memory leak in smb2_read()	ZhangGuoDong	1	-0/+1
	Memory leak occurs when ksmbd_vfs_read() fails. Fix this by adding the missing kvfree(). Co-developed-by: ChenXiaoSong <chenxiaosong@kylinos.cn> Signed-off-by: ChenXiaoSong <chenxiaosong@kylinos.cn> Signed-off-by: ZhangGuoDong <zhangguodong@kylinos.cn> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2025-10-14	net: core: fix lockdep splat on device unregister	Florian Westphal	1	-5/+35
	Since blamed commit, unregister_netdevice_many_notify() takes the netdev mutex if the device needs it. If the device list is too long, this will lock more device mutexes than lockdep can handle: unshare -n \ bash -c 'for i in $(seq 1 100);do ip link add foo$i type dummy;done' BUG: MAX_LOCK_DEPTH too low! turning off the locking correctness validator. depth: 48 max: 48! 48 locks held by kworker/u16:1/69: #0: ..148 ((wq_completion)netns){+.+.}-{0:0}, at: process_one_work #1: ..d40 (net_cleanup_work){+.+.}-{0:0}, at: process_one_work #2: ..bd0 (pernet_ops_rwsem){++++}-{4:4}, at: cleanup_net #3: ..aa8 (rtnl_mutex){+.+.}-{4:4}, at: default_device_exit_batch #4: ..cb0 (&dev_instance_lock_key#3){+.+.}-{4:4}, at: unregister_netdevice_many_notify [..] Add a helper to close and then unlock a list of net_devices. Devices that are not up have to be skipped - netif_close_many always removes them from the list without any other actions taken, so they'd remain in locked state. Close devices whenever we've used up half of the tracking slots or we processed entire list without hitting the limit. Fixes: 7e4d784f5810 ("net: hold netdev instance lock during rtnetlink operations") Signed-off-by: Florian Westphal <fw@strlen.de> Link: https://patch.msgid.link/20251013185052.14021-1-fw@strlen.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-10-14	MAINTAINERS: add myself as maintainer for b53	Jonas Gorski	1	-0/+1
	I wrote the original OpenWrt driver that Florian used as the base for the dsa driver, I might as well take responsibility for it. Signed-off-by: Jonas Gorski <jonas.gorski@gmail.com> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Vladimir Oltean <olteanv@gmail.com> Link: https://patch.msgid.link/20251013180347.133246-1-jonas.gorski@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-10-14	selftests: net: check jq command is supported	Wang Liang	2	-0/+4
	The jq command is used in vlan_bridge_binding.sh, if it is not supported, the test will spam the following log. # ./vlan_bridge_binding.sh: line 51: jq: command not found # ./vlan_bridge_binding.sh: line 51: jq: command not found # ./vlan_bridge_binding.sh: line 51: jq: command not found # ./vlan_bridge_binding.sh: line 51: jq: command not found # ./vlan_bridge_binding.sh: line 51: jq: command not found # TEST: Test bridge_binding on->off when lower down [FAIL] # Got operstate of , expected 0 The rtnetlink.sh has the same problem. It makes sense to check if jq is installed before running these tests. After this patch, the vlan_bridge_binding.sh skipped if jq is not supported: # timeout set to 3600 # selftests: net: vlan_bridge_binding.sh # TEST: jq not installed [SKIP] Fixes: dca12e9ab760 ("selftests: net: Add a VLAN bridge binding selftest") Fixes: 6a414fd77f61 ("selftests: rtnetlink: Add an address proto test") Signed-off-by: Wang Liang <wangliang74@huawei.com> Reviewed-by: Hangbin Liu <liuhangbin@gmail.com> Link: https://patch.msgid.link/20251013080039.3035898-1-wangliang74@huawei.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-10-14	net: airoha: Take into account out-of-order tx completions in airoha_dev_xmit()	Lorenzo Bianconi	1	-1/+15
	Completion napi can free out-of-order tx descriptors if hw QoS is enabled and packets with different priority are queued to same DMA ring. Take into account possible out-of-order reports checking if the tx queue is full using circular buffer head/tail pointer instead of the number of queued packets. Fixes: 23020f0493270 ("net: airoha: Introduce ethernet support for EN7581 SoC") Suggested-by: Simon Horman <horms@kernel.org> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20251012-airoha-tx-busy-queue-v2-1-a600b08bab2d@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-10-14	tcp: fix tcp_tso_should_defer() vs large RTT	Eric Dumazet	1	-4/+15
	Neal reported that using neper tcp_stream with TCP_TX_DELAY set to 50ms would often lead to flows stuck in a small cwnd mode, regardless of the congestion control. While tcp_stream sets TCP_TX_DELAY too late after the connect(), it highlighted two kernel bugs. The following heuristic in tcp_tso_should_defer() seems wrong for large RTT: delta = tp->tcp_clock_cache - head->tstamp; /* If next ACK is likely to come too late (half srtt), do not defer / if ((s64)(delta - (u64)NSEC_PER_USEC (tp->srtt_us >> 4)) < 0) goto send_now; If next ACK is expected to come in more than 1 ms, we should not defer because we prefer a smooth ACK clocking. While blamed commit was a step in the good direction, it was not generic enough. Another patch fixing TCP_TX_DELAY for established flows will be proposed when net-next reopens. Fixes: 50c8339e9299 ("tcp: tso: restore IW10 after TSO autosizing") Reported-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Neal Cardwell <ncardwell@google.com> Tested-by: Neal Cardwell <ncardwell@google.com> Link: https://patch.msgid.link/20251011115742.1245771-1-edumazet@google.com [pabeni@redhat.com: fixed whitespace issue] Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-10-14	r8152: add error handling in rtl8152_driver_init	Yi Cong	1	-1/+6
	rtl8152_driver_init() is missing the error handling. When rtl8152_driver registration fails, rtl8152_cfgselector_driver should be deregistered. Fixes: ec51fbd1b8a2 ("r8152: add USB device driver for config selection") Cc: stable@vger.kernel.org Signed-off-by: Yi Cong <yicong@kylinos.cn> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20251011082415.580740-1-yicongsrfy@163.com [pabeni@redhat.com: clarified the commit message] Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-10-14	usbnet: Fix using smp_processor_id() in preemptible code warnings	Zqiang	1	-0/+2
	Syzbot reported the following warning: BUG: using smp_processor_id() in preemptible [00000000] code: dhcpcd/2879 caller is usbnet_skb_return+0x74/0x490 drivers/net/usb/usbnet.c:331 CPU: 1 UID: 0 PID: 2879 Comm: dhcpcd Not tainted 6.15.0-rc4-syzkaller-00098-g615dca38c2ea #0 PREEMPT(voluntary) Call Trace: <TASK> __dump_stack lib/dump_stack.c:94 [inline] dump_stack_lvl+0x16c/0x1f0 lib/dump_stack.c:120 check_preemption_disabled+0xd0/0xe0 lib/smp_processor_id.c:49 usbnet_skb_return+0x74/0x490 drivers/net/usb/usbnet.c:331 usbnet_resume_rx+0x4b/0x170 drivers/net/usb/usbnet.c:708 usbnet_change_mtu+0x1be/0x220 drivers/net/usb/usbnet.c:417 __dev_set_mtu net/core/dev.c:9443 [inline] netif_set_mtu_ext+0x369/0x5c0 net/core/dev.c:9496 netif_set_mtu+0xb0/0x160 net/core/dev.c:9520 dev_set_mtu+0xae/0x170 net/core/dev_api.c:247 dev_ifsioc+0xa31/0x18d0 net/core/dev_ioctl.c:572 dev_ioctl+0x223/0x10e0 net/core/dev_ioctl.c:821 sock_do_ioctl+0x19d/0x280 net/socket.c:1204 sock_ioctl+0x42f/0x6a0 net/socket.c:1311 vfs_ioctl fs/ioctl.c:51 [inline] __do_sys_ioctl fs/ioctl.c:906 [inline] __se_sys_ioctl fs/ioctl.c:892 [inline] __x64_sys_ioctl+0x190/0x200 fs/ioctl.c:892 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] do_syscall_64+0xcd/0x260 arch/x86/entry/syscall_64.c:94 entry_SYSCALL_64_after_hwframe+0x77/0x7f For historical and portability reasons, the netif_rx() is usually run in the softirq or interrupt context, this commit therefore add local_bh_disable/enable() protection in the usbnet_resume_rx(). Fixes: 43daa96b166c ("usbnet: Stop RX Q on MTU change") Link: https://syzkaller.appspot.com/bug?id=81f55dfa587ee544baaaa5a359a060512228c1e1 Suggested-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Zqiang <qiang.zhang@linux.dev> Link: https://patch.msgid.link/20251011070518.7095-1-qiang.zhang@linux.dev Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-10-14	Octeontx2-af: Fix missing error code in cgx_probe()	Harshit Mogalapalli	1	-0/+1
	When CGX fails mapping to NIX, set the error code to -ENODEV, currently err is zero and that is treated as success path. Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Closes: https://lore.kernel.org/all/aLAdlCg2_Yv7Y-3h@stanley.mountain/ Fixes: d280233fc866 ("Octeontx2-af: Fix NIX X2P calibration failures") Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20251010204239.94237-1-harshit.m.mogalapalli@oracle.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-10-14	amd-xgbe: Avoid spurious link down messages during interface toggle	Raju Rangoju	2	-1/+1
	During interface toggle operations (ifdown/ifup), the driver currently resets the local helper variable 'phy_link' to -1. This causes the link state machine to incorrectly interpret the state as a link change event, resulting in spurious "Link is down" messages being logged when the interface is brought back up. Preserve the phy_link state across interface toggles to avoid treating the -1 sentinel value as a legitimate link state transition. Fixes: 88131a812b16 ("amd-xgbe: Perform phy connect/disconnect at dev open/stop") Signed-off-by: Raju Rangoju <Raju.Rangoju@amd.com> Reviewed-by: Dawid Osuchowski <dawid.osuchowski@linux.intel.com> Link: https://patch.msgid.link/20251010065142.1189310-1-Raju.Rangoju@amd.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-10-13	Use CONFIG_EXT4_FS instead of CONFIG_EXT3_FS in all of the defconfigs	Theodore Ts'o	109	-182/+182
	Commit d6ace46c82fd ("ext4: remove obsolete EXT3 config options") removed the obsolete EXT3_CONFIG options, since it had been over a decade since fs/ext3 had been removed. Unfortunately, there were a number of defconfigs that still used CONFIG_EXT3_FS which the cleanup commit didn't fix up. This led to a large number of defconfig test builds to fail. Oops. Fixes: d6ace46c82fd ("ext4: remove obsolete EXT3 config options") Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2025-10-13	net: phy: realtek: Avoid PHYCR2 access if PHYCR2 not present	Marek Vasut	1	-12/+11
	The driver is currently checking for PHYCR2 register presence in rtl8211f_config_init(), but it does so after accessing PHYCR2 to disable EEE. This was introduced in commit bfc17c165835 ("net: phy: realtek: disable PHY-mode EEE"). Move the PHYCR2 presence test before the EEE disablement and simplify the code. Fixes: bfc17c165835 ("net: phy: realtek: disable PHY-mode EEE") Signed-off-by: Marek Vasut <marek.vasut@mailbox.org> Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/20251011110309.12664-1-marek.vasut@mailbox.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-10-13	ixgbe: fix too early devlink_free() in ixgbe_remove()	Koichiro Den	1	-1/+2
	Since ixgbe_adapter is embedded in devlink, calling devlink_free() prematurely in the ixgbe_remove() path can lead to UAF. Move devlink_free() to the end. KASAN report: BUG: KASAN: use-after-free in ixgbe_reset_interrupt_capability+0x140/0x180 [ixgbe] Read of size 8 at addr ffff0000adf813e0 by task bash/2095 CPU: 1 UID: 0 PID: 2095 Comm: bash Tainted: G S 6.17.0-rc2-tnguy.net-queue+ #1 PREEMPT(full) [...] Call trace: show_stack+0x30/0x90 (C) dump_stack_lvl+0x9c/0xd0 print_address_description.constprop.0+0x90/0x310 print_report+0x104/0x1f0 kasan_report+0x88/0x180 __asan_report_load8_noabort+0x20/0x30 ixgbe_reset_interrupt_capability+0x140/0x180 [ixgbe] ixgbe_clear_interrupt_scheme+0xf8/0x130 [ixgbe] ixgbe_remove+0x2d0/0x8c0 [ixgbe] pci_device_remove+0xa0/0x220 device_remove+0xb8/0x170 device_release_driver_internal+0x318/0x490 device_driver_detach+0x40/0x68 unbind_store+0xec/0x118 drv_attr_store+0x64/0xb8 sysfs_kf_write+0xcc/0x138 kernfs_fop_write_iter+0x294/0x440 new_sync_write+0x1fc/0x588 vfs_write+0x480/0x6a0 ksys_write+0xf0/0x1e0 __arm64_sys_write+0x70/0xc0 invoke_syscall.constprop.0+0xcc/0x280 el0_svc_common.constprop.0+0xa8/0x248 do_el0_svc+0x44/0x68 el0_svc+0x54/0x160 el0t_64_sync_handler+0xa0/0xe8 el0t_64_sync+0x1b0/0x1b8 Fixes: a0285236ab93 ("ixgbe: add initial devlink support") Signed-off-by: Koichiro Den <den@valinux.co.jp> Tested-by: Rinitha S <sx.rinitha@intel.com> Reviewed-by: Jedrzej Jagielski <jedrzej.jagielski@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20251009-jk-iwl-net-2025-10-01-v3-6-ef32a425b92a@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-10-13	ixgbe: handle IXGBE_VF_FEATURES_NEGOTIATE mbox cmd	Jedrzej Jagielski	2	-0/+47
	Send to VF information about features supported by the PF driver. Increase API version to 1.7. Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Jedrzej Jagielski <jedrzej.jagielski@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20251009-jk-iwl-net-2025-10-01-v3-5-ef32a425b92a@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-10-13	ixgbevf: fix mailbox API compatibility by negotiating supported features	Jedrzej Jagielski	6	-3/+96
	There was backward compatibility in the terms of mailbox API. Various drivers from various OSes supporting 10G adapters from Intel portfolio could easily negotiate mailbox API. This convention has been broken since introducing API 1.4. Commit 0062e7cc955e ("ixgbevf: add VF IPsec offload code") added support for IPSec which is specific only for the kernel ixgbe driver. None of the rest of the Intel 10G PF/VF drivers supports it. And actually lack of support was not included in the IPSec implementation - there were no such code paths. No possibility to negotiate support for the feature was introduced along with introduction of the feature itself. Commit 339f28964147 ("ixgbevf: Add support for new mailbox communication between PF and VF") increasing API version to 1.5 did the same - it introduced code supported specifically by the PF ESX driver. It altered API version for the VF driver in the same time not touching the version defined for the PF ixgbe driver. It led to additional discrepancies, as the code provided within API 1.6 cannot be supported for Linux ixgbe driver as it causes crashes. The issue was noticed some time ago and mitigated by Jake within the commit d0725312adf5 ("ixgbevf: stop attempting IPSEC offload on Mailbox API 1.5"). As a result we have regression for IPsec support and after increasing API to version 1.6 ixgbevf driver stopped to support ESX MBX. To fix this mess add new mailbox op asking PF driver about supported features. Basing on a response determine whether to set support for IPSec and ESX-specific enhanced mailbox. New mailbox op, for compatibility purposes, must be added within new API revision, as API version of OOT PF & VF drivers is already increased to 1.6 and doesn't incorporate features negotiate op. Features negotiation mechanism gives possibility to be extended with new features when needed in the future. Reported-by: Jacob Keller <jacob.e.keller@intel.com> Closes: https://lore.kernel.org/intel-wired-lan/20241101-jk-ixgbevf-mailbox-v1-5-fixes-v1-0-f556dc9a66ed@intel.com/ Fixes: 0062e7cc955e ("ixgbevf: add VF IPsec offload code") Fixes: 339f28964147 ("ixgbevf: Add support for new mailbox communication between PF and VF") Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Cc: stable@vger.kernel.org Signed-off-by: Jedrzej Jagielski <jedrzej.jagielski@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20251009-jk-iwl-net-2025-10-01-v3-4-ef32a425b92a@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-10-13	ixgbe: handle IXGBE_VF_GET_PF_LINK_STATE mailbox operation	Jedrzej Jagielski	2	-0/+47
	Update supported API version and provide handler for IXGBE_VF_GET_PF_LINK_STATE cmd. Simply put stored values of link speed and link_up from adapter context. Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Jedrzej Jagielski <jedrzej.jagielski@intel.com> Link: https://lore.kernel.org/stable/20250828095227.1857066-3-jedrzej.jagielski%40intel.com Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20251009-jk-iwl-net-2025-10-01-v3-3-ef32a425b92a@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-10-13	ixgbevf: fix getting link speed data for E610 devices	Jedrzej Jagielski	4	-32/+116
	E610 adapters no longer use the VFLINKS register to read PF's link speed and linkup state. As a result VF driver cannot get actual link state and it incorrectly reports 10G which is the default option. It leads to a situation where even 1G adapters print 10G as actual link speed. The same happens when PF driver set speed different than 10G. Add new mailbox operation to let the VF driver request a PF driver to provide actual link data. Update the mailbox api to v1.6. Incorporate both ways of getting link status within the legacy ixgbe_check_mac_link_vf() function. Fixes: 4c44b450c69b ("ixgbevf: Add support for Intel(R) E610 device") Co-developed-by: Andrzej Wilczynski <andrzejx.wilczynski@intel.com> Signed-off-by: Andrzej Wilczynski <andrzejx.wilczynski@intel.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Cc: stable@vger.kernel.org Signed-off-by: Jedrzej Jagielski <jedrzej.jagielski@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20251009-jk-iwl-net-2025-10-01-v3-2-ef32a425b92a@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-10-13	idpf: cleanup remaining SKBs in PTP flows	Milena Olech	2	-0/+4
	When the driver requests Tx timestamp value, one of the first steps is to clone SKB using skb_get. It increases the reference counter for that SKB to prevent unexpected freeing by another component. However, there may be a case where the index is requested, SKB is assigned and never consumed by PTP flows - for example due to reset during running PTP apps. Add a check in release timestamping function to verify if the SKB assigned to Tx timestamp latch was freed, and release remaining SKBs. Fixes: 4901e83a94ef ("idpf: add Tx timestamp capabilities negotiation") Signed-off-by: Milena Olech <milena.olech@intel.com> Signed-off-by: Anton Nadezhdin <anton.nadezhdin@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Tested-by: Samuel Salin <Samuel.salin@intel.com> Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20251009-jk-iwl-net-2025-10-01-v3-1-ef32a425b92a@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-10-13	net/ip6_tunnel: Prevent perpetual tunnel growth	Dmitry Safonov	3	-16/+16
	Similarly to ipv4 tunnel, ipv6 version updates dev->needed_headroom, too. While ipv4 tunnel headroom adjustment growth was limited in commit 5ae1e9922bbd ("net: ip_tunnel: prevent perpetual headroom growth"), ipv6 tunnel yet increases the headroom without any ceiling. Reflect ipv4 tunnel headroom adjustment limit on ipv6 version. Credits to Francesco Ruggeri, who was originally debugging this issue and wrote local Arista-specific patch and a reproducer. Fixes: 8eb30be0352d ("ipv6: Create ip6_tnl_xmit") Cc: Florian Westphal <fw@strlen.de> Cc: Francesco Ruggeri <fruggeri05@gmail.com> Signed-off-by: Dmitry Safonov <dima@arista.com> Link: https://patch.msgid.link/20251009-ip6_tunnel-headroom-v2-1-8e4dbd8f7e35@arista.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-10-13	net: phy: bcm54811: Fix GMII/MII/MII-Lite selection	Kamil Horák - 2N	2	-1/+20
	The Broadcom bcm54811 is hardware-strapped to select among RGMII and GMII/MII/MII-Lite modes. However, the corresponding bit, RGMII Enable in Miscellaneous Control Register must be also set to select desired RGMII or MII(-lite)/GMII mode. Fixes: 3117a11fff5af9e7 ("net: phy: bcm54811: PHY initialization") Signed-off-by: Kamil Horák - 2N <kamilh@axis.com> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Link: https://patch.msgid.link/20251009130656.1308237-2-kamilh@axis.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-10-13	r8169: fix packet truncation after S4 resume on RTL8168H/RTL8111H	Linmao Li	1	-2/+3
	After resume from S4 (hibernate), RTL8168H/RTL8111H truncates incoming packets. Packet captures show messages like "IP truncated-ip - 146 bytes missing!". The issue is caused by RxConfig not being properly re-initialized after resume. Re-initializing the RxConfig register before the chip re-initialization sequence avoids the truncation and restores correct packet reception. This follows the same pattern as commit ef9da46ddef0 ("r8169: fix data corruption issue on RTL8402"). Fixes: 6e1d0b898818 ("r8169:add support for RTL8168H and RTL8107E") Signed-off-by: Linmao Li <lilinmao@kylinos.cn> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Heiner Kallweit <hkallweit1@gmail.com> Link: https://patch.msgid.link/20251009122549.3955845-1-lilinmao@kylinos.cn Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-10-13	net: gro_cells: Use nested-BH locking for gro_cell	Sebastian Andrzej Siewior	1	-0/+10
	The gro_cell data structure is per-CPU variable and relies on disabled BH for its locking. Without per-CPU locking in local_bh_disable() on PREEMPT_RT this data structure requires explicit locking. Add a local_lock_t to the data structure and use local_lock_nested_bh() for locking. This change adds only lockdep coverage and does not alter the functional behaviour for !PREEMPT_RT. Reported-by: syzbot+8715dd783e9b0bef43b1@syzkaller.appspotmail.com Closes: https://lore.kernel.org/all/68c6c3b1.050a0220.2ff435.0382.GAE@google.com/ Fixes: 3253cb49cbad ("softirq: Allow to drop the softirq-BKL lock on PREEMPT_RT") Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Link: https://patch.msgid.link/20251009094338.j1jyKfjR@linutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-10-13	dpll: zl3073x: Handle missing or corrupted flash configuration	Ivan Vecera	2	-0/+24
	If the internal flash contains missing or corrupted configuration, basic communication over the bus still functions, but the device is not capable of normal operation (for example, using mailboxes). This condition is indicated in the info register by the ready bit. If this bit is cleared, the probe procedure times out while fetching the device state. Handle this case by checking the ready bit value in zl3073x_dev_start() and skipping DPLL device and pin registration if it is cleared. Do not report this condition as an error, allowing the devlink device to be registered and enabling the user to flash the correct configuration. Prior this patch: [ 31.112299] zl3073x-i2c 1-0070: Failed to fetch input state: -ETIMEDOUT [ 31.116332] zl3073x-i2c 1-0070: error -ETIMEDOUT: Failed to start device [ 31.136881] zl3073x-i2c 1-0070: probe with driver zl3073x-i2c failed with error -110 After this patch: [ 41.011438] zl3073x-i2c 1-0070: FW not fully ready - missing or corrupted config Fixes: 75a71ecc24125 ("dpll: zl3073x: Register DPLL devices and pins") Signed-off-by: Ivan Vecera <ivecera@redhat.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20251008141445.841113-1-ivecera@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-10-13	btrfs: send: fix -Wflex-array-member-not-at-end warning in struct send_ctx	Gustavo A. R. Silva	1	-1/+3
	The warning -Wflex-array-member-not-at-end was introduced in GCC-14, and we are getting ready to enable it, globally. Fix the following warning: fs/btrfs/send.c:181:24: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end] and move the declaration of send_ctx::cur_inode_path to the end. Notice that struct fs_path contains a flexible array member inline_buf, but also a padding array and a limit calculated for the usable space of inline_buf (FS_PATH_INLINE_SIZE). It is not the pattern where flexible array is in the middle of a structure and could potentially overwrite other members. Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2025-10-13	btrfs: tree-checker: fix bounds check in check_inode_extref()	Dan Carpenter	1	-1/+1
	The parentheses for the unlikely() annotation were put in the wrong place so it means that the condition is basically never true and the bounds checking is skipped. Fixes: aab9458b9f00 ("btrfs: tree-checker: add inode extref checks") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2025-10-13	btrfs: fix memory leaks when rejecting a non SINGLE data profile without an RST	Miquel Sabaté Solà	1	-1/+1
	At the end of btrfs_load_block_group_zone_info() the first thing we do is to ensure that if the mapping type is not a SINGLE one and there is no RAID stripe tree, then we return early with an error. Doing that, though, prevents the code from running the last calls from this function which are about freeing memory allocated during its run. Hence, in this case, instead of returning early, we set the ret value and fall through the rest of the cleanup code. Fixes: 5906333cc4af ("btrfs: zoned: don't skip block group profile checks on conventional zones") CC: stable@vger.kernel.org # 6.8+ Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Miquel Sabaté Solà <mssola@mssola.com> Signed-off-by: David Sterba <dsterba@suse.com>
2025-10-13	btrfs: fix incorrect readahead expansion length	Boris Burkov	1	-1/+1
	The intent of btrfs_readahead_expand() was to expand to the length of the current compressed extent being read. However, "ram_bytes" is not that, in the case where a single physical compressed extent is used for multiple file extents. Consider this case with a large compressed extent C and then later two non-compressed extents N1 and N2 written over C, leaving C1 and C2 pointing to offset/len pairs of C: [ C ] [ N1 ][ C1 ][ N2 ][ C2 ] In such a case, ram_bytes for both C1 and C2 is the full uncompressed length of C. So starting readahead in C1 will expand the readahead past the end of C1, past N2, and into C2. This will then expand readahead again, to C2_start + ram_bytes, way past EOF. First of all, this is totally undesirable, we don't want to read the whole file in arbitrary chunks of the large underlying extent if it happens to exist. Secondly, it results in zeroing the range past the end of C2 up to ram_bytes. This is particularly unpleasant with fs-verity as it can zero and set uptodate pages in the verity virtual space past EOF. This incorrect readahead behavior can lead to verity verification errors, if we iterate in a way that happens to do the wrong readahead. Fix this by using em->len for readahead expansion, not em->ram_bytes, resulting in the expected behavior of stopping readahead at the extent boundary. Reported-by: Max Chernoff <git@maxchernoff.ca> Link: https://bugzilla.redhat.com/show_bug.cgi?id=2399898 Fixes: 9e9ff875e417 ("btrfs: use readahead_expand() on compressed extents") CC: stable@vger.kernel.org # 6.17 Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Boris Burkov <boris@bur.io> Signed-off-by: David Sterba <dsterba@suse.com>
2025-10-13	btrfs: do not assert we found block group item when creating free space tree	Filipe Manana	1	-7/+8
	Currently, when building a free space tree at populate_free_space_tree(), if we are not using the block group tree feature, we always expect to find block group items (either extent items or a block group item with key type BTRFS_BLOCK_GROUP_ITEM_KEY) when we search the extent tree with btrfs_search_slot_for_read(), so we assert that we found an item. However this expectation is wrong since we can have a new block group created in the current transaction which is still empty and for which we still have not added the block group's item to the extent tree, in which case we do not have any items in the extent tree associated to the block group. The insertion of a new block group's block group item in the extent tree happens at btrfs_create_pending_block_groups() when it calls the helper insert_block_group_item(). This typically is done when a transaction handle is released, committed or when running delayed refs (either as part of a transaction commit or when serving tickets for space reservation if we are low on free space). So remove the assertion at populate_free_space_tree() even when the block group tree feature is not enabled and update the comment to mention this case. Syzbot reported this with the following stack trace: BTRFS info (device loop3 state M): rebuilding free space tree assertion failed: ret == 0 :: 0, in fs/btrfs/free-space-tree.c:1115 ------------[ cut here ]------------ kernel BUG at fs/btrfs/free-space-tree.c:1115! Oops: invalid opcode: 0000 [#1] SMP KASAN PTI CPU: 1 UID: 0 PID: 6352 Comm: syz.3.25 Not tainted syzkaller #0 PREEMPT(full) Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 08/18/2025 RIP: 0010:populate_free_space_tree+0x700/0x710 fs/btrfs/free-space-tree.c:1115 Code: ff ff e8 d3 (...) RSP: 0018:ffffc9000430f780 EFLAGS: 00010246 RAX: 0000000000000043 RBX: ffff88805b709630 RCX: fea61d0e2e79d000 RDX: 0000000000000000 RSI: 0000000080000000 RDI: 0000000000000000 RBP: ffffc9000430f8b0 R08: ffffc9000430f4a7 R09: 1ffff92000861e94 R10: dffffc0000000000 R11: fffff52000861e95 R12: 0000000000000001 R13: 1ffff92000861f00 R14: dffffc0000000000 R15: 0000000000000000 FS: 00007f424d9fe6c0(0000) GS:ffff888125afc000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fd78ad212c0 CR3: 0000000076d68000 CR4: 00000000003526f0 Call Trace: <TASK> btrfs_rebuild_free_space_tree+0x1ba/0x6d0 fs/btrfs/free-space-tree.c:1364 btrfs_start_pre_rw_mount+0x128f/0x1bf0 fs/btrfs/disk-io.c:3062 btrfs_remount_rw fs/btrfs/super.c:1334 [inline] btrfs_reconfigure+0xaed/0x2160 fs/btrfs/super.c:1559 reconfigure_super+0x227/0x890 fs/super.c:1076 do_remount fs/namespace.c:3279 [inline] path_mount+0xd1a/0xfe0 fs/namespace.c:4027 do_mount fs/namespace.c:4048 [inline] __do_sys_mount fs/namespace.c:4236 [inline] __se_sys_mount+0x313/0x410 fs/namespace.c:4213 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] do_syscall_64+0xfa/0xfa0 arch/x86/entry/syscall_64.c:94 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7f424e39066a Code: d8 64 89 02 (...) RSP: 002b:00007f424d9fde68 EFLAGS: 00000246 ORIG_RAX: 00000000000000a5 RAX: ffffffffffffffda RBX: 00007f424d9fdef0 RCX: 00007f424e39066a RDX: 0000200000000180 RSI: 0000200000000380 RDI: 0000000000000000 RBP: 0000200000000180 R08: 00007f424d9fdef0 R09: 0000000000000020 R10: 0000000000000020 R11: 0000000000000246 R12: 0000200000000380 R13: 00007f424d9fdeb0 R14: 0000000000000000 R15: 00002000000002c0 </TASK> Modules linked in: ---[ end trace 0000000000000000 ]--- Reported-by: syzbot+884dc4621377ba579a6f@syzkaller.appspotmail.com Link: https://lore.kernel.org/linux-btrfs/68dc3dab.a00a0220.102ee.004e.GAE@google.com/ Fixes: a5ed91828518 ("Btrfs: implement the free space B-tree") CC: <stable@vger.kernel.org> # 6.1.x: 1961d20f6fa8: btrfs: fix assertion when building free space tree CC: <stable@vger.kernel.org> # 6.1.x Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2025-10-13	btrfs: do not use folio_test_partial_kmap() in ASSERT()s	Qu Wenruo	1	-2/+2
	[BUG] Syzbot reported an ASSERT() triggered inside scrub: BTRFS info (device loop0): scrub: started on devid 1 assertion failed: !folio_test_partial_kmap(folio) :: 0, in fs/btrfs/scrub.c:697 ------------[ cut here ]------------ kernel BUG at fs/btrfs/scrub.c:697! Oops: invalid opcode: 0000 [#1] SMP KASAN PTI CPU: 0 UID: 0 PID: 6077 Comm: syz.0.17 Not tainted syzkaller #0 PREEMPT(full) Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 08/18/2025 RIP: 0010:scrub_stripe_get_kaddr+0x1bb/0x1c0 fs/btrfs/scrub.c:697 Call Trace: <TASK> scrub_bio_add_sector fs/btrfs/scrub.c:932 [inline] scrub_submit_initial_read+0xf21/0x1120 fs/btrfs/scrub.c:1897 submit_initial_group_read+0x423/0x5b0 fs/btrfs/scrub.c:1952 flush_scrub_stripes+0x18f/0x1150 fs/btrfs/scrub.c:1973 scrub_stripe+0xbea/0x2a30 fs/btrfs/scrub.c:2516 scrub_chunk+0x2a3/0x430 fs/btrfs/scrub.c:2575 scrub_enumerate_chunks+0xa70/0x1350 fs/btrfs/scrub.c:2839 btrfs_scrub_dev+0x6e7/0x10e0 fs/btrfs/scrub.c:3153 btrfs_ioctl_scrub+0x249/0x4b0 fs/btrfs/ioctl.c:3163 vfs_ioctl fs/ioctl.c:51 [inline] __do_sys_ioctl fs/ioctl.c:597 [inline] __se_sys_ioctl+0xfc/0x170 fs/ioctl.c:583 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] do_syscall_64+0xfa/0xfa0 arch/x86/entry/syscall_64.c:94 entry_SYSCALL_64_after_hwframe+0x77/0x7f </TASK> ---[ end trace 0000000000000000 ]--- Which doesn't make much sense, as all the folios we allocated for scrub should not be highmem. [CAUSE] Thankfully syzbot has a detailed kernel config file, showing that CONFIG_DEBUG_KMAP_LOCAL_FORCE_MAP is set to y. And that debug option will force all folio_test_partial_kmap() to return true, to improve coverage on highmem tests. But in our case we really just want to make sure the folios we allocated are not highmem (and they are indeed not). Such incorrect result from folio_test_partial_kmap() is just screwing up everything. [FIX] Replace folio_test_partial_kmap() to folio_test_highmem() so that we won't bother those highmem specific debuging options. Fixes: 5fbaae4b8567 ("btrfs: prepare scrub to support bs > ps cases") Reported-by: syzbot+bde59221318c592e6346@syzkaller.appspotmail.com Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2025-10-13	btrfs: only set the device specific options after devices are opened	Qu Wenruo	1	-2/+1
	[BUG] With v6.17-rc kernels, btrfs will always set 'ssd' mount option even if the block device is not a rotating one: # cat /sys/block/sdd/queue/rotational 1 # cat /etc/fstab: LABEL=DATA2 /data2 btrfs rw,relatime,space_cache=v2,subvolid=5,subvol=/,nofail,nosuid,nodev 0 0 # mount [...] /dev/sdd on /data2 type btrfs (rw,nosuid,nodev,relatime,ssd,space_cache=v2,subvolid=5,subvol=/) [CAUSE] The 'ssd' mount option is set by set_device_specific_options(), and it expects that if there is any rotating device in the btrfs, it will set fs_devices::rotating. However after commit bddf57a70781 ("btrfs: delay btrfs_open_devices() until super block is created"), the device opening is delayed until the super block is created. But the timing of set_device_specific_options() is still left as is, this makes the function be called without any device opened. Since no device is opened, thus fs_devices::rotating will never be set, making btrfs incorrectly set 'ssd' mount option. [FIX] Only call set_device_specific_options() after btrfs_open_devices(). Also only call set_device_specific_options() after a new mount, if we're mounting a mounted btrfs, there is no need to set the device specific mount options again. Reported-by: HAN Yuwei <hrx@bupt.moe> Link: https://lore.kernel.org/linux-btrfs/C8FF75669DFFC3C5+5f93bf8a-80a0-48a6-81bf-4ec890abc99a@bupt.moe/ Fixes: bddf57a70781 ("btrfs: delay btrfs_open_devices() until super block is created") CC: stable@vger.kernel.org # 6.17 Signed-off-by: Qu Wenruo <wqu@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2025-10-13	btrfs: fix memory leak on duplicated memory in the qgroup assign ioctl	Miquel Sabaté Solà	1	-1/+1
	On 'btrfs_ioctl_qgroup_assign' we first duplicate the argument as provided by the user, which is kfree'd in the end. But this was not the case when allocating memory for 'prealloc'. In this case, if it somehow failed, then the previous code would go directly into calling 'mnt_drop_write_file', without freeing the string duplicated from the user space. Fixes: 4addc1ffd67a ("btrfs: qgroup: preallocate memory before adding a relation") CC: stable@vger.kernel.org # 6.12+ Reviewed-by: Boris Burkov <boris@bur.io> Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Miquel Sabaté Solà <mssola@mssola.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2025-10-13	btrfs: fix clearing of BTRFS_FS_RELOC_RUNNING if relocation already running	Filipe Manana	1	-6/+7
	When starting relocation, at reloc_chunk_start(), if we happen to find the flag BTRFS_FS_RELOC_RUNNING is already set we return an error (-EINPROGRESS) to the callers, however the callers call reloc_chunk_end() which will clear the flag BTRFS_FS_RELOC_RUNNING, which is wrong since relocation was started by another task and still running. Finding the BTRFS_FS_RELOC_RUNNING flag already set is an unexpected scenario, but still our current behaviour is not correct. Fix this by never calling reloc_chunk_end() if reloc_chunk_start() has returned an error, which is what logically makes sense, since the general widespread pattern is to have end functions called only if the counterpart start functions succeeded. This requires changing reloc_chunk_start() to clear BTRFS_FS_RELOC_RUNNING if there's a pending cancel request. Fixes: 907d2710d727 ("btrfs: add cancellable chunk relocation support") CC: stable@vger.kernel.org # 5.15+ Reviewed-by: Boris Burkov <boris@bur.io> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2025-10-13	can: j1939: add missing calls in NETDEV_UNREGISTER notification handler	Tetsuo Handa	1	-0/+2
	Currently NETDEV_UNREGISTER event handler is not calling j1939_cancel_active_session() and j1939_sk_queue_drop_all(). This will result in these calls being skipped when j1939_sk_release() is called. And I guess that the reason syzbot is still reporting unregister_netdevice: waiting for vcan0 to become free. Usage count = 2 is caused by lack of these calls. Calling j1939_cancel_active_session(priv, sk) from j1939_sk_release() can be covered by calling j1939_cancel_active_session(priv, NULL) from j1939_netdev_notify(). Calling j1939_sk_queue_drop_all() from j1939_sk_release() can be covered by calling j1939_sk_netdev_event_netdown() from j1939_netdev_notify(). Therefore, we can reuse j1939_cancel_active_session(priv, NULL) and j1939_sk_netdev_event_netdown(priv) for NETDEV_UNREGISTER event handler. Fixes: 7fcbe5b2c6a4 ("can: j1939: implement NETDEV_UNREGISTER notification handler") Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Tested-by: Oleksij Rempel <o.rempel@pengutronix.de> Acked-by: Oleksij Rempel <o.rempel@pengutronix.de> Link: https://patch.msgid.link/3ad3c7f8-5a74-4b07-a193-cb0725823558@I-love.SAKURA.ne.jp Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>