summaryrefslogtreecommitdiffstats
path: root/fs
AgeCommit message (Collapse)AuthorLines
2026-05-15Merge tag 'v7.1-rc4-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds-63/+130
Pull smb client fixes from Steve French: - Fix integer overflow in read - Fix smbdirect error cleanup - Multichannel reconnect fix - Add some missing defines and correct some references to protocol spec - Fix oob symlink read * tag 'v7.1-rc4-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6: smbdirect: Fix error cleanup in smbdirect_map_sges_from_iter() smb: client: avoid integer overflow in SMB2 READ length check cifs: client: stage smb3_reconfigure() updates and restore ctx on failure smb/client: fix possible infinite loop and oob read in symlink_data() SMB3.1.1: add missing QUERY_DIR info levels
2026-05-15Merge tag 'ceph-for-7.1-rc4' of https://github.com/ceph/ceph-clientLinus Torvalds-10/+46
Pull ceph fixes from Ilya Dryomov: "An important patch from Hristo that squashes a folio reference leak that could lead to OOM kills in CephFS and a number of miscellaneous fixes from Raphael and Slava. All but two are marked for stable" * tag 'ceph-for-7.1-rc4' of https://github.com/ceph/ceph-client: libceph: Fix potential null-ptr-deref in decode_choose_args() libceph: handle rbtree insertion error in decode_choose_args() libceph: Fix potential out-of-bounds access in osdmap_decode() ceph: put folios not suitable for writeback ceph: add ceph_has_realms_with_quotas() check to ceph_quota_update_statfs() libceph: Fix potential out-of-bounds access in __ceph_x_decrypt() ceph: fix BUG_ON in __ceph_build_xattrs_blob() due to stale blob size ceph: fix a buffer leak in __ceph_setxattr() libceph: Fix unnecessarily high ceph_decode_need() for uniform bucket libceph: Fix potential out-of-bounds access in crush_decode()
2026-05-15Merge tag 'for-7.1-rc3-tag' of ↵Linus Torvalds-32/+55
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fixes from David Sterba: - fixup warning when allocating memory for readahead, __GFP_NOWARN was accidentally dropped when setting mapping constraints - in tracepoint of file sync, fix sleeping in atomic context when handling dentries - harden initial loading of block group on crafted/fuzzed images, iterate all chunk mapping entries unconditionally - fix freeing pages of submitted io after checking for errors - fix incorrect inode size after remount when using fallocate KEEP_SIZE mode (also requires disabled 'no-holes' feature) * tag 'for-7.1-rc3-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: fix incorrect i_size after remount caused by KEEP_SIZE prealloc gap btrfs: only release the dirty pages io tree after successful writes btrfs: tracepoints: fix sleep while in atomic context in btrfs_sync_file() btrfs: always pass __GFP_NOWARN from add_ra_bio_pages() btrfs: fix check_chunk_block_group_mappings() to iterate all chunk maps
2026-05-15Merge tag 'xfs-fixes-7.1-rc4' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linuxLinus Torvalds-21/+52
Pull xfs fixes from Carlos Maiolino: "A few bug fixes, nothing really special stands out" * tag 'xfs-fixes-7.1-rc4' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: xfs: Fix typo in comment xfs: fix the "limiting open zones" message xfs: flush delalloc blocks on ENOSPC in xfs_trans_alloc_icreate xfs: check da node block pad field during scrub xfs: fix memory leak for data allocated by xfs_zone_gc_data_alloc() xfs: fix memory leak on error in xfs_alloc_zone_info() xfs: check directory data block header padding in scrub xfs: zero directory data block padding on write verification xfs: zero entire directory data block header region at init xfs: remove the meaningless XFS_ALLOC_FLAG_FREEING
2026-05-15Merge tag 'nfsd-7.1-1' of ↵Linus Torvalds-25/+59
git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux Pull nfsd fixes from Chuck Lever: "Fixes for this release: - Correctness fix for the new sunrpc cache netlink protocol Marked for stable: - Correctness fixes for delegated attributes - Prevent an infinite loop when revoking layouts" * tag 'nfsd-7.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: NFSD: Fix infinite loop in layout state revocation sunrpc: start cache request seqno at 1 to fix netlink GET_REQS nfsd: update mtime/ctime on COPY in presence of delegated attributes nfsd: update mtime/ctime on CLONE in presense of delegated attributes nfsd: fix file change detection in CB_GETATTR nfsd: fix GET_DIR_DELEGATION when VFS leases are disabled
2026-05-15Merge tag 'block-7.1-20260515' of ↵Linus Torvalds-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux Pull block fixes from Jens Axboe: - NVMe merge request via Keith: - Fix memory leak on a passthrough integrity mapping failure (Keith) - Hide secrets behind debug option (Hannes) - Fix pci use-after-free for host memory buffer (Chia-Lin Kao) - Fix tcp taregt use-after-free for data digest (Sagi) - Revert a mistaken quirk (Alan Cui) - Fix uevent and controller state race condition (Maurizio) - Fix apple submission queue re-initialization (Nick Chan) - Three fixes for blk-integrity, fixing an issue with the user data mapping and two problems with recomputing number of segments - Two fixes for the iov_iter bounce buffering - Fix for the handling of dead zoned write plugs - ublk max_sectors validation fix, with associated selftest addition * tag 'block-7.1-20260515' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux: nvme-apple: Reset q->sq_tail during queue init block: align down bounces bios block: pass a minsize argument to bio_iov_iter_bounce selftests: ublk: cap nthreads to kernel's actual nr_hw_queues block: fix handling of dead zone write plugs block: bio-integrity: Fix null-ptr-deref in bio_integrity_map_user() block: recompute nr_integrity_segments in blk_insert_cloned_request block: don't overwrite bip_vcnt in bio_integrity_copy_user() nvme: fix race condition between connected uevent and STARTED_ONCE flag Revert "nvme: add quirk NVME_QUIRK_IGNORE_DEV_SUBNQN for 144d:a808" nvmet-tcp: Fix potential UAF when ddgst mismatch nvme-pci: fix use-after-free in nvme_free_host_mem() nvmet-auth: Do not print DH-HMAC-CHAP secrets nvme: fix bio leak on mapping failure nvme: make prp passthrough usage less scary ublk: reject max_sectors smaller than PAGE_SECTORS in parameter validation
2026-05-14smbdirect: Fix error cleanup in smbdirect_map_sges_from_iter()David Howells-1/+1
Fix smbdirect_map_sges_from_iter() to use pre-decrement, not post-decrement so that it cleans up the correct slots. Fixes: e5fbdde43017 ("cifs: Add a function to build an RDMA SGE list from an iterator") Closes: https://sashiko.dev/#/patchset/20260326104544.509518-1-dhowells%40redhat.com Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Stefan Metzmacher <metze@samba.org> cc: Paulo Alcantara <pc@manguebit.org> cc: Tom Talpey <tom@talpey.com> cc: linux-cifs@vger.kernel.org cc: linux-fsdevel@vger.kernel.org Signed-off-by: Steve French <stfrench@microsoft.com>
2026-05-14smb: client: avoid integer overflow in SMB2 READ length checkJeremy Erazo-7/+12
SMB2 READ response validation in cifs_readv_receive() and handle_read_data() checks data_offset + data_len against the received buffer length. Both values are attacker-controlled fields from the server response and are stored as unsigned int, so the addition can wrap before the bounds check: fs/smb/client/transport.c:1259 if (!use_rdma_mr && (data_offset + data_len > buflen)) fs/smb/client/smb2ops.c:4839 else if (buf_len >= data_offset + data_len) A malicious SMB server can use this to bypass validation. In the non-encrypted receive path the client attempts an oversized socket read and stalls for the SMB response timeout (180 seconds) before reconnecting. In the SMB3 encrypted path, runtime testing shows the malformed length can reach copy_to_iter() in handle_read_data() with attacker-controlled size, where usercopy hardening stops the oversized copy before bytes reach userspace. Guard both call sites with check_add_overflow(), which is already used elsewhere in this subsystem (smb2pdu.c). On overflow, treat the response as malformed and reject with -EIO. Signed-off-by: Jeremy Erazo <mendozayt13@gmail.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-05-14cifs: client: stage smb3_reconfigure() updates and restore ctx on failureDaeMyung Kang-53/+108
smb3_reconfigure() moves strings out of cifs_sb->ctx before the multichannel update, so a later failure can leave the live context with NULL strings or options that do not match the session. Stage the new ctx separately, commit it only on success, and restore the snapshot on failure. Also make smb3_sync_session_ctx_passwords() all-or-nothing. Commit session passwords before channel updates so newly added channels authenticate with the staged credentials. Fixes: ef529f655a2c ("cifs: client: allow changing multichannel mount options on remount") Reported-by: RAJASI MANDAL <rajasimandalos@gmail.com> Closes: https://lore.kernel.org/lkml/CAEY6_V1+dzW3OD5zqXhsWyXwrDTrg5tAMGZ1AJ7_GAuRE+aevA@mail.gmail.com/ Link: https://lore.kernel.org/lkml/xkr2dlvgibq5j6gkcxd3yhhnj4atgxw2uy4eug2pxm7wy7nbms@iq6cf5taa65v/ Reviewed-by: Henrique Carvalho <henrique.carvalho@suse.com> Signed-off-by: DaeMyung Kang <charsyam@gmail.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-05-14smb/client: fix possible infinite loop and oob read in symlink_data()Ye Bin-0/+3
On 32-bit architectures, the infinite loop is as follows: len = p->ErrorDataLength == 0xfffffff8 u8 *next = p->ErrorContextData + len next == p On 32-bit architectures, the out-of-bounds read is as follows: len = p->ErrorDataLength == 0xfffffff0 u8 *next = p->ErrorContextData + len next == (u8 *)p - 8 Reported-by: ChenXiaoSong <chenxiaosong@kylinos.cn> Fixes: 76894f3e2f71 ("cifs: improve symlink handling for smb2+") Cc: stable@vger.kernel.org Signed-off-by: Ye Bin <yebin10@huawei.com> Reviewed-by: ChenXiaoSong <chenxiaosong@kylinos.cn> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-05-13block: pass a minsize argument to bio_iov_iter_bounceChristoph Hellwig-1/+1
When bouncing for block size > PAGE_SIZE file systems that require file system block size alignment (e.g. zoned XFS), the bio needs to be big enough to fit an entire block. Fixes: 8dd5e7c75d7b ("block: add helpers to bounce buffer an iov_iter into bios") Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hannes Reinecke <hare@kernel.org> Link: https://patch.msgid.link/20260507050153.1298375-2-hch@lst.de Signed-off-by: Jens Axboe <axboe@kernel.dk>
2026-05-12SMB3.1.1: add missing QUERY_DIR info levelsSteve French-2/+6
New Infolevels for QUERY_DIR (and QUERY_INFO) levels 78 through 81 are now being used by Windows clients and were added to the documentation. Add defines for them (and correct some typos in documentation). See MS-SMB2 2.2.33 and MS-FSCC 2.4 Signed-off-by: Steve French <stfrench@microsoft.com>
2026-05-11xfs: Fix typo in commentMd Shofiqul Islam-1/+1
Fix spelling mistake in comment: - occured -> occurred Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Md Shofiqul Islam <shofiqtest@gmail.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2026-05-11xfs: fix the "limiting open zones" messageChristoph Hellwig-1/+1
The xfs logging macros include a newline, remove the \n, which adds an extra one. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Reviewed-by: Andrey Albershteyn <aalbersh@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2026-05-11ceph: put folios not suitable for writebackHristo Venev-0/+2
The batch holds references to the folios (see `filemap_get_folios`, `folio_batch_release`), so we need to `folio_put` the folios we remove. Tested on v6.18. Cc: stable@vger.kernel.org Link: https://tracker.ceph.com/issues/74156 Signed-off-by: Hristo Venev <hristo@venev.name> Reviewed-by: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2026-05-11ceph: add ceph_has_realms_with_quotas() check to ceph_quota_update_statfs()Viacheslav Dubeyko-10/+27
When MDS rejects a session, remove_session_caps() -> __ceph_remove_cap() -> ceph_change_snap_realm() clears i_snap_realm for every inode that loses its last cap. The realm is restored once caps are re-granted after reconnect. It is not a real error and this patch changes pr_err_ratelimited_client() on doutc(). Every quota methods ceph_quota_is_max_files_exceeded(), ceph_quota_is_max_bytes_exceeded(), ceph_quota_is_max_bytes_approaching() calls ceph_has_realms_with_quotas() check. This patch adds the missing ceph_has_realms_with_quotas() call into ceph_quota_update_statfs(). [ idryomov: add braces around both arms of multiline ifs ] Signed-off-by: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com> Reviewed-by: Alex Markuze <amarkuze@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2026-05-11ceph: fix BUG_ON in __ceph_build_xattrs_blob() due to stale blob sizeViacheslav Dubeyko-0/+16
The generic/642 test-case can reproduce the kernel crash: [40243.605254] ------------[ cut here ]------------ [40243.605956] kernel BUG at fs/ceph/xattr.c:918! [40243.607142] Oops: invalid opcode: 0000 [#1] SMP PTI [40243.608067] CPU: 7 UID: 0 PID: 498762 Comm: kworker/7:1 Not tainted 7.0.0-rc7+ #3 PREEMPT(full) [40243.609700] Hardware name: QEMU Ubuntu 25.10 PC v2 (i440FX + PIIX, + 10.1 machine, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 [40243.611820] Workqueue: ceph-msgr ceph_con_workfn [40243.612715] RIP: 0010:__ceph_build_xattrs_blob+0x1b8/0x1e0 [40243.613731] Code: 0f 84 82 fe ff ff e9 cf 8e 56 ff 48 8d 65 e8 31 c0 5b 41 5c 41 5d 5d 31 d2 31 c9 31 f6 31 ff 45 31 c0 45 31 c9 c3 cc cc cc cc <0f> 0b 4c 8b 62 08 41 8b 85 24 07 00 00 49 83 c4 04 41 89 44 24 fc [40243.616888] RSP: 0018:ffffcc80c4d4b688 EFLAGS: 00010287 [40243.617773] RAX: 0000000000010026 RBX: 0000000000000001 RCX: 0000000000000000 [40243.618928] RDX: ffff8a773798dee0 RSI: 0000000000000000 RDI: 0000000000000000 [40243.620158] RBP: ffffcc80c4d4b6a0 R08: 0000000000000000 R09: 0000000000000000 [40243.621573] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8a75f3b58000 [40243.622907] R13: ffff8a75f3b58000 R14: 0000000000000080 R15: 000000000000bffd [40243.624054] FS: 0000000000000000(0000) GS:ffff8a787d1b4000(0000) knlGS:0000000000000000 [40243.625331] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [40243.626269] CR2: 000072f390b623c0 CR3: 000000011c02a003 CR4: 0000000000372ef0 [40243.627408] Call Trace: [40243.627839] <TASK> [40243.628188] __prep_cap+0x3fd/0x4a0 [40243.628789] ? do_raw_spin_unlock+0x4e/0xe0 [40243.629474] ceph_check_caps+0x46a/0xc80 [40243.630094] ? __lock_acquire+0x4a2/0x2650 [40243.630773] ? find_held_lock+0x31/0x90 [40243.631347] ? handle_cap_grant+0x79f/0x1060 [40243.632068] ? lock_release+0xd9/0x300 [40243.632696] ? __mutex_unlock_slowpath+0x3e/0x340 [40243.633429] ? lock_release+0xd9/0x300 [40243.634052] handle_cap_grant+0xcf6/0x1060 [40243.634745] ceph_handle_caps+0x122b/0x2110 [40243.635415] mds_dispatch+0x5bd/0x2160 [40243.636034] ? ceph_con_process_message+0x65/0x190 [40243.636828] ? lock_release+0xd9/0x300 [40243.637431] ceph_con_process_message+0x7a/0x190 [40243.638184] ? kfree+0x311/0x4f0 [40243.638749] ? kfree+0x311/0x4f0 [40243.639268] process_message+0x16/0x1a0 [40243.639915] ? sg_free_table+0x39/0x90 [40243.640572] ceph_con_v2_try_read+0xf58/0x2120 [40243.641255] ? lock_acquire+0xc8/0x300 [40243.641863] ceph_con_workfn+0x151/0x820 [40243.642493] process_one_work+0x22f/0x630 [40243.643093] ? process_one_work+0x254/0x630 [40243.643770] worker_thread+0x1e2/0x400 [40243.644332] ? __pfx_worker_thread+0x10/0x10 [40243.645020] kthread+0x109/0x140 [40243.645560] ? __pfx_kthread+0x10/0x10 [40243.646125] ret_from_fork+0x3f8/0x480 [40243.646752] ? __pfx_kthread+0x10/0x10 [40243.647316] ? __pfx_kthread+0x10/0x10 [40243.647919] ret_from_fork_asm+0x1a/0x30 [40243.648556] </TASK> [40243.648902] Modules linked in: overlay hctr2 libpolyval chacha libchacha adiantum libnh libpoly1305 essiv intel_rapl_msr intel_rapl_common intel_uncore_frequency_common skx_edac_common nfit kvm_intel kvm irqbypass joydev ghash_clmulni_intel aesni_intel rapl input_leds mac_hid psmouse vga16fb serio_raw vgastate floppy i2c_piix4 pata_acpi bochs qemu_fw_cfg i2c_smbus sch_fq_codel rbd dm_crypt msr parport_pc ppdev lp parport efi_pstore [40243.654766] ---[ end trace 0000000000000000 ]--- Commit d93231a6bc8a ("ceph: prevent a client from exceeding the MDS maximum xattr size") moved the required_blob_size computation to before the __build_xattrs() call, introducing a race. __build_xattrs() releases and reacquires i_ceph_lock during execution. In that window, handle_cap_grant() may update i_xattrs.blob with a newer MDS-provided blob and bump i_xattrs.version. When __build_xattrs() detects that index_version < version, it destroys and rebuilds the entire xattr rb-tree from the new blob, potentially increasing count, names_size, and vals_size. The prealloc_blob size check that follows still uses the stale required_blob_size computed before the rebuild, so it passes even when prealloc_blob is too small for the now-larger tree. After __set_xattr() adds one more xattr on top, __ceph_build_xattrs_blob() is called from the cap flush path and hits: BUG_ON(need > ci->i_xattrs.prealloc_blob->alloc_len); Fix this by recomputing required_blob_size after __build_xattrs() returns, using the current tree state. Also re-validate against m_max_xattr_size to fall back to the sync path if the rebuilt tree now exceeds the MDS limit. Cc: stable@vger.kernel.org Fixes: d93231a6bc8a ("ceph: prevent a client from exceeding the MDS maximum xattr size") Signed-off-by: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com> Reviewed-by: Alex Markuze <amarkuze@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2026-05-11ceph: fix a buffer leak in __ceph_setxattr()Viacheslav Dubeyko-0/+1
The old_blob in __ceph_setxattr() can store ci->i_xattrs.prealloc_blob value during the retry. However, it is never called the ceph_buffer_put() for the old_blob object. This patch fixes the issue of the buffer leak. Cc: stable@vger.kernel.org Signed-off-by: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com> Reviewed-by: Alex Markuze <amarkuze@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2026-05-10NFSD: Fix infinite loop in layout state revocationChuck Lever-0/+7
find_one_sb_stid() skips stids whose sc_status is non-zero, but the SC_TYPE_LAYOUT case in nfsd4_revoke_states() never sets sc_status before calling nfsd4_close_layout(). The retry loop therefore finds the same layout stid on every iteration, hanging the revoker indefinitely. Fixes: 1e33e1414bec ("nfsd: allow layout state to be admin-revoked.") Reported-by: Dai Ngo <dai.ngo@oracle.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Tested-by: Dai Ngo <dai.ngo@oracle.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2026-05-10nfsd: update mtime/ctime on COPY in presence of delegated attributesOlga Kornievskaia-1/+11
When delegated attributes are given on open, the file is opened with NOCMTIME and modifying operations do not update mtime/ctime as to not get out-of-sync with the client's delegated view. However, for COPY operation, the server should update its view of mtime/ctime and reflect that in any GETATTR queries. Fixes: e5e9b24ab8fa ("nfsd: freeze c/mtime updates with outstanding WRITE_ATTRS delegation") Cc: stable@vger.kernel.org Signed-off-by: Olga Kornievskaia <okorniev@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2026-05-10nfsd: update mtime/ctime on CLONE in presense of delegated attributesOlga Kornievskaia-15/+33
When delegated attributes are given on open, the file is opened with NOCMTIME and modifying operations do not update mtime/ctime as to not get out-of-sync with the client's delegated view. However, for CLONE operation, the server should update its view of mtime/ctime and reflect that in any GETATTR queries. Fixes: e5e9b24ab8fa ("nfsd: freeze c/mtime updates with outstanding WRITE_ATTRS delegation") Cc: stable@vger.kernel.org Signed-off-by: Olga Kornievskaia <okorniev@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2026-05-10nfsd: fix file change detection in CB_GETATTRScott Mayhew-5/+8
RFC 8881, section 10.4.3 doesn't say anything about caching the file size in the delegation record, nor does it say anything about comparing a cached file size with the size reported by the client in the CB_GETATTR reply for the purpose of determining if the client holds modified data for the file. What section 10.4.3 of RFC 8881 does say is that the server should compare the *current* file size with the size reported by the client holding the delegation in the CB_GETATTR reply, and if they differ to treat it as a modification regardless of the change attribute retrieved via the CB_GETATTR. Doing otherwise would cause the server to believe the client holding the delegation has a modified version of the file, even if the client flushed the modifications to the server prior to the CB_GETATTR. This would have the added side effect of subsequent CB_GETATTRs causing updates to the mtime, ctime, and change attribute even if the client holding the delegation makes no further updates to the file. Modify nfsd4_deleg_getattr_conflict() to obtain the current file size via i_size_read(). Retain the ncf_cur_fsize field, since it's a convenient way to return the file size back to nfsd4_encode_fattr4(), but don't use it for the purpose of detecting file changes. Remove the unnecessary initialization of ncf_cur_fsize in nfs4_open_delegation(). Also, if we recall the delegation (because the client didn't respond to the CB_GETATTR), then skip the logic that checks the nfs4_cb_fattr fields. Fixes: c5967721e106 ("NFSD: handle GETATTR conflict with write delegation") Cc: stable@vger.kernel.org Signed-off-by: Scott Mayhew <smayhew@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2026-05-09Merge tag 'fsverity-for-linus' of git://git.kernel.org/pub/scm/fs/fsverity/linuxLinus Torvalds-1/+1
Pull fsverity fix from Eric Biggers: "Fix a regression in overlayfs caused by an fsverity API change" * tag 'fsverity-for-linus' of git://git.kernel.org/pub/scm/fs/fsverity/linux: ovl: fix verity lazy-load guard broken by fsverity_active() semantic change
2026-05-08Merge tag 'v7.1-rc3-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds-24/+96
Pull smb client fixes from Steve French: - Fix for two ACL issues (security fix to validate dacloffset better and chmod fix) - Fix out of bounds reads (in check_wsl_eas and smb2_check_msg for symlinks) - Two Kerberos fixes including an important one when AES-256 encryption chosen - Fix open_cached_dir problem when directory leases disabled * tag 'v7.1-rc3-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6: smb: client: validate dacloffset before building DACL pointers smb/client: fix out-of-bounds read in smb2_compound_op() smb/client: fix out-of-bounds read in symlink_data() smb: client: Zero-pad short GSS session keys per MS-SMB2 smb: client: Use FullSessionKey for AES-256 encryption key derivation smb: client: use kzalloc to zero-initialize security descriptor buffer cifs: abort open_cached_dir if we don't request leases
2026-05-08btrfs: fix incorrect i_size after remount caused by KEEP_SIZE prealloc gapRobbie Ko-0/+28
When fallocate() with FALLOC_FL_KEEP_SIZE preallocates an extent past the current i_size, the file_extent_tree of the inode is updated to cover that range. However, on the next mount, btrfs_read_locked_inode() only re-populates file_extent_tree with [0, round_up(i_size, sectorsize)), losing the marks that belonged to the KEEP_SIZE prealloc extent beyond i_size. Later, when a non-KEEP_SIZE fallocate() extends i_size into / past that old prealloc extent, the reservation loop in btrfs_fallocate() skips already-prealloc segments and does not call into the path that marks the file_extent_tree, so a gap remains inside the file_extent_tree across [old_aligned_i_size, start_of_new_alloc). Then __btrfs_prealloc_file_range() calls btrfs_inode_safe_disk_i_size_write(), which uses find_contiguous_extent_bit() starting at offset 0 to derive disk_i_size. The walk stops at the gap, so disk_i_size ends up smaller than i_size and gets persisted. After the next mount, the file shows the wrong (smaller) size. The following reproducer triggers the problem: $ cat test.sh MNT=/mnt/sdi DEV=/dev/sdi mkdir -p $MNT mkfs.btrfs -f -O ^no-holes $DEV mount $DEV $MNT touch $MNT/file1 # KEEP_SIZE prealloc beyond i_size (i_size stays 0) fallocate -n -o 4M -l 4M $MNT/file1 umount $MNT mount $DEV $MNT # non-KEEP_SIZE fallocate that overlaps the previous prealloc tail # and extends past it fallocate -o 7M -l 2M $MNT/file1 ls -lh $MNT/file1 umount $MNT mount $DEV $MNT ls -lh $MNT/file1 umount $MNT Running the reproducer gives the following result: $ ./test.sh (...) -rw-rw-r-- 1 root root 9.0M May 4 16:35 /mnt/sdi/file1 -rw-rw-r-- 1 root root 7.0M May 4 16:35 /mnt/sdi/file1 The size before the second mount is correct (9M), but after the remount it drops to 7M, i.e. the start of the gap inside file_extent_tree. Fix this in __btrfs_prealloc_file_range() by marking the entire range [round_down(old_i_size, sectorsize), round_up(new_i_size, sectorsize)) in file_extent_tree before updating i_size and calling btrfs_inode_safe_disk_i_size_write(). This ensures the contiguous bit search starting from 0 is not truncated by a stale gap left behind by a previous KEEP_SIZE prealloc that was not restored on inode load. The fix has no effect when the NO_HOLES feature is enabled because btrfs_inode_safe_disk_i_size_write() and btrfs_inode_set_file_extent_range() both take the fast path that directly tracks disk_i_size without consulting file_extent_tree. Fixes: 9ddc959e802b ("btrfs: use the file extent tree infrastructure") Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Robbie Ko <robbieko@synology.com> [ Minor updates to the change log ] Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2026-05-08btrfs: only release the dirty pages io tree after successful writesQu Wenruo-5/+5
[WARNING] With extra warning on dirty extent buffers at umount (aka, the next patch in the series), test case generic/388 can trigger the following warning about dirty extent buffers at unmount time: BTRFS critical (device dm-2 state E): emergency shutdown BTRFS error (device dm-2 state E): error while writing out transaction: -30 BTRFS warning (device dm-2 state E): Skipping commit of aborted transaction. BTRFS error (device dm-2 state EA): Transaction 9 aborted (error -30) BTRFS: error (device dm-2 state EA) in cleanup_transaction:2068: errno=-30 Readonly filesystem BTRFS info (device dm-2 state EA): forced readonly BTRFS info (device dm-2 state EA): last unmount of filesystem 4fbf2e15-f941-49a0-bc7c-716315d2777c ------------[ cut here ]------------ WARNING: disk-io.c:3311 at invalidate_and_check_btree_folios+0xfd/0x1ca [btrfs], CPU#8: umount/914368 CPU: 8 UID: 0 PID: 914368 Comm: umount Tainted: G OE 7.1.0-rc1-custom+ #372 PREEMPT(full) 2de38db8d1deae71fde295430a0ff3ab98ccf596 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS unknown 02/02/2022 RIP: 0010:invalidate_and_check_btree_folios+0xfd/0x1ca [btrfs] Call Trace: <TASK> close_ctree+0x52e/0x574 [btrfs d2f0b1cd330d1287e7a9919d112eadfc0e914efd] generic_shutdown_super+0x89/0x1a0 kill_anon_super+0x16/0x40 btrfs_kill_super+0x16/0x20 [btrfs d2f0b1cd330d1287e7a9919d112eadfc0e914efd] deactivate_locked_super+0x2d/0xb0 cleanup_mnt+0xdc/0x140 task_work_run+0x5a/0xa0 exit_to_user_mode_loop+0x123/0x4b0 do_syscall_64+0x243/0x7c0 entry_SYSCALL_64_after_hwframe+0x4b/0x53 </TASK> ---[ end trace 0000000000000000 ]--- BTRFS warning (device dm-2 state EA): unable to release extent buffer 30539776 owner 9 gen 9 refs 2 flags 0x7 BTRFS warning (device dm-2 state EA): unable to release extent buffer 30621696 owner 257 gen 9 refs 2 flags 0x7 BTRFS warning (device dm-2 state EA): unable to release extent buffer 30638080 owner 258 gen 9 refs 2 flags 0x7 BTRFS warning (device dm-2 state EA): unable to release extent buffer 30654464 owner 7 gen 9 refs 2 flags 0x7 BTRFS warning (device dm-2 state EA): unable to release extent buffer 30703616 owner 2 gen 9 refs 2 flags 0x7 BTRFS warning (device dm-2 state EA): unable to release extent buffer 30720000 owner 10 gen 9 refs 2 flags 0x7 BTRFS warning (device dm-2 state EA): unable to release extent buffer 30736384 owner 4 gen 9 refs 2 flags 0x7 BTRFS warning (device dm-2 state EA): unable to release extent buffer 30752768 owner 11 gen 9 refs 2 flags 0x7 I'm using a stripped down version, which seems to trigger the warning more reliably: _fsstress_pid="" workload() { dmesg -C mkfs.btrfs -f -K $dev > /dev/null echo 1 > /sys/kernel/debug/clear_warn_once mount $dev $mnt $fsstress -w -n 1024 -p 4 -d $mnt & _fsstress_pid=$! sleep 0 $godown $mnt pkill --echo -PIPE fsstress > /dev/null wait $_fsstress_pid unset _fsstress_pid umount $mnt if dmesg | grep -q "WARNING"; then fail fi } for (( i = 0; i < $runtime; i++ )); do echo "=== $i/$runtime ===" workload done [CAUSE] Inside btrfs_write_and_wait_transaction(), we first try to write all dirty ebs, then wait for them to finish. After that we call btrfs_extent_io_tree_release() to free all extent states from dirty_pages io tree. However if we hit an error from btrfs_write_marked_extent(), then we still call btrfs_extent_io_tree_release() to clear that dirty_pages io tree, which may contain dirty records that we haven't yet submitted. Furthermore, the later transaction cleanup path will utilize that dirty_pages io tree to properly cleanup those dirty ebs, but since it's already empty, no dirty ebs are properly cleaned up, thus will later trigger the warnings inside invalidate_btree_folios(). [FIX] Normally such dirty ebs won't cause problems, as when the iput() is called on the btree inode, the dirty ebs will be forcibly written back, and since the fs is already in an error status, such writeback will not reach disk and finish immediately. But it's still better to get rid of such dirty ebs, if we ended up with dirty ebs but the fs is not in an error status, then such writeback at iput() time will be too late, as all workers are already stopped but writeback will utilize workers, which will lead to NULL pointer dereferences. Instead of unconditionally calling btrfs_extent_io_tree_release(), only call it if btrfs_write_and_wait_transaction() finished successfully, so that @dirty_pages extent io tree is kept untouched for transaction cleanup. CC: stable@vger.kernel.org # 6.1+ Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Qu Wenruo <wqu@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2026-05-08btrfs: always pass __GFP_NOWARN from add_ra_bio_pages()Calvin Owens-12/+14
A build workload newly prints order-0 allocation failures on 7.1-rc1: sh: page allocation failure: order:0 mode:0x14084a(__GFP_HIGHMEM|__GFP_MOVABLE|__GFP_IO|__GFP_KSWAPD_RECLAIM| __GFP_COMP|__GFP_HARDWALL) CPU: 27 UID: 1000 PID: 855540 Comm: sh Not tainted 7.1.0-rc1-llvm-00058-gdca922e019dd #1 PREEMPTLAZY Call Trace: <TASK> dump_stack_lvl+0x50/0x70 warn_alloc+0xeb/0x100 __alloc_pages_slowpath+0x567/0x5a0 ? filemap_get_entry+0x11a/0x140 __alloc_frozen_pages_noprof+0x249/0x2d0 alloc_pages_mpol+0xe4/0x180 folio_alloc_noprof+0x80/0xa0 add_ra_bio_pages+0x13c/0x4b0 btrfs_submit_compressed_read+0x229/0x300 submit_one_bio+0x9e/0xe0 btrfs_readahead+0x185/0x1a0 [...] (lldb) source list -a add_ra_bio_pages+0x13c .../vmlinux.unstripped add_ra_bio_pages + 316 at .../fs/btrfs/compression.c:454:8 451 452 folio = filemap_alloc_folio(mapping_gfp_constraint(mapping, constraint_gfp), 453 0, NULL); -> 454 if (!folio) 455 break; I can reproduce this consistently by running a memory hog concurrently with a buffered writer on a machine with a very large amount of swap. Commit 7ae37b2c94ed ("btrfs: prevent direct reclaim during compressed readahead") clearly intended to suppress these warnings. But because the mask set in the address_space with mapping_set_gfp_mask() doesn't include __GFP_NOWARN, mapping_gfp_constraint() removes it from constraint_gfp before it is passed to filemap_alloc_folio(). Fix by refactoring the code to add __GFP_NOWARN after the call to mapping_gfp_constraint(). Fixes: 7ae37b2c94ed ("btrfs: prevent direct reclaim during compressed readahead") Signed-off-by: Calvin Owens <calvin@wbinvd.org> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2026-05-08btrfs: fix check_chunk_block_group_mappings() to iterate all chunk mapsZhengYuan Huang-15/+8
[BUG] A corrupted image with a chunk present in the chunk tree but whose corresponding block group item is missing from the extent tree can be mounted successfully, even though check_chunk_block_group_mappings() is supposed to catch exactly this corruption at mount time. Once mounted, running btrfs balance with a usage filter (-dusage=N or -dusage=min..max) triggers a null-ptr-deref: KASAN: null-ptr-deref in range [0x0000000000000070-0x0000000000000077] RIP: 0010:chunk_usage_filter fs/btrfs/volumes.c:3874 [inline] RIP: 0010:should_balance_chunk fs/btrfs/volumes.c:4018 [inline] RIP: 0010:__btrfs_balance fs/btrfs/volumes.c:4172 [inline] RIP: 0010:btrfs_balance+0x2024/0x42b0 fs/btrfs/volumes.c:4604 [CAUSE] The crash occurs because __btrfs_balance() iterates the on-disk chunk tree, finds the orphaned chunk, calls chunk_usage_filter() (or chunk_usage_range_filter()), which queries the in-memory block group cache via btrfs_lookup_block_group(). Since no block group was ever inserted for this chunk, the lookup returns NULL, and the subsequent dereference of cache->used crashes. check_chunk_block_group_mappings() uses btrfs_find_chunk_map() to iterate the in-memory chunk map (fs_info->mapping_tree): map = btrfs_find_chunk_map(fs_info, start, 1); With @start = 0 and @length = 1, btrfs_find_chunk_map() looks for a chunk map that *contains* the logical address 0. If no chunk contains logical address 0, btrfs_find_chunk_map(fs_info, 0, 1) returns NULL immediately and the loop breaks after the very first iteration, having checked zero chunks. The entire verification function is therefore a no-op, and the corrupted image passes the mount-time check undetected. [FIX] Replace the btrfs_find_chunk_map() based loop with a direct in-order walk of fs_info->mapping_tree using rb_first_cached() + rb_next(). This guarantees that every chunk map in the tree is visited regardless of the logical addresses involved. No lock is taken around the traversal. This function is called during mount from btrfs_read_block_groups(), which is invoked from open_ctree() before any background threads (cleaner, transaction kthread, etc.) are started. There are therefore no concurrent writers that could modify mapping_tree at this point. An analogous lockless direct traversal of mapping_tree already exists in fill_dummy_bgs() in the same file. Since we walk the rb-tree directly via rb_entry() without going through btrfs_find_chunk_map(), no reference is taken on each map entry, so the btrfs_free_chunk_map() calls are also removed. Signed-off-by: ZhengYuan Huang <gality369@gmail.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2026-05-07smb: client: validate dacloffset before building DACL pointersMichael Bommarito-3/+32
parse_sec_desc(), build_sec_desc(), and the chown path in id_mode_to_cifs_acl() all add the server-supplied dacloffset to pntsd before proving a DACL header fits inside the returned security descriptor. On 32-bit builds a malicious server can return dacloffset near U32_MAX, wrap the derived DACL pointer below end_of_acl, and then slip past the later pointer-based bounds checks. build_sec_desc() and id_mode_to_cifs_acl() can then dereference DACL fields from the wrapped pointer in the chmod/chown rewrite paths. Validate dacloffset numerically before building any DACL pointer and reuse the same helper at the three DACL entry points. Fixes: bc3e9dd9d104 ("cifs: Change SIDs in ACEs while transferring file ownership.") Cc: stable@vger.kernel.org Assisted-by: Claude:claude-opus-4-6 Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-05-07smb/client: fix out-of-bounds read in smb2_compound_op()Zisen Ye-4/+8
If a server sends a truncated response but a large OutputBufferLength, and terminates the EA list early, check_wsl_eas() returns success without validating that the entire OutputBufferLength fits within iov_len. Then smb2_compound_op() does: memcpy(idata->wsl.eas, data[0], size[0]); Where size[0] is OutputBufferLength. If iov_len is smaller than size[0], memcpy can read beyond the end of the rsp_iov allocation and leak adjacent kernel heap memory. Link: https://lore.kernel.org/linux-cifs/d998240c-aca9-420d-9dbd-f5ba24af19e0@chenxiaosong.com/ Fixes: ea41367b2a60 ("smb: client: introduce SMB2_OP_QUERY_WSL_EA") Cc: stable@vger.kernel.org Signed-off-by: Zisen Ye <zisenye@stu.xidian.edu.cn> Reviewed-by: ChenXiaoSong <chenxiaosong@kylinos.cn> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-05-07smb/client: fix out-of-bounds read in symlink_data()Zisen Ye-1/+2
Since smb2_check_message() returns success without length validation for the symlink error response, in symlink_data() it is possible for iov->iov_len to be smaller than sizeof(struct smb2_err_rsp). If the buffer only contains the base SMB2 header (64 bytes), accessing err->ErrorContextCount (at offset 66) or err->ByteCount later in symlink_data() will cause an out-of-bounds read. Link: https://lore.kernel.org/linux-cifs/297d8d9b-adf7-42fd-a1c2-5b1f230032bc@chenxiaosong.com/ Fixes: 76894f3e2f71 ("cifs: improve symlink handling for smb2+") Cc: Stable@vger.kernel.org Signed-off-by: Zisen Ye <zisenye@stu.xidian.edu.cn> Reviewed-by: ChenXiaoSong <chenxiaosong@kylinos.cn> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-05-07smb: client: Zero-pad short GSS session keys per MS-SMB2Piyush Sachdeva-5/+18
Per MS-SMB2 section 3.2.5.3, Session.SessionKey is the first 16 bytes of the GSS cryptographic key, right-padded with zero bytes if the key is shorter than 16 bytes. SMB2_auth_kerberos() copies the GSS session key from the cifs.upcall response using kmemdup(msg->data, msg->sesskey_len, ...) and stores the GSS-reported length verbatim in ses->auth_key.len. generate_key() reads SMB2_NTLMV2_SESSKEY_SIZE bytes from this buffer when feeding the HMAC-SHA256 KDF for signing key derivation. If a GSS mechanism returns a session key shorter than 16 bytes (e.g. a deprecated single-DES Kerberos enctype with an 8-byte session key), the KDF call performs an out-of-bounds slab read and derives keys that do not match the server, which pads per the spec. Modern KDCs disable short-key enctypes by default, so this is latent rather than reachable in production, but it is still a kernel heap over-read. Allocate auth_key.response with kzalloc() at a length of max(msg->sesskey_len, SMB2_NTLMV2_SESSKEY_SIZE), copy the GSS key in, and rely on kzalloc()'s zero initialization for the spec-mandated padding. Set ses->auth_key.len to the padded length. Larger GSS keys (e.g. the 32-byte aes256-cts-hmac-sha1-96 session key) continue to be stored at their natural length, preserving the FullSessionKey path. Emit a cifs_dbg(VFS, ...) message when a short key is encountered to surface deprecated-enctype usage. NTLMv2 and NTLMSSP code paths produce a 16-byte session key by construction and are unaffected. Signed-off-by: Piyush Sachdeva <psachdeva@microsoft.com> Signed-off-by: Piyush Sachdeva <s.piyush1024@gmail.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-05-07smb: client: Use FullSessionKey for AES-256 encryption key derivationPiyush Sachdeva-10/+27
When Kerberos authentication is used with AES-256 encryption (AES-256-CCM or AES-256-GCM), the SMB3 encryption and decryption keys must be derived using the full session key (Session.FullSessionKey) rather than just the first 16 bytes (Session.SessionKey). Per MS-SMB2 section 3.2.5.3.1, when Connection.Dialect is "3.1.1" and Connection.CipherId is AES-256-CCM or AES-256-GCM, Session.FullSessionKey must be set to the full cryptographic key from the GSS authentication context. The encryption and decryption key derivation (SMBC2SCipherKey, SMBS2CCipherKey) must use this FullSessionKey as the KDF input. The signing key derivation continues to use Session.SessionKey (first 16 bytes) in all cases. Previously, generate_key() hardcoded SMB2_NTLMV2_SESSKEY_SIZE (16) as the HMAC-SHA256 key input length for all derivations. When Kerberos with AES-256 provides a 32-byte session key, the KDF for encryption/decryption was using only the first 16 bytes, producing keys that did not match the server's, causing mount failures with sec=krb5 and require_gcm_256=1. Add a full_key_size parameter to generate_key() and pass the appropriate size from generate_smb3signingkey(): - Signing: always SMB2_NTLMV2_SESSKEY_SIZE (16 bytes) - Encryption/Decryption: ses->auth_key.len when AES-256, otherwise 16 Also fix cifs_dump_full_key() to report the actual session key length for AES-256 instead of hardcoded CIFS_SESS_KEY_SIZE, so that userspace tools like Wireshark receive the correct key for decryption. Cc: <stable@vger.kernel.org> Reviewed-by: Bharath SM <bharathsm@microsoft.com> Signed-off-by: Piyush Sachdeva <psachdeva@microsoft.com> Signed-off-by: Piyush Sachdeva <s.piyush1024@gmail.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-05-06Merge tag 'v7.1-rc3-ksmbd-server-fixes' of git://git.samba.org/ksmbdLinus Torvalds-342/+508
Pull smb server fixes from Steve French: - Fix memory leak in connection free - Fix inherited ACL ACE validation - Minor cleanup - Fix for share config - Fix durable handle cleanup race - Fix close_file_table_ids in session teardown - smbdirect fixes: - Fix memory region registration - Two fixes for out-of-tree builds * tag 'v7.1-rc3-ksmbd-server-fixes' of git://git.samba.org/ksmbd: ksmbd: validate inherited ACE SID length ksmbd: fix kernel-doc warnings from ksmbd_conn_get/put() ksmbd: fail share config requests when path allocation fails ksmbd: close durable scavenger races against m_fp_list lookups ksmbd: harden file lifetime during session teardown ksmbd: centralize ksmbd_conn final release to plug transport leak smb: smbdirect: fix MR registration for coalesced SG lists smb: smbdirect: introduce and use include/linux/smbdirect.h smb: smbdirect: make use of DEFAULT_SYMBOL_NAMESPACE and EXPORT_SYMBOL_GPL
2026-05-06ovl: fix verity lazy-load guard broken by fsverity_active() semantic changeColin Walters-1/+1
Commit f77f281b6118 ("fsverity: use a hashtable to find the fsverity_info") made fsverity_active() check whether the inode has the verity flag, rather than whether the inode's fsverity_info is loaded. This broke ovl_ensure_verity_loaded(), which wants to load the fsverity_info for any verity inodes that haven't had it loaded yet. Therefore, to check that the fsverity_info hasn't yet been loaded, use fsverity_get_info(inode) == NULL instead of !fsverity_active(inode). Also, since fsverity_get_info() now involves a hash table lookup, put the more lightweight IS_VERITY() flag check first. Fixes: f77f281b6118 ("fsverity: use a hashtable to find the fsverity_info") Cc: stable@vger.kernel.org Link: https://github.com/bootc-dev/bootc/issues/2174 Signed-off-by: Colin Walters <walters@verbum.org> Acked-by: Amir Goldstein <amir73il@gmail.com> Link: https://patch.msgid.link/20260505224257.23213-1-ebiggers@kernel.org Signed-off-by: Eric Biggers <ebiggers@kernel.org>
2026-05-06Merge tag 'efi-fixes-for-v7.1-1' of ↵Linus Torvalds-4/+1
git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi Pull EFI fixes from Ard Biesheuvel: - Fix issues in EFI graceful recovery on x86 introduced by changes to the kernel mode FPU APIs - I-cache coherency fixes for the LoongArch EFI stub - Locking fix for EFI pstore - Code tweak for efivarfs * tag 'efi-fixes-for-v7.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi: x86/efi: Restore IRQ state in EFI page fault handler x86/efi: Fix graceful fault handling after FPU softirq changes efi/libstub: Synchronize instruction cache after kernel relocation efi/loongarch: Implement efi_cache_sync_image() efi/libstub: Move efi_relocate_kernel() into its only remaining user efi: pstore: Drop efivar lock when efi_pstore_open() returns with an error efivarfs: use QSTR() in efivarfs_alloc_dentry
2026-05-03smb: client: use kzalloc to zero-initialize security descriptor bufferBjoern Doebel-1/+1
Commit 62e7dd0a39c2d ("smb: common: change the data type of num_aces to le16") split struct smb_acl's __le32 num_aces field into __le16 num_aces and __le16 reserved. The reserved field corresponds to Sbz2 in the MS-DTYP ACL wire format, which must be zero [1]. When building an ACL descriptor in build_sec_desc(), we are using a kmalloc()'ed descriptor buffer and writing the fields explicitly using le16() writes now. This never writes to the 2 byte reserved field, leaving it as uninitialized heap data. When the reserved field happens to contain non-zero slab garbage, Samba rejects the security descriptor with "ndr_pull_security_descriptor failed: Range Error", causing chmod to fail with EINVAL. Change kmalloc() to kzalloc() to ensure the entire buffer is zero-initialized. Fixes: 62e7dd0a39c2d ("smb: common: change the data type of num_aces to le16") Cc: stable@vger.kernel.org Signed-off-by: Bjoern Doebel <doebel@amazon.de> Assisted-by: Kiro:claude-opus-4.6 [1] https://learn.microsoft.com/en-us/openspecs/windows_protocols/ms-dtyp/20233ed8-a6c6-4097-aafa-dd545ed24428 Signed-off-by: Steve French <stfrench@microsoft.com>
2026-05-03cifs: abort open_cached_dir if we don't request leasesShyam Prasad N-0/+8
It is possible that SMB2_open_init may not set lease context based on the requested oplock level. This can happen when leases have been temporarily or permanently disabled. When this happens, we will have open_cached_dir making an open without lease context and the response will anyway be rejected by open_cached_dir (thereby forcing a close to discard this open). That's unnecessary two round-trips to the server. This change adds a check before making the open request to the server to make sure that SMB2_open_init did add the expected lease context to the open in open_cached_dir. Cc: <stable@vger.kernel.org> Reviewed-by: Bharath SM <bharathsm@microsoft.com> Signed-off-by: Shyam Prasad N <sprasad@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-05-02Merge tag 'ntfs-for-7.1-rc2' of ↵Linus Torvalds-25/+72
git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/ntfs Pull ntfs fixes from Namjae Jeon: - Fix a NULL pointer dereference in ntfs_index_walk_down() by validating index block allocation - Fix a memory leak of the symlink target string in ntfs_reparse_set_wsl_symlink() during error paths - Prevent VCN overflow and validate lowest_vcn in ntfs_mapping_pairs_decompress() to avoid runlist corruption - Fix a page reference leak in ntfs_write_iomap_end_resident() when attribute search context allocation fails - Fix an invalid PTR_ERR() usage on a valid folio pointer in __ntfs_bitmap_set_bits_in_run() - Correct directory link counting by dropping nlink only when the MFT record link count reaches zero for WIN32/DOS aliases - Fix an uninitialized variable in ntfs_mapping_pairs_decompress() by returning an error pointer directly * tag 'ntfs-for-7.1-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/ntfs: ntfs: Use return instead of goto in ntfs_mapping_pairs_decompress() ntfs: drop nlink once for WIN32/DOS aliases ntfs: fix invalid PTR_ERR() usage in __ntfs_bitmap_set_bits_in_run() ntfs: fix error handling in ntfs_write_iomap_end_resident() ntfs: fix VCN overflow in ntfs_mapping_pairs_decompress() ntfs: fix WSL symlink target leak on reparse failure ntfs: fix NULL dereference in ntfs_index_walk_down()
2026-05-01ksmbd: validate inherited ACE SID lengthShota Zaizen-14/+52
smb_inherit_dacl() walks the parent directory DACL loaded from the security descriptor xattr. It verifies that each ACE contains the fixed SID header before using it, but does not verify that the variable-length SID described by sid.num_subauth is fully contained in the ACE. A malformed inheritable ACE can advertise more subauthorities than are present in the ACE. compare_sids() may then read past the ACE. smb_set_ace() also clamps the copied destination SID, but used the unchecked source SID count to compute the inherited ACE size. That could advance the temporary inherited ACE buffer pointer and nt_size accounting past the allocated buffer. Fix this by validating the parent ACE SID count and SID length before using the SID during inheritance. Compute the inherited ACE size from the copied SID so the size matches the bounded destination SID. Reject the inherited DACL if size accumulation would overflow smb_acl.size or the security descriptor allocation size. Fixes: e2f34481b24d ("cifsd: add server-side procedures for SMB3") Signed-off-by: Shota Zaizen <s@zaizen.me> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-05-01ksmbd: fix kernel-doc warnings from ksmbd_conn_get/put()Namjae Jeon-0/+4
The kernel test robot reported W=1 build warnings for ksmbd_conn_get() and ksmbd_conn_put() due to missing parameter descriptions. Add the @conn description to fix these warnings. Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-05-01ksmbd: fail share config requests when path allocation failsShuhao Fu-4/+8
Non-pipe shares must have a duplicated backing path before they can be published. share_config_request() currently calls kstrndup() for that path, but if the allocation fails it leaves ret unchanged. If veto list parsing succeeds and share->name exists, the partially built share is still inserted into the global share table with share->path left NULL. A later share-root SMB2 create uses tree_conn->share_conf->path as the lookup root. If the share was published with path == NULL, that request passes a NULL pathname into do_getname_kernel()/strlen() and can crash the ksmbd worker. Set ret = -ENOMEM when path duplication fails so the incomplete share is destroyed before publication. Fixes: e2f34481b24d ("cifsd: add server-side procedures for SMB3") Signed-off-by: Shuhao Fu <sfual@cse.ust.hk> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-05-01ksmbd: close durable scavenger races against m_fp_list lookupsDaeMyung Kang-26/+76
ksmbd_durable_scavenger() has two related races against any walker that iterates f_ci->m_fp_list, including ksmbd_lookup_fd_inode() (used by ksmbd_vfs_rename) and the share-mode checks in fs/smb/server/smb_common.c. (1) fp->node list-head reuse. Durable-preserved handles can remain linked on f_ci->m_fp_list after session teardown so share-mode checks still see them while the handle is reconnectable. The scavenger collected expired handles by adding fp->node to a local scavenger_list after removing them from the global durable idr. Because fp->node is the same list_head used by m_fp_list, list_add(&fp->node, &scavenger_list) overwrites the m_fp_list links and corrupts both lists. CONFIG_DEBUG_LIST can report this on the share-mode walk path. (2) Refcount race against m_fp_list walkers. The scavenger qualifies an expired durable handle with atomic_read(&fp->refcount) > 1 and fp->conn under global_ft.lock, removes fp from global_ft, then drops global_ft.lock before unlinking fp from m_fp_list and freeing it. During that gap fp is still linked on m_fp_list with f_state == FP_INITED. ksmbd_lookup_fd_inode() under m_lock read calls ksmbd_fp_get() (atomic_inc_not_zero on refcount that is still 1) and takes a live reference; the scavenger then unlinks and frees fp while the holder owns a reference, leading to UAF on the holder's subsequent ksmbd_fd_put() and on any field reads performed by a concurrent share-mode walker that iterates m_fp_list without taking ksmbd_fp_get() (smb_check_perm_dleases-like paths). Fix both: * Stop reusing fp->node as a scavenger-private list node. Remove one expired handle from global_ft under global_ft.lock, take an explicit transient reference, drop the lock, unlink fp->node from m_fp_list under f_ci->m_lock, then drop both the durable lifetime and transient references with atomic_sub_and_test(2, &fp->refcount). If the scavenger is the last putter the close runs there; otherwise an in-flight holder that already raced through the m_fp_list lookup owns the final close via its ksmbd_fd_put() path. The one-at-a-time disposal can rescan the durable idr when multiple handles expire in the same pass, but durable scavenging is a background expiration path and the final full scan recomputes min_timeout before the next wait. * Clear fp->persistent_id inside __ksmbd_remove_durable_fd() right after idr_remove(), so a delayed final close from a holder that snatched fp does not re-issue idr_remove() on a persistent id that idr_alloc_cyclic() in ksmbd_open_durable_fd() may have already handed out to a brand-new durable handle. * Bypass the per-conn open_files_count decrement in __put_fd_final() when fp is detached from any session table (fp->conn cleared by session_fd_check() at durable preserve -- paired with the volatile_id clear at unpublish, so checking fp->conn alone is sufficient). The walker that owns the final close runs from an unrelated work->conn whose stats.open_files_count never tracked this durable fp; without this guard the holder would underflow that unrelated counter. The two races are folded into one patch because patch (1) alone cleans up the corrupted list but leaves a deterministic UAF window for m_fp_list walkers that the transient-reference and persistent_id discipline in (2) close; bisecting onto an intermediate state would land on a UAF that pre-patch chaos merely made less reproducible. Validation: * CONFIG_DEBUG_LIST coverage for the list_head reuse path. * KASAN-enabled direct SMB2 durable-handle coverage that exercised ksmbd_durable_scavenger() and non-NULL ksmbd_lookup_fd_inode() returns while durable handles expired under concurrent rename lookups, with no KASAN, UAF, list-corruption, ODEBUG, or WARNING reports. * checkpatch --strict * make -j$(nproc) M=fs/smb/server Fixes: d484d621d40f ("ksmbd: add durable scavenger timer") Signed-off-by: DaeMyung Kang <charsyam@gmail.com> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-05-01ksmbd: harden file lifetime during session teardownDaeMyung Kang-25/+164
__close_file_table_ids() is the per-session teardown that closes every fp belonging to a session (or to one tree connect on that session) by walking the session's volatile-id idr. The current loop has three related problems on busy or racing workloads: * Sleeping under ft->lock. The session-teardown skip callback, session_fd_check(), already sleeps in ksmbd_vfs_copy_durable_owner() -> kstrdup(GFP_KERNEL) and down_write(&fp->f_ci->m_lock) (a rw_semaphore). Running the callback inside write_lock(&ft->lock) trips CONFIG_DEBUG_ATOMIC_SLEEP / CONFIG_PROVE_LOCKING on a durable-fd workload. * Refcount accounting blind to f_state. The unconditional atomic_dec_and_test(&fp->refcount) does not distinguish FP_INITED (idr-owned reference still intact) from FP_CLOSED (an earlier ksmbd_close_fd() already consumed the idr-owned reference while leaving fp in the idr because a holder kept refcount non-zero). When the latter races with teardown the same path over-decrements into a holder reference and ksmbd_fd_put() later UAFs that holder. * FP_NEW window. Between __open_id() publishing fp into the session idr and ksmbd_update_fstate(..., FP_INITED) committing the transition at the end of smb2_open(), an fp is in FP_NEW and an intervening teardown that takes a transient reference and unpublishes the volatile id leaves the original idr-owned reference orphaned -- the opener is unaware that fp has been unpublished, returns success to the client, and the fp leaks at refcount = 1. Refactor __close_file_table_ids() to take a transient reference on fp and unpublish fp from the session idr *under ft->lock* before calling skip() outside the lock. A transient ref protects lifetime but not concurrent field mutation, so the idr_remove() is what keeps __ksmbd_lookup_fd() through this session's idr from granting a new ksmbd_fp_get() reference to an fp whose fp->conn / fp->tcon / fp->volatile_id / op->conn / lock_list links are about to be rewritten by session_fd_check(). Durable reconnect is unaffected because it reaches fp through the global durable table (ksmbd_lookup_durable_fd -> global_ft). Decide n_to_drop together with any FP_INITED -> FP_CLOSED transition under ft->lock so teardown and ksmbd_close_fd() never both consume the idr-owned reference. See ksmbd_mark_fp_closed() for the per-state accounting. For the FP_NEW path to be safe, the opener has to learn that fp was unpublished: ksmbd_update_fstate() now returns -ENOENT when an FP_NEW -> FP_INITED transition finds f_state already advanced or the volatile id cleared (both committed by teardown under ft->lock); smb2_open() propagates that as STATUS_OBJECT_NAME_INVALID and drops the original reference via ksmbd_fd_put(). The list removal cannot be left for a deferred final putter because fp->volatile_id has already been cleared and __ksmbd_remove_fd() will intentionally skip both idr_remove() and list_del_init(). Move the m_fp_list unlink in __ksmbd_remove_fd() above the volatile-id check so that an FP_NEW fp that happened to be added to m_fp_list (smb2_open() adds fp->node before ksmbd_update_fstate() runs) is still cleaned up on the deferred putter path; list_del_init() on an empty node is a no-op and remains safe for fps that were never added. Add a defensive guard in session_fd_check() that refuses non-FP_INITED fps so that even if a teardown reaches an FP_NEW fp it falls into the close branch (where the n_to_drop = 1 accounting keeps the opener's reference alive) instead of the durable-preserve branch (which mutates fp->conn / fp->tcon). Validation on a debug kernel additionally built with CONFIG_DEBUG_LIST and CONFIG_DEBUG_OBJECTS_WORK used a same-session two-tcon workload (open/write storm on one tcon, 50 tree disconnects on the other) and reported no list-corruption, work_struct ODEBUG, sleep-in-atomic, lockdep or kmemleak reports. Reverting only the __close_file_table_ids() hunk while keeping a forced-is_reconnectable() harness produced the expected sleep-in-atomic at vfs_cache.c:1095, confirming the ft->lock-out-of-sleepable-skip discipline. KASAN-enabled direct SMB2 coverage with durable handles enabled exercised ksmbd_close_tree_conn_fds(), ksmbd_close_session_fds(), the FP_NEW failure path, tree_conn_fd_check(), and a non-zero session_fd_check() durable-preserve return. This produced no KASAN, DEBUG_LIST, ODEBUG, or WARNING reports. Fixes: f44158485826 ("cifsd: add file operations") Signed-off-by: DaeMyung Kang <charsyam@gmail.com> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-05-01ksmbd: centralize ksmbd_conn final release to plug transport leakDaeMyung Kang-30/+156
ksmbd_conn_free() is one of four sites that can observe the last refcount drop of a struct ksmbd_conn. The other three fs/smb/server/connection.c ksmbd_conn_r_count_dec() fs/smb/server/oplock.c __free_opinfo() fs/smb/server/vfs_cache.c session_fd_check() end the conn with a bare kfree(), skipping ida_destroy(&conn->async_ida) and conn->transport->ops->free_transport(conn->transport). Whenever one of them is the last putter, the embedded async_ida and the entire transport struct leak -- for TCP, that is also the struct socket and the kvec iov. __free_opinfo() being a final putter is not theoretical. opinfo_put() queues the callback via call_rcu(&opinfo->rcu, free_opinfo_rcu), so ksmbd_server_terminate_conn() can deposit N opinfo releases in RCU and have ksmbd_conn_free() run in the handler thread before any of them fire. ksmbd_conn_free() then observes refcnt > 0 and short-circuits; the last RCU-delivered __free_opinfo() falls onto its bare kfree(conn) branch and the transport is lost. A/B validation in a QEMU/virtme guest, mounting //127.0.0.1/testshare: each iteration holds 8 files open via sleep processes, force-closes TCP with "ss -K sport = :445", kills the holders, lazy-umounts; repeated 10 times, then ksmbd shutdown and kmemleak scan. state conn_alloc conn_free tcp_free opi_rcu kmemleak ---------- ---------- --------- -------- ------- -------- pre-patch 20 20 10 160 7 with patch 20 20 20 160 0 Pre-patch conn_free=20 with tcp_free=10 directly demonstrates the bare-kfree paths skipping transport cleanup; kmemleak backtraces point into struct tcp_transport / iov. With this patch tcp_free matches conn_free at 20/20 and kmemleak is clean. Move the per-struct final release into __ksmbd_conn_release_work() and route the three bare-kfree final-put sites through a new ksmbd_conn_put(). Those sites now pair ida_destroy() and free_transport() with kfree(conn) regardless of which holder happens to release the last reference. stop_sessions() only triggers the transport shutdown and does not itself drop the last conn reference, so it is unaffected. The centralized release reaches sock_release() -> tcp_close() -> lock_sock_nested() (might_sleep) from every final putter, including __free_opinfo() invoked from an RCU softirq callback, which trips CONFIG_DEBUG_ATOMIC_SLEEP. Defer the release to a dedicated ksmbd_conn_wq workqueue so ksmbd_conn_put() is safe from any non-sleeping context. Make ksmbd_file own a strong connection reference while fp->conn is non-NULL so durable-preserve and final-close paths cannot dereference a stale connection. ksmbd_open_fd() and ksmbd_reopen_durable_fd() take the reference via ksmbd_conn_get() (the latter also reorders the fp->conn / fp->tcon assignments before __open_id() so the published fp is never observed with fp->conn == NULL); session_fd_check() and __ksmbd_close_fd() drop it via ksmbd_conn_put(). With that invariant, session_fd_check() can take a local conn pointer once and use it across the m_op_list and lock_list iterations even though op->conn puts may otherwise drop the last reference. At module exit the workqueue is flushed and destroyed after rcu_barrier(), so any release queued by a trailing RCU callback is drained before the inode hash and module text go away. Fixes: ee426bfb9d09 ("ksmbd: add refcnt to ksmbd_conn struct") Signed-off-by: DaeMyung Kang <charsyam@gmail.com> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-05-01smb: smbdirect: fix MR registration for coalesced SG listsYi Kuo-9/+12
ib_dma_map_sg() modifies the provided scatterlist and returns the number of mapped entries, which can be fewer than the requested mr->sgt.nents if the DMA controller coalesces contiguous memory segments. Passing the original, uncoalesced count to ib_map_mr_sg() causes memory registration failures if coalescing actually occurs. Capture the actual mapped count returned by ib_dma_map_sg() and pass it to ib_map_mr_sg() to ensure correct MR registration. Also update the ib_dma_map_sg() error logging to drop the error pointer formatting, since the return value is an integer count rather than an error code. Ensure a proper error code (-EIO) is assigned when DMA mapping or MR registration fails. Fixes: de5ef8ec3c46 ("smb: smbdirect: introduce smbdirect_mr.c with client mr code") Closes: https://bugzilla.kernel.org/show_bug.cgi?id=221408 Reviewed-by: Stefan Metzmacher <metze@samba.org> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Yi Kuo <yi@yikuo.dev> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-05-01smb: smbdirect: introduce and use include/linux/smbdirect.hStefan Metzmacher-204/+3
This makes it easier to rebuild cifs.ko and ksmbd.ko against a running kernel. Suggested-by: Christoph Hellwig <hch@infradead.org> Link: https://lore.kernel.org/linux-cifs/aehrPuY60VMcYGU8@infradead.org/ Cc: Steve French <smfrench@gmail.com> Cc: Tom Talpey <tom@talpey.com> Cc: Long Li <longli@microsoft.com> Cc: Namjae Jeon <linkinjeon@kernel.org> Cc: Christoph Hellwig <hch@infradead.org> Cc: linux-cifs@vger.kernel.org Cc: samba-technical@lists.samba.org Signed-off-by: Stefan Metzmacher <metze@samba.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-05-01smb: smbdirect: make use of DEFAULT_SYMBOL_NAMESPACE and EXPORT_SYMBOL_GPLStefan Metzmacher-30/+33
This is a better solution than EXPORT_SYMBOL_FOR_MODULES(__sym, "cifs,ksmbd") as it makes it possible to rebuild smbdirect.ko against a running kernel and then load the existing cifs.ko and ksmbd.ko from the running kernel. Suggested-by: Christoph Hellwig <hch@infradead.org> Link: https://lore.kernel.org/linux-cifs/aehrPuY60VMcYGU8@infradead.org/ Cc: Steve French <smfrench@gmail.com> Cc: Tom Talpey <tom@talpey.com> Cc: Long Li <longli@microsoft.com> Cc: Namjae Jeon <linkinjeon@kernel.org> Cc: Christoph Hellwig <hch@infradead.org> Cc: linux-cifs@vger.kernel.org Cc: samba-technical@lists.samba.org Signed-off-by: Stefan Metzmacher <metze@samba.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2026-05-01Merge tag 'v7.1-rc2-ksmbd-server-fixes' of git://git.samba.org/ksmbdLinus Torvalds-9/+46
Pull smb server fixes from Steve French: - Fix shutdown (stop sessions) - Fix readdir unsupported info level * tag 'v7.1-rc2-ksmbd-server-fixes' of git://git.samba.org/ksmbd: ksmbd: rewrite stop_sessions() with restartable iteration smb: server: handle readdir_info_level_struct_sz() error
2026-04-30Merge tag 'v7.1-rc2-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds-33/+12
Pull smb client fixes from Steve French: - multichannel crediting fix - memory allocation improvement for smb2_compound_op - remove some dead code * tag 'v7.1-rc2-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6: cifs: change_conf needs to be called for session setup smb: client: change allocation requirements in smb2_compound_op smb/client: remove unused smb3_parse_opt()