summaryrefslogtreecommitdiffstats
path: root/fs/xfs
AgeCommit message (Collapse)AuthorLines
2026-02-25xfs: add static size checks for ioctl UABIWilfred Mallawa-5/+34
The ioctl structures in libxfs/xfs_fs.h are missing static size checks. It is useful to have static size checks for these structures as adding new fields to them could cause issues (e.g. extra padding that may be inserted by the compiler). So add these checks to xfs/xfs_ondisk.h. Due to different padding/alignment requirements across different architectures, to avoid build failures, some structures are ommited from the size checks. For example, structures with "compat_" definitions in xfs/xfs_ioctl32.h are ommited. Signed-off-by: Wilfred Mallawa <wilfred.mallawa@wdc.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2026-02-25xfs: remove duplicate static size checksWilfred Mallawa-9/+0
In libxfs/xfs_ondisk.h, remove some duplicate entries of XFS_CHECK_STRUCT_SIZE(). Signed-off-by: Wilfred Mallawa <wilfred.mallawa@wdc.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2026-02-25xfs: Add comments for usages of some macros.Nirjhar Roy (IBM)-0/+9
Add comments explaining when to use XFS_IS_CORRUPT() and ASSERT() Suggested-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Nirjhar Roy (IBM) <nirjhar.roy.lists@gmail.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2026-02-25xfs: Update lazy counters in xfs_growfs_rt_bmblock()Nirjhar Roy (IBM)-0/+9
Update lazy counters in xfs_growfs_rt_bmblock() similar to the way it is done xfs_growfs_data_private(). This is because the lazy counters are not always updated and synching the counters will avoid inconsistencies between frexents and rtextents(total realtime extent count). This will be more useful once realtime shrink is implemented as this will prevent some transient state to occur where frexents might be greater than total rtextents. Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Nirjhar Roy (IBM) <nirjhar.roy.lists@gmail.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2026-02-25xfs: Add a comment in xfs_log_sb()Nirjhar Roy (IBM)-0/+3
Add a comment explaining why the sb_frextents are updated outside the if (xfs_has_lazycount(mp) check even though it is a lazycounter. RT groups are supported only in v5 filesystems which always have lazycounter enabled - so putting it inside the if(xfs_has_lazycount(mp) check is redundant. Suggested-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Nirjhar Roy (IBM) <nirjhar.roy.lists@gmail.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2026-02-25xfs: Fix xfs_last_rt_bmblock()Nirjhar Roy (IBM)-6/+24
Bug description: If the size of the last rtgroup i.e, the rtg passed to xfs_last_rt_bmblock() is such that the last rtextent falls in 0th word offset of a bmblock of the bitmap file tracking this (last) rtgroup, then in that case xfs_last_rt_bmblock() incorrectly returns the next bmblock number instead of the current/last used bmblock number. When xfs_last_rt_bmblock() incorrectly returns the next bmblock, the loop to grow/modify the bmblocks in xfs_growfs_rtg() doesn't execute and xfs_growfs basically does a nop in certain cases. xfs_growfs will do a nop when the new size of the fs will have the same number of rtgroups i.e, we are only growing the last rtgroup. Reproduce: $ mkfs.xfs -m metadir=0 -r rtdev=/dev/loop1 /dev/loop0 \ -r size=32769b -f $ mount -o rtdev=/dev/loop1 /dev/loop0 /mnt/scratch $ xfs_growfs -R $(( 32769 + 1 )) /mnt/scratch $ xfs_info /mnt/scratch | grep rtextents $ # We can see that rtextents hasn't changed Fix: Fix this by returning the current/last used bmblock when the last rtgroup size is not a multiple xfs_rtbitmap_rtx_per_rbmblock() and the next bmblock when the rtgroup size is a multiple of xfs_rtbitmap_rtx_per_rbmblock() i.e, the existing blocks are completely used up. Also, I have renamed xfs_last_rt_bmblock() to xfs_last_rt_bmblock_to_extend() to signify that this function returns the bmblock number to extend and NOT always the last used bmblock number. Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Nirjhar Roy (IBM) <nirjhar.roy.lists@gmail.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2026-02-25xfs: don't report half-built inodes to fserrorDarrick J. Wong-3/+14
Sam Sun apparently found a syzbot way to fuzz a filesystem such that xfs_iget_cache_miss would free the inode before the fserror code could catch up. Frustratingly he doesn't use the syzbot dashboard so there's no C reproducer and not even a full error report, so I'm guessing that: Inodes that are being constructed or torn down inside XFS are not visible to the VFS. They should never be reported to fserror. Also, any inode that has been freshly allocated in _cache_miss should be marked INEW immediately because, well, it's an incompletely constructed inode that isn't yet visible to the VFS. Reported-by: Sam Sun <samsun1006219@gmail.com> Fixes: 5eb4cb18e445d0 ("xfs: convey metadata health events to the health monitor") Signed-off-by: "Darrick J. Wong" <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2026-02-25xfs: don't report metadata inodes to fserrorDarrick J. Wong-2/+14
Internal metadata inodes are not exposed to userspace programs, so it makes no sense to pass them to the fserror functions (aka fsnotify). Instead, report metadata file problems as general filesystem corruption. Fixes: 5eb4cb18e445d0 ("xfs: convey metadata health events to the health monitor") Signed-off-by: "Darrick J. Wong" <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2026-02-25xfs: fix potential pointer access race in xfs_healthmon_getDarrick J. Wong-5/+8
Pankaj Raghav asks about this code in xfs_healthmon_get: hm = mp->m_healthmon; if (hm && !refcount_inc_not_zero(&hm->ref)) hm = NULL; rcu_read_unlock(); return hm; (slightly edited to compress a mailing list thread) "Nit: Should we do a READ_ONCE(mp->m_healthmon) here to avoid any compiler tricks that can result in an undefined behaviour? I am not sure if I am being paranoid here. "So this is my understanding: RCU guarantees that we get a valid object (actual data of m_healthmon) but does not guarantee the compiler will not reread the pointer between checking if hm is !NULL and accessing the pointer as we are doing it lockless. "So just a barrier() call in rcu_read_lock is enough to make sure this doesn't happen and probably adding a READ_ONCE() is not needed?" After some initial confusion I concluded that he's correct. The compiler could very well eliminate the hm variable in favor of walking the pointers twice, turning the code into: if (mp->m_healthmon && !refcount_inc_not_zero(&mp->m_healthmon->ref)) If this happens, then xfs_healthmon_detach can sneak in between the two sides of the && expression and set mp->m_healthmon to NULL, and thereby cause a null pointer dereference crash. Fix this by using the rcu pointer assignment and dereference functions, which ensure that the proper reordering barriers are in place. Practically speaking, gcc seems to allocate an actual variable for hm and only reads mp->m_healthmon once (as intended), but we ought to be more explicit about requiring this. Reported-by: Pankaj Raghav <pankaj.raghav@linux.dev> Fixes: a48373e7d35a89f6f ("xfs: start creating infrastructure for health monitoring") Signed-off-by: "Darrick J. Wong" <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Reviewed-by: Pankaj Raghav <p.raghav@samsung.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2026-02-25xfs: fix xfs_group release bug in xfs_dax_notify_dev_failureDarrick J. Wong-2/+2
Chris Mason reports that his AI tools noticed that we were using xfs_perag_put and xfs_group_put to release the group reference returned by xfs_group_next_range. However, the iterator function returns an object with an active refcount, which means that we must use the correct function to release the active refcount, which is _rele. Cc: <stable@vger.kernel.org> # v6.0 Fixes: 6f643c57d57c56 ("xfs: implement ->notify_failure() for XFS") Signed-off-by: "Darrick J. Wong" <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2026-02-25xfs: fix xfs_group release bug in xfs_verify_report_lossesDarrick J. Wong-2/+2
Chris Mason reports that his AI tools noticed that we were using xfs_perag_put and xfs_group_put to release the group reference returned by xfs_group_next_range. However, the iterator function returns an object with an active refcount, which means that we must use the correct function to release the active refcount, which is _rele. Fixes: b8accfd65d31f2 ("xfs: add media verification ioctl") Reported-by: Chris Mason <clm@meta.com> Link: https://lore.kernel.org/linux-xfs/20260206030527.2506821-1-clm@meta.com/ Signed-off-by: "Darrick J. Wong" <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2026-02-25xfs: fix copy-paste error in previous fixDarrick J. Wong-1/+1
Chris Mason noticed that there is a copy-paste error in a recent change to xrep_dir_teardown that nulls out pointers after freeing the resources. Fixes: ba408d299a3bb3c ("xfs: only call xf{array,blob}_destroy if we have a valid pointer") Link: https://lore.kernel.org/linux-xfs/20260205194211.2307232-1-clm@meta.com/ Reported-by: Chris Mason <clm@meta.com> Signed-off-by: "Darrick J. Wong" <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2026-02-25xfs: Fix error pointer dereferenceEthan Tidmore-1/+6
The function try_lookup_noperm() can return an error pointer and is not checked for one. Add checks for error pointer in xrep_adoption_check_dcache() and xrep_adoption_zap_dcache(). Detected by Smatch: fs/xfs/scrub/orphanage.c:449 xrep_adoption_check_dcache() error: 'd_child' dereferencing possible ERR_PTR() fs/xfs/scrub/orphanage.c:485 xrep_adoption_zap_dcache() error: 'd_child' dereferencing possible ERR_PTR() Fixes: 73597e3e42b4 ("xfs: ensure dentry consistency when the orphanage adopts a file") Cc: stable@vger.kernel.org # v6.16 Signed-off-by: Ethan Tidmore <ethantidmore06@gmail.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Nirjhar Roy (IBM) <nirjhar.roy.lists@gmail.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2026-02-25xfs: remove metafile inodes from the active inode statChristoph Hellwig-5/+23
The active inode (or active vnode until recently) stat can get much larger than expected on file systems with a lot of metafile inodes like zoned file systems on SMR hard disks with 10.000s of rtg rmap inodes. Remove all metafile inodes from the active counter to make it more useful to track actual workloads and add a separate counter for active metafile inodes. This fixes xfs/177 on SMR hard drives. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2026-02-25xfs: cleanup inode counter statsChristoph Hellwig-18/+18
Most of them are unused, so mark them as such. Give the remaining ones names that match their use instead of the historic IRIX ones based on vnodes. Note that the names are purely internal to the XFS code, the user interface is based on section names and arrays of counters. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2026-02-25xfs: fix code alignment issues in xfs_ondisk.cWilfred Mallawa-2/+2
Fixup some code alignment issues in xfs_ondisk.c Signed-off-by: Wilfred Mallawa <wilfred.mallawa@wdc.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2026-02-25xfs: Replace &rtg->rtg_group with rtg_group()Nirjhar Roy (IBM)-8/+8
Use the already existing rtg_group() wrapper instead of directly accessing the struct xfs_group member in struct xfs_rtgroup. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Nirjhar Roy (IBM) <nirjhar.roy.lists@gmail.com> [cem: Conflict resolution against 06873dbd940d] Signed-off-by: Carlos Maiolino <cem@kernel.org>
2026-02-25xfs: Refactoring the nagcount and delta calculationNirjhar Roy (IBM)-15/+33
Introduce xfs_growfs_compute_delta() to calculate the nagcount and delta blocks and refactor the code from xfs_growfs_data_private(). No functional changes. Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Nirjhar Roy (IBM) <nirjhar.roy.lists@gmail.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2026-02-25xfs: Replace ASSERT with XFS_IS_CORRUPT in xfs_rtcopy_summary()Nirjhar Roy (IBM)-1/+4
Replace ASSERT(sum > 0) with an XFS_IS_CORRUPT() and place it just after the call to xfs_rtget_summary() so that we don't end up using an illegal value of sum. Signed-off-by: Nirjhar Roy (IBM) <nirjhar.roy.lists@gmail.com> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2026-02-21Convert more 'alloc_obj' cases to default GFP_KERNEL argumentsLinus Torvalds-4/+2
This converts some of the visually simpler cases that have been split over multiple lines. I only did the ones that are easy to verify the resulting diff by having just that final GFP_KERNEL argument on the next line. Somebody should probably do a proper coccinelle script for this, but for me the trivial script actually resulted in an assertion failure in the middle of the script. I probably had made it a bit _too_ trivial. So after fighting that far a while I decided to just do some of the syntactically simpler cases with variations of the previous 'sed' scripts. The more syntactically complex multi-line cases would mostly really want whitespace cleanup anyway. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2026-02-21Convert 'alloc_flex' family to use the new default GFP_KERNEL argumentLinus Torvalds-1/+1
This is the exact same thing as the 'alloc_obj()' version, only much smaller because there are a lot fewer users of the *alloc_flex() interface. As with alloc_obj() version, this was done entirely with mindless brute force, using the same script, except using 'flex' in the pattern rather than 'objs*'. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2026-02-21Convert 'alloc_obj' family to use the new default GFP_KERNEL argumentLinus Torvalds-15/+15
This was done entirely with mindless brute force, using git grep -l '\<k[vmz]*alloc_objs*(.*, GFP_KERNEL)' | xargs sed -i 's/\(alloc_objs*(.*\), GFP_KERNEL)/\1)/' to convert the new alloc_obj() users that had a simple GFP_KERNEL argument to just drop that argument. Note that due to the extreme simplicity of the scripting, any slightly more complex cases spread over multiple lines would not be triggered: they definitely exist, but this covers the vast bulk of the cases, and the resulting diff is also then easier to check automatically. For the same reason the 'flex' versions will be done as a separate conversion. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2026-02-21treewide: Replace kmalloc with kmalloc_obj for non-scalar typesKees Cook-115/+106
This is the result of running the Coccinelle script from scripts/coccinelle/api/kmalloc_objs.cocci. The script is designed to avoid scalar types (which need careful case-by-case checking), and instead replace kmalloc-family calls that allocate struct or union object instances: Single allocations: kmalloc(sizeof(TYPE), ...) are replaced with: kmalloc_obj(TYPE, ...) Array allocations: kmalloc_array(COUNT, sizeof(TYPE), ...) are replaced with: kmalloc_objs(TYPE, COUNT, ...) Flex array allocations: kmalloc(struct_size(PTR, FAM, COUNT), ...) are replaced with: kmalloc_flex(*PTR, FAM, COUNT, ...) (where TYPE may also be *VAR) The resulting allocations no longer return "void *", instead returning "TYPE *". Signed-off-by: Kees Cook <kees@kernel.org>
2026-02-18Merge tag 'mm-stable-2026-02-18-19-48' of ↵Linus Torvalds-4/+5
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull more MM updates from Andrew Morton: - "mm/vmscan: fix demotion targets checks in reclaim/demotion" fixes a couple of issues in the demotion code - pages were failed demotion and were finding themselves demoted into disallowed nodes (Bing Jiao) - "Remove XA_ZERO from error recovery of dup_mmap()" fixes a rare mapledtree race and performs a number of cleanups (Liam Howlett) - "mm: add bitmap VMA flag helpers and convert all mmap_prepare to use them" implements a lot of cleanups following on from the conversion of the VMA flags into a bitmap (Lorenzo Stoakes) - "support batch checking of references and unmapping for large folios" implements batching to greatly improve the performance of reclaiming clean file-backed large folios (Baolin Wang) - "selftests/mm: add memory failure selftests" does as claimed (Miaohe Lin) * tag 'mm-stable-2026-02-18-19-48' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (36 commits) mm/page_alloc: clear page->private in free_pages_prepare() selftests/mm: add memory failure dirty pagecache test selftests/mm: add memory failure clean pagecache test selftests/mm: add memory failure anonymous page test mm: rmap: support batched unmapping for file large folios arm64: mm: implement the architecture-specific clear_flush_young_ptes() arm64: mm: support batch clearing of the young flag for large folios arm64: mm: factor out the address and ptep alignment into a new helper mm: rmap: support batched checks of the references for large folios tools/testing/vma: add VMA userland tests for VMA flag functions tools/testing/vma: separate out vma_internal.h into logical headers tools/testing/vma: separate VMA userland tests into separate files mm: make vm_area_desc utilise vma_flags_t only mm: update all remaining mmap_prepare users to use vma_flags_t mm: update shmem_[kernel]_file_*() functions to use vma_flags_t mm: update secretmem to use VMA flags on mmap_prepare mm: update hugetlbfs to use VMA flags on mmap_prepare mm: add basic VMA flag operation helper functions tools: bitmap: add missing bitmap_[subset(), andnot()] mm: add mk_vma_flags() bitmap flag macro helper ...
2026-02-12mm: update all remaining mmap_prepare users to use vma_flags_tLorenzo Stoakes-2/+2
We will be shortly removing the vm_flags_t field from vm_area_desc so we need to update all mmap_prepare users to only use the dessc->vma_flags field. This patch achieves that and makes all ancillary changes required to make this possible. This lays the groundwork for future work to eliminate the use of vm_flags_t in vm_area_desc altogether and more broadly throughout the kernel. While we're here, we take the opportunity to replace VM_REMAP_FLAGS with VMA_REMAP_FLAGS, the vma_flags_t equivalent. No functional changes intended. Link: https://lkml.kernel.org/r/fb1f55323799f09fe6a36865b31550c9ec67c225.1769097829.git.lorenzo.stoakes@oracle.com Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Acked-by: Damien Le Moal <dlemoal@kernel.org> [zonefs] Acked-by: "Darrick J. Wong" <djwong@kernel.org> Acked-by: Pedro Falcato <pfalcato@suse.de> Cc: Baolin Wang <baolin.wang@linux.alibaba.com> Cc: Barry Song <baohua@kernel.org> Cc: David Hildenbrand <david@kernel.org> Cc: Dev Jain <dev.jain@arm.com> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Zi Yan <ziy@nvidia.com> Cc: Jarkko Sakkinen <jarkko@kernel.org> Cc: Yury Norov <ynorov@nvidia.com> Cc: Chris Mason <clm@fb.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-02-12mm: update shmem_[kernel]_file_*() functions to use vma_flags_tLorenzo Stoakes-2/+3
In order to be able to use only vma_flags_t in vm_area_desc we must adjust shmem file setup functions to operate in terms of vma_flags_t rather than vm_flags_t. This patch makes this change and updates all callers to use the new functions. No functional changes intended. [akpm@linux-foundation.org: comment fixes, per Baolin] Link: https://lkml.kernel.org/r/736febd280eb484d79cef5cf55b8a6f79ad832d2.1769097829.git.lorenzo.stoakes@oracle.com Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com> Cc: Barry Song <baohua@kernel.org> Cc: David Hildenbrand <david@kernel.org> Cc: Dev Jain <dev.jain@arm.com> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Zi Yan <ziy@nvidia.com> Cc: "Darrick J. Wong" <djwong@kernel.org> Cc: Damien Le Moal <dlemoal@kernel.org> Cc: Yury Norov <ynorov@nvidia.com> Cc: Chris Mason <clm@fb.com> Cc: Pedro Falcato <pfalcato@suse.de> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2026-02-12Merge tag 'mm-stable-2026-02-11-19-22' of ↵Linus Torvalds-9/+0
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull MM updates from Andrew Morton: - "powerpc/64s: do not re-activate batched TLB flush" makes arch_{enter|leave}_lazy_mmu_mode() nest properly (Alexander Gordeev) It adds a generic enter/leave layer and switches architectures to use it. Various hacks were removed in the process. - "zram: introduce compressed data writeback" implements data compression for zram writeback (Richard Chang and Sergey Senozhatsky) - "mm: folio_zero_user: clear page ranges" adds clearing of contiguous page ranges for hugepages. Large improvements during demand faulting are demonstrated (David Hildenbrand) - "memcg cleanups" tidies up some memcg code (Chen Ridong) - "mm/damon: introduce {,max_}nr_snapshots and tracepoint for damos stats" improves DAMOS stat's provided information, deterministic control, and readability (SeongJae Park) - "selftests/mm: hugetlb cgroup charging: robustness fixes" fixes a few issues in the hugetlb cgroup charging selftests (Li Wang) - "Fix va_high_addr_switch.sh test failure - again" addresses several issues in the va_high_addr_switch test (Chunyu Hu) - "mm/damon/tests/core-kunit: extend existing test scenarios" improves the KUnit test coverage for DAMON (Shu Anzai) - "mm/khugepaged: fix dirty page handling for MADV_COLLAPSE" fixes a glitch in khugepaged which was causing madvise(MADV_COLLAPSE) to transiently return -EAGAIN (Shivank Garg) - "arch, mm: consolidate hugetlb early reservation" reworks and consolidates a pile of straggly code related to reservation of hugetlb memory from bootmem and creation of CMA areas for hugetlb (Mike Rapoport) - "mm: clean up anon_vma implementation" cleans up the anon_vma implementation in various ways (Lorenzo Stoakes) - "tweaks for __alloc_pages_slowpath()" does a little streamlining of the page allocator's slowpath code (Vlastimil Babka) - "memcg: separate private and public ID namespaces" cleans up the memcg ID code and prevents the internal-only private IDs from being exposed to userspace (Shakeel Butt) - "mm: hugetlb: allocate frozen gigantic folio" cleans up the allocation of frozen folios and avoids some atomic refcount operations (Kefeng Wang) - "mm/damon: advance DAMOS-based LRU sorting" improves DAMOS's movement of memory betewwn the active and inactive LRUs and adds auto-tuning of the ratio-based quotas and of monitoring intervals (SeongJae Park) - "Support page table check on PowerPC" makes CONFIG_PAGE_TABLE_CHECK_ENFORCED work on powerpc (Andrew Donnellan) - "nodemask: align nodes_and{,not} with underlying bitmap ops" makes nodes_and() and nodes_andnot() propagate the return values from the underlying bit operations, enabling some cleanup in calling code (Yury Norov) - "mm/damon: hide kdamond and kdamond_lock from API callers" cleans up some DAMON internal interfaces (SeongJae Park) - "mm/khugepaged: cleanups and scan limit fix" does some cleanup work in khupaged and fixes a scan limit accounting issue (Shivank Garg) - "mm: balloon infrastructure cleanups" goes to town on the balloon infrastructure and its page migration function. Mainly cleanups, also some locking simplification (David Hildenbrand) - "mm/vmscan: add tracepoint and reason for kswapd_failures reset" adds additional tracepoints to the page reclaim code (Jiayuan Chen) - "Replace wq users and add WQ_PERCPU to alloc_workqueue() users" is part of Marco's kernel-wide migration from the legacy workqueue APIs over to the preferred unbound workqueues (Marco Crivellari) - "Various mm kselftests improvements/fixes" provides various unrelated improvements/fixes for the mm kselftests (Kevin Brodsky) - "mm: accelerate gigantic folio allocation" greatly speeds up gigantic folio allocation, mainly by avoiding unnecessary work in pfn_range_valid_contig() (Kefeng Wang) - "selftests/damon: improve leak detection and wss estimation reliability" improves the reliability of two of the DAMON selftests (SeongJae Park) - "mm/damon: cleanup kdamond, damon_call(), damos filter and DAMON_MIN_REGION" does some cleanup work in the core DAMON code (SeongJae Park) - "Docs/mm/damon: update intro, modules, maintainer profile, and misc" performs maintenance work on the DAMON documentation (SeongJae Park) - "mm: add and use vma_assert_stabilised() helper" refactors and cleans up the core VMA code. The main aim here is to be able to use the mmap write lock's lockdep state to perform various assertions regarding the locking which the VMA code requires (Lorenzo Stoakes) - "mm, swap: swap table phase II: unify swapin use" removes some old swap code (swap cache bypassing and swap synchronization) which wasn't working very well. Various other cleanups and simplifications were made. The end result is a 20% speedup in one benchmark (Kairui Song) - "enable PT_RECLAIM on more 64-bit architectures" makes PT_RECLAIM available on 64-bit alpha, loongarch, mips, parisc, and um. Various cleanups were performed along the way (Qi Zheng) * tag 'mm-stable-2026-02-11-19-22' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (325 commits) mm/memory: handle non-split locks correctly in zap_empty_pte_table() mm: move pte table reclaim code to memory.c mm: make PT_RECLAIM depends on MMU_GATHER_RCU_TABLE_FREE mm: convert __HAVE_ARCH_TLB_REMOVE_TABLE to CONFIG_HAVE_ARCH_TLB_REMOVE_TABLE config um: mm: enable MMU_GATHER_RCU_TABLE_FREE parisc: mm: enable MMU_GATHER_RCU_TABLE_FREE mips: mm: enable MMU_GATHER_RCU_TABLE_FREE LoongArch: mm: enable MMU_GATHER_RCU_TABLE_FREE alpha: mm: enable MMU_GATHER_RCU_TABLE_FREE mm: change mm/pt_reclaim.c to use asm/tlb.h instead of asm-generic/tlb.h mm/damon/stat: remove __read_mostly from memory_idle_ms_percentiles zsmalloc: make common caches global mm: add SPDX id lines to some mm source files mm/zswap: use %pe to print error pointers mm/vmscan: use %pe to print error pointers mm/readahead: fix typo in comment mm: khugepaged: fix NR_FILE_PAGES and NR_SHMEM in collapse_file() mm: refactor vma_map_pages to use vm_insert_pages mm/damon: unify address range representation with damon_addr_range mm/cma: replace snprintf with strscpy in cma_new_area ...
2026-02-09Merge tag 'for-7.0/block-stable-pages-20260206' of ↵Linus Torvalds-5/+44
git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux Pull bounce buffer dio for stable pages from Jens Axboe: "This adds support for bounce buffering of dio for stable pages. This was all done by Christoph. In his words: This series tries to address the problem that under I/O pages can be modified during direct I/O, even when the device or file system require stable pages during I/O to calculate checksums, parity or data operations. It does so by adding block layer helpers to bounce buffer an iov_iter into a bio, then wires that up in iomap and ultimately XFS. The reason that the file system even needs to know about it, is because reads need a user context to copy the data back, and the infrastructure to defer ioends to a workqueue currently sits in XFS. I'm going to look into moving that into ioend and enabling it for other file systems. Additionally btrfs already has it's own infrastructure for this, and actually an urgent need to bounce buffer, so this should be useful there and could be wire up easily. In fact the idea comes from patches by Qu that did this in btrfs. This patch fixes all but one xfstests failures on T10 PI capable devices (generic/095 seems to have issues with a mix of mmap and splice still, I'm looking into that separately), and make qemu VMs running Windows, or Linux with swap enabled fine on an XFS file on a device using PI. Performance numbers on my (not exactly state of the art) NVMe PI test setup: Sequential reads using io_uring, QD=16. Bandwidth and CPU usage (usr/sys): | size | zero copy | bounce | +------+--------------------------+--------------------------+ | 4k | 1316MiB/s (12.65/55.40%) | 1081MiB/s (11.76/49.78%) | | 64K | 3370MiB/s ( 5.46/18.20%) | 3365MiB/s ( 4.47/15.68%) | | 1M | 3401MiB/s ( 0.76/23.05%) | 3400MiB/s ( 0.80/09.06%) | +------+--------------------------+--------------------------+ Sequential writes using io_uring, QD=16. Bandwidth and CPU usage (usr/sys): | size | zero copy | bounce | +------+--------------------------+--------------------------+ | 4k | 882MiB/s (11.83/33.88%) | 750MiB/s (10.53/34.08%) | | 64K | 2009MiB/s ( 7.33/15.80%) | 2007MiB/s ( 7.47/24.71%) | | 1M | 1992MiB/s ( 7.26/ 9.13%) | 1992MiB/s ( 9.21/19.11%) | +------+--------------------------+--------------------------+ Note that the 64k read numbers look really odd to me for the baseline zero copy case, but are reproducible over many repeated runs. The bounce read numbers should further improve when moving the PI validation to the file system and removing the double context switch, which I have patches for that will sent out soon" * tag 'for-7.0/block-stable-pages-20260206' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux: xfs: use bounce buffering direct I/O when the device requires stable pages iomap: add a flag to bounce buffer direct I/O iomap: support ioends for direct reads iomap: rename IOMAP_DIO_DIRTY to IOMAP_DIO_USER_BACKED iomap: free the bio before completing the dio iomap: share code between iomap_dio_bio_end_io and iomap_finish_ioend_direct iomap: split out the per-bio logic from iomap_dio_bio_iter iomap: simplify iomap_dio_bio_iter iomap: fix submission side handling of completion side errors block: add helpers to bounce buffer an iov_iter into bios block: remove bio_release_page iov_iter: extract a iov_iter_extract_bvecs helper from bio code block: open code bio_add_page and fix handling of mismatching P2P ranges block: refactor get_contig_folio_len block: add a BIO_MAX_SIZE constant and use it
2026-02-09Merge tag 'xfs-merge-7.0' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linuxLinus Torvalds-1342/+4212
Pull xfs updates from Carlos Maiolino: "This contains several improvements to zoned device support, performance improvements for the parent pointers, and a new health monitoring feature. There are some improvements in the journaling code too but no behavior change expected. Last but not least, some code refactoring and bug fixes are also included in this series" * tag 'xfs-merge-7.0' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: (67 commits) xfs: add sysfs stats for zoned GC xfs: give the defer_relog stat a xs_ prefix xfs: add zone reset error injection xfs: refactor zone reset handling xfs: don't mark all discard issued by zoned GC as sync xfs: allow setting errortags at mount time xfs: use WRITE_ONCE/READ_ONCE for m_errortag xfs: move the guts of XFS_ERRORTAG_DELAY out of line xfs: don't validate error tags in the I/O path xfs: allocate m_errortag early xfs: fix the errno sign for the xfs_errortag_{add,clearall} stubs xfs: validate log record version against superblock log version xfs: fix spacing style issues in xfs_alloc.c xfs: remove xfs_zone_gc_space_available xfs: use a seprate member to track space availabe in the GC scatch buffer xfs: check for deleted cursors when revalidating two btrees xfs: fix UAF in xchk_btree_check_block_owner xfs: check return value of xchk_scrub_create_subord xfs: only call xf{array,blob}_destroy if we have a valid pointer xfs: get rid of the xchk_xfile_*_descr calls ...
2026-02-09Merge tag 'vfs-7.0-rc1.fserror' of ↵Linus Torvalds-2/+22
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs error reporting updates from Christian Brauner: "This contains the changes to support generic I/O error reporting. Filesystems currently have no standard mechanism for reporting metadata corruption and file I/O errors to userspace via fsnotify. Each filesystem (xfs, ext4, erofs, f2fs, etc.) privately defines EFSCORRUPTED, and error reporting to fanotify is inconsistent or absent entirely. This introduces a generic fserror infrastructure built around struct super_block that gives filesystems a standard way to queue metadata and file I/O error reports for delivery to fsnotify. Errors are queued via mempools and queue_work to avoid holding filesystem locks in the notification path; unmount waits for pending events to drain. A new super_operations::report_error callback lets filesystem drivers respond to file I/O errors themselves (to be used by an upcoming XFS self-healing patchset). On the uapi side, EFSCORRUPTED and EUCLEAN are promoted from private per-filesystem definitions to canonical errno.h values across all architectures" * tag 'vfs-7.0-rc1.fserror' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: ext4: convert to new fserror helpers xfs: translate fsdax media errors into file "data lost" errors when convenient xfs: report fs metadata errors via fsnotify iomap: report file I/O errors to the VFS fs: report filesystem and file I/O errors to fsnotify uapi: promote EFSCORRUPTED and EUCLEAN to errno.h
2026-02-09Merge tag 'vfs-7.0-rc1.leases' of ↵Linus Torvalds-0/+3
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs lease updates from Christian Brauner: "This contains updates for lease support to require filesystems to explicitly opt-in to lease support Currently kernel_setlease() falls through to generic_setlease() when a a filesystem does not define ->setlease(), silently granting lease support to every filesystem regardless of whether it is prepared for it. This is a poor default: most filesystems never intended to support leases, and the silent fallthrough makes it impossible to distinguish "supports leases" from "never thought about it". This inverts the default. It adds explicit .setlease = generic_setlease; assignments to every in-tree filesystem that should retain lease support, then changes kernel_setlease() to return -EINVAL when ->setlease is NULL. With the new default in place, simple_nosetlease() is redundant and is removed along with all references to it" * tag 'vfs-7.0-rc1.leases' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (25 commits) fuse: add setlease file operation fs: remove simple_nosetlease() filelock: default to returning -EINVAL when ->setlease operation is NULL xfs: add setlease file operation ufs: add setlease file operation udf: add setlease file operation tmpfs: add setlease file operation squashfs: add setlease file operation overlayfs: add setlease file operation orangefs: add setlease file operation ocfs2: add setlease file operation ntfs3: add setlease file operation nilfs2: add setlease file operation jfs: add setlease file operation jffs2: add setlease file operation gfs2: add a setlease file operation fat: add setlease file operation f2fs: add setlease file operation exfat: add setlease file operation ext4: add setlease file operation ...
2026-02-09Merge tag 'vfs-7.0-rc1.nonblocking_timestamps' of ↵Linus Torvalds-43/+35
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs timestamp updates from Christian Brauner: "This contains the changes to support non-blocking timestamp updates. Since commit 66fa3cedf16a ("fs: Add async write file modification handling") file_update_time_flags() unconditionally returns -EAGAIN when any timestamp needs updating and IOCB_NOWAIT is set. This makes non-blocking direct writes impossible on file systems with granular enough timestamps, which in practice means all of them. This reworks the timestamp update path to propagate IOCB_NOWAIT through ->update_time so that file systems which can update timestamps without blocking are no longer penalized. With that groundwork in place, the core change passes IOCB_NOWAIT into ->update_time and returns -EAGAIN only when the file system indicates it would block. XFS implements non-blocking timestamp updates by using the new ->sync_lazytime and open-coding generic_update_time without the S_NOWAIT check, since the lazytime path through the generic helpers can never block in XFS" * tag 'vfs-7.0-rc1.nonblocking_timestamps' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: xfs: enable non-blocking timestamp updates xfs: implement ->sync_lazytime fs: refactor file_update_time_flags fs: add support for non-blocking timestamp updates fs: add a ->sync_lazytime method fs: factor out a sync_lazytime helper fs: refactor ->update_time handling fat: cleanup the flags for fat_truncate_time nfs: split nfs_update_timestamps fs: allow error returns from generic_update_time fs: remove inode_update_time
2026-01-30xfs: add sysfs stats for zoned GCChristoph Hellwig-1/+18
Add counters of read, write and zone_reset operations as well as GC written bytes to sysfs. This way they can be easily used for monitoring tools and test cases. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Hans Holmberg <hans.holmberg@wdc.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2026-01-30xfs: give the defer_relog stat a xs_ prefixChristoph Hellwig-5/+5
Make this counter naming consistent with all the others. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2026-01-30xfs: add zone reset error injectionChristoph Hellwig-4/+15
Add a new errortag to test that zone reset errors are handled correctly. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Reviewed-by: Hans Holmberg <hans.holmberg@wdc.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2026-01-30xfs: refactor zone reset handlingChristoph Hellwig-21/+28
Include the actual bio submission in the common zone reset handler to share more code and prepare for adding error injection for zone reset. Note the I plan to refactor the block layer submit_bio_wait and bio_await_chain code in the next merge window to remove some of the code duplication added here. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Reviewed-by: Hans Holmberg <hans.holmberg@wdc.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2026-01-30xfs: don't mark all discard issued by zoned GC as syncChristoph Hellwig-1/+2
Discard are not usually sync when issued from zoned garbage collection, so drop the REQ_SYNC flag. Fixes: 080d01c41d44 ("xfs: implement zoned garbage collection") Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Reviewed-by: Hans Holmberg <hans.holmberg@wdc.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2026-01-30xfs: allow setting errortags at mount timeChristoph Hellwig-1/+47
Add an errortag mount option that enables an errortag with the default injection frequency. This allows injecting errors into the mount process instead of just on live file systems, and thus test mount error handling. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hans Holmberg <hans.holmberg@wdc.com> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2026-01-30xfs: use WRITE_ONCE/READ_ONCE for m_errortagChristoph Hellwig-9/+14
There is no synchronization for updating m_errortag, which is fine as it's just a debug tool. It would still be nice to fully avoid the theoretical case of torn values, so use WRITE_ONCE and READ_ONCE to access the members. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hans Holmberg <hans.holmberg@wdc.com> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2026-01-30xfs: move the guts of XFS_ERRORTAG_DELAY out of lineChristoph Hellwig-12/+24
Mirror what is done for the more common XFS_ERRORTAG_TEST version, and also only look at the error tag value once now that we can easily have a local variable. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hans Holmberg <hans.holmberg@wdc.com> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2026-01-30xfs: don't validate error tags in the I/O pathChristoph Hellwig-30/+12
We can trust XFS developers enough to not pass random stuff to XFS_ERROR_TEST/DELAY. Open code the validity check in xfs_errortag_add, which is the only place that receives unvalidated error tag values from user space, and drop the now pointless xfs_errortag_enabled helper. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hans Holmberg <hans.holmberg@wdc.com> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2026-01-30xfs: allocate m_errortag earlyChristoph Hellwig-25/+13
Ensure the mount structure always has a valid m_errortag for debug builds. This removes the NULL checking from the runtime code, and prepares for allowing to set errortags from mount. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hans Holmberg <hans.holmberg@wdc.com> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2026-01-30xfs: fix the errno sign for the xfs_errortag_{add,clearall} stubsChristoph Hellwig-2/+2
All errno values should be negative in the kernel. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hans Holmberg <hans.holmberg@wdc.com> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2026-01-30xfs: validate log record version against superblock log versionRaphael Pinsonneault-Thibeault-11/+16
Syzbot creates a fuzzed record where xfs_has_logv2() but the xlog_rec_header h_version != XLOG_VERSION_2. This causes a KASAN: slab-out-of-bounds read in xlog_do_recovery_pass() -> xlog_recover_process() -> xlog_cksum(). Fix by adding a check to xlog_valid_rec_header() to abort journal recovery if the xlog_rec_header h_version does not match the super block log version. A file system with a version 2 log will only ever set XLOG_VERSION_2 in its headers (and v1 will only ever set V_1), so if there is any mismatch, either the journal or the superblock has been corrupted and therefore we abort processing with a -EFSCORRUPTED error immediately. Also, refactor the structure of the validity checks for better readability. At the default error level (LOW), XFS_IS_CORRUPT() emits the condition that failed, the file and line number it is located at, then dumps the stack. This gives us everything we need to know about the failure if we do a single validity check per XFS_IS_CORRUPT(). Reported-by: syzbot+9f6d080dece587cfdd4c@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=9f6d080dece587cfdd4c Tested-by: syzbot+9f6d080dece587cfdd4c@syzkaller.appspotmail.com Fixes: 45cf976008dd ("xfs: fix log recovery buffer allocation for the legacy h_size fixup") Signed-off-by: Raphael Pinsonneault-Thibeault <rpthibeault@gmail.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2026-01-29xfs: fix spacing style issues in xfs_alloc.cShin Seong-jun-4/+4
Fix checkpatch.pl errors regarding missing spaces around assignment operators in xfs_alloc_compute_diff() and xfs_alloc_fixup_trees(). Adhere to the Linux kernel coding style by ensuring spaces are placed around the assignment operator '='. Signed-off-by: Shin Seong-jun <shinsj4653@gmail.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2026-01-29xfs: remove xfs_zone_gc_space_availableChristoph Hellwig-14/+7
xfs_zone_gc_space_available only has one caller left, so fold it into that. Reorder the checks so that the cheaper scratch_available check is done first. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hans Holmberg <hans.holmberg@wdc.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2026-01-29xfs: use a seprate member to track space availabe in the GC scatch bufferChristoph Hellwig-16/+9
When scratch_head wraps back to 0 and scratch_tail is also 0 because no I/O has completed yet, the ring buffer could be mistaken for empty. Fix this by introducing a separate scratch_available member in struct xfs_zone_gc_data. This actually ends up simplifying the code as well. Reported-by: Chris Mason <clm@meta.com> Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Hans Holmberg <hans.holmberg@wdc.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2026-01-28xfs: use bounce buffering direct I/O when the device requires stable pagesChristoph Hellwig-5/+44
Fix direct I/O on devices that require stable pages by asking iomap to bounce buffer. To support this, ioends are used for direct reads in this case to provide a user context for copying data back from the bounce buffer. This fixes qemu when used on devices using T10 protection information and probably other cases like iSCSI using data digests. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Damien Le Moal <dlemoal@kernel.org> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Tested-by: Anuj Gupta <anuj20.g@samsung.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
2026-01-28Merge tag 'scrub-syzbot-fixes-7.0_2026-01-25' of ↵Carlos Maiolino-181/+115
https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux into xfs-7.0-merge xfs: syzbot fixes for online fsck [3/3] Fix various syzbot complaints about scrub that Jiaming Zhang found. With a bit of luck, this should all go splendidly. Signed-off-by: "Darrick J. Wong" <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2026-01-28Merge tag 'attr-pptr-speedup-7.0_2026-01-25' of ↵Carlos Maiolino-17/+157
https://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux into xfs-7.0-merge xfs: improve shortform attr performance [2/3] Improve performance of the xattr (and parent pointer) code when the attr structure is in short format and we can therefore perform all updates in a single transaction. Avoiding the attr intent code brings a very nice speedup in those operations. With a bit of luck, this should all go splendidly. Signed-off-by: "Darrick J. Wong" <djwong@kernel.org> Signed-off-by: Carlos Maiolino <cem@kernel.org>