| Age | Commit message (Collapse) | Author | Lines |
|
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull MM updates from Andrew Morton:
- "maple_tree: Replace big node with maple copy" (Liam Howlett)
Mainly prepararatory work for ongoing development but it does reduce
stack usage and is an improvement.
- "mm, swap: swap table phase III: remove swap_map" (Kairui Song)
Offers memory savings by removing the static swap_map. It also yields
some CPU savings and implements several cleanups.
- "mm: memfd_luo: preserve file seals" (Pratyush Yadav)
File seal preservation to LUO's memfd code
- "mm: zswap: add per-memcg stat for incompressible pages" (Jiayuan
Chen)
Additional userspace stats reportng to zswap
- "arch, mm: consolidate empty_zero_page" (Mike Rapoport)
Some cleanups for our handling of ZERO_PAGE() and zero_pfn
- "mm/kmemleak: Improve scan_should_stop() implementation" (Zhongqiu
Han)
A robustness improvement and some cleanups in the kmemleak code
- "Improve khugepaged scan logic" (Vernon Yang)
Improve khugepaged scan logic and reduce CPU consumption by
prioritizing scanning tasks that access memory frequently
- "Make KHO Stateless" (Jason Miu)
Simplify Kexec Handover by transitioning KHO from an xarray-based
metadata tracking system with serialization to a radix tree data
structure that can be passed directly to the next kernel
- "mm: vmscan: add PID and cgroup ID to vmscan tracepoints" (Thomas
Ballasi and Steven Rostedt)
Enhance vmscan's tracepointing
- "mm: arch/shstk: Common shadow stack mapping helper and
VM_NOHUGEPAGE" (Catalin Marinas)
Cleanup for the shadow stack code: remove per-arch code in favour of
a generic implementation
- "Fix KASAN support for KHO restored vmalloc regions" (Pasha Tatashin)
Fix a WARN() which can be emitted the KHO restores a vmalloc area
- "mm: Remove stray references to pagevec" (Tal Zussman)
Several cleanups, mainly udpating references to "struct pagevec",
which became folio_batch three years ago
- "mm: Eliminate fake head pages from vmemmap optimization" (Kiryl
Shutsemau)
Simplify the HugeTLB vmemmap optimization (HVO) by changing how tail
pages encode their relationship to the head page
- "mm/damon/core: improve DAMOS quota efficiency for core layer
filters" (SeongJae Park)
Improve two problematic behaviors of DAMOS that makes it less
efficient when core layer filters are used
- "mm/damon: strictly respect min_nr_regions" (SeongJae Park)
Improve DAMON usability by extending the treatment of the
min_nr_regions user-settable parameter
- "mm/page_alloc: pcp locking cleanup" (Vlastimil Babka)
The proper fix for a previously hotfixed SMP=n issue. Code
simplifications and cleanups ensued
- "mm: cleanups around unmapping / zapping" (David Hildenbrand)
A bunch of cleanups around unmapping and zapping. Mostly
simplifications, code movements, documentation and renaming of
zapping functions
- "support batched checking of the young flag for MGLRU" (Baolin Wang)
Batched checking of the young flag for MGLRU. It's part cleanups; one
benchmark shows large performance benefits for arm64
- "memcg: obj stock and slab stat caching cleanups" (Johannes Weiner)
memcg cleanup and robustness improvements
- "Allow order zero pages in page reporting" (Yuvraj Sakshith)
Enhance free page reporting - it is presently and undesirably order-0
pages when reporting free memory.
- "mm: vma flag tweaks" (Lorenzo Stoakes)
Cleanup work following from the recent conversion of the VMA flags to
a bitmap
- "mm/damon: add optional debugging-purpose sanity checks" (SeongJae
Park)
Add some more developer-facing debug checks into DAMON core
- "mm/damon: test and document power-of-2 min_region_sz requirement"
(SeongJae Park)
An additional DAMON kunit test and makes some adjustments to the
addr_unit parameter handling
- "mm/damon/core: make passed_sample_intervals comparisons
overflow-safe" (SeongJae Park)
Fix a hard-to-hit time overflow issue in DAMON core
- "mm/damon: improve/fixup/update ratio calculation, test and
documentation" (SeongJae Park)
A batch of misc/minor improvements and fixups for DAMON
- "mm: move vma_(kernel|mmu)_pagesize() out of hugetlb.c" (David
Hildenbrand)
Fix a possible issue with dax-device when CONFIG_HUGETLB=n. Some code
movement was required.
- "zram: recompression cleanups and tweaks" (Sergey Senozhatsky)
A somewhat random mix of fixups, recompression cleanups and
improvements in the zram code
- "mm/damon: support multiple goal-based quota tuning algorithms"
(SeongJae Park)
Extend DAMOS quotas goal auto-tuning to support multiple tuning
algorithms that users can select
- "mm: thp: reduce unnecessary start_stop_khugepaged()" (Breno Leitao)
Fix the khugpaged sysfs handling so we no longer spam the logs with
reams of junk when starting/stopping khugepaged
- "mm: improve map count checks" (Lorenzo Stoakes)
Provide some cleanups and slight fixes in the mremap, mmap and vma
code
- "mm/damon: support addr_unit on default monitoring targets for
modules" (SeongJae Park)
Extend the use of DAMON core's addr_unit tunable
- "mm: khugepaged cleanups and mTHP prerequisites" (Nico Pache)
Cleanups to khugepaged and is a base for Nico's planned khugepaged
mTHP support
- "mm: memory hot(un)plug and SPARSEMEM cleanups" (David Hildenbrand)
Code movement and cleanups in the memhotplug and sparsemem code
- "mm: remove CONFIG_ARCH_ENABLE_MEMORY_HOTREMOVE and cleanup
CONFIG_MIGRATION" (David Hildenbrand)
Rationalize some memhotplug Kconfig support
- "change young flag check functions to return bool" (Baolin Wang)
Cleanups to change all young flag check functions to return bool
- "mm/damon/sysfs: fix memory leak and NULL dereference issues" (Josh
Law and SeongJae Park)
Fix a few potential DAMON bugs
- "mm/vma: convert vm_flags_t to vma_flags_t in vma code" (Lorenzo
Stoakes)
Convert a lot of the existing use of the legacy vm_flags_t data type
to the new vma_flags_t type which replaces it. Mainly in the vma
code.
- "mm: expand mmap_prepare functionality and usage" (Lorenzo Stoakes)
Expand the mmap_prepare functionality, which is intended to replace
the deprecated f_op->mmap hook which has been the source of bugs and
security issues for some time. Cleanups, documentation, extension of
mmap_prepare into filesystem drivers
- "mm/huge_memory: refactor zap_huge_pmd()" (Lorenzo Stoakes)
Simplify and clean up zap_huge_pmd(). Additional cleanups around
vm_normal_folio_pmd() and the softleaf functionality are performed.
* tag 'mm-stable-2026-04-13-21-45' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (369 commits)
mm: fix deferred split queue races during migration
mm/khugepaged: fix issue with tracking lock
mm/huge_memory: add and use has_deposited_pgtable()
mm/huge_memory: add and use normal_or_softleaf_folio_pmd()
mm: add softleaf_is_valid_pmd_entry(), pmd_to_softleaf_folio()
mm/huge_memory: separate out the folio part of zap_huge_pmd()
mm/huge_memory: use mm instead of tlb->mm
mm/huge_memory: remove unnecessary sanity checks
mm/huge_memory: deduplicate zap deposited table call
mm/huge_memory: remove unnecessary VM_BUG_ON_PAGE()
mm/huge_memory: add a common exit path to zap_huge_pmd()
mm/huge_memory: handle buggy PMD entry in zap_huge_pmd()
mm/huge_memory: have zap_huge_pmd return a boolean, add kdoc
mm/huge: avoid big else branch in zap_huge_pmd()
mm/huge_memory: simplify vma_is_specal_huge()
mm: on remap assert that input range within the proposed VMA
mm: add mmap_action_map_kernel_pages[_full]()
uio: replace deprecated mmap hook with mmap_prepare in uio_info
drivers: hv: vmbus: replace deprecated mmap hook with mmap_prepare
mm: allow handling of stacked mmap_prepare hooks in more drivers
...
|
|
Pull fscrypt updates from Eric Biggers:
- Various cleanups for the interface between fs/crypto/ and
filesystems, from Christoph Hellwig
- Simplify and optimize the implementation of v1 key derivation by
using the AES library instead of the crypto_skcipher API
* tag 'fscrypt-for-linus' of git://git.kernel.org/pub/scm/fs/fscrypt/linux:
fscrypt: use AES library for v1 key derivation
ext4: use a byte granularity cursor in ext4_mpage_readpages
fscrypt: pass a real sector_t to fscrypt_zeroout_range
fscrypt: pass a byte length to fscrypt_zeroout_range
fscrypt: pass a byte offset to fscrypt_zeroout_range
fscrypt: pass a byte length to fscrypt_zeroout_range_inline_crypt
fscrypt: pass a byte offset to fscrypt_zeroout_range_inline_crypt
fscrypt: pass a byte offset to fscrypt_set_bio_crypt_ctx
fscrypt: pass a byte offset to fscrypt_mergeable_bio
fscrypt: pass a byte offset to fscrypt_generate_dun
fscrypt: move fscrypt_set_bio_crypt_ctx_bh to buffer.c
ext4, fscrypt: merge fscrypt_mergeable_bio_bh into io_submit_need_new_bio
ext4: factor out a io_submit_need_new_bio helper
ext4: open code fscrypt_set_bio_crypt_ctx_bh
ext4: initialize the write hint in io_submit_init_bio
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull vfs i_ino updates from Christian Brauner:
"For historical reasons, the inode->i_ino field is an unsigned long,
which means that it's 32 bits on 32 bit architectures. This has caused
a number of filesystems to implement hacks to hash a 64-bit identifier
into a 32-bit field, and deprives us of a universal identifier field
for an inode.
This changes the inode->i_ino field from an unsigned long to a u64.
This shouldn't make any material difference on 64-bit hosts, but
32-bit hosts will see struct inode grow by at least 4 bytes. This
could have effects on slabcache sizes and field alignment.
The bulk of the changes are to format strings and tracepoints, since
the kernel itself doesn't care that much about the i_ino field. The
first patch changes some vfs function arguments, so check that one out
carefully.
With this change, we may be able to shrink some inode structures. For
instance, struct nfs_inode has a fileid field that holds the 64-bit
inode number. With this set of changes, that field could be
eliminated. I'd rather leave that sort of cleanups for later just to
keep this simple"
* tag 'vfs-7.1-rc1.kino' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
nilfs2: fix 64-bit division operations in nilfs_bmap_find_target_in_group()
EVM: add comment describing why ino field is still unsigned long
vfs: remove externs from fs.h on functions modified by i_ino widening
treewide: fix missed i_ino format specifier conversions
ext4: fix signed format specifier in ext4_load_inode trace event
treewide: change inode->i_ino from unsigned long to u64
nilfs2: widen trace event i_ino fields to u64
f2fs: widen trace event i_ino fields to u64
ext4: widen trace event i_ino fields to u64
zonefs: widen trace event i_ino fields to u64
hugetlbfs: widen trace event i_ino fields to u64
ext2: widen trace event i_ino fields to u64
cachefiles: widen trace event i_ino fields to u64
vfs: widen trace event i_ino fields to u64
net: change sock.sk_ino and sock_i_ino() to u64
audit: widen ino fields to u64
vfs: widen inode hash/lookup functions to u64
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull vfs writeback updates from Christian Brauner:
"This introduces writeback helper APIs and converts f2fs, gfs2 and nfs
to stop accessing writeback internals directly"
* tag 'vfs-7.1-rc1.writeback' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
nfs: stop using writeback internals for WB_WRITEBACK accounting
gfs2: stop using writeback internals for dirty_exceeded check
f2fs: stop using writeback internals for dirty_exceeded checks
writeback: prep helpers for dirty-limit and writeback accounting
|
|
struct pagevec was removed in commit 1e0877d58b1e ("mm: remove struct
pagevec"). Rename include/linux/pagevec.h to reflect reality and update
includes tree-wide. Add the new filename to MAINTAINERS explicitly, as it
no longer matches the "include/linux/page[-_]*" pattern in MEMORY
MANAGEMENT - CORE.
Link: https://lkml.kernel.org/r/20260225-pagevec_cleanup-v2-3-716868cc2d11@columbia.edu
Signed-off-by: Tal Zussman <tz2294@columbia.edu>
Acked-by: David Hildenbrand (Arm) <david@kernel.org>
Reviewed-by: Jan Kara <jack@suse.cz>
Acked-by: Zi Yan <ziy@nvidia.com>
Reviewed-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
Cc: Chris Li <chrisl@kernel.org>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Remove unused pagevec.h includes from .c files. These were found with
the following command:
grep -rl '#include.*pagevec\.h' --include='*.c' | while read f; do
grep -qE 'PAGEVEC_SIZE|folio_batch' "$f" || echo "$f"
done
There are probably more removal candidates in .h files, but those are
more complex to analyze.
Link: https://lkml.kernel.org/r/20260225-pagevec_cleanup-v2-2-716868cc2d11@columbia.edu
Signed-off-by: Tal Zussman <tz2294@columbia.edu>
Reviewed-by: Jan Kara <jack@suse.cz>
Acked-by: Zi Yan <ziy@nvidia.com>
Acked-by: Chris Li <chrisl@kernel.org>
Reviewed-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
Cc: Christian Brauner <brauner@kernel.org>
Cc: David Hildenbrand (Arm) <david@kernel.org>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Patch series "mm: Remove stray references to pagevec", v2.
struct pagevec was removed in commit 1e0877d58b1e ("mm: remove struct
pagevec"). Remove any stray references to it and rename relevant files
and macros accordingly.
While at it, remove unnecessary #includes of pagevec.h (now folio_batch.h)
in .c files. There are probably more of these that could be removed in .h
files, but those are more complex to verify.
This patch (of 4):
struct pagevec was removed in commit 1e0877d58b1e ("mm: remove struct
pagevec"). Remove remaining forward declarations and change
__folio_batch_release()'s declaration to match its definition.
Link: https://lkml.kernel.org/r/20260225-pagevec_cleanup-v2-0-716868cc2d11@columbia.edu
Link: https://lkml.kernel.org/r/20260225-pagevec_cleanup-v2-1-716868cc2d11@columbia.edu
Signed-off-by: Tal Zussman <tz2294@columbia.edu>
Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Acked-by: David Hildenbrand (Arm) <david@kernel.org>
Acked-by: Chris Li <chrisl@kernel.org>
Acked-by: Zi Yan <ziy@nvidia.com>
Reviewed-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Jan Kara <jack@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
While the pblk argument to fscrypt_zeroout_range is declared as a
sector_t, it actually is interpreted as a logical block size unit, which
is highly unusual. Switch to passing the 512 byte units that sector_t is
defined for.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20260302141922.370070-14-hch@lst.de
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|
|
Range lengths are usually expressed as bytes in the VFS, switch
fscrypt_zeroout_range to this convention.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20260302141922.370070-13-hch@lst.de
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|
|
Logical offsets into an inode are usually expressed as bytes in the VFS.
Switch fscrypt_zeroout_range to that convention.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20260302141922.370070-12-hch@lst.de
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|
|
Logical offsets into an inode are usually expressed as bytes in the VFS.
Switch fscrypt_set_bio_crypt_ctx to that convention.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20260302141922.370070-9-hch@lst.de
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|
|
Logical offsets into an inode are usually expressed as bytes in the VFS.
Switch fscrypt_mergeable_bio to that convention.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20260302141922.370070-8-hch@lst.de
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|
|
On 32-bit architectures, unsigned long is only 32 bits wide, which
causes 64-bit inode numbers to be silently truncated. Several
filesystems (NFS, XFS, BTRFS, etc.) can generate inode numbers that
exceed 32 bits, and this truncation can lead to inode number collisions
and other subtle bugs on 32-bit systems.
Change the type of inode->i_ino from unsigned long to u64 to ensure that
inode numbers are always represented as 64-bit values regardless of
architecture. Update all format specifiers treewide from %lu/%lx to
%llu/%llx to match the new type, along with corresponding local variable
types.
This is the bulk treewide conversion. Earlier patches in this series
handled trace events separately to allow trace field reordering for
better struct packing on 32-bit.
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Link: https://patch.msgid.link/20260304-iino-u64-v3-12-2257ad83d372@kernel.org
Acked-by: Damien Le Moal <dlemoal@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Chuck Lever <chuck.lever@oracle.com>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Change the inode hash/lookup VFS API functions to accept u64 parameters
instead of unsigned long for inode numbers and hash values. This is
preparation for widening i_ino itself to u64, which will allow
filesystems to store full 64-bit inode numbers on 32-bit architectures.
Since unsigned long implicitly widens to u64 on all architectures, this
change is backward-compatible with all existing callers.
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Link: https://patch.msgid.link/20260304-iino-u64-v3-1-2257ad83d372@kernel.org
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Pull fsverity fixes from Eric Biggers:
- Fix a build error on parisc
- Remove the non-large-folio-aware function fsverity_verify_page()
* tag 'fsverity-for-linus' of git://git.kernel.org/pub/scm/fs/fsverity/linux:
fsverity: fix build error by adding fsverity_readahead() stub
fsverity: remove fsverity_verify_page()
f2fs: make f2fs_verify_cluster() partially large-folio-aware
f2fs: remove unnecessary ClearPageUptodate in f2fs_verify_cluster()
|
|
This was done entirely with mindless brute force, using
git grep -l '\<k[vmz]*alloc_objs*(.*, GFP_KERNEL)' |
xargs sed -i 's/\(alloc_objs*(.*\), GFP_KERNEL)/\1)/'
to convert the new alloc_obj() users that had a simple GFP_KERNEL
argument to just drop that argument.
Note that due to the extreme simplicity of the scripting, any slightly
more complex cases spread over multiple lines would not be triggered:
they definitely exist, but this covers the vast bulk of the cases, and
the resulting diff is also then easier to check automatically.
For the same reason the 'flex' versions will be done as a separate
conversion.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
This is the result of running the Coccinelle script from
scripts/coccinelle/api/kmalloc_objs.cocci. The script is designed to
avoid scalar types (which need careful case-by-case checking), and
instead replace kmalloc-family calls that allocate struct or union
object instances:
Single allocations: kmalloc(sizeof(TYPE), ...)
are replaced with: kmalloc_obj(TYPE, ...)
Array allocations: kmalloc_array(COUNT, sizeof(TYPE), ...)
are replaced with: kmalloc_objs(TYPE, COUNT, ...)
Flex array allocations: kmalloc(struct_size(PTR, FAM, COUNT), ...)
are replaced with: kmalloc_flex(*PTR, FAM, COUNT, ...)
(where TYPE may also be *VAR)
The resulting allocations no longer return "void *", instead returning
"TYPE *".
Signed-off-by: Kees Cook <kees@kernel.org>
|
|
f2fs_verify_cluster() is the only remaining caller of the
non-large-folio-aware function fsverity_verify_page(). To unblock the
removal of that function, change f2fs_verify_cluster() to verify the
entire folio of each page and mark it up-to-date.
Note that this doesn't actually make f2fs_verify_cluster()
large-folio-aware, as it is still passed an array of pages. Currently,
it's never called with large folios.
Suggested-by: Matthew Wilcox <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20260218010630.7407-3-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|
|
Remove the unnecessary clearing of PG_uptodate. It's guaranteed to
already be clear.
Suggested-by: Matthew Wilcox <willy@infradead.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20260218010630.7407-2-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|
|
Replace direct dereferences of dirty_exceeded with the core helper
bdi_wb_dirty_exceeded(), removing f2fs dependencies on writeback
internals.
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Reviewed-by: Andreas Gruenbacher <agruenba@redhat.com>
Suggested-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Kundan Kumar <kundan.kumar@samsung.com>
Signed-off-by: Anuj Gupta <anuj20.g@samsung.com>
Link: https://patch.msgid.link/20260213054634.79785-3-kundan.kumar@samsung.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull more misc vfs updates from Christian Brauner:
"Features:
- Optimize close_range() from O(range size) to O(active FDs) by using
find_next_bit() on the open_fds bitmap instead of linearly scanning
the entire requested range. This is a significant improvement for
large-range close operations on sparse file descriptor tables.
- Add FS_XFLAG_VERITY file attribute for fs-verity files, retrievable
via FS_IOC_FSGETXATTR and file_getattr(). The flag is read-only.
Add tracepoints for fs-verity enable and verify operations,
replacing the previously removed debug printk's.
- Prevent nfsd from exporting special kernel filesystems like pidfs
and nsfs. These filesystems have custom ->open() and ->permission()
export methods that are designed for open_by_handle_at(2) only and
are incompatible with nfsd. Update the exportfs documentation
accordingly.
Fixes:
- Fix KMSAN uninit-value in ovl_fill_real() where strcmp() was used
on a non-null-terminated decrypted directory entry name from
fscrypt. This triggered on encrypted lower layers when the
decrypted name buffer contained uninitialized tail data.
The fix also adds VFS-level name_is_dot(), name_is_dotdot(), and
name_is_dot_dotdot() helpers, replacing various open-coded "." and
".." checks across the tree.
- Fix read-only fsflags not being reset together with xflags in
vfs_fileattr_set(). Currently harmless since no read-only xflags
overlap with flags, but this would cause inconsistencies for any
future shared read-only flag
- Return -EREMOTE instead of -ESRCH from PIDFD_GET_INFO when the
target process is in a different pid namespace. This lets userspace
distinguish "process exited" from "process in another namespace",
matching glibc's pidfd_getpid() behavior
Cleanups:
- Use C-string literals in the Rust seq_file bindings, replacing the
kernel::c_str!() macro (available since Rust 1.77)
- Fix typo in d_walk_ret enum comment, add porting notes for the
readlink_copy() calling convention change"
* tag 'vfs-7.0-rc1.misc.2' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
fs: add porting notes about readlink_copy()
pidfs: return -EREMOTE when PIDFD_GET_INFO is called on another ns
nfsd: do not allow exporting of special kernel filesystems
exportfs: clarify the documentation of open()/permission() expotrfs ops
fsverity: add tracepoints
fs: add FS_XFLAG_VERITY for fs-verity files
rust: seq_file: replace `kernel::c_str!` with C-Strings
fs: dcache: fix typo in enum d_walk_ret comment
ovl: use name_is_dot* helpers in readdir code
fs: add helpers name_is_dot{,dot,_dotdot}
ovl: Fix uninit-value in ovl_fill_real
fs: reset read-only fsflags together with xflags
fs/file: optimize close_range() complexity from O(N) to O(Sparse)
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs
Pull f2fs updates from Jaegeuk Kim:
"In this development cycle, we focused on several key performance
optimizations:
- introducing large folio support to enhance read speeds for
immutable files
- reducing checkpoint=enable latency by flushing only committed dirty
pages
- implementing tracepoints to diagnose and resolve lock priority
inversion.
Additionally, we introduced the packed_ssa feature to optimize the SSA
footprint when utilizing large block sizes.
Detail summary:
Enhancements:
- support large folio for immutable non-compressed case
- support non-4KB block size without packed_ssa feature
- optimize f2fs_enable_checkpoint() to avoid long delay
- optimize f2fs_overwrite_io() for f2fs_iomap_begin
- optimize NAT block loading during checkpoint write
- add write latency stats for NAT and SIT blocks in
f2fs_write_checkpoint
- pin files do not require sbi->writepages lock for ordering
- avoid f2fs_map_blocks() for consecutive holes in readpages
- flush plug periodically during GC to maximize readahead effect
- add tracepoints to catch lock overheads
- add several sysfs entries to tune internal lock priorities
Fixes:
- fix lock priority inversion issue
- fix incomplete block usage in compact SSA summaries
- fix to show simulate_lock_timeout correctly
- fix to avoid mapping wrong physical block for swapfile
- fix IS_CHECKPOINTED flag inconsistency issue caused by
concurrent atomic commit and checkpoint writes
- fix to avoid UAF in f2fs_write_end_io()"
* tag 'f2fs-for-7.0-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (61 commits)
f2fs: sysfs: introduce critical_task_priority
f2fs: introduce trace_f2fs_priority_update
f2fs: fix lock priority inversion issue
f2fs: optimize f2fs_overwrite_io() for f2fs_iomap_begin
f2fs: fix incomplete block usage in compact SSA summaries
f2fs: decrease maximum flush retry count in f2fs_enable_checkpoint()
f2fs: optimize NAT block loading during checkpoint write
f2fs: change size parameter of __has_cursum_space() to unsigned int
f2fs: add write latency stats for NAT and SIT blocks in f2fs_write_checkpoint
f2fs: pin files do not require sbi->writepages lock for ordering
f2fs: fix to show simulate_lock_timeout correctly
f2fs: introduce FAULT_SKIP_WRITE
f2fs: check skipped write in f2fs_enable_checkpoint()
Revert "f2fs: add timeout in f2fs_enable_checkpoint()"
f2fs: fix to unlock folio in f2fs_read_data_large_folio()
f2fs: fix error path handling in f2fs_read_data_large_folio()
f2fs: use folio_end_read
f2fs: fix to avoid mapping wrong physical block for swapfile
f2fs: avoid f2fs_map_blocks() for consecutive holes in readpages
f2fs: advance index and offset after zeroing in large folio read
...
|
|
Pull fsverity updates from Eric Biggers:
"fsverity cleanups, speedup, and memory usage optimization from
Christoph Hellwig:
- Move some logic into common code
- Fix btrfs to reject truncates of fsverity files
- Improve the readahead implementation
- Store each inode's fsverity_info in a hash table instead of using a
pointer in the filesystem-specific part of the inode.
This optimizes for memory usage in the usual case where most files
don't have fsverity enabled.
- Look up the fsverity_info fewer times during verification, to
amortize the hash table overhead"
* tag 'fsverity-for-linus' of git://git.kernel.org/pub/scm/fs/fsverity/linux:
fsverity: remove inode from fsverity_verification_ctx
fsverity: use a hashtable to find the fsverity_info
btrfs: consolidate fsverity_info lookup
f2fs: consolidate fsverity_info lookup
ext4: consolidate fsverity_info lookup
fs: consolidate fsverity_info lookup in buffer.c
fsverity: push out fsverity_info lookup
fsverity: deconstify the inode pointer in struct fsverity_info
fsverity: kick off hash readahead at data I/O submission time
ext4: move ->read_folio and ->readahead to readpage.c
readahead: push invalidate_lock out of page_cache_ra_unbounded
fsverity: don't issue readahead for non-ENOENT errors from __filemap_get_folio
fsverity: start consolidating pagecache code
fsverity: pass struct file to ->write_merkle_tree_block
f2fs: don't build the fsverity work handler for !CONFIG_FS_VERITY
ext4: don't build the fsverity work handler for !CONFIG_FS_VERITY
fs,fsverity: clear out fsverity_info from common code
fs,fsverity: reject size changes on fsverity files in setattr_prepare
|
|
This patch introduces /sys/fs/f2fs/<disk>/critical_task_priority, w/
this new sysfs interface, we can tune priority of f2fs_ckpt thread and
f2fs_gc thread.
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux
Pull block updates from Jens Axboe:
- Support for batch request processing for ublk, improving the
efficiency of the kernel/ublk server communication. This can yield
nice 7-12% performance improvements
- Support for integrity data for ublk
- Various other ublk improvements and additions, including a ton of
selftests additions and updated
- Move the handling of blk-crypto software fallback from below the
block layer to above it. This reduces the complexity of dealing with
bio splitting
- Series fixing a number of potential deadlocks in blk-mq related to
the queue usage counter and writeback throttling and rq-qos debugfs
handling
- Add an async_depth queue attribute, to resolve a performance
regression that's been around for a qhilw related to the scheduler
depth handling
- Only use task_work for IOPOLL completions on NVMe, if it is necessary
to do so. An earlier fix for an issue resulted in all these
completions being punted to task_work, to guarantee that completions
were only run for a given io_uring ring when it was local to that
ring. With the new changes, we can detect if it's necessary to use
task_work or not, and avoid it if possible.
- rnbd fixes:
- Fix refcount underflow in device unmap path
- Handle PREFLUSH and NOUNMAP flags properly in protocol
- Fix server-side bi_size for special IOs
- Zero response buffer before use
- Fix trace format for flags
- Add .release to rnbd_dev_ktype
- MD pull requests via Yu Kuai
- Fix raid5_run() to return error when log_init() fails
- Fix IO hang with degraded array with llbitmap
- Fix percpu_ref not resurrected on suspend timeout in llbitmap
- Fix GPF in write_page caused by resize race
- Fix NULL pointer dereference in process_metadata_update
- Fix hang when stopping arrays with metadata through dm-raid
- Fix any_working flag handling in raid10_sync_request
- Refactor sync/recovery code path, improve error handling for
badblocks, and remove unused recovery_disabled field
- Consolidate mddev boolean fields into mddev_flags
- Use mempool to allocate stripe_request_ctx and make sure
max_sectors is not less than io_opt in raid5
- Fix return value of mddev_trylock
- Fix memory leak in raid1_run()
- Add Li Nan as mdraid reviewer
- Move phys_vec definitions to the kernel types, mostly in preparation
for some VFIO and RDMA changes
- Improve the speed for secure erase for some devices
- Various little rust updates
- Various other minor fixes, improvements, and cleanups
* tag 'for-7.0/block-20260206' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux: (162 commits)
blk-mq: ABI/sysfs-block: fix docs build warnings
selftests: ublk: organize test directories by test ID
block: decouple secure erase size limit from discard size limit
block: remove redundant kill_bdev() call in set_blocksize()
blk-mq: add documentation for new queue attribute async_dpeth
block, bfq: convert to use request_queue->async_depth
mq-deadline: covert to use request_queue->async_depth
kyber: covert to use request_queue->async_depth
blk-mq: add a new queue sysfs attribute async_depth
blk-mq: factor out a helper blk_mq_limit_depth()
blk-mq-sched: unify elevators checking for async requests
block: convert nr_requests to unsigned int
block: don't use strcpy to copy blockdev name
blk-mq-debugfs: warn about possible deadlock
blk-mq-debugfs: add missing debugfs_mutex in blk_mq_debugfs_register_hctxs()
blk-mq-debugfs: remove blk_mq_debugfs_unregister_rqos()
blk-mq-debugfs: make blk_mq_debugfs_register_rqos() static
blk-rq-qos: fix possible debugfs_mutex deadlock
blk-mq-debugfs: factor out a helper to register debugfs for all rq_qos
blk-wbt: fix possible deadlock to nest pcpu_alloc_mutex under q_usage_counter
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
Pull vfs error reporting updates from Christian Brauner:
"This contains the changes to support generic I/O error reporting.
Filesystems currently have no standard mechanism for reporting
metadata corruption and file I/O errors to userspace via fsnotify.
Each filesystem (xfs, ext4, erofs, f2fs, etc.) privately defines
EFSCORRUPTED, and error reporting to fanotify is inconsistent or
absent entirely.
This introduces a generic fserror infrastructure built around struct
super_block that gives filesystems a standard way to queue metadata
and file I/O error reports for delivery to fsnotify.
Errors are queued via mempools and queue_work to avoid holding
filesystem locks in the notification path; unmount waits for pending
events to drain. A new super_operations::report_error callback lets
filesystem drivers respond to file I/O errors themselves (to be used
by an upcoming XFS self-healing patchset).
On the uapi side, EFSCORRUPTED and EUCLEAN are promoted from private
per-filesystem definitions to canonical errno.h values across all
architectures"
* tag 'vfs-7.0-rc1.fserror' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
ext4: convert to new fserror helpers
xfs: translate fsdax media errors into file "data lost" errors when convenient
xfs: report fs metadata errors via fsnotify
iomap: report file I/O errors to the VFS
fs: report filesystem and file I/O errors to fsnotify
uapi: promote EFSCORRUPTED and EUCLEAN to errno.h
|
|
Use the kernel's resizable hash table (rhashtable) to find the
fsverity_info. This way file systems that want to support fsverity don't
have to bloat every inode in the system with an extra pointer. The
trade-off is that looking up the fsverity_info is a bit more expensive
now, but the main operations are still dominated by I/O and hashing
overhead.
The rhashtable implementations requires no external synchronization, and
the _fast versions of the APIs provide the RCU critical sections required
by the implementation. Because struct fsverity_info is only removed on
inode eviction and does not contain a reference count, there is no need
for an extended critical section to grab a reference or validate the
object state. The file open path uses rhashtable_lookup_get_insert_fast,
which can either find an existing object for the hash key or insert a
new one in a single atomic operation, so that concurrent opens never
instantiate duplicate fsverity_info structure. FS_IOC_ENABLE_VERITY must
already be synchronized by a combination of i_rwsem and file system flags
and uses rhashtable_lookup_insert_fast, which errors out on an existing
object for the hash key as an additional safety check.
Because insertion into the hash table now happens before S_VERITY is set,
fsverity just becomes a barrier and a flag check and doesn't have to look
up the fsverity_info at all, so there is only a single lookup per
->read_folio or ->readahead invocation. For btrfs there is an additional
one for each bio completion, while for ext4 and f2fs the fsverity_info
is stored in the per-I/O context and reused for the completion workqueue.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
Link: https://lore.kernel.org/r/20260202060754.270269-12-hch@lst.de
[EB: folded in fix for missing fsverity_free_info()]
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|
|
Look up the fsverity_info once in f2fs_mpage_readpages, and then use it
for the readahead, local verification of holes and pass it along to the
I/O completion workqueue in struct bio_post_read_ctx. Do the same
thing in f2fs_get_read_data_folio for reads that come from garbage
collection and other background activities.
This amortizes the lookup better once it becomes less efficient.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20260202060754.270269-10-hch@lst.de
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|
|
Pass a struct fsverity_info to the verification and readahead helpers,
and push the lookup into the callers. Right now this is a very dumb
almost mechanic move that open codes a lot of fsverity_info_addr() calls
in the file systems. The subsequent patches will clean this up.
This prepares for reducing the number of fsverity_info lookups, which
will allow to amortize them better when using a more expensive lookup
method.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
Acked-by: David Sterba <dsterba@suse.com> # btrfs
Link: https://lore.kernel.org/r/20260202060754.270269-7-hch@lst.de
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|
|
Currently all reads of the fsverity hashes are kicked off from the data
I/O completion handler, leading to needlessly dependent I/O. This is
worked around a bit by performing readahead on the level 0 nodes, but
still fairly ineffective.
Switch to a model where the ->read_folio and ->readahead methods instead
kick off explicit readahead of the fsverity hashed so they are usually
available at I/O completion time.
For 64k sequential reads on my test VM this improves read performance
from 2.4GB/s - 2.6GB/s to 3.5GB/s - 3.9GB/s. The improvements for
random reads are likely to be even bigger.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: David Sterba <dsterba@suse.com> # btrfs
Link: https://lore.kernel.org/r/20260202060754.270269-5-hch@lst.de
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|
|
Require the invalidate_lock to be held over calls to
page_cache_ra_unbounded instead of acquiring it in this function.
This prepares for calling page_cache_ra_unbounded from ->readahead for
fsverity read-ahead.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Link: https://lore.kernel.org/r/20260202060754.270269-3-hch@lst.de
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|
|
This patch introduces two new tracepoints for debug purpose.
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
If userspace thread has held f2fs rw semaphore, due to its low priority,
it could be runnable or preempted state for long time, during the time,
it will block high priority thread which is trying to grab the same rw
semaphore, e.g. cp_rwsem, io_rwsem...
To fix such issue, let's detect thread's priority when it tries to grab
f2fs_rwsem lock, if the priority is lower than a priority threshold, let's
uplift the priority before it enters into critical region of lock, and
restore the priority after it leaves from critical region.
Meanwhile, introducing two new sysfs nodes:
- /sys/fs/f2fs/<disk>/adjust_lock_priority, it is used to control whether
the functionality is enable or not.
========== ==================
Flag_Value Flag_Description
========== ==================
0x00000000 Disabled (default)
0x00000001 cp_rwsem
0x00000002 node_change
0x00000004 node_write
0x00000008 gc_lock
0x00000010 cp_global
0x00000020 io_rwsem
========== ==================
- /sys/fs/f2fs/<disk>/lock_duration_priority, it is used to control
priority threshold.
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
When overwriting already allocated blocks, f2fs_iomap_begin() calls
f2fs_overwrite_io() to check block mappings. However,
f2fs_overwrite_io() iterates through all mapped blocks in the range,
which can be inefficient for fragmented files with large I/O requests.
This patch optimizes f2fs_overwrite_io() by adding a 'check_first'
parameter and introducing __f2fs_overwrite_io() helper. When called from
f2fs_iomap_begin(), we only check the first mapping to determine if the
range is already allocated, which is sufficient for setting
map.m_may_create.
This optimization significantly reduces the number of f2fs_map_blocks()
calls in f2fs_overwrite_io() when called from f2fs_iomap_begin(),
especially for fragmented files with large I/O requests.
Cc: stable@kernel.org
Fixes: 351bc761338d ("f2fs: optimize f2fs DIO overwrites")
Reviewed-by: Sungjong Seo <sj1557.seo@samsung.com>
Reviewed-by: Sunmin Jeong <s_min.jeong@samsung.com>
Signed-off-by: Yeongjin Gil <youngjin.gil@samsung.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
In a previous commit, a bug was introduced where compact SSA summaries
failed to utilize the entire block space in non-4KB block size
configurations, leading to inefficient space management.
This patch fixes the calculation logic to ensure that compact SSA
summaries can fully occupy the block regardless of the block size.
Reported-by: Chris Mason <clm@meta.com>
Fixes: e48e16f3e37f ("f2fs: support non-4KB block size without packed_ssa feature")
Signed-off-by: Daeho Jeong <daehojeong@google.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
ext4 and f2fs are largely using the same code to read a page full
of Merkle tree blocks from the page cache, and the upcoming xfs
fsverity support would add another copy.
Move the ext4 code to fs/verity/ and use it in f2fs as well. For f2fs
this removes the previous f2fs-specific error injection, but otherwise
the behavior remains unchanged.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Andrey Albershteyn <aalbersh@redhat.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
Link: https://lore.kernel.org/r/20260128152630.627409-7-hch@lst.de
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|
|
This will make an iomap implementation of the method easier.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Andrey Albershteyn <aalbersh@redhat.com>
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
Acked-by: David Sterba <dsterba@suse.com> # btrfs
Link: https://lore.kernel.org/r/20260128152630.627409-6-hch@lst.de
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|
|
Use IS_ENABLED to disable this code, leading to a slight size reduction:
text data bss dec hex filename
25709 2412 24 28145 6df1 fs/f2fs/compress.o.old
25198 2252 24 27474 6b52 fs/f2fs/compress.o
Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20260128152630.627409-5-hch@lst.de
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|
|
Free the fsverity_info directly in clear_inode instead of requiring file
systems to handle it.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
Acked-by: David Sterba <dsterba@suse.com> # btrfs
Link: https://lore.kernel.org/r/20260128152630.627409-3-hch@lst.de
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|
|
Add the check to reject truncates of fsverity files directly to
setattr_prepare instead of requiring the file system to handle it.
Besides removing boilerplate code, this also fixes the complete lack of
such check in btrfs.
Fixes: 146054090b08 ("btrfs: initial fsverity support")
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
Link: https://lore.kernel.org/r/20260128152630.627409-2-hch@lst.de
Signed-off-by: Eric Biggers <ebiggers@kernel.org>
|
|
Rename the helper is_dot_dotdot() into the name_ namespace
and add complementary helpers to check for dot and dotdot
names individually.
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Link: https://patch.msgid.link/20260128132406.23768-3-amir73il@gmail.com
Reviewed-by: Eric Biggers <ebiggers@kernel.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
It's rare case that sync_inodes_sb() always skips to flush some drity
datas, so it's enough to give extra three more chances to flush data.
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
Under stress tests with frequent metadata operations, checkpoint write
time can become excessively long. Analysis shows that the slowdown is
caused by synchronous, one-by-one reads of NAT blocks during checkpoint
processing.
The issue can be reproduced with the following workload:
1. seq 1 650000 | xargs -P 16 -n 1 touch
2. sync # avoid checkpoint write during deleting
3. delete 1 file every 455 files
4. echo 3 > /proc/sys/vm/drop_caches
5. sync # trigger checkpoint write
This patch submits read I/O for all NAT blocks required in the
__flush_nat_entry_set() phase in advance, reducing the overhead of
synchronous waiting for individual NAT block reads.
The NAT block flush latency before and after the change is as below:
| |NAT blocks accessed|NAT blocks read|Flush time (ms)|
|-------------|-------------------|---------------|---------------|
|Before change|1205 |1191 |158 |
|After change |1264 |1242 |11 |
With a similar number of NAT blocks accessed and read from disk, adding
NAT block readahead reduces the total NAT block flush time by more than
90%.
Signed-off-by: Yongpeng Yang <yangyongpeng@xiaomi.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
All callers of __has_cursum_space() pass an unsigned int value as the
size parameter. Change the parameter type to unsigned int accordingly.
Signed-off-by: Yongpeng Yang <yangyongpeng@xiaomi.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
This patch adds separate write latency accounting for NAT and SIT blocks
in f2fs_write_checkpoint().
Signed-off-by: Yongpeng Yang <yangyongpeng@xiaomi.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
For pinned files, the file mapping is already established before
writing, and since the writes are in IPU, there is no need to acquire
the sbi->writepages lock to guarantee write ordering.
Signed-off-by: Yongpeng Yang <yangyongpeng@xiaomi.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
Commit d36de29f4bb5 ("f2fs: sysfs: introduce inject_lock_timeout")
introduces a bug as below, fix it.
cat /sys/fs/f2fs/vdx/inject_lock_timeout
s/fs/f2fs/vdx/inject_lock_timeout: Invalid argument
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
In order to simulate skipped write during enable_checkpoint().
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
This patch introduces sbi->nr_pages[F2FS_SKIPPED_WRITE] to record any
skipped write during data flush in f2fs_enable_checkpoint().
So in the loop of data flush, if there is any skipped write in previous
flush, let's retry sync_inode_sb(), otherwise, all dirty data written
before f2fs_enable_checkpoint() should have been persisted, then break
the retry loop.
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|
|
This reverts commit 4bc347779698b5e67e1514bab105c2c083e55502.
Let's apply a better approach to flush the only dirty pages committed by user
to avoid the delay caused by unncessary incoming ones.
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
|