| Age | Commit message (Collapse) | Author | Lines |
|
git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat
Pull exfat updates from Namjae Jeon:
- Implement FALLOC_FL_ALLOCATE_RANGE to add support for preallocating
clusters without zeroing, helping to reduce file fragmentation
- Add a unified block readahead helper for FAT chain conversion, bitmap
allocation, and directory entry lookups
- Optimize exfat_chain_cont_cluster() by caching buffer heads to
minimize mark_buffer_dirty() and mirroring overhead during
NO_FAT_CHAIN to FAT_CHAIN conversion
- Switch to truncate_inode_pages_final() in evict_inode() to prevent
BUG_ON caused by shadow entries during reclaim
- Fix a 32-bit truncation bug in directory entry calculations by
ensuring proper bitwise coercion
- Fix sb->s_maxbytes calculation to correctly reflect the maximum
possible volume size for a given cluster size, resolving xfstests
generic/213
- Introduced exfat_cluster_walk() helper to traverse FAT chains by a
specified step, handling both ALLOC_NO_FAT_CHAIN and ALLOC_FAT_CHAIN
modes
- Introduced exfat_chain_advance() helper to advance an exfat_chain
structure, updating both the current cluster and remaining size
- Remove dead assignments and fix Smatch warnings
* tag 'exfat-for-7.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/linkinjeon/exfat:
exfat: use exfat_chain_advance helper
exfat: introduce exfat_chain_advance helper
exfat: remove NULL cache pointer case in exfat_ent_get
exfat: use exfat_cluster_walk helper
exfat: introduce exfat_cluster_walk helper
exfat: fix incorrect directory checksum after rename to shorter name
exfat: fix s_maxbytes
exfat: fix passing zero to ERR_PTR() in exfat_mkdir()
exfat: fix error handling for FAT table operations
exfat: optimize exfat_chain_cont_cluster with cached buffer heads
exfat: drop redundant sec parameter from exfat_mirror_bh
exfat: use readahead helper in exfat_get_dentry
exfat: use readahead helper in exfat_allocate_bitmap
exfat: add block readahead in exfat_chain_cont_cluster
exfat: add fallocate FALLOC_FL_ALLOCATE_RANGE support
exfat: Fix bitwise operation having different size
exfat: Drop dead assignment of num_clusters
exfat: use truncate_inode_pages_final() at evict_inode()
|
|
Replace open-coded cluster chain walking logic with exfat_chain_advance()
across exfat_readdir, exfat_find_dir_entry, exfat_count_dir_entries,
exfat_search_empty_slot and exfat_check_dir_empty.
Signed-off-by: Chi Zhiling <chizhiling@kylinos.cn>
Reviewed-by: Sungjong Seo <sj1557.seo@samsung.com>
Reviewed-by: Yuezhang Mo <Yuezhang.Mo@sony.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
|
|
Introduce exfat_chain_advance() to walk a exfat_chain structure by a
given step, updating both ->dir and ->size fields atomically. This
helper handles both ALLOC_NO_FAT_CHAIN and ALLOC_FAT_CHAIN modes with
proper boundary checking.
Suggested-by: Yuezhang Mo <Yuezhang.Mo@sony.com>
Signed-off-by: Chi Zhiling <chizhiling@kylinos.cn>
Reviewed-by: Sungjong Seo <sj1557.seo@samsung.com>
Reviewed-by: Yuezhang Mo <Yuezhang.Mo@sony.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
|
|
Since exfat_get_next_cluster has been updated, no callers pass a NULL
pointer to exfat_ent_get, so remove the handling logic for this case.
Signed-off-by: Chi Zhiling <chizhiling@kylinos.cn>
Reviewed-by: Sungjong Seo <sj1557.seo@samsung.com>
Reviewed-by: Yuezhang Mo <Yuezhang.Mo@sony.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
|
|
Replace the custom exfat_walk_fat_chain() function and open-coded
FAT chain walking logic with the exfat_cluster_walk() helper across
exfat_find_location, __exfat_get_dentry_set, and exfat_map_cluster.
Suggested-by: Sungjong Seo <sj1557.seo@samsung.com>
Signed-off-by: Chi Zhiling <chizhiling@kylinos.cn>
Reviewed-by: Sungjong Seo <sj1557.seo@samsung.com>
Reviewed-by: Yuezhang Mo <Yuezhang.Mo@sony.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
|
|
Introduce exfat_cluster_walk() to walk the FAT chain by a given step,
handling both ALLOC_NO_FAT_CHAIN and ALLOC_FAT_CHAIN modes. Also
redefine exfat_get_next_cluster as a thin wrapper around it for
backward compatibility.
Signed-off-by: Chi Zhiling <chizhiling@kylinos.cn>
Reviewed-by: Sungjong Seo <sj1557.seo@samsung.com>
Reviewed-by: Yuezhang Mo <Yuezhang.Mo@sony.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
|
|
When renaming a file in-place to a shorter name, exfat_remove_entries
marks excess entries as DELETED, but es->num_entries is not updated
accordingly. As a result, exfat_update_dir_chksum iterates over the
deleted entries and computes an incorrect checksum.
This does not lead to persistent corruption because mark_inode_dirty()
is called afterward, and __exfat_write_inode later recomputes the
checksum using the correct num_entries value.
Fix by setting es->num_entries = num_entries in exfat_init_ext_entry.
Signed-off-by: Chi Zhiling <chizhiling@kylinos.cn>
Reviewed-by: Sungjong Seo <sj1557.seo@samsung.com>
Reviewed-by: Yuezhang Mo <Yuezhang.Mo@sony.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
|
|
With fallocate support, xfstest unit generic/213 fails with
QA output created by 213
We should get: fallocate: No space left on device
Strangely, xfs_io sometimes says "Success" when something went wrong
-fallocate: No space left on device
+fallocate: File too large
because sb->s_maxbytes is set to the volume size.
To be in line with other non-extent-based filesystems, set to max volume
size possible with the cluster size of the volume.
Signed-off-by: David Timber <dxdt@dev.snart.me>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
|
|
The implementation is now really basic so rename generic_file_fsync()
simple_fsync() and __generic_file_fsync() to simple_fsync_noflush().
Signed-off-by: Jan Kara <jack@suse.cz>
Link: https://patch.msgid.link/20260326095354.16340-56-jack@suse.cz
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
EXFAT never calls mark_buffer_dirty_inode() and thus
invalidate_inode_buffers() never has anything to evict. Drop the
pointless call.
Signed-off-by: Jan Kara <jack@suse.cz>
Link: https://patch.msgid.link/20260326095354.16340-49-jack@suse.cz
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Detected by Smatch.
namei.c:890 exfat_mkdir() warn:
passing zero to 'ERR_PTR'
Signed-off-by: Yang Wen <anmuxixixi@gmail.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
|
|
Fix three error handling issues in FAT table operations:
1. Fix exfat_update_bh() to properly return errors from sync_dirty_buffer
2. Fix exfat_end_bh() to properly return errors from exfat_update_bh()
and exfat_mirror_bh()
3. Fix ignored return values from exfat_chain_cont_cluster() in inode.c
and namei.c
These fixes ensure that FAT table write errors are properly propagated
to the caller instead of being silently ignored.
Signed-off-by: Chi Zhiling <chizhiling@kylinos.cn>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
|
|
When converting files from NO_FAT_CHAIN to FAT_CHAIN format, profiling
reveals significant time spent in mark_buffer_dirty() and exfat_mirror_bh()
operations. This overhead occurs because each FAT entry modification
triggers a full block dirty marking and mirroring operation.
For consecutive clusters that reside in the same block, optimize by caching
the buffer head and performing dirty marking only once at the end of the
block's modifications.
Performance improvements for converting a 30GB file:
| Cluster Size | Before Patch | After Patch | Speedup |
|--------------|--------------|-------------|---------|
| 512 bytes | 4.243s | 1.866s | 2.27x |
| 4KB | 0.863s | 0.236s | 3.66x |
| 32KB | 0.069s | 0.034s | 2.03x |
| 256KB | 0.012s | 0.006s | 2.00x |
Signed-off-by: Chi Zhiling <chizhiling@kylinos.cn>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
|
|
The sector offset can be obtained from bh->b_blocknr, so drop the
redundant sec parameter from exfat_mirror_bh(). Also clean up the
function to use exfat_update_bh() helper.
No functional changes.
Signed-off-by: Chi Zhiling <chizhiling@kylinos.cn>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
|
|
Replace the custom exfat_dir_readahead() function with the unified
exfat_blk_readahead() helper in exfat_get_dentry(). This removes
the duplicate readahead implementation and uses the common interface,
also reducing code complexity.
Signed-off-by: Chi Zhiling <chizhiling@kylinos.cn>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
|
|
Use the newly added exfat_blk_readahead() helper in exfat_allocate_bitmap()
to simplify the code. This eliminates the duplicate inline readahead logic
and uses the unified readahead interface.
Signed-off-by: Chi Zhiling <chizhiling@kylinos.cn>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
|
|
When a file cannot allocate contiguous clusters, exfat converts the file
from NO_FAT_CHAIN to FAT_CHAIN format. For large files, this conversion
process can take a significant amount of time.
Add simple readahead to read all the FAT blocks in advance, as these
blocks are consecutive, significantly improving the conversion performance.
Test in an empty exfat filesystem:
dd if=/dev/zero of=/mnt/file bs=1M count=30k
dd if=/dev/zero of=/mnt/file2 bs=1M count=1
time cat /mnt/file2 >> /mnt/file
| cluster size | before patch | after patch |
| ------------ | ------------ | ----------- |
| 512 | 47.667s | 4.316s |
| 4k | 6.436s | 0.541s |
| 32k | 0.758s | 0.071s |
| 256k | 0.117s | 0.011s |
Signed-off-by: Chi Zhiling <chizhiling@kylinos.cn>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
|
|
Currently, the Linux (ex)FAT drivers do not employ any cluster
allocation strategy to keep fragmentation at bay. As a result, when
multiple processes are competing for new clusters to expand files in
exfat filesystem on Linux simultaneously, the files end up heavily
fragmented. HDDs are most impacted, but this could also have some
negative impact on various forms of flash memory depending on the
type of underlying technology.
For instance, modern digital cameras produce multiple media files for a
single video stream. If the application does not take the fragmentation
issue into account or the system is under memory pressure, the kernel
end up allocating clusters in said files in a interleaved manner.
Demo script:
for (( i = 0; i < 4; i += 1 ));
do
dd if=/dev/urandom iflag=fullblock bs=1M count=64 of=frag-$i &
done
for (( i = 0; i < 4; i += 1 ));
do
wait
done
filefrag frag-*
Result - Linux kernel native exfat, async mount:
780 extents found
740 extents found
809 extents found
712 extents found
Result - Linux kernel native exfat, sync mount:
1852 extents found
1836 extents found
1846 extents found
1881 extents found
Result - Windows XP:
3 extents found
3 extents found
3 extents found
2 extents found
Windows kernel, on the other hand, regardless of the underlying storage
interface or the medium, seems to space out clusters for each file.
Similar strategy has to be employed by Linux fat filesystems for
efficient utilisation of storage backend.
In the meantime, userspace applications like rsync may
use fallocate to combat this issue.
This patch may introduce a regression-like behaviour to some niche
filesystem-agnostic applications that use fallocate and proceed to
non-sequentially write to the file. Examples:
- libtorrent's use of posix_fallocate() and the first fragment from a
peer is near the end of the file
- "Download accelerators" that do partial content requests(HTTP 206)
in multiple threads writing to the same file
The delay incurred in such use cases is documented in WinAPI. Patches
that add the ioctl equivalents to the WinAPI function
SetFileValidData() and `fsutil file queryvaliddata ...` will follow.
Signed-off-by: David Timber <dxdt@dev.snart.me>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
|
|
cpos has type loff_t (long long), while s_blocksize has type u32. The
inversion wil happen on u32, the coercion to s64 happens afterwards and
will do 0-left-paddding, resulting in the upper bits getting masked out.
Cast s_blocksize to loff_t before negating it.
Found by static code analysis using Klocwork.
Signed-off-by: Philipp Hahn <phahn-oss@avm.de>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
|
|
num_clusters is not used anywhere afterwards. Remove assignment.
Found by static code analysis using Klocwork.
Signed-off-by: Philipp Hahn <phahn-oss@avm.de>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
|
|
Currently, exfat uses truncate_inode_pages() in exfat_evict_inode().
However, truncate_inode_pages() does not mark the mapping as exiting,
so reclaim may still install shadow entries for the mapping until
the inode teardown completes.
In older kernels like Linux 5.10, if shadow entries are present
at that point,clear_inode() can hit
BUG_ON(inode->i_data.nrexceptional);
To align with VFS eviction semantics and prevent this situation,
switch to truncate_inode_pages_final() in ->evict_inode().
Other filesystems were updated to use truncate_inode_pages_final()
in ->evict_inode() by commit 91b0abe36a7b ("mm + fs: store shadow
entries in page cache")'.
Signed-off-by: Yang Wen <anmuxixixi@gmail.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
|
|
This converts some of the visually simpler cases that have been split
over multiple lines. I only did the ones that are easy to verify the
resulting diff by having just that final GFP_KERNEL argument on the next
line.
Somebody should probably do a proper coccinelle script for this, but for
me the trivial script actually resulted in an assertion failure in the
middle of the script. I probably had made it a bit _too_ trivial.
So after fighting that far a while I decided to just do some of the
syntactically simpler cases with variations of the previous 'sed'
scripts.
The more syntactically complex multi-line cases would mostly really want
whitespace cleanup anyway.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
This was done entirely with mindless brute force, using
git grep -l '\<k[vmz]*alloc_objs*(.*, GFP_KERNEL)' |
xargs sed -i 's/\(alloc_objs*(.*\), GFP_KERNEL)/\1)/'
to convert the new alloc_obj() users that had a simple GFP_KERNEL
argument to just drop that argument.
Note that due to the extreme simplicity of the scripting, any slightly
more complex cases spread over multiple lines would not be triggered:
they definitely exist, but this covers the vast bulk of the cases, and
the resulting diff is also then easier to check automatically.
For the same reason the 'flex' versions will be done as a separate
conversion.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
This is the result of running the Coccinelle script from
scripts/coccinelle/api/kmalloc_objs.cocci. The script is designed to
avoid scalar types (which need careful case-by-case checking), and
instead replace kmalloc-family calls that allocate struct or union
object instances:
Single allocations: kmalloc(sizeof(TYPE), ...)
are replaced with: kmalloc_obj(TYPE, ...)
Array allocations: kmalloc_array(COUNT, sizeof(TYPE), ...)
are replaced with: kmalloc_objs(TYPE, COUNT, ...)
Flex array allocations: kmalloc(struct_size(PTR, FAM, COUNT), ...)
are replaced with: kmalloc_flex(*PTR, FAM, COUNT, ...)
(where TYPE may also be *VAR)
The resulting allocations no longer return "void *", instead returning
"TYPE *".
Signed-off-by: Kees Cook <kees@kernel.org>
|
|
Add a blank line after variable declarations in fatent.c and file.c.
This improves readability and makes code style more consistent
across the exfat subsystem.
Signed-off-by: William Hansen-Baird <william.hansen.baird@gmail.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
|
|
Else-branch is unnecessary after return statement in if-branch.
Remove to enhance readability and reduce indentation.
Signed-off-by: William Hansen-Baird <william.hansen.baird@gmail.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
|
|
This patch introduces a count parameter to exfat_get_cluster, which
serves as an input parameter for the caller to specify the desired
number of clusters, and as an output parameter to store the length
of consecutive clusters.
This patch can improve read performance by reducing the number of
get_block calls in sequential read scenarios. speacially in small
cluster size.
According to my test data, the performance improvement is
approximately 10% when read FAT_CHAIN file with 512 bytes of
cluster size.
454 MB/s -> 511 MB/s
Suggested-by: Yuezhang Mo <Yuezhang.Mo@sony.com>
Signed-off-by: Chi Zhiling <chizhiling@kylinos.cn>
Reviewed-by: Yuezhang Mo <Yuezhang.Mo@sony.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
|
|
Change exfat_cache_lookup to return the cluster number of the last
cluster before the next cache (i.e., the end of the current cache range)
or the given 'end' if there is no next cache. This allows the caller to
know whether the next cluster after the current cache is cached.
The function signature is changed to accept an 'end' parameter, which
is the upper bound of the search range. The function now stops early
if it finds a cache that starts within the current cache's tail, meaning
caches are contiguous. The return value is the cluster number at which
the next cache starts (minus one) or the original 'end' if no next cache
is found.
The new behavior is illustrated as follows:
cache: [ccccccc-------ccccccccc]
search: [..................]
return: ^
Signed-off-by: Chi Zhiling <chizhiling@kylinos.cn>
Reviewed-by: Yuezhang Mo <Yuezhang.Mo@sony.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
|
|
The current cache mechanism does not support reading clusters starting
from a file offset of zero. This patch enables that feature in
preparation for subsequent reads of contiguous clusters from offset zero.
1. support finding clusters with zero offset.
2. allow clusters with zero offset to be cached.
Signed-off-by: Chi Zhiling <chizhiling@kylinos.cn>
Reviewed-by: Yuezhang Mo <Yuezhang.Mo@sony.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
|
|
This patch introduces a parameter 'count' to support fetching multiple
clusters in exfat_map_cluster. The returned 'count' indicates the number
of consecutive clusters, or 0 when the input cluster offset is past EOF.
And the 'count' is also an input parameter for the caller to specify the
required number of clusters.
Only NO_FAT_CHAIN files enable multi-cluster fetching in this patch.
After this patch, the time proportion of exfat_get_block has decreased,
The performance data is as follows:
Cluster size: 512 bytes
Sequential read of a 30GB NO_FAT_CHAIN file:
2.4GB/s -> 2.5 GB/s
proportion of exfat_get_block:
10.8% -> 0.02%
Signed-off-by: Chi Zhiling <chizhiling@kylinos.cn>
Reviewed-by: Yuezhang Mo <Yuezhang.Mo@sony.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
|
|
Yuezhang said: "exfat_map_cluster() is only used for files. The code
in this 'else' block is never executed and can be cleaned up."
Suggested-by: Yuezhang Mo <Yuezhang.Mo@sony.com>
Signed-off-by: Chi Zhiling <chizhiling@kylinos.cn>
Reviewed-by: Yuezhang Mo <Yuezhang.Mo@sony.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
|
|
Since exfat_ent_get supports cache buffer head, we can use this option to
reduce sb_bread calls when fetching consecutive entries.
Signed-off-by: Chi Zhiling <chizhiling@kylinos.cn>
Reviewed-by: Yuezhang Mo <Yuezhang.Mo@sony.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
|
|
Remove parameter 'fclus' and 'allow_eof':
- The fclus parameter is changed to a local variable as it is not
needed to be returned.
- The passed allow_eof parameter was always 1, remove it and the
associated error handling.
Signed-off-by: Chi Zhiling <chizhiling@kylinos.cn>
Reviewed-by: Yuezhang Mo <Yuezhang.Mo@sony.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
|
|
The cache_id remains unchanged on a cache miss; its value is always
exactly what was set by cache_init. Therefore, checking this value
again is meaningless.
Signed-off-by: Chi Zhiling <chizhiling@kylinos.cn>
Reviewed-by: Yuezhang Mo <Yuezhang.Mo@sony.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
|
|
The infinite cluster chain loop check is not work because the
loop will terminate when fclus reaches the parameter cluster,
and the parameter cluster value is never greater than
ei->valid_size.
The following relationship holds:
'fclus' < 'cluster' ≤ ei->valid_size ≤ sb->num_clusters
The check would only be triggered if a cluster number greater than
sb->num_clusters is passed, but no caller currently does this.
Signed-off-by: Chi Zhiling <chizhiling@kylinos.cn>
Reviewed-by: Yuezhang Mo <Yuezhang.Mo@sony.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
|
|
Since exfat_ent_get support cache buffer head, let's apply it to
exfat_find_last_cluster.
Signed-off-by: Chi Zhiling <chizhiling@kylinos.cn>
Reviewed-by: Yuezhang Mo <Yuezhang.Mo@sony.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
|
|
Since exfat_ent_get support cache buffer head, let's apply it to
exfat_count_num_clusters.
Signed-off-by: Chi Zhiling <chizhiling@kylinos.cn>
Reviewed-by: Yuezhang Mo <Yuezhang.Mo@sony.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
|
|
This patch is part 2 of cached buffer head for exfat_ent_get,
it introduces an argument for exfat_ent_get, and make sure this
routine releases buffer head refcount when any error return.
Signed-off-by: Chi Zhiling <chizhiling@kylinos.cn>
Reviewed-by: Yuezhang Mo <Yuezhang.Mo@sony.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
|
|
When multiple entries are obtained consecutively, these entries are mostly
stored adjacent to each other. this patch introduces a "last" parameter to
cache the last opened buffer head, and reuse it when possible, which
reduces the number of sb_bread() calls.
When the passed parameter "last" is NULL, it means cache option is
disabled, the behavior unchanged as it was.
Signed-off-by: Chi Zhiling <chizhiling@kylinos.cn>
Reviewed-by: Yuezhang Mo <Yuezhang.Mo@sony.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
|
|
During mmap write, exfat_page_mkwrite() currently extends
valid_size to the end of the VMA range. For a large mapping,
this can push valid_size far beyond the page that actually
triggered the fault, resulting in unnecessary writes.
valid_size only needs to extend to the end of the page
being written.
Signed-off-by: Yuling Dong <yuling-dong@qq.com>
Reviewed-by: Yuezhang Mo <Yuezhang.Mo@sony.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
|
|
Change the type of 'ret' from unsigned int to int in
exfat_find_empty_entry(). Although the implicit type conversion
(int -> unsigned int -> int) does not cause actual bugs in
practice, using int directly is more appropriate for storing
error codes returned by exfat_alloc_cluster().
This improves code clarity and consistency with standard error
handling practices.
Signed-off-by: Haotian Zhang <vulab@iscas.ac.cn>
Reviewed-by: Sungjong Seo <sj1557.seo@samsung.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
|
|
Add the setlease file_operation to exfat_file_operations and
exfat_dir_operations, pointing to generic_setlease. A future patch
will change the default behavior to reject lease attempts with -EINVAL
when there is no setlease file operation defined. Add generic_setlease
to retain the ability to set leases on this filesystem.
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Link: https://patch.msgid.link/20260108-setlease-6-20-v1-7-ea4dec9b67fa@kernel.org
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
The kernel test robot reported that the exFAT remount operation
failed. The reason for the failure was that the process's umask
is different between mount and remount, causing fs_fmask and
fs_dmask are changed.
Potentially, both gid and uid may also be changed. Therefore, when
initializing fs_context for remount, inherit these mount options
from the options used during mount.
Reported-by: kernel test robot <oliver.sang@intel.com>
Closes: https://lore.kernel.org/oe-lkp/202511251637.81670f5c-lkp@intel.com
Signed-off-by: Yuezhang Mo <Yuezhang.Mo@sony.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
|
|
The variable max_ra_count can be 0 in exfat_allocate_bitmap(),
which causes a divide-by-zero error in the subsequent modulo operation
(i % max_ra_count), leading to a system crash.
When max_ra_count is 0, it means that readahead is not used. This patch
load the bitmap without readahead.
Fixes: 9fd688678dd8 ("exfat: optimize allocation bitmap loading time")
Reported-by: Jiaming Zhang <r772577952@gmail.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
|
|
Syzbot created this issue by testing an image that did not have the root
cluster bitmap bit marked. After accessing a file through the root
directory via exfat_lookup, when creating a file again with mkdir,
the root cluster bit can be allocated for direcotry, which can cause
the root cluster to be zeroed out and the same entry can be allocated
in the same cluster. This patch improved this issue by adding
exfat_test_bitmap to validate the cluster bits of the root directory
and directory. And the first cluster bit of the root directory should
never be unset except when storage is corrupted. This bit is set to
allow operations after mount.
Reported-by: syzbot+5216036fc59c43d1ee02@syzkaller.appspotmail.com
Tested-by: syzbot+5216036fc59c43d1ee02@syzkaller.appspotmail.com
Reviewed-by: Sungjong Seo <sj1557.seo@samsung.com>
Reviewed-by: Yuezhang Mo <Yuezhang.Mo@sony.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
|
|
xfstests generic/363 was failing due to unzeroed post-EOF page
cache that allowed mmap writes beyond EOF to become visible
after file extension.
For example, in following xfs_io sequence, 0x22 should not be
written to the file but would become visible after the extension:
xfs_io -f -t -c "pwrite -S 0x11 0 8" \
-c "mmap 0 4096" \
-c "mwrite -S 0x22 32 32" \
-c "munmap" \
-c "pwrite -S 0x33 512 32" \
$testfile
This violates the expected behavior where writes beyond EOF via
mmap should not persist after the file is extended. Instead, the
extended region should contain zeros.
Fix this by using truncate_pagecache() to truncate the page cache
after the current EOF when extending the file.
Signed-off-by: Yuezhang Mo <Yuezhang.Mo@sony.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
|
|
Fix refcount leaks in `exfat_find` related to `exfat_get_dentry_set`.
Function `exfat_get_dentry_set` would increase the reference counter of
`es->bh` on success. Therefore, `exfat_put_dentry_set` must be called
after `exfat_get_dentry_set` to ensure refcount consistency. This patch
relocate two checks to avoid possible leaks.
Fixes: 82ebecdc74ff ("exfat: fix improper check of dentry.stream.valid_size")
Fixes: 13940cef9549 ("exfat: add a check for invalid data size")
Signed-off-by: Shuhao Fu <sfual@cse.ust.hk>
Reviewed-by: Yuezhang Mo <Yuezhang.Mo@sony.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
|
|
gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs
Pull vfs fixes from Christian Brauner:
- Fix unitialized variable in statmount_string()
- Fix hostfs mounting when passing host root during boot
- Fix dynamic lookup to fail on cell lookup failure
- Fix missing file type when reading bfs inodes from disk
- Enforce checking of sb_min_blocksize() calls and update all callers
accordingly
- Restore write access before closing files opened by open_exec() in
binfmt_misc
- Always freeze efivarfs during suspend/hibernate cycles
- Fix statmount()'s and listmount()'s grab_requested_mnt_ns() helper to
actually allow mount namespace file descriptor in addition to mount
namespace ids
- Fix tmpfs remount when noswap is specified
- Switch Landlock to iput_not_last() to remove false-positives from
might_sleep() annotations in iput()
- Remove dead node_to_mnt_ns() code
- Ensure that per-queue kobjects are successfully created
* tag 'vfs-6.18-rc7.fixes' of gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs:
landlock: fix splats from iput() after it started calling might_sleep()
fs: add iput_not_last()
shmem: fix tmpfs reconfiguration (remount) when noswap is set
fs/namespace: correctly handle errors returned by grab_requested_mnt_ns
power: always freeze efivarfs
binfmt_misc: restore write access before closing files opened by open_exec()
block: add __must_check attribute to sb_min_blocksize()
virtio-fs: fix incorrect check for fsvq->kobj
xfs: check the return value of sb_min_blocksize() in xfs_fs_fill_super
isofs: check the return value of sb_min_blocksize() in isofs_fill_super
exfat: check return value of sb_min_blocksize in exfat_read_boot_sector
vfat: fix missing sb_min_blocksize() return value checks
mnt: Remove dead code which might prevent from building
bfs: Reconstruct file type when loading from disk
afs: Fix dynamic lookup to fail on cell lookup failure
hostfs: Fix only passing host root in boot stage with new mount
fs: Fix uninitialized 'offp' in statmount_string()
|
|
sb_min_blocksize() may return 0. Check its return value to avoid
accessing the filesystem super block when sb->s_blocksize is 0.
Cc: stable@vger.kernel.org # v6.15
Fixes: 719c1e1829166d ("exfat: add super block operations")
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Yongpeng Yang <yangyongpeng@xiaomi.com>
Link: https://patch.msgid.link/20251104125009.2111925-3-yangyongpeng.storage@gmail.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Since the len argument value passed to exfat_ioctl_set_volume_label()
from exfat_nls_to_utf16() is passed 1 too large, an out-of-bounds read
occurs when dereferencing p_cstring in exfat_nls_to_ucs2() later.
And because of the NLS_NAME_OVERLEN macro, another error occurs when
creating a file with a period at the end using utf8 and other iocharsets.
So to avoid this, you should remove the code that uses NLS_NAME_OVERLEN
macro and make the len argument value be the length of the label string,
but with a maximum length of FSLABEL_MAX - 1.
Reported-by: syzbot+98cc76a76de46b3714d4@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=98cc76a76de46b3714d4
Fixes: d01579d590f7 ("exfat: Add support for FS_IOC_{GET,SET}FSLABEL")
Suggested-by: Pali Rohár <pali@kernel.org>
Signed-off-by: Jeongjun Park <aha310510@gmail.com>
Signed-off-by: Namjae Jeon <linkinjeon@kernel.org>
|