<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/block/fops.c, branch v6.16</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v6.16</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v6.16'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2025-05-06T13:46:43Z</updated>
<entry>
<title>block: expose write streams for block device nodes</title>
<updated>2025-05-06T13:46:43Z</updated>
<author>
<name>Christoph Hellwig</name>
<email>hch@lst.de</email>
</author>
<published>2025-05-06T12:17:26Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=c27683da6406031d47a65b344d04a40736490d95'/>
<id>urn:sha1:c27683da6406031d47a65b344d04a40736490d95</id>
<content type='text'>
Use the per-kiocb write stream if provided, or map temperature hints to
write streams (which is a bit questionable, but this shows how it is
done).

Reviewed-by: Hannes Reinecke &lt;hare@suse.de&gt;
Reviewed-by: Nitesh Shetty &lt;nj.shetty@samsung.com&gt;
Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
[kbusch: removed statx reporting]
Signed-off-by: Keith Busch &lt;kbusch@kernel.org&gt;
Signed-off-by: Kanchan Joshi &lt;joshi.k@samsung.com&gt;
Link: https://lore.kernel.org/r/20250506121732.8211-6-joshi.k@samsung.com
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>block: use writeback_iter</title>
<updated>2025-05-02T15:23:00Z</updated>
<author>
<name>Christoph Hellwig</name>
<email>hch@lst.de</email>
</author>
<published>2025-04-24T08:27:52Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=00ef5c728ec05af5f8591016a9d138eab6b6f8e9'/>
<id>urn:sha1:00ef5c728ec05af5f8591016a9d138eab6b6f8e9</id>
<content type='text'>
Use writeback_iter instead of the deprecated write_cache_pages wrapper
in blkdev_writepages.  This removes an indirect call per folio.

Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Reviewed-by: Johannes Thumshirn &lt;johannes.thumshirn@wdc.com&gt;
Reviewed-by: Hannes Reinecke &lt;hare@suse.de&gt;
Reviewed-by: John Garry &lt;john.g.garry@oracle.com&gt;
Link: https://lore.kernel.org/r/20250424082752.1967679-1-hch@lst.de
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>block: don't autoload drivers on stat</title>
<updated>2025-04-24T13:35:23Z</updated>
<author>
<name>Christoph Hellwig</name>
<email>hch@lst.de</email>
</author>
<published>2025-04-23T05:37:41Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=5f33b5226c9d92359e58e91ad0bf0c1791da36a1'/>
<id>urn:sha1:5f33b5226c9d92359e58e91ad0bf0c1791da36a1</id>
<content type='text'>
blkdev_get_no_open can trigger the legacy autoload of block drivers.  A
simple stat of a block device has not historically done that, so disable
this behavior again.

Fixes: 9abcfbd235f5 ("block: Add atomic write support for statx")
Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Reviewed-by: Christian Brauner &lt;brauner@kernel.org&gt;
Acked-by: Tejun Heo &lt;tj@kernel.org&gt;
Link: https://lore.kernel.org/r/20250423053810.1683309-4-hch@lst.de
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>block: fix race between set_blocksize and read paths</title>
<updated>2025-04-23T19:58:06Z</updated>
<author>
<name>Darrick J. Wong</name>
<email>djwong@kernel.org</email>
</author>
<published>2025-04-23T19:53:42Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=c0e473a0d226479e8e925d5ba93f751d8df628e9'/>
<id>urn:sha1:c0e473a0d226479e8e925d5ba93f751d8df628e9</id>
<content type='text'>
With the new large sector size support, it's now the case that
set_blocksize can change i_blksize and the folio order in a manner that
conflicts with a concurrent reader and causes a kernel crash.

Specifically, let's say that udev-worker calls libblkid to detect the
labels on a block device.  The read call can create an order-0 folio to
read the first 4096 bytes from the disk.  But then udev is preempted.

Next, someone tries to mount an 8k-sectorsize filesystem from the same
block device.  The filesystem calls set_blksize, which sets i_blksize to
8192 and the minimum folio order to 1.

Now udev resumes, still holding the order-0 folio it allocated.  It then
tries to schedule a read bio and do_mpage_readahead tries to create
bufferheads for the folio.  Unfortunately, blocks_per_folio == 0 because
the page size is 4096 but the blocksize is 8192 so no bufferheads are
attached and the bh walk never sets bdev.  We then submit the bio with a
NULL block device and crash.

Therefore, truncate the page cache after flushing but before updating
i_blksize.  However, that's not enough -- we also need to lock out file
IO and page faults during the update.  Take both the i_rwsem and the
invalidate_lock in exclusive mode for invalidations, and in shared mode
for read/write operations.

I don't know if this is the correct fix, but xfs/259 found it.

Signed-off-by: Darrick J. Wong &lt;djwong@kernel.org&gt;
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Reviewed-by: Luis Chamberlain &lt;mcgrof@kernel.org&gt;
Tested-by: Shin'ichiro Kawasaki &lt;shinichiro.kawasaki@wdc.com&gt;
Link: https://lore.kernel.org/r/174543795699.4139148.2086129139322431423.stgit@frogsfrogsfrogs
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>block: don't revert iter for -EIOCBQUEUED</title>
<updated>2025-01-23T13:18:41Z</updated>
<author>
<name>Jens Axboe</name>
<email>axboe@kernel.dk</email>
</author>
<published>2025-01-23T13:18:41Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=b13ee668e8280ca5b07f8ce2846b9957a8a10853'/>
<id>urn:sha1:b13ee668e8280ca5b07f8ce2846b9957a8a10853</id>
<content type='text'>
blkdev_read_iter() has a few odd checks, like gating the position and
count adjustment on whether or not the result is bigger-than-or-equal to
zero (where bigger than makes more sense), and not checking the return
value of blkdev_direct_IO() before doing an iov_iter_revert(). The
latter can lead to attempting to revert with a negative value, which
when passed to iov_iter_revert() as an unsigned value will lead to
throwing a WARN_ON() because unroll is bigger than MAX_RW_COUNT.

Be sane and don't revert for -EIOCBQUEUED, like what is done in other
spots.

Cc: stable@vger.kernel.org
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>block: add support to pass user meta buffer</title>
<updated>2024-12-23T15:17:17Z</updated>
<author>
<name>Kanchan Joshi</name>
<email>joshi.k@samsung.com</email>
</author>
<published>2024-11-28T11:22:40Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=3d8b5a22d40435b4a7e58f06ae2cd3506b222898'/>
<id>urn:sha1:3d8b5a22d40435b4a7e58f06ae2cd3506b222898</id>
<content type='text'>
If an iocb contains metadata, extract that and prepare the bip.
Based on flags specified by the user, set corresponding guard/app/ref
tags to be checked in bip.

Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Anuj Gupta &lt;anuj20.g@samsung.com&gt;
Signed-off-by: Kanchan Joshi &lt;joshi.k@samsung.com&gt;
Reviewed-by: Keith Busch &lt;kbusch@kernel.org&gt;
Link: https://lore.kernel.org/r/20241128112240.8867-11-anuj20.g@samsung.com
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>block: Don't allow an atomic write be truncated in blkdev_write_iter()</title>
<updated>2024-11-27T22:04:57Z</updated>
<author>
<name>John Garry</name>
<email>john.g.garry@oracle.com</email>
</author>
<published>2024-11-27T09:23:18Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=2cbd51f1f8739fd2fdf4bae1386bcf75ce0176ba'/>
<id>urn:sha1:2cbd51f1f8739fd2fdf4bae1386bcf75ce0176ba</id>
<content type='text'>
A write which goes past the end of the bdev in blkdev_write_iter() will
be truncated. Truncating cannot tolerated for an atomic write, so error
that condition.

Fixes: caf336f81b3a ("block: Add fops atomic write support")
Signed-off-by: John Garry &lt;john.g.garry@oracle.com&gt;
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Link: https://lore.kernel.org/r/20241127092318.632790-1-john.g.garry@oracle.com
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>fs/block: Check for IOCB_DIRECT in generic_atomic_write_valid()</title>
<updated>2024-10-19T22:48:22Z</updated>
<author>
<name>John Garry</name>
<email>john.g.garry@oracle.com</email>
</author>
<published>2024-10-19T12:51:07Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=c3be7ebbbce5201e151f17e28a6c807602f369c9'/>
<id>urn:sha1:c3be7ebbbce5201e151f17e28a6c807602f369c9</id>
<content type='text'>
Currently FMODE_CAN_ATOMIC_WRITE is set if the bdev can atomic write and
the file is open for direct IO. This does not work if the file is not
opened for direct IO, yet fcntl(O_DIRECT) is used on the fd later.

Change to check for direct IO on a per-IO basis in
generic_atomic_write_valid(). Since we want to report -EOPNOTSUPP for
non-direct IO for an atomic write, change to return an error code.

Relocate the block fops atomic write checks to the common write path, as to
catch non-direct IO.

Fixes: c34fc6f26ab8 ("fs: Initial atomic write support")
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Reviewed-by: Darrick J. Wong &lt;djwong@kernel.org&gt;
Signed-off-by: John Garry &lt;john.g.garry@oracle.com&gt;
Link: https://lore.kernel.org/r/20241019125113.369994-3-john.g.garry@oracle.com
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>block/fs: Pass an iocb to generic_atomic_write_valid()</title>
<updated>2024-10-19T22:48:21Z</updated>
<author>
<name>John Garry</name>
<email>john.g.garry@oracle.com</email>
</author>
<published>2024-10-19T12:51:06Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=9a8dbdadae509e5717ff6e5aa572ca0974d2101d'/>
<id>urn:sha1:9a8dbdadae509e5717ff6e5aa572ca0974d2101d</id>
<content type='text'>
Darrick and Hannes both thought it better that generic_atomic_write_valid()
should be passed a struct iocb, and not just the member of that struct
which is referenced; see [0] and [1].

I think that makes a more generic and clean API, so make that change.

[0] https://lore.kernel.org/linux-block/680ce641-729b-4150-b875-531a98657682@suse.de/
[1] https://lore.kernel.org/linux-xfs/20240620212401.GA3058325@frogsfrogsfrogs/

Fixes: c34fc6f26ab8 ("fs: Initial atomic write support")
Suggested-by: Darrick J. Wong &lt;djwong@kernel.org&gt;
Suggested-by: Hannes Reinecke &lt;hare@suse.de&gt;
Reviewed-by: Darrick J. Wong &lt;djwong@kernel.org&gt;
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: John Garry &lt;john.g.garry@oracle.com&gt;
Link: https://lore.kernel.org/r/20241019125113.369994-2-john.g.garry@oracle.com
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>Merge tag 'vfs-6.12.blocksize' of gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs</title>
<updated>2024-09-21T00:53:17Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2024-09-21T00:53:17Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=171754c3808214d4fd8843eab584599a429deb52'/>
<id>urn:sha1:171754c3808214d4fd8843eab584599a429deb52</id>
<content type='text'>
Pull vfs blocksize updates from Christian Brauner:
 "This contains the vfs infrastructure as well as the xfs bits to enable
  support for block sizes (bs) larger than page sizes (ps) plus a few
  fixes to related infrastructure.

  There has been efforts over the last 16 years to enable enable Large
  Block Sizes (LBS), that is block sizes in filesystems where bs &gt; page
  size. Through these efforts we have learned that one of the main
  blockers to supporting bs &gt; ps in filesystems has been a way to
  allocate pages that are at least the filesystem block size on the page
  cache where bs &gt; ps.

  Thanks to various previous efforts it is possible to support bs &gt; ps
  in XFS with only a few changes in XFS itself. Most changes are to the
  page cache to support minimum order folio support for the target block
  size on the filesystem.

  A motivation for Large Block Sizes today is to support high-capacity
  (large amount of Terabytes) QLC SSDs where the internal Indirection
  Unit (IU) are typically greater than 4k to help reduce DRAM and so in
  turn cost and space. In practice this then allows different
  architectures to use a base page size of 4k while still enabling
  support for block sizes aligned to the larger IUs by relying on high
  order folios on the page cache when needed.

  It also allows to take advantage of the drive's support for atomics
  larger than 4k with buffered IO support in Linux. As described this
  year at LSFMM, supporting large atomics greater than 4k enables
  databases to remove the need to rely on their own journaling, so they
  can disable double buffered writes, which is a feature different cloud
  providers are already enabling through custom storage solutions"

* tag 'vfs-6.12.blocksize' of gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs: (22 commits)
  Documentation: iomap: fix a typo
  iomap: remove the iomap_file_buffered_write_punch_delalloc return value
  iomap: pass the iomap to the punch callback
  iomap: pass flags to iomap_file_buffered_write_punch_delalloc
  iomap: improve shared block detection in iomap_unshare_iter
  iomap: handle a post-direct I/O invalidate race in iomap_write_delalloc_release
  docs:filesystems: fix spelling and grammar mistakes in iomap design page
  filemap: fix htmldoc warning for mapping_align_index()
  iomap: make zero range flush conditional on unwritten mappings
  iomap: fix handling of dirty folios over unwritten extents
  iomap: add a private argument for iomap_file_buffered_write
  iomap: remove set_memor_ro() on zero page
  xfs: enable block size larger than page size support
  xfs: make the calculation generic in xfs_sb_validate_fsb_count()
  xfs: expose block size in stat
  xfs: use kvmalloc for xattr buffers
  iomap: fix iomap_dio_zero() for fs bs &gt; system page size
  filemap: cap PTE range to be created to allowed zero fill in folio_map_range()
  mm: split a folio in minimum folio order chunks
  readahead: allocate folios with mapping_min_order in readahead
  ...
</content>
</entry>
</feed>
