<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/fs/read_write.c, branch v6.0</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v6.0</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v6.0'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2022-09-28T18:28:40Z</updated>
<entry>
<title>[coredump] don't use __kernel_write() on kmap_local_page()</title>
<updated>2022-09-28T18:28:40Z</updated>
<author>
<name>Al Viro</name>
<email>viro@zeniv.linux.org.uk</email>
</author>
<published>2022-09-26T15:59:14Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=06bbaa6dc53cb72040db952053432541acb9adc7'/>
<id>urn:sha1:06bbaa6dc53cb72040db952053432541acb9adc7</id>
<content type='text'>
passing kmap_local_page() result to __kernel_write() is unsafe -
random -&gt;write_iter() might (and 9p one does) get unhappy when
passed ITER_KVEC with pointer that came from kmap_local_page().

Fix by providing a variant of __kernel_write() that takes an iov_iter
from caller (__kernel_write() becomes a trivial wrapper) and adding
dump_emit_page() that parallels dump_emit(), except that instead of
__kernel_write() it uses __kernel_write_iter() with ITER_BVEC source.

Fixes: 3159ed57792b "fs/coredump: use kmap_local_page()"
Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</content>
</entry>
<entry>
<title>switch new_sync_{read,write}() to ITER_UBUF</title>
<updated>2022-08-09T02:37:15Z</updated>
<author>
<name>Al Viro</name>
<email>viro@zeniv.linux.org.uk</email>
</author>
<published>2022-05-22T20:55:40Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=3e20a751aff0e099cff496511fef8cdf655b3360'/>
<id>urn:sha1:3e20a751aff0e099cff496511fef8cdf655b3360</id>
<content type='text'>
Reviewed-by: Christian Brauner (Microsoft) &lt;brauner@kernel.org&gt;
Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</content>
</entry>
<entry>
<title>Merge tag 'pull-work.lseek' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs</title>
<updated>2022-08-03T18:35:20Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2022-08-03T18:35:20Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=a782e866497217f22c5d9014cbb7be8549151376'/>
<id>urn:sha1:a782e866497217f22c5d9014cbb7be8549151376</id>
<content type='text'>
Pull vfs lseek updates from Al Viro:
 "Jason's lseek series.

  Saner handling of 'lseek should fail with ESPIPE' - this gets rid of
  the magical no_llseek thing and makes checks consistent.

  In particular, the ad-hoc "can we do splice via internal pipe" checks
  got saner (and somewhat more permissive, which is what Jason had been
  after, AFAICT)"

* tag 'pull-work.lseek' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  fs: remove no_llseek
  fs: check FMODE_LSEEK to control internal pipe splicing
  vfio: do not set FMODE_LSEEK flag
  dma-buf: remove useless FMODE_LSEEK flag
  fs: do not compare against -&gt;llseek
  fs: clear or set FMODE_LSEEK based on llseek function
</content>
</entry>
<entry>
<title>Merge tag 'for-5.20/io_uring-buffered-writes-2022-07-29' of git://git.kernel.dk/linux-block</title>
<updated>2022-08-02T20:27:23Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2022-08-02T20:27:23Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=98e247464088a11ce2328a214fdb87d4c06f8db6'/>
<id>urn:sha1:98e247464088a11ce2328a214fdb87d4c06f8db6</id>
<content type='text'>
Pull io_uring buffered writes support from Jens Axboe:
 "This contains support for buffered writes, specifically for XFS. btrfs
  is in progress, will be coming in the next release.

  io_uring does support buffered writes on any file type, but since the
  buffered write path just always -EAGAIN (or -EOPNOTSUPP) any attempt
  to do so if IOCB_NOWAIT is set, any buffered write will effectively be
  handled by io-wq offload. This isn't very efficient, and we even have
  specific code in io-wq to serialize buffered writes to the same inode
  to avoid further inefficiencies with thread offload.

  This is particularly sad since most buffered writes don't block, they
  simply copy data to a page and dirty it. With this pull request, we
  can handle buffered writes a lot more effiently.

  If balance_dirty_pages() needs to block, we back off on writes as
  indicated.

  This improves buffered write support by 2-3x.

  Jan Kara helped with the mm bits for this, and Stefan handled the
  fs/iomap/xfs/io_uring parts of it"

* tag 'for-5.20/io_uring-buffered-writes-2022-07-29' of git://git.kernel.dk/linux-block:
  mm: honor FGP_NOWAIT for page cache page allocation
  xfs: Add async buffered write support
  xfs: Specify lockmode when calling xfs_ilock_for_iomap()
  io_uring: Add tracepoint for short writes
  io_uring: fix issue with io_write() not always undoing sb_start_write()
  io_uring: Add support for async buffered writes
  fs: Add async write file modification handling.
  fs: Split off inode_needs_update_time and __file_update_time
  fs: add __remove_file_privs() with flags parameter
  fs: add a FMODE_BUF_WASYNC flags for f_mode
  iomap: Return -EAGAIN from iomap_write_iter()
  iomap: Add async buffered write support
  iomap: Add flags parameter to iomap_page_create()
  mm: Add balance_dirty_pages_ratelimited_flags() function
  mm: Move updates of dirty_exceeded into one place
  mm: Move starting of background writeback into the main balancing loop
</content>
</entry>
<entry>
<title>Merge tag 'mm-hotfixes-stable-2022-07-26' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm</title>
<updated>2022-07-27T02:38:46Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2022-07-27T02:38:46Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=39c3c396f8131f3db454c80e0fcfcdc54ed9ec01'/>
<id>urn:sha1:39c3c396f8131f3db454c80e0fcfcdc54ed9ec01</id>
<content type='text'>
Pull misc fixes from Andrew Morton:
 "Thirteen hotfixes.

  Eight are cc:stable and the remainder are for post-5.18 issues or are
  too minor to warrant backporting"

* tag 'mm-hotfixes-stable-2022-07-26' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
  mailmap: update Gao Xiang's email addresses
  userfaultfd: provide properly masked address for huge-pages
  Revert "ocfs2: mount shared volume without ha stack"
  hugetlb: fix memoryleak in hugetlb_mcopy_atomic_pte
  fs: sendfile handles O_NONBLOCK of out_fd
  ntfs: fix use-after-free in ntfs_ucsncmp()
  secretmem: fix unhandled fault in truncate
  mm/hugetlb: separate path for hwpoison entry in copy_hugetlb_page_range()
  mm: fix missing wake-up event for FSDAX pages
  mm: fix page leak with multiple threads mapping the same page
  mailmap: update Seth Forshee's email address
  tmpfs: fix the issue that the mount and remount results are inconsistent.
  mm: kfence: apply kmemleak_ignore_phys on early allocated pool
</content>
</entry>
<entry>
<title>fs: add a FMODE_BUF_WASYNC flags for f_mode</title>
<updated>2022-07-25T00:39:31Z</updated>
<author>
<name>Stefan Roesch</name>
<email>shr@fb.com</email>
</author>
<published>2022-06-23T17:51:50Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=8017553980d0bbfef3e66c583363828565afd6da'/>
<id>urn:sha1:8017553980d0bbfef3e66c583363828565afd6da</id>
<content type='text'>
This introduces the flag FMODE_BUF_WASYNC. If devices support async
buffered writes, this flag can be set. It also modifies the check in
generic_write_checks to take async buffered writes into consideration.

Signed-off-by: Stefan Roesch &lt;shr@fb.com&gt;
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Reviewed-by: Christian Brauner (Microsoft) &lt;brauner@kernel.org&gt;
Reviewed-by: Darrick J. Wong &lt;djwong@kernel.org&gt;
Link: https://lore.kernel.org/r/20220623175157.1715274-8-shr@fb.com
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>fs: sendfile handles O_NONBLOCK of out_fd</title>
<updated>2022-07-18T22:07:52Z</updated>
<author>
<name>Andrei Vagin</name>
<email>avagin@gmail.com</email>
</author>
<published>2022-07-17T04:37:10Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=bdeb77bc2c405fa9f954c20269db175a0bd2793f'/>
<id>urn:sha1:bdeb77bc2c405fa9f954c20269db175a0bd2793f</id>
<content type='text'>
sendfile has to return EAGAIN if out_fd is nonblocking and the write into
it would block.

Here is a small reproducer for the problem:

#define _GNU_SOURCE /* See feature_test_macros(7) */
#include &lt;fcntl.h&gt;
#include &lt;stdio.h&gt;
#include &lt;unistd.h&gt;
#include &lt;errno.h&gt;
#include &lt;sys/stat.h&gt;
#include &lt;sys/types.h&gt;
#include &lt;sys/sendfile.h&gt;


#define FILE_SIZE (1UL &lt;&lt; 30)
int main(int argc, char **argv) {
        int p[2], fd;

        if (pipe2(p, O_NONBLOCK))
                return 1;

        fd = open(argv[1], O_RDWR | O_TMPFILE, 0666);
        if (fd &lt; 0)
                return 1;
        ftruncate(fd, FILE_SIZE);

        if (sendfile(p[1], fd, 0, FILE_SIZE) == -1) {
                fprintf(stderr, "FAIL\n");
        }
        if (sendfile(p[1], fd, 0, FILE_SIZE) != -1 || errno != EAGAIN) {
                fprintf(stderr, "FAIL\n");
        }
        return 0;
}

It worked before b964bf53e540, it is stuck after b964bf53e540, and it
works again with this fix.

This regression occurred because do_splice_direct() calls pipe_write
that handles O_NONBLOCK.  Here is a trace log from the reproducer:

 1)               |  __x64_sys_sendfile64() {
 1)               |    do_sendfile() {
 1)               |      __fdget()
 1)               |      rw_verify_area()
 1)               |      __fdget()
 1)               |      rw_verify_area()
 1)               |      do_splice_direct() {
 1)               |        rw_verify_area()
 1)               |        splice_direct_to_actor() {
 1)               |          do_splice_to() {
 1)               |            rw_verify_area()
 1)               |            generic_file_splice_read()
 1) + 74.153 us   |          }
 1)               |          direct_splice_actor() {
 1)               |            iter_file_splice_write() {
 1)               |              __kmalloc()
 1)   0.148 us    |              pipe_lock();
 1)   0.153 us    |              splice_from_pipe_next.part.0();
 1)   0.162 us    |              page_cache_pipe_buf_confirm();
... 16 times
 1)   0.159 us    |              page_cache_pipe_buf_confirm();
 1)               |              vfs_iter_write() {
 1)               |                do_iter_write() {
 1)               |                  rw_verify_area()
 1)               |                  do_iter_readv_writev() {
 1)               |                    pipe_write() {
 1)               |                      mutex_lock()
 1)   0.153 us    |                      mutex_unlock();
 1)   1.368 us    |                    }
 1)   1.686 us    |                  }
 1)   5.798 us    |                }
 1)   6.084 us    |              }
 1)   0.174 us    |              kfree();
 1)   0.152 us    |              pipe_unlock();
 1) + 14.461 us   |            }
 1) + 14.783 us   |          }
 1)   0.164 us    |          page_cache_pipe_buf_release();
... 16 times
 1)   0.161 us    |          page_cache_pipe_buf_release();
 1)               |          touch_atime()
 1) + 95.854 us   |        }
 1) + 99.784 us   |      }
 1) ! 107.393 us  |    }
 1) ! 107.699 us  |  }

Link: https://lkml.kernel.org/r/20220415005015.525191-1-avagin@gmail.com
Fixes: b964bf53e540 ("teach sendfile(2) to handle send-to-pipe directly")
Signed-off-by: Andrei Vagin &lt;avagin@gmail.com&gt;
Cc: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>fs: remove no_llseek</title>
<updated>2022-07-16T13:19:47Z</updated>
<author>
<name>Jason A. Donenfeld</name>
<email>Jason@zx2c4.com</email>
</author>
<published>2022-06-29T13:07:00Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=868941b14441282ba08761b770fc6cad69d5bdb7'/>
<id>urn:sha1:868941b14441282ba08761b770fc6cad69d5bdb7</id>
<content type='text'>
Now that all callers of -&gt;llseek are going through vfs_llseek(), we
don't gain anything by keeping no_llseek around. Nothing actually calls
it and setting -&gt;llseek to no_lseek is completely equivalent to
leaving it NULL.

Longer term (== by the end of merge window) we want to remove all such
intializations.  To simplify the merge window this commit does *not*
touch initializers - it only defines no_llseek as NULL (and simplifies
the tests on file opening).

At -rc1 we'll need do a mechanical removal of no_llseek -

git grep -l -w no_llseek | grep -v porting.rst | while read i; do
	sed -i '/\&lt;no_llseek\&gt;/d' $i
done
would do it.

Signed-off-by: Jason A. Donenfeld &lt;Jason@zx2c4.com&gt;
Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</content>
</entry>
<entry>
<title>fs: do not compare against -&gt;llseek</title>
<updated>2022-07-16T13:19:15Z</updated>
<author>
<name>Jason A. Donenfeld</name>
<email>Jason@zx2c4.com</email>
</author>
<published>2022-06-29T13:06:59Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=4e3299eaddffd9d7d5b8bae28ad700bb775f02d0'/>
<id>urn:sha1:4e3299eaddffd9d7d5b8bae28ad700bb775f02d0</id>
<content type='text'>
Now vfs_llseek() can simply check for FMODE_LSEEK; if it's set,
we know that -&gt;llseek() won't be NULL and if it's not we should
just fail with -ESPIPE.

A couple of other places where we used to check for special
values of -&gt;llseek() (somewhat inconsistently) switched to
checking FMODE_LSEEK.

Signed-off-by: Jason A. Donenfeld &lt;Jason@zx2c4.com&gt;
Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</content>
</entry>
<entry>
<title>vfs: fix copy_file_range() regression in cross-fs copies</title>
<updated>2022-06-30T22:16:38Z</updated>
<author>
<name>Amir Goldstein</name>
<email>amir73il@gmail.com</email>
</author>
<published>2022-06-30T19:58:49Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=868f9f2f8e004bfe0d3935b1976f625b2924893b'/>
<id>urn:sha1:868f9f2f8e004bfe0d3935b1976f625b2924893b</id>
<content type='text'>
A regression has been reported by Nicolas Boichat, found while using the
copy_file_range syscall to copy a tracefs file.

Before commit 5dae222a5ff0 ("vfs: allow copy_file_range to copy across
devices") the kernel would return -EXDEV to userspace when trying to
copy a file across different filesystems.  After this commit, the
syscall doesn't fail anymore and instead returns zero (zero bytes
copied), as this file's content is generated on-the-fly and thus reports
a size of zero.

Another regression has been reported by He Zhe - the assertion of
WARN_ON_ONCE(ret == -EOPNOTSUPP) can be triggered from userspace when
copying from a sysfs file whose read operation may return -EOPNOTSUPP.

Since we do not have test coverage for copy_file_range() between any two
types of filesystems, the best way to avoid these sort of issues in the
future is for the kernel to be more picky about filesystems that are
allowed to do copy_file_range().

This patch restores some cross-filesystem copy restrictions that existed
prior to commit 5dae222a5ff0 ("vfs: allow copy_file_range to copy across
devices"), namely, cross-sb copy is not allowed for filesystems that do
not implement -&gt;copy_file_range().

Filesystems that do implement -&gt;copy_file_range() have full control of
the result - if this method returns an error, the error is returned to
the user.  Before this change this was only true for fs that did not
implement the -&gt;remap_file_range() operation (i.e.  nfsv3).

Filesystems that do not implement -&gt;copy_file_range() still fall-back to
the generic_copy_file_range() implementation when the copy is within the
same sb.  This helps the kernel can maintain a more consistent story
about which filesystems support copy_file_range().

nfsd and ksmbd servers are modified to fall-back to the
generic_copy_file_range() implementation in case vfs_copy_file_range()
fails with -EOPNOTSUPP or -EXDEV, which preserves behavior of
server-side-copy.

fall-back to generic_copy_file_range() is not implemented for the smb
operation FSCTL_DUPLICATE_EXTENTS_TO_FILE, which is arguably a correct
change of behavior.

Fixes: 5dae222a5ff0 ("vfs: allow copy_file_range to copy across devices")
Link: https://lore.kernel.org/linux-fsdevel/20210212044405.4120619-1-drinkcat@chromium.org/
Link: https://lore.kernel.org/linux-fsdevel/CANMq1KDZuxir2LM5jOTm0xx+BnvW=ZmpsG47CyHFJwnw7zSX6Q@mail.gmail.com/
Link: https://lore.kernel.org/linux-fsdevel/20210126135012.1.If45b7cdc3ff707bc1efa17f5366057d60603c45f@changeid/
Link: https://lore.kernel.org/linux-fsdevel/20210630161320.29006-1-lhenriques@suse.de/
Reported-by: Nicolas Boichat &lt;drinkcat@chromium.org&gt;
Reported-by: kernel test robot &lt;oliver.sang@intel.com&gt;
Signed-off-by: Luis Henriques &lt;lhenriques@suse.de&gt;
Fixes: 64bf5ff58dff ("vfs: no fallback for -&gt;copy_file_range")
Link: https://lore.kernel.org/linux-fsdevel/20f17f64-88cb-4e80-07c1-85cb96c83619@windriver.com/
Reported-by: He Zhe &lt;zhe.he@windriver.com&gt;
Tested-by: Namjae Jeon &lt;linkinjeon@kernel.org&gt;
Tested-by: Luis Henriques &lt;lhenriques@suse.de&gt;
Signed-off-by: Amir Goldstein &lt;amir73il@gmail.com&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
</feed>
