<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/mm/filemap.c, branch v5.6</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v5.6</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v5.6'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2020-01-31T18:30:37Z</updated>
<entry>
<title>mm/filemap.c: clean up filemap_write_and_wait()</title>
<updated>2020-01-31T18:30:37Z</updated>
<author>
<name>Ira Weiny</name>
<email>ira.weiny@intel.com</email>
</author>
<published>2020-01-31T06:12:07Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=ddf8f376d137ba41ca67347a0b80ba0c357a1018'/>
<id>urn:sha1:ddf8f376d137ba41ca67347a0b80ba0c357a1018</id>
<content type='text'>
At some point filemap_write_and_wait() and
filemap_write_and_wait_range() got the exact same implementation with
the exception of the range being specified in *_range()

Similar to other functions in fs.h which call *_range(..., 0,
LLONG_MAX), change filemap_write_and_wait() to be a static inline which
calls filemap_write_and_wait_range()

Link: http://lkml.kernel.org/r/20191129160713.30892-1-ira.weiny@intel.com
Signed-off-by: Ira Weiny &lt;ira.weiny@intel.com&gt;
Reviewed-by: Nikolay Borisov &lt;nborisov@suse.com&gt;
Reviewed-by: Matthew Wilcox (Oracle) &lt;willy@infradead.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: drop mmap_sem before calling balance_dirty_pages() in write fault</title>
<updated>2019-12-01T14:29:18Z</updated>
<author>
<name>Johannes Weiner</name>
<email>hannes@cmpxchg.org</email>
</author>
<published>2019-12-01T01:50:22Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=89b15332af7c0312a41e50846819ca6613b58b4c'/>
<id>urn:sha1:89b15332af7c0312a41e50846819ca6613b58b4c</id>
<content type='text'>
One of our services is observing hanging ps/top/etc under heavy write
IO, and the task states show this is an mmap_sem priority inversion:

A write fault is holding the mmap_sem in read-mode and waiting for
(heavily cgroup-limited) IO in balance_dirty_pages():

    balance_dirty_pages+0x724/0x905
    balance_dirty_pages_ratelimited+0x254/0x390
    fault_dirty_shared_page.isra.96+0x4a/0x90
    do_wp_page+0x33e/0x400
    __handle_mm_fault+0x6f0/0xfa0
    handle_mm_fault+0xe4/0x200
    __do_page_fault+0x22b/0x4a0
    page_fault+0x45/0x50

Somebody tries to change the address space, contending for the mmap_sem in
write-mode:

    call_rwsem_down_write_failed_killable+0x13/0x20
    do_mprotect_pkey+0xa8/0x330
    SyS_mprotect+0xf/0x20
    do_syscall_64+0x5b/0x100
    entry_SYSCALL_64_after_hwframe+0x3d/0xa2

The waiting writer locks out all subsequent readers to avoid lock
starvation, and several threads can be seen hanging like this:

    call_rwsem_down_read_failed+0x14/0x30
    proc_pid_cmdline_read+0xa0/0x480
    __vfs_read+0x23/0x140
    vfs_read+0x87/0x130
    SyS_read+0x42/0x90
    do_syscall_64+0x5b/0x100
    entry_SYSCALL_64_after_hwframe+0x3d/0xa2

To fix this, do what we do for cache read faults already: drop the
mmap_sem before calling into anything IO bound, in this case the
balance_dirty_pages() function, and return VM_FAULT_RETRY.

Link: http://lkml.kernel.org/r/20190924194238.GA29030@cmpxchg.org
Signed-off-by: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Reviewed-by: Matthew Wilcox (Oracle) &lt;willy@infradead.org&gt;
Acked-by: Kirill A. Shutemov &lt;kirill.shutemov@linux.intel.com&gt;
Cc: Josef Bacik &lt;josef@toxicpanda.com&gt;
Cc: Hillf Danton &lt;hdanton@sina.com&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm/filemap.c: warn if stale pagecache is left after direct write</title>
<updated>2019-12-01T14:29:18Z</updated>
<author>
<name>Konstantin Khlebnikov</name>
<email>khlebnikov@yandex-team.ru</email>
</author>
<published>2019-12-01T01:49:47Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=9266a14033a81b3096feccd10542c20b3f47fe8e'/>
<id>urn:sha1:9266a14033a81b3096feccd10542c20b3f47fe8e</id>
<content type='text'>
generic_file_direct_write() tries to invalidate pagecache after O_DIRECT
write.  Unlike to similar code in dio_complete() this silently ignores
error returned from invalidate_inode_pages2_range().

According to comment this code here because not all filesystems call
dio_complete() to do proper invalidation after O_DIRECT write.  Noticeable
example is a blkdev_direct_IO().

This patch calls dio_warn_stale_pagecache() if invalidation fails.

Link: http://lkml.kernel.org/r/157270038294.4812.2238891109785106069.stgit@buzz
Signed-off-by: Konstantin Khlebnikov &lt;khlebnikov@yandex-team.ru&gt;
Reviewed-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Reviewed-by: Jan Kara &lt;jack@suse.cz&gt;
Cc: Jens Axboe &lt;axboe@kernel.dk&gt;
Cc: Alexander Viro &lt;viro@zeniv.linux.org.uk&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>fs/direct-io.c: keep dio_warn_stale_pagecache() when CONFIG_BLOCK=n</title>
<updated>2019-12-01T14:29:18Z</updated>
<author>
<name>Konstantin Khlebnikov</name>
<email>khlebnikov@yandex-team.ru</email>
</author>
<published>2019-12-01T01:49:44Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=a92853b6746fe5ffef20a7c30addf6320561e669'/>
<id>urn:sha1:a92853b6746fe5ffef20a7c30addf6320561e669</id>
<content type='text'>
This helper prints warning if direct I/O write failed to invalidate cache,
and set EIO at inode to warn usersapce about possible data corruption.

See also commit 5a9d929d6e13 ("iomap: report collisions between directio
and buffered writes to userspace").

Direct I/O is supported by non-disk filesystems, for example NFS.  Thus
generic code needs this even in kernel without CONFIG_BLOCK.

Link: http://lkml.kernel.org/r/157270038074.4812.7980855544557488880.stgit@buzz
Signed-off-by: Konstantin Khlebnikov &lt;khlebnikov@yandex-team.ru&gt;
Reviewed-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Reviewed-by: Jan Kara &lt;jack@suse.cz&gt;
Cc: Jens Axboe &lt;axboe@kernel.dk&gt;
Cc: Alexander Viro &lt;viro@zeniv.linux.org.uk&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm/filemap.c: remove redundant cache invalidation after async direct-io write</title>
<updated>2019-12-01T14:29:18Z</updated>
<author>
<name>Konstantin Khlebnikov</name>
<email>khlebnikov@yandex-team.ru</email>
</author>
<published>2019-12-01T01:49:41Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=80c1fe902691d3ef4786f9e62e47a0aa0deb8b54'/>
<id>urn:sha1:80c1fe902691d3ef4786f9e62e47a0aa0deb8b54</id>
<content type='text'>
generic_file_direct_write() invalidates cache at entry.  Second time this
should be done when request completes.  But this function calls second
invalidation at exit unconditionally even for async requests.

This patch skips second invalidation for async requests (-EIOCBQUEUED).

Link: http://lkml.kernel.org/r/157270037850.4812.15036239021726025572.stgit@buzz
Signed-off-by: Konstantin Khlebnikov &lt;khlebnikov@yandex-team.ru&gt;
Reviewed-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Reviewed-by: Jan Kara &lt;jack@suse.cz&gt;
Cc: Jens Axboe &lt;axboe@kernel.dk&gt;
Cc: Alexander Viro &lt;viro@zeniv.linux.org.uk&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm/filemap.c: include &lt;linux/ramfs.h&gt; for generic_file_vm_ops definition</title>
<updated>2019-10-19T10:32:32Z</updated>
<author>
<name>Ben Dooks</name>
<email>ben.dooks@codethink.co.uk</email>
</author>
<published>2019-10-19T03:20:20Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=d0e6a5821cdf08eea91d8597dce1ad695ebeb8bc'/>
<id>urn:sha1:d0e6a5821cdf08eea91d8597dce1ad695ebeb8bc</id>
<content type='text'>
The generic_file_vm_ops is defined in &lt;linux/ramfs.h&gt; so include it to
fix the following warning:

  mm/filemap.c:2717:35: warning: symbol 'generic_file_vm_ops' was not declared. Should it be static?

Link: http://lkml.kernel.org/r/20191008102311.25432-1-ben.dooks@codethink.co.uk
Signed-off-by: Ben Dooks &lt;ben.dooks@codethink.co.uk&gt;
Reviewed-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm,thp: avoid writes to file with THP in pagecache</title>
<updated>2019-09-24T22:54:11Z</updated>
<author>
<name>Song Liu</name>
<email>songliubraving@fb.com</email>
</author>
<published>2019-09-23T22:38:03Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=09d91cda0e8207c1f14ee0d572f61a53dbcdaf85'/>
<id>urn:sha1:09d91cda0e8207c1f14ee0d572f61a53dbcdaf85</id>
<content type='text'>
In previous patch, an application could put part of its text section in
THP via madvise().  These THPs will be protected from writes when the
application is still running (TXTBSY).  However, after the application
exits, the file is available for writes.

This patch avoids writes to file THP by dropping page cache for the file
when the file is open for write.  A new counter nr_thps is added to struct
address_space.  In do_dentry_open(), if the file is open for write and
nr_thps is non-zero, we drop page cache for the whole file.

Link: http://lkml.kernel.org/r/20190801184244.3169074-8-songliubraving@fb.com
Signed-off-by: Song Liu &lt;songliubraving@fb.com&gt;
Reported-by: kbuild test robot &lt;lkp@intel.com&gt;
Acked-by: Rik van Riel &lt;riel@surriel.com&gt;
Acked-by: Kirill A. Shutemov &lt;kirill.shutemov@linux.intel.com&gt;
Acked-by: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: Hillf Danton &lt;hdanton@sina.com&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Cc: William Kucharski &lt;william.kucharski@oracle.com&gt;
Cc: Oleg Nesterov &lt;oleg@redhat.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm,thp: add read-only THP support for (non-shmem) FS</title>
<updated>2019-09-24T22:54:11Z</updated>
<author>
<name>Song Liu</name>
<email>songliubraving@fb.com</email>
</author>
<published>2019-09-23T22:38:00Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=99cb0dbd47a15d395bf3faa78dc122bc5efe3fc0'/>
<id>urn:sha1:99cb0dbd47a15d395bf3faa78dc122bc5efe3fc0</id>
<content type='text'>
This patch is (hopefully) the first step to enable THP for non-shmem
filesystems.

This patch enables an application to put part of its text sections to THP
via madvise, for example:

    madvise((void *)0x600000, 0x200000, MADV_HUGEPAGE);

We tried to reuse the logic for THP on tmpfs.

Currently, write is not supported for non-shmem THP.  khugepaged will only
process vma with VM_DENYWRITE.  sys_mmap() ignores VM_DENYWRITE requests
(see ksys_mmap_pgoff).  The only way to create vma with VM_DENYWRITE is
execve().  This requirement limits non-shmem THP to text sections.

The next patch will handle writes, which would only happen when the all
the vmas with VM_DENYWRITE are unmapped.

An EXPERIMENTAL config, READ_ONLY_THP_FOR_FS, is added to gate this
feature.

[songliubraving@fb.com: fix build without CONFIG_SHMEM]
  Link: http://lkml.kernel.org/r/F53407FB-96CC-42E8-9862-105C92CC2B98@fb.com
[songliubraving@fb.com: fix double unlock in collapse_file()]
  Link: http://lkml.kernel.org/r/B960CBFA-8EFC-4DA4-ABC5-1977FFF2CA57@fb.com
Link: http://lkml.kernel.org/r/20190801184244.3169074-7-songliubraving@fb.com
Signed-off-by: Song Liu &lt;songliubraving@fb.com&gt;
Acked-by: Rik van Riel &lt;riel@surriel.com&gt;
Acked-by: Kirill A. Shutemov &lt;kirill.shutemov@linux.intel.com&gt;
Acked-by: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: Stephen Rothwell &lt;sfr@canb.auug.org.au&gt;
Cc: Dan Carpenter &lt;dan.carpenter@oracle.com&gt;
Cc: Hillf Danton &lt;hdanton@sina.com&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Cc: William Kucharski &lt;william.kucharski@oracle.com&gt;
Cc: Oleg Nesterov &lt;oleg@redhat.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>filemap: update offset check in filemap_fault()</title>
<updated>2019-09-24T22:54:11Z</updated>
<author>
<name>Song Liu</name>
<email>songliubraving@fb.com</email>
</author>
<published>2019-09-23T22:37:50Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=520e5ba415906373186bcd3c7cffa3535bfdbdde'/>
<id>urn:sha1:520e5ba415906373186bcd3c7cffa3535bfdbdde</id>
<content type='text'>
With THP, current check of offset:

    VM_BUG_ON_PAGE(page-&gt;index != offset, page);

is no longer accurate. Update it to:

    VM_BUG_ON_PAGE(page_to_pgoff(page) != offset, page);

Link: http://lkml.kernel.org/r/20190801184244.3169074-4-songliubraving@fb.com
Signed-off-by: Song Liu &lt;songliubraving@fb.com&gt;
Acked-by: Rik van Riel &lt;riel@surriel.com&gt;
Acked-by: Kirill A. Shutemov &lt;kirill.shutemov@linux.intel.com&gt;
Acked-by: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: Hillf Danton &lt;hdanton@sina.com&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Cc: William Kucharski &lt;william.kucharski@oracle.com&gt;
Cc: Oleg Nesterov &lt;oleg@redhat.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>filemap: check compound_head(page)-&gt;mapping in pagecache_get_page()</title>
<updated>2019-09-24T22:54:11Z</updated>
<author>
<name>Song Liu</name>
<email>songliubraving@fb.com</email>
</author>
<published>2019-09-23T22:37:47Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=31895438e702f48e25b7aa6d88f9c97c795c79c7'/>
<id>urn:sha1:31895438e702f48e25b7aa6d88f9c97c795c79c7</id>
<content type='text'>
Similar to previous patch, pagecache_get_page() avoids race condition with
truncate by checking page-&gt;mapping == mapping.  This does not work for
compound pages.  This patch let it check compound_head(page)-&gt;mapping
instead.

Link: http://lkml.kernel.org/r/20190801184244.3169074-3-songliubraving@fb.com
Signed-off-by: Song Liu &lt;songliubraving@fb.com&gt;
Suggested-by: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Acked-by: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: Hillf Danton &lt;hdanton@sina.com&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Cc: Kirill A. Shutemov &lt;kirill.shutemov@linux.intel.com&gt;
Cc: Rik van Riel &lt;riel@surriel.com&gt;
Cc: William Kucharski &lt;william.kucharski@oracle.com&gt;
Cc: Oleg Nesterov &lt;oleg@redhat.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
</feed>
