<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/fs/cachefiles, branch v5.1</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v5.1</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v5.1'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2018-11-30T16:00:58Z</updated>
<entry>
<title>fscache, cachefiles: remove redundant variable 'cache'</title>
<updated>2018-11-30T16:00:58Z</updated>
<author>
<name>Colin Ian King</name>
<email>colin.king@canonical.com</email>
</author>
<published>2018-07-17T08:53:42Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=31ffa563833576bd49a8bf53120568312755e6e2'/>
<id>urn:sha1:31ffa563833576bd49a8bf53120568312755e6e2</id>
<content type='text'>
Variable 'cache' is being assigned but is never used hence it is
redundant and can be removed.

Cleans up clang warning:
warning: variable 'cache' set but not used [-Wunused-but-set-variable]

Signed-off-by: Colin Ian King &lt;colin.king@canonical.com&gt;
Signed-off-by: David Howells &lt;dhowells@redhat.com&gt;
</content>
</entry>
<entry>
<title>cachefiles: avoid deprecated get_seconds()</title>
<updated>2018-11-30T16:00:58Z</updated>
<author>
<name>Arnd Bergmann</name>
<email>arnd@arndb.de</email>
</author>
<published>2018-07-13T14:27:44Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=34e06fe4d05bd120556a95d5ebf1bcc97b0a1ca0'/>
<id>urn:sha1:34e06fe4d05bd120556a95d5ebf1bcc97b0a1ca0</id>
<content type='text'>
get_seconds() returns an unsigned long can overflow on some architectures
and is deprecated because of that. In cachefs, we cast that number to
a a 32-bit integer, which will overflow in year 2106 on all architectures.

As confirmed by David Howells, the overflow probably isn't harmful
in the end, since the timestamps are only used to make the file names
unique, but they don't strictly have to be in monotonically increasing
order since the files only exist in order to be deleted as quickly
as possible.

Moving to ktime_get_real_seconds() avoids the deprecated interface.

Signed-off-by: Arnd Bergmann &lt;arnd@arndb.de&gt;
Signed-off-by: David Howells &lt;dhowells@redhat.com&gt;
</content>
</entry>
<entry>
<title>cachefiles: Explicitly cast enumerated type in put_object</title>
<updated>2018-11-30T16:00:58Z</updated>
<author>
<name>Nathan Chancellor</name>
<email>natechancellor@gmail.com</email>
</author>
<published>2018-09-24T17:33:44Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=b7e768b7e3522695ed36dcb48ecdcd344bd30a9b'/>
<id>urn:sha1:b7e768b7e3522695ed36dcb48ecdcd344bd30a9b</id>
<content type='text'>
Clang warns when one enumerated type is implicitly converted to another.

fs/cachefiles/namei.c:247:50: warning: implicit conversion from
enumeration type 'enum cachefiles_obj_ref_trace' to different
enumeration type 'enum fscache_obj_ref_trace' [-Wenum-conversion]
        cache-&gt;cache.ops-&gt;put_object(&amp;xobject-&gt;fscache,
cachefiles_obj_put_wait_retry);

Silence this warning by explicitly casting to fscache_obj_ref_trace,
which is also done in put_object.

Reported-by: Nick Desaulniers &lt;ndesaulniers@google.com&gt;
Signed-off-by: Nathan Chancellor &lt;natechancellor@gmail.com&gt;
Signed-off-by: David Howells &lt;dhowells@redhat.com&gt;
</content>
</entry>
<entry>
<title>cachefiles: Fix page leak in cachefiles_read_backing_file while vmscan is active</title>
<updated>2018-11-28T14:47:05Z</updated>
<author>
<name>Kiran Kumar Modukuri</name>
<email>kiran.modukuri@gmail.com</email>
</author>
<published>2018-09-24T02:02:39Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=9a24ce5b66f9c8190d63b15f4473600db4935f1f'/>
<id>urn:sha1:9a24ce5b66f9c8190d63b15f4473600db4935f1f</id>
<content type='text'>
[Description]

In a heavily loaded system where the system pagecache is nearing memory
limits and fscache is enabled, pages can be leaked by fscache while trying
read pages from cachefiles backend.  This can happen because two
applications can be reading same page from a single mount, two threads can
be trying to read the backing page at same time.  This results in one of
the threads finding that a page for the backing file or netfs file is
already in the radix tree.  During the error handling cachefiles does not
clean up the reference on backing page, leading to page leak.

[Fix]
The fix is straightforward, to decrement the reference when error is
encountered.

  [dhowells: Note that I've removed the clearance and put of newpage as
   they aren't attested in the commit message and don't appear to actually
   achieve anything since a new page is only allocated is newpage!=NULL and
   any residual new page is cleared before returning.]

[Testing]
I have tested the fix using following method for 12+ hrs.

1) mkdir -p /mnt/nfs ; mount -o vers=3,fsc &lt;server_ip&gt;:/export /mnt/nfs
2) create 10000 files of 2.8MB in a NFS mount.
3) start a thread to simulate heavy VM presssure
   (while true ; do echo 3 &gt; /proc/sys/vm/drop_caches ; sleep 1 ; done)&amp;
4) start multiple parallel reader for data set at same time
   find /mnt/nfs -type f | xargs -P 80 cat &gt; /dev/null &amp;
   find /mnt/nfs -type f | xargs -P 80 cat &gt; /dev/null &amp;
   find /mnt/nfs -type f | xargs -P 80 cat &gt; /dev/null &amp;
   ..
   ..
   find /mnt/nfs -type f | xargs -P 80 cat &gt; /dev/null &amp;
   find /mnt/nfs -type f | xargs -P 80 cat &gt; /dev/null &amp;
5) finally check using cat /proc/fs/fscache/stats | grep -i pages ;
   free -h , cat /proc/meminfo and page-types -r -b lru
   to ensure all pages are freed.

Reviewed-by: Daniel Axtens &lt;dja@axtens.net&gt;
Signed-off-by: Shantanu Goel &lt;sgoel01@yahoo.com&gt;
Signed-off-by: Kiran Kumar Modukuri &lt;kiran.modukuri@gmail.com&gt;
[dja: forward ported to current upstream]
Signed-off-by: Daniel Axtens &lt;dja@axtens.net&gt;
Signed-off-by: David Howells &lt;dhowells@redhat.com&gt;
</content>
</entry>
<entry>
<title>cachefiles: Fix an assertion failure when trying to update a failed object</title>
<updated>2018-11-28T13:19:20Z</updated>
<author>
<name>David Howells</name>
<email>dhowells@redhat.com</email>
</author>
<published>2018-11-27T16:34:55Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=e6bc06faf64a83384cc0abc537df954c9d3ff942'/>
<id>urn:sha1:e6bc06faf64a83384cc0abc537df954c9d3ff942</id>
<content type='text'>
If cachefiles gets an error other then ENOENT when trying to look up an
object in the cache (in this case, EACCES), the object state machine will
eventually transition to the DROP_OBJECT state.

This state invokes fscache_drop_object() which tries to sync the auxiliary
data with the cache (this is done lazily since commit 402cb8dda949d) on an
incomplete cache object struct.

The problem comes when cachefiles_update_object_xattr() is called to
rewrite the xattr holding the data.  There's an assertion there that the
cache object points to a dentry as we're going to update its xattr.  The
assertion trips, however, as dentry didn't get set.

Fix the problem by skipping the update in cachefiles if the object doesn't
refer to a dentry.  A better way to do it could be to skip the update from
the DROP_OBJECT state handler in fscache, but that might deny the cache the
opportunity to update intermediate state.

If this error occurs, the kernel log includes lines that look like the
following:

 CacheFiles: Lookup failed error -13
 CacheFiles:
 CacheFiles: Assertion failed
 ------------[ cut here ]------------
 kernel BUG at fs/cachefiles/xattr.c:138!
 ...
 Workqueue: fscache_object fscache_object_work_func [fscache]
 RIP: 0010:cachefiles_update_object_xattr.cold.4+0x18/0x1a [cachefiles]
 ...
 Call Trace:
  cachefiles_update_object+0xdd/0x1c0 [cachefiles]
  fscache_update_aux_data+0x23/0x30 [fscache]
  fscache_drop_object+0x18e/0x1c0 [fscache]
  fscache_object_work_func+0x74/0x2b0 [fscache]
  process_one_work+0x18d/0x340
  worker_thread+0x2e/0x390
  ? pwq_unbound_release_workfn+0xd0/0xd0
  kthread+0x112/0x130
  ? kthread_bind+0x30/0x30
  ret_from_fork+0x35/0x40

Note that there are actually two issues here: (1) EACCES happened on a
cache object and (2) an oops occurred.  I think that the second is a
consequence of the first (it certainly looks like it ought to be).  This
patch only deals with the second.

Fixes: 402cb8dda949 ("fscache: Attach the index key and aux data to the cookie")
Reported-by: Zhibin Li &lt;zhibli@redhat.com&gt;
Signed-off-by: David Howells &lt;dhowells@redhat.com&gt;
</content>
</entry>
<entry>
<title>cachefiles: fix the race between cachefiles_bury_object() and rmdir(2)</title>
<updated>2018-10-18T09:32:21Z</updated>
<author>
<name>Al Viro</name>
<email>viro@zeniv.linux.org.uk</email>
</author>
<published>2018-10-17T14:23:26Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=169b803397499be85bdd1e3d07d6f5e3d4bd669e'/>
<id>urn:sha1:169b803397499be85bdd1e3d07d6f5e3d4bd669e</id>
<content type='text'>
the victim might've been rmdir'ed just before the lock_rename();
unlike the normal callers, we do not look the source up after the
parents are locked - we know it beforehand and just recheck that it's
still the child of what used to be its parent.  Unfortunately,
the check is too weak - we don't spot a dead directory since its
-&gt;d_parent is unchanged, dentry is positive, etc.  So we sail all
the way to -&gt;rename(), with hosting filesystems _not_ expecting
to be asked renaming an rmdir'ed subdirectory.

The fix is easy, fortunately - the lock on parent is sufficient for
making IS_DEADDIR() on child safe.

Cc: stable@vger.kernel.org
Fixes: 9ae326a69004 (CacheFiles: A cache that backs onto a mounted filesystem)
Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Signed-off-by: David Howells &lt;dhowells@redhat.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>cachefiles: Wait rather than BUG'ing on "Unexpected object collision"</title>
<updated>2018-07-25T13:49:00Z</updated>
<author>
<name>Kiran Kumar Modukuri</name>
<email>kiran.modukuri@gmail.com</email>
</author>
<published>2018-06-21T20:25:53Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=c2412ac45a8f8f1cd582723c1a139608694d410d'/>
<id>urn:sha1:c2412ac45a8f8f1cd582723c1a139608694d410d</id>
<content type='text'>
If we meet a conflicting object that is marked FSCACHE_OBJECT_IS_LIVE in
the active object tree, we have been emitting a BUG after logging
information about it and the new object.

Instead, we should wait for the CACHEFILES_OBJECT_ACTIVE flag to be cleared
on the old object (or return an error).  The ACTIVE flag should be cleared
after it has been removed from the active object tree.  A timeout of 60s is
used in the wait, so we shouldn't be able to get stuck there.

Fixes: 9ae326a69004 ("CacheFiles: A cache that backs onto a mounted filesystem")
Signed-off-by: Kiran Kumar Modukuri &lt;kiran.modukuri@gmail.com&gt;
Signed-off-by: David Howells &lt;dhowells@redhat.com&gt;
</content>
</entry>
<entry>
<title>cachefiles: Fix missing clear of the CACHEFILES_OBJECT_ACTIVE flag</title>
<updated>2018-07-25T13:49:00Z</updated>
<author>
<name>Kiran Kumar Modukuri</name>
<email>kiran.modukuri@gmail.com</email>
</author>
<published>2018-06-21T20:25:53Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=5ce83d4bb7d8e11e8c1c687d09f4b5ae67ef3ce3'/>
<id>urn:sha1:5ce83d4bb7d8e11e8c1c687d09f4b5ae67ef3ce3</id>
<content type='text'>
In cachefiles_mark_object_active(), the new object is marked active and
then we try to add it to the active object tree.  If a conflicting object
is already present, we want to wait for that to go away.  After the wait,
we go round again and try to re-mark the object as being active - but it's
already marked active from the first time we went through and a BUG is
issued.

Fix this by clearing the CACHEFILES_OBJECT_ACTIVE flag before we try again.

Analysis from Kiran Kumar Modukuri:

[Impact]
Oops during heavy NFS + FSCache + Cachefiles

CacheFiles: Error: Overlong wait for old active object to go away.

BUG: unable to handle kernel NULL pointer dereference at 0000000000000002

CacheFiles: Error: Object already active kernel BUG at
fs/cachefiles/namei.c:163!

[Cause]
In a heavily loaded system with big files being read and truncated, an
fscache object for a cookie is being dropped and a new object being
looked. The new object being looked for has to wait for the old object
to go away before the new object is moved to active state.

[Fix]
Clear the flag 'CACHEFILES_OBJECT_ACTIVE' for the new object when
retrying the object lookup.

[Testcase]
Have run ~100 hours of NFS stress tests and have not seen this bug recur.

[Regression Potential]
 - Limited to fscache/cachefiles.

Fixes: 9ae326a69004 ("CacheFiles: A cache that backs onto a mounted filesystem")
Signed-off-by: Kiran Kumar Modukuri &lt;kiran.modukuri@gmail.com&gt;
Signed-off-by: David Howells &lt;dhowells@redhat.com&gt;
</content>
</entry>
<entry>
<title>fscache: Fix reference overput in fscache_attach_object() error handling</title>
<updated>2018-07-25T13:49:00Z</updated>
<author>
<name>Kiran Kumar Modukuri</name>
<email>kiran.modukuri@gmail.com</email>
</author>
<published>2018-06-21T20:31:44Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=f29507ce66701084c39aeb1b0ae71690cbff3554'/>
<id>urn:sha1:f29507ce66701084c39aeb1b0ae71690cbff3554</id>
<content type='text'>
When a cookie is allocated that causes fscache_object structs to be
allocated, those objects are initialised with the cookie pointer, but
aren't blessed with a ref on that cookie unless the attachment is
successfully completed in fscache_attach_object().

If attachment fails because the parent object was dying or there was a
collision, fscache_attach_object() returns without incrementing the cookie
counter - but upon failure of this function, the object is released which
then puts the cookie, whether or not a ref was taken on the cookie.

Fix this by taking a ref on the cookie when it is assigned in
fscache_object_init(), even when we're creating a root object.


Analysis from Kiran Kumar:

This bug has been seen in 4.4.0-124-generic #148-Ubuntu kernel

BugLink: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1776277

fscache cookie ref count updated incorrectly during fscache object
allocation resulting in following Oops.

kernel BUG at /build/linux-Y09MKI/linux-4.4.0/fs/fscache/internal.h:321!
kernel BUG at /build/linux-Y09MKI/linux-4.4.0/fs/fscache/cookie.c:639!

[Cause]
Two threads are trying to do operate on a cookie and two objects.

(1) One thread tries to unmount the filesystem and in process goes over a
    huge list of objects marking them dead and deleting the objects.
    cookie-&gt;usage is also decremented in following path:

      nfs_fscache_release_super_cookie
       -&gt; __fscache_relinquish_cookie
        -&gt;__fscache_cookie_put
        -&gt;BUG_ON(atomic_read(&amp;cookie-&gt;usage) &lt;= 0);

(2) A second thread tries to lookup an object for reading data in following
    path:

    fscache_alloc_object
    1) cachefiles_alloc_object
        -&gt; fscache_object_init
           -&gt; assign cookie, but usage not bumped.
    2) fscache_attach_object -&gt; fails in cant_attach_object because the
         cookie's backing object or cookie's-&gt;parent object are going away
    3) fscache_put_object
        -&gt; cachefiles_put_object
          -&gt;fscache_object_destroy
            -&gt;fscache_cookie_put
               -&gt;BUG_ON(atomic_read(&amp;cookie-&gt;usage) &lt;= 0);

[NOTE from dhowells] It's unclear as to the circumstances in which (2) can
take place, given that thread (1) is in nfs_kill_super(), however a
conflicting NFS mount with slightly different parameters that creates a
different superblock would do it.  A backtrace from Kiran seems to show
that this is a possibility:

    kernel BUG at/build/linux-Y09MKI/linux-4.4.0/fs/fscache/cookie.c:639!
    ...
    RIP: __fscache_cookie_put+0x3a/0x40 [fscache]
    Call Trace:
     __fscache_relinquish_cookie+0x87/0x120 [fscache]
     nfs_fscache_release_super_cookie+0x2d/0xb0 [nfs]
     nfs_kill_super+0x29/0x40 [nfs]
     deactivate_locked_super+0x48/0x80
     deactivate_super+0x5c/0x60
     cleanup_mnt+0x3f/0x90
     __cleanup_mnt+0x12/0x20
     task_work_run+0x86/0xb0
     exit_to_usermode_loop+0xc2/0xd0
     syscall_return_slowpath+0x4e/0x60
     int_ret_from_sys_call+0x25/0x9f

[Fix] Bump up the cookie usage in fscache_object_init, when it is first
being assigned a cookie atomically such that the cookie is added and bumped
up if its refcount is not zero.  Remove the assignment in
fscache_attach_object().

[Testcase]
I have run ~100 hours of NFS stress tests and not seen this bug recur.

[Regression Potential]
 - Limited to fscache/cachefiles.

Fixes: ccc4fc3d11e9 ("FS-Cache: Implement the cookie management part of the netfs API")
Signed-off-by: Kiran Kumar Modukuri &lt;kiran.modukuri@gmail.com&gt;
Signed-off-by: David Howells &lt;dhowells@redhat.com&gt;
</content>
</entry>
<entry>
<title>cachefiles: Fix refcounting bug in backing-file read monitoring</title>
<updated>2018-07-25T13:49:00Z</updated>
<author>
<name>Kiran Kumar Modukuri</name>
<email>kiran.modukuri@gmail.com</email>
</author>
<published>2017-07-18T23:25:49Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=934140ab028713a61de8bca58c05332416d037d1'/>
<id>urn:sha1:934140ab028713a61de8bca58c05332416d037d1</id>
<content type='text'>
cachefiles_read_waiter() has the right to access a 'monitor' object by
virtue of being called under the waitqueue lock for one of the pages in its
purview.  However, it has no ref on that monitor object or on the
associated operation.

What it is allowed to do is to move the monitor object to the operation's
to_do list, but once it drops the work_lock, it's actually no longer
permitted to access that object.  However, it is trying to enqueue the
retrieval operation for processing - but it can only do this via a pointer
in the monitor object, something it shouldn't be doing.

If it doesn't enqueue the operation, the operation may not get processed.
If the order is flipped so that the enqueue is first, then it's possible
for the work processor to look at the to_do list before the monitor is
enqueued upon it.

Fix this by getting a ref on the operation so that we can trust that it
will still be there once we've added the monitor to the to_do list and
dropped the work_lock.  The op can then be enqueued after the lock is
dropped.

The bug can manifest in one of a couple of ways.  The first manifestation
looks like:

 FS-Cache:
 FS-Cache: Assertion failed
 FS-Cache: 6 == 5 is false
 ------------[ cut here ]------------
 kernel BUG at fs/fscache/operation.c:494!
 RIP: 0010:fscache_put_operation+0x1e3/0x1f0
 ...
 fscache_op_work_func+0x26/0x50
 process_one_work+0x131/0x290
 worker_thread+0x45/0x360
 kthread+0xf8/0x130
 ? create_worker+0x190/0x190
 ? kthread_cancel_work_sync+0x10/0x10
 ret_from_fork+0x1f/0x30

This is due to the operation being in the DEAD state (6) rather than
INITIALISED, COMPLETE or CANCELLED (5) because it's already passed through
fscache_put_operation().

The bug can also manifest like the following:

 kernel BUG at fs/fscache/operation.c:69!
 ...
    [exception RIP: fscache_enqueue_operation+246]
 ...
 #7 [ffff883fff083c10] fscache_enqueue_operation at ffffffffa0b793c6
 #8 [ffff883fff083c28] cachefiles_read_waiter at ffffffffa0b15a48
 #9 [ffff883fff083c48] __wake_up_common at ffffffff810af028

I'm not entirely certain as to which is line 69 in Lei's kernel, so I'm not
entirely clear which assertion failed.

Fixes: 9ae326a69004 ("CacheFiles: A cache that backs onto a mounted filesystem")
Reported-by: Lei Xue &lt;carmark.dlut@gmail.com&gt;
Reported-by: Vegard Nossum &lt;vegard.nossum@gmail.com&gt;
Reported-by: Anthony DeRobertis &lt;aderobertis@metrics.net&gt;
Reported-by: NeilBrown &lt;neilb@suse.com&gt;
Reported-by: Daniel Axtens &lt;dja@axtens.net&gt;
Reported-by: Kiran Kumar Modukuri &lt;kiran.modukuri@gmail.com&gt;
Signed-off-by: David Howells &lt;dhowells@redhat.com&gt;
Reviewed-by: Daniel Axtens &lt;dja@axtens.net&gt;
</content>
</entry>
</feed>
