<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/fs, branch v4.20</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v4.20</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v4.20'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2018-12-23T18:40:41Z</updated>
<entry>
<title>Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs</title>
<updated>2018-12-23T18:40:41Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2018-12-23T18:40:41Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=3c730b1041aefa2a92b96fcba9db237d28585922'/>
<id>urn:sha1:3c730b1041aefa2a92b96fcba9db237d28585922</id>
<content type='text'>
Pull vfs fixes from Al Viro:
 "A couple of fixes - no common topic ;-)"

[ The aio spectre patch also came in from Jens, so now we have that
  doubly fixed .. ]

* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  proc/sysctl: don't return ENOMEM on lookup when a table is unregistering
  aio: fix spectre gadget in lookup_ioctx
</content>
</entry>
<entry>
<title>Revert "vfs: Allow userns root to call mknod on owned filesystems."</title>
<updated>2018-12-22T22:18:34Z</updated>
<author>
<name>Christian Brauner</name>
<email>christian@brauner.io</email>
</author>
<published>2018-07-05T15:51:20Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=94f82008ce30e2624537d240d64ce718255e0b80'/>
<id>urn:sha1:94f82008ce30e2624537d240d64ce718255e0b80</id>
<content type='text'>
This reverts commit 55956b59df336f6738da916dbb520b6e37df9fbd.

commit 55956b59df33 ("vfs: Allow userns root to call mknod on owned filesystems.")
enabled mknod() in user namespaces for userns root if CAP_MKNOD is
available. However, these device nodes are useless since any filesystem
mounted from a non-initial user namespace will set the SB_I_NODEV flag on
the filesystem. Now, when a device node s created in a non-initial user
namespace a call to open() on said device node will fail due to:

bool may_open_dev(const struct path *path)
{
        return !(path-&gt;mnt-&gt;mnt_flags &amp; MNT_NODEV) &amp;&amp;
                !(path-&gt;mnt-&gt;mnt_sb-&gt;s_iflags &amp; SB_I_NODEV);
}

The problem with this is that as of the aforementioned commit mknod()
creates partially functional device nodes in non-initial user namespaces.
In particular, it has the consequence that as of the aforementioned commit
open() will be more privileged with respect to device nodes than mknod().
Before it was the other way around. Specifically, if mknod() succeeded
then it was transparent for any userspace application that a fatal error
must have occured when open() failed.

All of this breaks multiple userspace workloads and a widespread assumption
about how to handle mknod(). Basically, all container runtimes and systemd
live by the slogan "ask for forgiveness not permission" when running user
namespace workloads. For mknod() the assumption is that if the syscall
succeeds the device nodes are useable irrespective of whether it succeeds
in a non-initial user namespace or not. This logic was chosen explicitly
to allow for the glorious day when mknod() will actually be able to create
fully functional device nodes in user namespaces.
A specific problem people are already running into when running 4.18 rc
kernels are failing systemd services. For any distro that is run in a
container systemd services started with the PrivateDevices= property set
will fail to start since the device nodes in question cannot be
opened (cf. the arguments in [1]).

Full disclosure, Seth made the very sound argument that it is already
possible to end up with partially functional device nodes. Any filesystem
mounted with MS_NODEV set will allow mknod() to succeed but will not allow
open() to succeed. The difference to the case here is that the MS_NODEV
case is transparent to userspace since it is an explicitly set mount option
while the SB_I_NODEV case is an implicit property enforced by the kernel
and hence opaque to userspace.

[1]: https://github.com/systemd/systemd/pull/9483

Signed-off-by: Christian Brauner &lt;christian@brauner.io&gt;
Cc: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Cc: Seth Forshee &lt;seth.forshee@canonical.com&gt;
Cc: Serge Hallyn &lt;serge@hallyn.com&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>Merge tag '4.20-rc7-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6</title>
<updated>2018-12-21T16:56:31Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2018-12-21T16:56:31Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=783619556a560919ed2113261042ce53a186d8cc'/>
<id>urn:sha1:783619556a560919ed2113261042ce53a186d8cc</id>
<content type='text'>
Pull smb3 fix from Steve French:
 "An important smb3 fix for an regression to some servers introduced by
  compounding optimization to rmdir.

  This fix has been tested by multiple developers (including me) with
  the usual private xfstesting, but also by the new cifs/smb3 "buildbot"
  xfstest VMs (thank you Ronnie and Aurelien for good work on this
  automation). The automated testing has been updated so that it will
  catch problems like this in the future.

  Note that Pavel discovered (very recently) some unrelated but
  extremely important bugs in credit handling (smb3 flow control problem
  that can lead to disconnects/reconnects) when compounding, that I
  would have liked to send in ASAP but the complete testing of those two
  fixes may not be done in time and have to wait for 4.21"

* tag '4.20-rc7-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6:
  smb3: Fix rmdir compounding regression to strict servers
</content>
</entry>
<entry>
<title>Merge tag 'upstream-4.20-rc7' of git://git.infradead.org/linux-ubifs</title>
<updated>2018-12-20T22:17:24Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2018-12-20T22:17:24Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=f57b620a89ad2eee2e25218b5cb2bee0ad1e2e7d'/>
<id>urn:sha1:f57b620a89ad2eee2e25218b5cb2bee0ad1e2e7d</id>
<content type='text'>
Pull UBI/UBIFS fixes from Richard Weinberger:

 - Kconfig dependency fixes for our new auth feature

 - Fix for selecting the right compressor when creating a fs

 - Bugfix for a bug in UBIFS's O_TMPFILE implementation

 - Refcounting fixes for UBI

* tag 'upstream-4.20-rc7' of git://git.infradead.org/linux-ubifs:
  ubifs: Handle re-linking of inodes correctly while recovery
  ubi: Do not drop UBI device reference before using
  ubi: Put MTD device after it is not used
  ubifs: Fix default compression selection in ubifs
  ubifs: Fix memory leak on error condition
  ubifs: auth: Add CONFIG_KEYS dependency
  ubifs: CONFIG_UBIFS_FS_AUTHENTICATION should depend on UBIFS_FS
  ubifs: replay: Fix high stack usage
</content>
</entry>
<entry>
<title>iomap: Revert "fs/iomap.c: get/put the page in iomap_page_create/release()"</title>
<updated>2018-12-20T15:22:51Z</updated>
<author>
<name>Dave Chinner</name>
<email>dchinner@redhat.com</email>
</author>
<published>2018-12-20T12:23:24Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=a837eca2412051628c0529768c9bc4f3580b040e'/>
<id>urn:sha1:a837eca2412051628c0529768c9bc4f3580b040e</id>
<content type='text'>
This reverts commit 61c6de667263184125d5ca75e894fcad632b0dd3.

The reverted commit added page reference counting to iomap page
structures that are used to track block size &lt; page size state. This
was supposed to align the code with page migration page accounting
assumptions, but what it has done instead is break XFS filesystems.
Every fstests run I've done on sub-page block size XFS filesystems
has since picking up this commit 2 days ago has failed with bad page
state errors such as:

# ./run_check.sh "-m rmapbt=1,reflink=1 -i sparse=1 -b size=1k" "generic/038"
....
SECTION       -- xfs
FSTYP         -- xfs (debug)
PLATFORM      -- Linux/x86_64 test1 4.20.0-rc6-dgc+
MKFS_OPTIONS  -- -f -m rmapbt=1,reflink=1 -i sparse=1 -b size=1k /dev/sdc
MOUNT_OPTIONS -- /dev/sdc /mnt/scratch

generic/038 454s ...
 run fstests generic/038 at 2018-12-20 18:43:05
 XFS (sdc): Unmounting Filesystem
 XFS (sdc): Mounting V5 Filesystem
 XFS (sdc): Ending clean mount
 BUG: Bad page state in process kswapd0  pfn:3a7fa
 page:ffffea0000ccbeb0 count:0 mapcount:0 mapping:ffff88800d9b6360 index:0x1
 flags: 0xfffffc0000000()
 raw: 000fffffc0000000 dead000000000100 dead000000000200 ffff88800d9b6360
 raw: 0000000000000001 0000000000000000 00000000ffffffff
 page dumped because: non-NULL mapping
 CPU: 0 PID: 676 Comm: kswapd0 Not tainted 4.20.0-rc6-dgc+ #915
 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.1-1 04/01/2014
 Call Trace:
  dump_stack+0x67/0x90
  bad_page.cold.116+0x8a/0xbd
  free_pcppages_bulk+0x4bf/0x6a0
  free_unref_page_list+0x10f/0x1f0
  shrink_page_list+0x49d/0xf50
  shrink_inactive_list+0x19d/0x3b0
  shrink_node_memcg.constprop.77+0x398/0x690
  ? shrink_slab.constprop.81+0x278/0x3f0
  shrink_node+0x7a/0x2f0
  kswapd+0x34b/0x6d0
  ? node_reclaim+0x240/0x240
  kthread+0x11f/0x140
  ? __kthread_bind_mask+0x60/0x60
  ret_from_fork+0x24/0x30
 Disabling lock debugging due to kernel taint
....

The failures are from anyway that frees pages and empties the
per-cpu page magazines, so it's not a predictable failure or an easy
to debug failure.

generic/038 is a reliable reproducer of this problem - it has a 9 in
10 failure rate on one of my test machines. Failure on other
machines have been at random points in fstests runs but every run
has ended up tripping this problem. Hence generic/038 was used to
bisect the failure because it was the most reliable failure.

It is too close to the 4.20 release (not to mention holidays) to
try to diagnose, fix and test the underlying cause of the problem,
so reverting the commit is the only option we have right now. The
revert has been tested against a current tot 4.20-rc7+ kernel across
multiple machines running sub-page block size XFs filesystems and
none of the bad page state failures have been seen.

Signed-off-by: Dave Chinner &lt;dchinner@redhat.com&gt;
Cc: Piotr Jaroszynski &lt;pjaroszynski@nvidia.com&gt;
Cc: Christoph Hellwig &lt;hch@lst.de&gt;
Cc: William Kucharski &lt;william.kucharski@oracle.com&gt;
Cc: Darrick J. Wong &lt;darrick.wong@oracle.com&gt;
Cc: Brian Foster &lt;bfoster@redhat.com&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>smb3: Fix rmdir compounding regression to strict servers</title>
<updated>2018-12-19T13:55:32Z</updated>
<author>
<name>Ronnie Sahlberg</name>
<email>lsahlber@redhat.com</email>
</author>
<published>2018-12-18T23:49:05Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=271b9c0c80076bb1dd868dc384ef3aac87ec7dec'/>
<id>urn:sha1:271b9c0c80076bb1dd868dc384ef3aac87ec7dec</id>
<content type='text'>
Some servers require that the setinfo matches the exact size,
and in this case compounding changes introduced by
commit c2e0fe3f5aae ("cifs: make rmdir() use compounding")
caused us to send 8 bytes (padded length) instead of 1 byte
(the size of the structure).  See MS-FSCC section 2.4.11.

Fixing this when we send a SET_INFO command for delete file
disposition, then ends up as an iov of a single byte but this
causes problems with SMB3 and encryption.

To avoid this, instead of creating a one byte iov for the disposition value
and then appending an additional iov with a 7 byte padding we now handle
this as a single 8 byte iov containing both the disposition byte as well as
the padding in one single buffer.

Signed-off-by: Ronnie Sahlberg &lt;lsahlber@redhat.com&gt;
Signed-off-by: Steve French &lt;stfrench@microsoft.com&gt;
Acked-by: Paulo Alcantara &lt;palcantara@suse.de&gt;
</content>
</entry>
<entry>
<title>Merge branch 'akpm' (patches from Andrew)</title>
<updated>2018-12-14T23:35:30Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2018-12-14T23:35:30Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=6531e115b7ab84f563fcd7f0d2d05ccf971aaaf9'/>
<id>urn:sha1:6531e115b7ab84f563fcd7f0d2d05ccf971aaaf9</id>
<content type='text'>
Merge misc fixes from Andrew Morton:
 "11 fixes"

* emailed patches from Andrew Morton &lt;akpm@linux-foundation.org&gt;:
  scripts/spdxcheck.py: always open files in binary mode
  checkstack.pl: fix for aarch64
  userfaultfd: check VM_MAYWRITE was set after verifying the uffd is registered
  fs/iomap.c: get/put the page in iomap_page_create/release()
  hugetlbfs: call VM_BUG_ON_PAGE earlier in free_huge_page()
  memblock: annotate memblock_is_reserved() with __init_memblock
  psi: fix reference to kernel commandline enable
  arch/sh/include/asm/io.h: provide prototypes for PCI I/O mapping in asm/io.h
  mm/sparse: add common helper to mark all memblocks present
  mm: introduce common STRUCT_PAGE_MAX_SHIFT define
  alpha: fix hang caused by the bootmem removal
</content>
</entry>
<entry>
<title>userfaultfd: check VM_MAYWRITE was set after verifying the uffd is registered</title>
<updated>2018-12-14T23:05:45Z</updated>
<author>
<name>Andrea Arcangeli</name>
<email>aarcange@redhat.com</email>
</author>
<published>2018-12-14T22:17:17Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=01e881f5a1fca4677e82733061868c6d6ea05ca7'/>
<id>urn:sha1:01e881f5a1fca4677e82733061868c6d6ea05ca7</id>
<content type='text'>
Calling UFFDIO_UNREGISTER on virtual ranges not yet registered in uffd
could trigger an harmless false positive WARN_ON.  Check the vma is
already registered before checking VM_MAYWRITE to shut off the false
positive warning.

Link: http://lkml.kernel.org/r/20181206212028.18726-2-aarcange@redhat.com
Cc: &lt;stable@vger.kernel.org&gt;
Fixes: 29ec90660d68 ("userfaultfd: shmem/hugetlbfs: only allow to register VM_MAYWRITE vmas")
Signed-off-by: Andrea Arcangeli &lt;aarcange@redhat.com&gt;
Reported-by: syzbot+06c7092e7d71218a2c16@syzkaller.appspotmail.com
Acked-by: Mike Rapoport &lt;rppt@linux.ibm.com&gt;
Acked-by: Hugh Dickins &lt;hughd@google.com&gt;
Acked-by: Peter Xu &lt;peterx@redhat.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>fs/iomap.c: get/put the page in iomap_page_create/release()</title>
<updated>2018-12-14T23:05:45Z</updated>
<author>
<name>Piotr Jaroszynski</name>
<email>pjaroszynski@nvidia.com</email>
</author>
<published>2018-12-14T22:17:14Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=61c6de667263184125d5ca75e894fcad632b0dd3'/>
<id>urn:sha1:61c6de667263184125d5ca75e894fcad632b0dd3</id>
<content type='text'>
migrate_page_move_mapping() expects pages with private data set to have
a page_count elevated by 1.  This is what used to happen for xfs through
the buffer_heads code before the switch to iomap in commit 82cb14175e7d
("xfs: add support for sub-pagesize writeback without buffer_heads").
Not having the count elevated causes move_pages() to fail on memory
mapped files coming from xfs.

Make iomap compatible with the migrate_page_move_mapping() assumption by
elevating the page count as part of iomap_page_create() and lowering it
in iomap_page_release().

It causes the move_pages() syscall to misbehave on memory mapped files
from xfs.  It does not not move any pages, which I suppose is "just" a
perf issue, but it also ends up returning a positive number which is out
of spec for the syscall.  Talking to Michal Hocko, it sounds like
returning positive numbers might be a necessary update to move_pages()
anyway though
(https://lkml.kernel.org/r/20181116114955.GJ14706@dhcp22.suse.cz).

I only hit this in tests that verify that move_pages() actually moved
the pages.  The test also got confused by the positive return from
move_pages() (it got treated as a success as positive numbers were not
expected and not handled) making it a bit harder to track down what's
going on.

Link: http://lkml.kernel.org/r/20181115184140.1388751-1-pjaroszynski@nvidia.com
Fixes: 82cb14175e7d ("xfs: add support for sub-pagesize writeback without buffer_heads")
Signed-off-by: Piotr Jaroszynski &lt;pjaroszynski@nvidia.com&gt;
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Cc: William Kucharski &lt;william.kucharski@oracle.com&gt;
Cc: Darrick J. Wong &lt;darrick.wong@oracle.com&gt;
Cc: Brian Foster &lt;bfoster@redhat.com&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>Merge tag 'for-linus-20181214' of git://git.kernel.dk/linux-block</title>
<updated>2018-12-14T20:18:30Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2018-12-14T20:18:30Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=380ef2c9ad4fdd5fdd81055857be21ae5f581877'/>
<id>urn:sha1:380ef2c9ad4fdd5fdd81055857be21ae5f581877</id>
<content type='text'>
Pull block fixes from Jens Axboe:
 "Three small fixes for this week. contains:

   - spectre indexing fix for aio (Jeff)

   - fix for the previous zeroing bio fix, we don't need it for user
     mapped pages, and in fact it breaks some applications if we do
     (Keith)

   - allocation failure fix for null_blk with zoned (Shin'ichiro)"

* tag 'for-linus-20181214' of git://git.kernel.dk/linux-block:
  block: Fix null_blk_zoned creation failure with small number of zones
  aio: fix spectre gadget in lookup_ioctx
  block/bio: Do not zero user pages
</content>
</entry>
</feed>
