<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/fs/proc, branch v5.3</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v5.3</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v5.3'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2019-07-19T17:42:02Z</updated>
<entry>
<title>Merge branch 'work.mount0' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs</title>
<updated>2019-07-19T17:42:02Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2019-07-19T17:42:02Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=933a90bf4f3505f8ec83bda21a3c7d70d7c2b426'/>
<id>urn:sha1:933a90bf4f3505f8ec83bda21a3c7d70d7c2b426</id>
<content type='text'>
Pull vfs mount updates from Al Viro:
 "The first part of mount updates.

  Convert filesystems to use the new mount API"

* 'work.mount0' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (63 commits)
  mnt_init(): call shmem_init() unconditionally
  constify ksys_mount() string arguments
  don't bother with registering rootfs
  init_rootfs(): don't bother with init_ramfs_fs()
  vfs: Convert smackfs to use the new mount API
  vfs: Convert selinuxfs to use the new mount API
  vfs: Convert securityfs to use the new mount API
  vfs: Convert apparmorfs to use the new mount API
  vfs: Convert openpromfs to use the new mount API
  vfs: Convert xenfs to use the new mount API
  vfs: Convert gadgetfs to use the new mount API
  vfs: Convert oprofilefs to use the new mount API
  vfs: Convert ibmasmfs to use the new mount API
  vfs: Convert qib_fs/ipathfs to use the new mount API
  vfs: Convert efivarfs to use the new mount API
  vfs: Convert configfs to use the new mount API
  vfs: Convert binfmt_misc to use the new mount API
  convenience helper: get_tree_single()
  convenience helper get_tree_nodev()
  vfs: Kill sget_userns()
  ...
</content>
</entry>
<entry>
<title>proc/sysctl: add shared variables for range check</title>
<updated>2019-07-19T00:08:07Z</updated>
<author>
<name>Matteo Croce</name>
<email>mcroce@redhat.com</email>
</author>
<published>2019-07-18T22:58:50Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=eec4844fae7c033a0c1fc1eb3b8517aeb8b6cc49'/>
<id>urn:sha1:eec4844fae7c033a0c1fc1eb3b8517aeb8b6cc49</id>
<content type='text'>
In the sysctl code the proc_dointvec_minmax() function is often used to
validate the user supplied value between an allowed range.  This
function uses the extra1 and extra2 members from struct ctl_table as
minimum and maximum allowed value.

On sysctl handler declaration, in every source file there are some
readonly variables containing just an integer which address is assigned
to the extra1 and extra2 members, so the sysctl range is enforced.

The special values 0, 1 and INT_MAX are very often used as range
boundary, leading duplication of variables like zero=0, one=1,
int_max=INT_MAX in different source files:

    $ git grep -E '\.extra[12].*&amp;(zero|one|int_max)' |wc -l
    248

Add a const int array containing the most commonly used values, some
macros to refer more easily to the correct array member, and use them
instead of creating a local one for every object file.

This is the bloat-o-meter output comparing the old and new binary
compiled with the default Fedora config:

    # scripts/bloat-o-meter -d vmlinux.o.old vmlinux.o
    add/remove: 2/2 grow/shrink: 0/2 up/down: 24/-188 (-164)
    Data                                         old     new   delta
    sysctl_vals                                    -      12     +12
    __kstrtab_sysctl_vals                          -      12     +12
    max                                           14      10      -4
    int_max                                       16       -     -16
    one                                           68       -     -68
    zero                                         128      28    -100
    Total: Before=20583249, After=20583085, chg -0.00%

[mcroce@redhat.com: tipc: remove two unused variables]
  Link: http://lkml.kernel.org/r/20190530091952.4108-1-mcroce@redhat.com
[akpm@linux-foundation.org: fix net/ipv6/sysctl_net_ipv6.c]
[arnd@arndb.de: proc/sysctl: make firmware loader table conditional]
  Link: http://lkml.kernel.org/r/20190617130014.1713870-1-arnd@arndb.de
[akpm@linux-foundation.org: fix fs/eventpoll.c]
Link: http://lkml.kernel.org/r/20190430180111.10688-1-mcroce@redhat.com
Signed-off-by: Matteo Croce &lt;mcroce@redhat.com&gt;
Signed-off-by: Arnd Bergmann &lt;arnd@arndb.de&gt;
Acked-by: Kees Cook &lt;keescook@chromium.org&gt;
Reviewed-by: Aaron Tomlin &lt;atomlin@redhat.com&gt;
Cc: Matthew Wilcox &lt;willy@infradead.org&gt;
Cc: Stephen Rothwell &lt;sfr@canb.auug.org.au&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: thp: fix false negative of shmem vma's THP eligibility</title>
<updated>2019-07-19T00:08:06Z</updated>
<author>
<name>Yang Shi</name>
<email>yang.shi@linux.alibaba.com</email>
</author>
<published>2019-07-18T22:57:27Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=c06306696f8368b08774e2a743dbc52d92a61693'/>
<id>urn:sha1:c06306696f8368b08774e2a743dbc52d92a61693</id>
<content type='text'>
Commit 7635d9cbe832 ("mm, thp, proc: report THP eligibility for each
vma") introduced THPeligible bit for processes' smaps.  But, when
checking the eligibility for shmem vma, __transparent_hugepage_enabled()
is called to override the result from shmem_huge_enabled().  It may
result in the anonymous vma's THP flag override shmem's.  For example,
running a simple test which create THP for shmem, but with anonymous THP
disabled, when reading the process's smaps, it may show:

  7fc92ec00000-7fc92f000000 rw-s 00000000 00:14 27764 /dev/shm/test
  Size:               4096 kB
  ...
  [snip]
  ...
  ShmemPmdMapped:     4096 kB
  ...
  [snip]
  ...
  THPeligible:    0

And, /proc/meminfo does show THP allocated and PMD mapped too:

  ShmemHugePages:     4096 kB
  ShmemPmdMapped:     4096 kB

This doesn't make too much sense.  The shmem objects should be treated
separately from anonymous THP.  Calling shmem_huge_enabled() with
checking MMF_DISABLE_THP sounds good enough.  And, we could skip stack
and dax vma check since we already checked if the vma is shmem already.

Also check if vma is suitable for THP by calling
transhuge_vma_suitable().

And minor fix to smaps output format and documentation.

Link: http://lkml.kernel.org/r/1560401041-32207-3-git-send-email-yang.shi@linux.alibaba.com
Fixes: 7635d9cbe832 ("mm, thp, proc: report THP eligibility for each vma")
Signed-off-by: Yang Shi &lt;yang.shi@linux.alibaba.com&gt;
Acked-by: Hugh Dickins &lt;hughd@google.com&gt;
Cc: Kirill A. Shutemov &lt;kirill.shutemov@linux.intel.com&gt;
Cc: Michal Hocko &lt;mhocko@suse.com&gt;
Cc: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Cc: David Rientjes &lt;rientjes@google.com&gt;
Cc: Andrea Arcangeli &lt;aarcange@redhat.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>Merge branch 'akpm' (patches from Andrew)</title>
<updated>2019-07-17T15:58:04Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2019-07-17T15:58:04Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=57a8ec387e1441ea5e1232bc0749fb99a8cba7e7'/>
<id>urn:sha1:57a8ec387e1441ea5e1232bc0749fb99a8cba7e7</id>
<content type='text'>
Merge more updates from Andrew Morton:
 "VM:
   - z3fold fixes and enhancements by Henry Burns and Vitaly Wool

   - more accurate reclaimed slab caches calculations by Yafang Shao

   - fix MAP_UNINITIALIZED UAPI symbol to not depend on config, by
     Christoph Hellwig

   - !CONFIG_MMU fixes by Christoph Hellwig

   - new novmcoredd parameter to omit device dumps from vmcore, by
     Kairui Song

   - new test_meminit module for testing heap and pagealloc
     initialization, by Alexander Potapenko

   - ioremap improvements for huge mappings, by Anshuman Khandual

   - generalize kprobe page fault handling, by Anshuman Khandual

   - device-dax hotplug fixes and improvements, by Pavel Tatashin

   - enable synchronous DAX fault on powerpc, by Aneesh Kumar K.V

   - add pte_devmap() support for arm64, by Robin Murphy

   - unify locked_vm accounting with a helper, by Daniel Jordan

   - several misc fixes

  core/lib:
   - new typeof_member() macro including some users, by Alexey Dobriyan

   - make BIT() and GENMASK() available in asm, by Masahiro Yamada

   - changed LIST_POISON2 on x86_64 to 0xdead000000000122 for better
     code generation, by Alexey Dobriyan

   - rbtree code size optimizations, by Michel Lespinasse

   - convert struct pid count to refcount_t, by Joel Fernandes

  get_maintainer.pl:
   - add --no-moderated switch to skip moderated ML's, by Joe Perches

  misc:
   - ptrace PTRACE_GET_SYSCALL_INFO interface

   - coda updates

   - gdb scripts, various"

[ Using merge message suggestion from Vlastimil Babka, with some editing - Linus ]

* emailed patches from Andrew Morton &lt;akpm@linux-foundation.org&gt;: (100 commits)
  fs/select.c: use struct_size() in kmalloc()
  mm: add account_locked_vm utility function
  arm64: mm: implement pte_devmap support
  mm: introduce ARCH_HAS_PTE_DEVMAP
  mm: clean up is_device_*_page() definitions
  mm/mmap: move common defines to mman-common.h
  mm: move MAP_SYNC to asm-generic/mman-common.h
  device-dax: "Hotremove" persistent memory that is used like normal RAM
  mm/hotplug: make remove_memory() interface usable
  device-dax: fix memory and resource leak if hotplug fails
  include/linux/lz4.h: fix spelling and copy-paste errors in documentation
  ipc/mqueue.c: only perform resource calculation if user valid
  include/asm-generic/bug.h: fix "cut here" for WARN_ON for __WARN_TAINT architectures
  scripts/gdb: add helpers to find and list devices
  scripts/gdb: add lx-genpd-summary command
  drivers/pps/pps.c: clear offset flags in PPS_SETPARAMS ioctl
  kernel/pid.c: convert struct pid count to refcount_t
  drivers/rapidio/devices/rio_mport_cdev.c: NUL terminate some strings
  select: shift restore_saved_sigmask_unless() into poll_select_copy_remaining()
  select: change do_poll() to return -ERESTARTNOHAND rather than -EINTR
  ...
</content>
</entry>
<entry>
<title>fs/proc/proc_sysctl.c: fix the default values of i_uid/i_gid on /proc/sys inodes.</title>
<updated>2019-07-17T02:23:21Z</updated>
<author>
<name>Radoslaw Burny</name>
<email>rburny@google.com</email>
</author>
<published>2019-07-16T23:26:51Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=5ec27ec735ba0477d48c80561cc5e856f0c5dfaf'/>
<id>urn:sha1:5ec27ec735ba0477d48c80561cc5e856f0c5dfaf</id>
<content type='text'>
Normally, the inode's i_uid/i_gid are translated relative to s_user_ns,
but this is not a correct behavior for proc.  Since sysctl permission
check in test_perm is done against GLOBAL_ROOT_[UG]ID, it makes more
sense to use these values in u_[ug]id of proc inodes.  In other words:
although uid/gid in the inode is not read during test_perm, the inode
logically belongs to the root of the namespace.  I have confirmed this
with Eric Biederman at LPC and in this thread:
  https://lore.kernel.org/lkml/87k1kzjdff.fsf@xmission.com

Consequences
============

Since the i_[ug]id values of proc nodes are not used for permissions
checks, this change usually makes no functional difference.  However, it
causes an issue in a setup where:

 * a namespace container is created without root user in container -
   hence the i_[ug]id of proc nodes are set to INVALID_[UG]ID

 * container creator tries to configure it by writing /proc/sys files,
   e.g. writing /proc/sys/kernel/shmmax to configure shared memory limit

Kernel does not allow to open an inode for writing if its i_[ug]id are
invalid, making it impossible to write shmmax and thus - configure the
container.

Using a container with no root mapping is apparently rare, but we do use
this configuration at Google.  Also, we use a generic tool to configure
the container limits, and the inability to write any of them causes a
failure.

History
=======

The invalid uids/gids in inodes first appeared due to 81754357770e (fs:
Update i_[ug]id_(read|write) to translate relative to s_user_ns).
However, AFAIK, this did not immediately cause any issues.  The
inability to write to these "invalid" inodes was only caused by a later
commit 0bd23d09b874 (vfs: Don't modify inodes with a uid or gid unknown
to the vfs).

Tested: Used a repro program that creates a user namespace without any
mapping and stat'ed /proc/$PID/root/proc/sys/kernel/shmmax from outside.
Before the change, it shows the overflow uid, with the change it's 0.
The overflow uid indicates that the uid in the inode is not correct and
thus it is not possible to open the file for writing.

Link: http://lkml.kernel.org/r/20190708115130.250149-1-rburny@google.com
Fixes: 0bd23d09b874 ("vfs: Don't modify inodes with a uid or gid unknown to the vfs")
Signed-off-by: Radoslaw Burny &lt;rburny@google.com&gt;
Acked-by: Luis Chamberlain &lt;mcgrof@kernel.org&gt;
Cc: Kees Cook &lt;keescook@chromium.org&gt;
Cc: "Eric W . Biederman" &lt;ebiederm@xmission.com&gt;
Cc: Seth Forshee &lt;seth.forshee@canonical.com&gt;
Cc: John Sperbeck &lt;jsperbeck@google.com&gt;
Cc: Alexey Dobriyan &lt;adobriyan@gmail.com&gt;
Cc: &lt;stable@vger.kernel.org&gt;	[4.8+]
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>fs/proc/inode.c: use typeof_member() macro</title>
<updated>2019-07-17T02:23:21Z</updated>
<author>
<name>Alexey Dobriyan</name>
<email>adobriyan@gmail.com</email>
</author>
<published>2019-07-16T23:26:45Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=9af27b28b1da1020e427b626c4967d0206b55100'/>
<id>urn:sha1:9af27b28b1da1020e427b626c4967d0206b55100</id>
<content type='text'>
Don't repeat function signatures twice.

This is a kind-of-precursor for "struct proc_ops".

Note:

	typeof(pde-&gt;proc_fops-&gt;...) ...;

can't be used because -&gt;proc_fops is "const struct file_operations *".
"const" prevents assignment down the code and it can't be deleted in the
type system.

Link: http://lkml.kernel.org/r/20190529191110.GB5703@avx2
Signed-off-by: Alexey Dobriyan &lt;adobriyan@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>vmcore: add a kernel parameter novmcoredd</title>
<updated>2019-07-17T02:23:21Z</updated>
<author>
<name>Kairui Song</name>
<email>kasong@redhat.com</email>
</author>
<published>2019-07-16T23:26:39Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=c6c405336bd3b0ebd1d76aaf9ea88b35dba77e61'/>
<id>urn:sha1:c6c405336bd3b0ebd1d76aaf9ea88b35dba77e61</id>
<content type='text'>
Since commit 2724273e8fd0 ("vmcore: add API to collect hardware dump in
second kernel"), drivers are allowed to add device related dump data to
vmcore as they want by using the device dump API.  This has a potential
issue, the data is stored in memory, drivers may append too much data
and use too much memory.  The vmcore is typically used in a kdump kernel
which runs in a pre-reserved small chunk of memory.  So as a result it
will make kdump unusable at all due to OOM issues.

So introduce new 'novmcoredd' command line option.  User can disable
device dump to reduce memory usage.  This is helpful if device dump is
using too much memory, disabling device dump could make sure a regular
vmcore without device dump data is still available.

[akpm@linux-foundation.org: tweak documentation]
[akpm@linux-foundation.org: vmcore.c needs moduleparam.h]
Link: http://lkml.kernel.org/r/20190528111856.7276-1-kasong@redhat.com
Signed-off-by: Kairui Song &lt;kasong@redhat.com&gt;
Acked-by: Dave Young &lt;dyoung@redhat.com&gt;
Reviewed-by: Bhupesh Sharma &lt;bhsharma@redhat.com&gt;
Cc: Rahul Lakkireddy &lt;rahul.lakkireddy@chelsio.com&gt;
Cc: "David S . Miller" &lt;davem@davemloft.net&gt;
Cc: Eric Biederman &lt;ebiederm@xmission.com&gt;
Cc: Alexey Dobriyan &lt;adobriyan@gmail.com&gt;
Cc: Baoquan He &lt;bhe@redhat.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>Merge tag 'docs/v5.3-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media</title>
<updated>2019-07-16T19:21:41Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2019-07-16T19:21:41Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=c309b6f24222246c18a8b65d3950e6e755440865'/>
<id>urn:sha1:c309b6f24222246c18a8b65d3950e6e755440865</id>
<content type='text'>
Pull rst conversion of docs from Mauro Carvalho Chehab:
 "As agreed with Jon, I'm sending this big series directly to you, c/c
  him, as this series required a special care, in order to avoid
  conflicts with other trees"

* tag 'docs/v5.3-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: (77 commits)
  docs: kbuild: fix build with pdf and fix some minor issues
  docs: block: fix pdf output
  docs: arm: fix a breakage with pdf output
  docs: don't use nested tables
  docs: gpio: add sysfs interface to the admin-guide
  docs: locking: add it to the main index
  docs: add some directories to the main documentation index
  docs: add SPDX tags to new index files
  docs: add a memory-devices subdir to driver-api
  docs: phy: place documentation under driver-api
  docs: serial: move it to the driver-api
  docs: driver-api: add remaining converted dirs to it
  docs: driver-api: add xilinx driver API documentation
  docs: driver-api: add a series of orphaned documents
  docs: admin-guide: add a series of orphaned documents
  docs: cgroup-v1: add it to the admin-guide book
  docs: aoe: add it to the driver-api book
  docs: add some documentation dirs to the driver-api book
  docs: driver-model: move it to the driver-api book
  docs: lp855x-driver.rst: add it to the driver-api book
  ...
</content>
</entry>
<entry>
<title>Merge branch 'proc-cmdline' (/proc/&lt;pid&gt;/cmdline fixes)</title>
<updated>2019-07-16T17:37:27Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2019-07-16T17:37:27Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=2954152298c37804dab49d630aa959625b50cf64'/>
<id>urn:sha1:2954152298c37804dab49d630aa959625b50cf64</id>
<content type='text'>
This fixes two problems reported with the cmdline simplification and
cleanup last year:

 - the setproctitle() special cases didn't quite match the original
   semantics, and it can be noticeable:

      https://lore.kernel.org/lkml/alpine.LNX.2.21.1904052326230.3249@kich.toxcorp.com/

 - it could leak an uninitialized byte from the temporary buffer under
   the right (wrong) circustances:

      https://lore.kernel.org/lkml/20190712160913.17727-1-izbyshev@ispras.ru/

It rewrites the logic entirely, splitting it into two separate commits
(and two separate functions) for the two different cases ("unedited
cmdline" vs "setproctitle() has been used to change the command line").

* proc-cmdline:
  /proc/&lt;pid&gt;/cmdline: add back the setproctitle() special case
  /proc/&lt;pid&gt;/cmdline: remove all the special cases
</content>
</entry>
<entry>
<title>/proc/&lt;pid&gt;/cmdline: add back the setproctitle() special case</title>
<updated>2019-07-16T16:57:52Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2019-07-13T21:27:14Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=d26d0cd97c88eb1a5704b42e41ab443406807810'/>
<id>urn:sha1:d26d0cd97c88eb1a5704b42e41ab443406807810</id>
<content type='text'>
This makes the setproctitle() special case very explicit indeed, and
handles it with a separate helper function entirely.  In the process, it
re-instates the original semantics of simply stopping at the first NUL
character when the original last NUL character is no longer there.

[ The original semantics can still be seen in mm/util.c: get_cmdline()
  that is limited to a fixed-size buffer ]

This makes the logic about when we use the string lengths etc much more
obvious, and makes it easier to see what we do and what the two very
different cases are.

Note that even when we allow walking past the end of the argument array
(because the setproctitle() might have overwritten and overflowed the
original argv[] strings), we only allow it when it overflows into the
environment region if it is immediately adjacent.

[ Fixed for missing 'count' checks noted by Alexey Izbyshev ]

Link: https://lore.kernel.org/lkml/alpine.LNX.2.21.1904052326230.3249@kich.toxcorp.com/
Fixes: 5ab827189965 ("fs/proc: simplify and clarify get_mm_cmdline() function")
Cc: Jakub Jankowski &lt;shasta@toxcorp.com&gt;
Cc: Alexey Dobriyan &lt;adobriyan@gmail.com&gt;
Cc: Alexey Izbyshev &lt;izbyshev@ispras.ru&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
</feed>
