<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/fs/proc/base.c, branch v4.3</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v4.3</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v4.3'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2015-09-10T20:29:01Z</updated>
<entry>
<title>proc: convert to kstrto*()/kstrto*_from_user()</title>
<updated>2015-09-10T20:29:01Z</updated>
<author>
<name>Alexey Dobriyan</name>
<email>adobriyan@gmail.com</email>
</author>
<published>2015-09-09T22:36:59Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=774636e19ed514cdf560006813c0473409616de8'/>
<id>urn:sha1:774636e19ed514cdf560006813c0473409616de8</id>
<content type='text'>
Convert from manual allocation/copy_from_user/...  to kstrto*() family
which were designed for exactly that.

One case can not be converted to kstrto*_from_user() to make code even
more simpler because of whitespace stripping, oh well...

Signed-off-by: Alexey Dobriyan &lt;adobriyan@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>procfs: always expose /proc/&lt;pid&gt;/map_files/ and make it readable</title>
<updated>2015-09-10T20:29:01Z</updated>
<author>
<name>Calvin Owens</name>
<email>calvinowens@fb.com</email>
</author>
<published>2015-09-09T22:35:54Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=bdb4d100afe9818aebd1d98ced575c5ef143456c'/>
<id>urn:sha1:bdb4d100afe9818aebd1d98ced575c5ef143456c</id>
<content type='text'>
Currently, /proc/&lt;pid&gt;/map_files/ is restricted to CAP_SYS_ADMIN, and is
only exposed if CONFIG_CHECKPOINT_RESTORE is set.

Each mapped file region gets a symlink in /proc/&lt;pid&gt;/map_files/
corresponding to the virtual address range at which it is mapped.  The
symlinks work like the symlinks in /proc/&lt;pid&gt;/fd/, so you can follow them
to the backing file even if that backing file has been unlinked.

Currently, files which are mapped, unlinked, and closed are impossible to
stat() from userspace.  Exposing /proc/&lt;pid&gt;/map_files/ closes this
functionality "hole".

Not being able to stat() such files makes noticing and explicitly
accounting for the space they use on the filesystem impossible.  You can
work around this by summing up the space used by every file in the
filesystem and subtracting that total from what statfs() tells you, but
that obviously isn't great, and it becomes unworkable once your filesystem
becomes large enough.

This patch moves map_files/ out from behind CONFIG_CHECKPOINT_RESTORE, and
adjusts the permissions enforced on it as follows:

* proc_map_files_lookup()
* proc_map_files_readdir()
* map_files_d_revalidate()

	Remove the CAP_SYS_ADMIN restriction, leaving only the current
	restriction requiring PTRACE_MODE_READ. The information made
	available to userspace by these three functions is already
	available in /proc/PID/maps with MODE_READ, so I don't see any
	reason to limit them any further (see below for more detail).

* proc_map_files_follow_link()

	This stub has been added, and requires that the user have
	CAP_SYS_ADMIN in order to follow the links in map_files/,
	since there was concern on LKML both about the potential for
	bypassing permissions on ancestor directories in the path to
	files pointed to, and about what happens with more exotic
	memory mappings created by some drivers (ie dma-buf).

In older versions of this patch, I changed every permission check in
the four functions above to enforce MODE_ATTACH instead of MODE_READ.
This was an oversight on my part, and after revisiting the discussion
it seems that nobody was concerned about anything outside of what is
made possible by -&gt;follow_link(). So in this version, I've left the
checks for PTRACE_MODE_READ as-is.

[akpm@linux-foundation.org: catch up with concurrent proc_pid_follow_link() changes]
Signed-off-by: Calvin Owens &lt;calvinowens@fb.com&gt;
Reviewed-by: Kees Cook &lt;keescook@chromium.org&gt;
Cc: Andy Lutomirski &lt;luto@amacapital.net&gt;
Cc: Cyrill Gorcunov &lt;gorcunov@openvz.org&gt;
Cc: Joe Perches &lt;joe@perches.com&gt;
Cc: Kirill A. Shutemov &lt;kirill@shutemov.name&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>/proc/$PID/cmdline: fixup empty ARGV case</title>
<updated>2015-07-17T23:39:54Z</updated>
<author>
<name>Alexey Dobriyan</name>
<email>adobriyan@gmail.com</email>
</author>
<published>2015-07-17T23:24:09Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=3581d458c39bc5e58ceeeaf51796625bc978d2eb'/>
<id>urn:sha1:3581d458c39bc5e58ceeeaf51796625bc978d2eb</id>
<content type='text'>
/proc/*/cmdline code checks if it should look at ENVP area by checking
last byte of ARGV area:

	rv = access_remote_vm(mm, arg_end - 1, &amp;c, 1, 0);
	if (rv &lt;= 0)
		goto out_free_page;

If ARGV is somehow made empty (by doing execve(..., NULL, ...) or
manually setting -&gt;arg_start and -&gt;arg_end to equal values), the decision
will be based on byte which doesn't even belong to ARGV/ENVP.

So, quickly check if ARGV area is empty and report 0 to match previous
behaviour.

Signed-off-by: Alexey Dobriyan &lt;adobriyan@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip</title>
<updated>2015-07-04T15:56:53Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2015-07-04T15:56:53Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=22a093b2fb52fb656658a32adc80c24ddc200ca4'/>
<id>urn:sha1:22a093b2fb52fb656658a32adc80c24ddc200ca4</id>
<content type='text'>
Pull scheduler fixes from Ingo Molnar:
 "Debug info and other statistics fixes and related enhancements"

* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched/numa: Fix numa balancing stats in /proc/pid/sched
  sched/numa: Show numa_group ID in /proc/sched_debug task listings
  sched/debug: Move print_cfs_rq() declaration to kernel/sched/sched.h
  sched/stat: Expose /proc/pid/schedstat if CONFIG_SCHED_INFO=y
  sched/stat: Simplify the sched_info accounting dependency
</content>
</entry>
<entry>
<title>sched/stat: Expose /proc/pid/schedstat if CONFIG_SCHED_INFO=y</title>
<updated>2015-07-04T08:04:31Z</updated>
<author>
<name>Naveen N. Rao</name>
<email>naveen.n.rao@linux.vnet.ibm.com</email>
</author>
<published>2015-06-30T09:06:03Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=5968cecedd7a09f23e9fcb5f9fb4e893712f35ba'/>
<id>urn:sha1:5968cecedd7a09f23e9fcb5f9fb4e893712f35ba</id>
<content type='text'>
Expand /proc/pid/schedstat output:

 - enable it on CONFIG_TASK_DELAY_ACCT=y &amp;&amp; !CONFIG_SCHEDSTATS kernels.

 - dump all zeroes on kernels that are booted with the 'nodelayacct'
   option, which boot option disables delay accounting on
   CONFIG_TASK_DELAY_ACCT=y kernels.

Signed-off-by: Naveen N. Rao &lt;naveen.n.rao@linux.vnet.ibm.com&gt;
Cc: Balbir Singh &lt;bsingharora@gmail.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Srikar Dronamraju &lt;srikar@linux.vnet.ibm.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: a.p.zijlstra@chello.nl
Cc: ricklind@us.ibm.com
Link: http://lkml.kernel.org/r/5ccbef17d4bc841084ea6e6421d4e4a23b7b806f.1435654789.git.naveen.n.rao@linux.vnet.ibm.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>fs, proc: introduce CONFIG_PROC_CHILDREN</title>
<updated>2015-06-26T00:00:37Z</updated>
<author>
<name>Iago López Galeiras</name>
<email>iago@endocode.com</email>
</author>
<published>2015-06-25T22:00:57Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=2e13ba54a2682eea24918b87ad3edf70c2cf085b'/>
<id>urn:sha1:2e13ba54a2682eea24918b87ad3edf70c2cf085b</id>
<content type='text'>
Commit 818411616baf ("fs, proc: introduce /proc/&lt;pid&gt;/task/&lt;tid&gt;/children
entry") introduced the children entry for checkpoint restore and the
file is only available on kernels configured with CONFIG_EXPERT and
CONFIG_CHECKPOINT_RESTORE.

This is available in most distributions (Fedora, Debian, Ubuntu, CoreOS)
because they usually enable CONFIG_EXPERT and CONFIG_CHECKPOINT_RESTORE.
But Arch does not enable CONFIG_EXPERT or CONFIG_CHECKPOINT_RESTORE.

However, the children proc file is useful outside of checkpoint restore.
I would like to use it in rkt.  The rkt process exec() another program
it does not control, and that other program will fork()+exec() a child
process.  I would like to find the pid of the child process from an
external tool without iterating in /proc over all processes to find
which one has a parent pid equal to rkt.

This commit introduces CONFIG_PROC_CHILDREN and makes
CONFIG_CHECKPOINT_RESTORE select it.  This allows enabling
/proc/&lt;pid&gt;/task/&lt;tid&gt;/children without needing to enable
CONFIG_CHECKPOINT_RESTORE and CONFIG_EXPERT.

Alban tested that /proc/&lt;pid&gt;/task/&lt;tid&gt;/children is present when the
kernel is configured with CONFIG_PROC_CHILDREN=y but without
CONFIG_CHECKPOINT_RESTORE

Signed-off-by: Iago López Galeiras &lt;iago@endocode.com&gt;
Tested-by: Alban Crequy &lt;alban@endocode.com&gt;
Reviewed-by: Cyrill Gorcunov &lt;gorcunov@openvz.org&gt;
Cc: Oleg Nesterov &lt;oleg@redhat.com&gt;
Cc: Kees Cook &lt;keescook@chromium.org&gt;
Cc: Pavel Emelyanov &lt;xemul@parallels.com&gt;
Cc: Serge Hallyn &lt;serge.hallyn@canonical.com&gt;
Cc: KAMEZAWA Hiroyuki &lt;kamezawa.hiroyu@jp.fujitsu.com&gt;
Cc: Alexander Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: Andy Lutomirski &lt;luto@amacapital.net&gt;
Cc: Djalal Harouni &lt;djalal@endocode.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>proc: fix PAGE_SIZE limit of /proc/$PID/cmdline</title>
<updated>2015-06-26T00:00:37Z</updated>
<author>
<name>Alexey Dobriyan</name>
<email>adobriyan@gmail.com</email>
</author>
<published>2015-06-25T22:00:54Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=c2c0bb44620dece7ec97e28167e32c343da22867'/>
<id>urn:sha1:c2c0bb44620dece7ec97e28167e32c343da22867</id>
<content type='text'>
/proc/$PID/cmdline truncates output at PAGE_SIZE. It is easy to see with

	$ cat /proc/self/cmdline $(seq 1037) 2&gt;/dev/null

However, command line size was never limited to PAGE_SIZE but to 128 KB
and relatively recently limitation was removed altogether.

People noticed and ask questions:
http://stackoverflow.com/questions/199130/how-do-i-increase-the-proc-pid-cmdline-4096-byte-limit

seq file interface is not OK, because it kmalloc's for whole output and
open + read(, 1) + sleep will pin arbitrary amounts of kernel memory.  To
not do that, limit must be imposed which is incompatible with arbitrary
sized command lines.

I apologize for hairy code, but this it direct consequence of command line
layout in memory and hacks to support things like "init [3]".

The loops are "unrolled" otherwise it is either macros which hide control
flow or functions with 7-8 arguments with equal line count.

There should be real setproctitle(2) or something.

[akpm@linux-foundation.org: fix a billion min() warnings]
Signed-off-by: Alexey Dobriyan &lt;adobriyan@gmail.com&gt;
Tested-by: Jarod Wilson &lt;jarod@redhat.com&gt;
Acked-by: Jarod Wilson &lt;jarod@redhat.com&gt;
Cc: Cyrill Gorcunov &lt;gorcunov@openvz.org&gt;
Cc: Jan Stancek &lt;jstancek@redhat.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>don't pass nameidata to -&gt;follow_link()</title>
<updated>2015-05-11T02:20:15Z</updated>
<author>
<name>Al Viro</name>
<email>viro@zeniv.linux.org.uk</email>
</author>
<published>2015-05-02T17:37:52Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=6e77137b363b8d866ac29c5a0c95e953614fb2d8'/>
<id>urn:sha1:6e77137b363b8d866ac29c5a0c95e953614fb2d8</id>
<content type='text'>
its only use is getting passed to nd_jump_link(), which can obtain
it from current-&gt;nameidata

Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</content>
</entry>
<entry>
<title>new -&gt;follow_link() and -&gt;put_link() calling conventions</title>
<updated>2015-05-11T02:19:45Z</updated>
<author>
<name>Al Viro</name>
<email>viro@zeniv.linux.org.uk</email>
</author>
<published>2015-05-02T17:32:22Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=680baacbca69d18a6d7315374ad83d05ac9c0977'/>
<id>urn:sha1:680baacbca69d18a6d7315374ad83d05ac9c0977</id>
<content type='text'>
a) instead of storing the symlink body (via nd_set_link()) and returning
an opaque pointer later passed to -&gt;put_link(), -&gt;follow_link() _stores_
that opaque pointer (into void * passed by address by caller) and returns
the symlink body.  Returning ERR_PTR() on error, NULL on jump (procfs magic
symlinks) and pointer to symlink body for normal symlinks.  Stored pointer
is ignored in all cases except the last one.

Storing NULL for opaque pointer (or not storing it at all) means no call
of -&gt;put_link().

b) the body used to be passed to -&gt;put_link() implicitly (via nameidata).
Now only the opaque pointer is.  In the cases when we used the symlink body
to free stuff, -&gt;follow_link() now should store it as opaque pointer in addition
to returning it.

Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</content>
</entry>
<entry>
<title>Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs</title>
<updated>2015-04-27T00:22:07Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2015-04-26T22:48:49Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=9ec3a646fe09970f801ab15e0f1694060b9f19af'/>
<id>urn:sha1:9ec3a646fe09970f801ab15e0f1694060b9f19af</id>
<content type='text'>
Pull fourth vfs update from Al Viro:
 "d_inode() annotations from David Howells (sat in for-next since before
  the beginning of merge window) + four assorted fixes"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  RCU pathwalk breakage when running into a symlink overmounting something
  fix I_DIO_WAKEUP definition
  direct-io: only inc/dec inode-&gt;i_dio_count for file systems
  fs/9p: fix readdir()
  VFS: assorted d_backing_inode() annotations
  VFS: fs/inode.c helpers: d_inode() annotations
  VFS: fs/cachefiles: d_backing_inode() annotations
  VFS: fs library helpers: d_inode() annotations
  VFS: assorted weird filesystems: d_inode() annotations
  VFS: normal filesystems (and lustre): d_inode() annotations
  VFS: security/: d_inode() annotations
  VFS: security/: d_backing_inode() annotations
  VFS: net/: d_inode() annotations
  VFS: net/unix: d_backing_inode() annotations
  VFS: kernel/: d_inode() annotations
  VFS: audit: d_backing_inode() annotations
  VFS: Fix up some -&gt;d_inode accesses in the chelsio driver
  VFS: Cachefiles should perform fs modifications on the top layer only
  VFS: AF_UNIX sockets should call mknod on the top layer only
</content>
</entry>
</feed>
