<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/kernel/fork.c, branch v4.1</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v4.1</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v4.1'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2015-04-17T13:04:11Z</updated>
<entry>
<title>oprofile: reduce mmap_sem hold for mm-&gt;exe_file</title>
<updated>2015-04-17T13:04:11Z</updated>
<author>
<name>Davidlohr Bueso</name>
<email>dave@stgolabs.net</email>
</author>
<published>2015-04-16T19:49:12Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=11163348a23cdbcdca5fb42485418e75f8566a5c'/>
<id>urn:sha1:11163348a23cdbcdca5fb42485418e75f8566a5c</id>
<content type='text'>
sync_buffer() needs the mmap_sem for two distinct operations, both only
occurring upon user context switch handling:

 1) Dealing with the exe_file.

 2) Adding the dcookie data as we need to lookup the vma that
   backs it. This is done via add_sample() and add_data().

This patch isolates 1), for it will no longer need the mmap_sem for
serialization.  However, for now, make of the more standard
get_mm_exe_file(), requiring only holding the mmap_sem to read the value,
and relying on reference counting to make sure that the exe file won't
dissappear underneath us while doing the get dcookie.

As a consequence, for 2) we move the mmap_sem locking into where we really
need it, in lookup_dcookie().  The benefits are twofold: reduce mmap_sem
hold times, and cleaner code.

[akpm@linux-foundation.org: export get_mm_exe_file for arch/x86/oprofile/oprofile.ko]
Signed-off-by: Davidlohr Bueso &lt;dbueso@suse.de&gt;
Cc: Robert Richter &lt;rric@kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>prctl: avoid using mmap_sem for exe_file serialization</title>
<updated>2015-04-17T13:04:07Z</updated>
<author>
<name>Davidlohr Bueso</name>
<email>dave@stgolabs.net</email>
</author>
<published>2015-04-16T19:47:59Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=6e399cd144d8500ffb5d40fa6848890e2580a80a'/>
<id>urn:sha1:6e399cd144d8500ffb5d40fa6848890e2580a80a</id>
<content type='text'>
Oleg cleverly suggested using xchg() to set the new mm-&gt;exe_file instead
of calling set_mm_exe_file() which requires some form of serialization --
mmap_sem in this case.  For archs that do not have atomic rmw instructions
we still fallback to a spinlock alternative, so this should always be
safe.  As such, we only need the mmap_sem for looking up the backing
vm_file, which can be done sharing the lock.  Naturally, this means we
need to manually deal with both the new and old file reference counting,
and we need not worry about the MMF_EXE_FILE_CHANGED bits, which can
probably be deleted in the future anyway.

Signed-off-by: Davidlohr Bueso &lt;dbueso@suse.de&gt;
Suggested-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Acked-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Reviewed-by: Konstantin Khlebnikov &lt;khlebnikov@yandex-team.ru&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: rcu-protected get_mm_exe_file()</title>
<updated>2015-04-17T13:04:07Z</updated>
<author>
<name>Konstantin Khlebnikov</name>
<email>khlebnikov@yandex-team.ru</email>
</author>
<published>2015-04-16T19:47:56Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=90f31d0ea88880f780574f3d0bb1a227c4c66ca3'/>
<id>urn:sha1:90f31d0ea88880f780574f3d0bb1a227c4c66ca3</id>
<content type='text'>
This patch removes mm-&gt;mmap_sem from mm-&gt;exe_file read side.
Also it kills dup_mm_exe_file() and moves exe_file duplication into
dup_mmap() where both mmap_sems are locked.

[akpm@linux-foundation.org: fix comment typo]
Signed-off-by: Konstantin Khlebnikov &lt;khlebnikov@yandex-team.ru&gt;
Cc: Davidlohr Bueso &lt;dbueso@suse.de&gt;
Cc: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: Oleg Nesterov &lt;oleg@redhat.com&gt;
Cc: "Paul E. McKenney" &lt;paulmck@us.ibm.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>kernel/sysctl.c: threads-max observe limits</title>
<updated>2015-04-17T13:04:07Z</updated>
<author>
<name>Heinrich Schuchardt</name>
<email>xypron.glpk@gmx.de</email>
</author>
<published>2015-04-16T19:47:50Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=16db3d3f1170fb0efca652c9378ce7c5f5cb4232'/>
<id>urn:sha1:16db3d3f1170fb0efca652c9378ce7c5f5cb4232</id>
<content type='text'>
Users can change the maximum number of threads by writing to
/proc/sys/kernel/threads-max.

With the patch the value entered is checked against the same limits that
apply when fork_init is called.

Signed-off-by: Heinrich Schuchardt &lt;xypron.glpk@gmx.de&gt;
Cc: Oleg Nesterov &lt;oleg@redhat.com&gt;
Cc: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Guenter Roeck &lt;linux@roeck-us.net&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>kernel/fork.c: avoid division by zero</title>
<updated>2015-04-17T13:04:07Z</updated>
<author>
<name>Heinrich Schuchardt</name>
<email>xypron.glpk@gmx.de</email>
</author>
<published>2015-04-16T19:47:47Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=ac1b398de1ef94aeee8ba87b0120763526572a6e'/>
<id>urn:sha1:ac1b398de1ef94aeee8ba87b0120763526572a6e</id>
<content type='text'>
PAGE_SIZE is not guaranteed to be equal to or less than 8 times the
THREAD_SIZE.

E.g.  architecture hexagon may have page size 1M and thread size 4096.
This would lead to a division by zero in the calculation of max_threads.

With 32-bit calculation there is no solution which delivers valid results
for all possible combinations of the parameters.  The code is only called
once.  Hence a 64-bit calculation can be used as solution.

[akpm@linux-foundation.org: use clamp_t(), per Oleg]
Signed-off-by: Heinrich Schuchardt &lt;xypron.glpk@gmx.de&gt;
Cc: Oleg Nesterov &lt;oleg@redhat.com&gt;
Cc: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Guenter Roeck &lt;linux@roeck-us.net&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>kernel/fork.c: new function for max_threads</title>
<updated>2015-04-17T13:04:06Z</updated>
<author>
<name>Heinrich Schuchardt</name>
<email>xypron.glpk@gmx.de</email>
</author>
<published>2015-04-16T19:47:44Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=ff691f6e03815dc8f99461ea509df863a879fc3a'/>
<id>urn:sha1:ff691f6e03815dc8f99461ea509df863a879fc3a</id>
<content type='text'>
PAGE_SIZE is not guaranteed to be equal to or less than 8 times the
THREAD_SIZE.

E.g.  architecture hexagon may have page size 1M and thread size 4096.
This would lead to a division by zero in the calculation of max_threads.

With this patch the buggy code is moved to a separate function
set_max_threads.  The error is not fixed.

After fixing the problem in a separate patch the new function can be
reused to adjust max_threads after adding or removing memory.

Argument mempages of function fork_init() is removed as totalram_pages is
an exported symbol.

The creation of separate patches for refactoring to a new function and for
fixing the logic was suggested by Ingo Molnar.

Signed-off-by: Heinrich Schuchardt &lt;xypron.glpk@gmx.de&gt;
Cc: Oleg Nesterov &lt;oleg@redhat.com&gt;
Cc: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Guenter Roeck &lt;linux@roeck-us.net&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>fork_init: update max_threads comment</title>
<updated>2015-04-17T13:04:06Z</updated>
<author>
<name>Jean Delvare</name>
<email>jdelvare@suse.de</email>
</author>
<published>2015-04-16T19:47:41Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=3ea7f5e25ec271909451b7dc17be37581b888de6'/>
<id>urn:sha1:3ea7f5e25ec271909451b7dc17be37581b888de6</id>
<content type='text'>
The comment explaining what value max_threads is set to is outdated.  The
maximum memory consumption ratio for thread structures was 1/2 until
February 2002, then it was briefly changed to 1/16 before being set to 1/8
which we still use today.  The comment was never updated to reflect that
change, it's about time.

Signed-off-by: Jean Delvare &lt;jdelvare@suse.de&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>fork: report pid reservation failure properly</title>
<updated>2015-04-17T13:04:06Z</updated>
<author>
<name>Michal Hocko</name>
<email>mhocko@suse.cz</email>
</author>
<published>2015-04-16T19:47:38Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=35f71bc0a09a45924bed268d8ccd0d3407bc476f'/>
<id>urn:sha1:35f71bc0a09a45924bed268d8ccd0d3407bc476f</id>
<content type='text'>
copy_process will report any failure in alloc_pid as ENOMEM currently
which is misleading because the pid allocation might fail not only when
the memory is short but also when the pid space is consumed already.

The current man page even mentions this case:

: EAGAIN
:
:       A system-imposed limit on the number of threads was encountered.
:       There are a number of limits that may trigger this error: the
:       RLIMIT_NPROC soft resource limit (set via setrlimit(2)), which
:       limits the number of processes and threads for a real user ID, was
:       reached; the kernel's system-wide limit on the number of processes
:       and threads, /proc/sys/kernel/threads-max, was reached (see
:       proc(5)); or the maximum number of PIDs, /proc/sys/kernel/pid_max,
:       was reached (see proc(5)).

so the current behavior is also incorrect wrt.  documentation.  POSIX man
page also suggest returing EAGAIN when the process count limit is reached.

This patch simply propagates error code from alloc_pid and makes sure we
return -EAGAIN due to reservation failure.  This will make behavior of
fork closer to both our documentation and POSIX.

alloc_pid might alsoo fail when the reaper in the pid namespace is dead
(the namespace basically disallows all new processes) and there is no
good error code which would match documented ones. We have traditionally
returned ENOMEM for this case which is misleading as well but as per
Eric W. Biederman this behavior is documented in man pid_namespaces(7)

: If the "init" process of a PID namespace terminates, the kernel
: terminates all of the processes in the namespace via a SIGKILL signal.
: This behavior reflects the fact that the "init" process is essential for
: the correct operation of a PID namespace.  In this case, a subsequent
: fork(2) into this PID namespace will fail with the error ENOMEM; it is
: not possible to create a new processes in a PID namespace whose "init"
: process has terminated.

and introducing a new error code would be too risky so let's stick to
ENOMEM for this case.

Signed-off-by: Michal Hocko &lt;mhocko@suse.cz&gt;
Cc: Oleg Nesterov &lt;oleg@redhat.com&gt;
Cc: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Cc: Michael Kerrisk &lt;mtk.manpages@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>Remove execution domain support</title>
<updated>2015-04-12T18:58:24Z</updated>
<author>
<name>Richard Weinberger</name>
<email>richard@nod.at</email>
</author>
<published>2015-03-30T06:14:16Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=973f911f55a0e510dd6db8bbb29cd82ff138d3c0'/>
<id>urn:sha1:973f911f55a0e510dd6db8bbb29cd82ff138d3c0</id>
<content type='text'>
All users of exec_domain are gone, now we can get rid
of that abandoned feature.
To not break existing userspace we keep a dummy
/proc/execdomains file which will always contain
"0-0     Linux                   [kernel]".

Signed-off-by: Richard Weinberger &lt;richard@nod.at&gt;
</content>
</entry>
<entry>
<title>mm: do not use mm-&gt;nr_pmds on !MMU configurations</title>
<updated>2015-02-13T02:54:10Z</updated>
<author>
<name>Kirill A. Shutemov</name>
<email>kirill.shutemov@linux.intel.com</email>
</author>
<published>2015-02-12T22:59:59Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=2d2f5119b8bb057595e18f5b2f07aa097ea1b233'/>
<id>urn:sha1:2d2f5119b8bb057595e18f5b2f07aa097ea1b233</id>
<content type='text'>
mm-&gt;nr_pmds doesn't make sense on !MMU configurations

Signed-off-by: Kirill A. Shutemov &lt;kirill.shutemov@linux.intel.com&gt;
Cc: Guenter Roeck &lt;linux@roeck-us.net&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
</feed>
