<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/kernel/sysctl.c, branch v2.6.24</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v2.6.24</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v2.6.24'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2007-12-18T14:21:13Z</updated>
<entry>
<title>sched: sysctl, proc_dointvec_minmax() expects int values for</title>
<updated>2007-12-18T14:21:13Z</updated>
<author>
<name>Eric Dumazet</name>
<email>dada1@cosmosbay.com</email>
</author>
<published>2007-12-18T14:21:13Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=73c4efd2c88a41c8a4810904266a34423b5584e5'/>
<id>urn:sha1:73c4efd2c88a41c8a4810904266a34423b5584e5</id>
<content type='text'>
min_sched_granularity_ns, max_sched_granularity_ns,
min_wakeup_granularity_ns and max_wakeup_granularity_ns are declared
"unsigned long".

This is incorrect since proc_dointvec_minmax() expects plain "int" guard
values.

This bug only triggers on big endian 64 bit arches.

Signed-off-by: Eric Dumazet &lt;dada1@cosmosbay.com&gt;
Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
</content>
</entry>
<entry>
<title>Revert "hugetlb: Add hugetlb_dynamic_pool sysctl"</title>
<updated>2007-12-18T03:28:17Z</updated>
<author>
<name>Nishanth Aravamudan</name>
<email>nacc@us.ibm.com</email>
</author>
<published>2007-12-18T00:20:22Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=368d2c6358c3c62b3820a8a73f9fe9c8b540cdea'/>
<id>urn:sha1:368d2c6358c3c62b3820a8a73f9fe9c8b540cdea</id>
<content type='text'>
This reverts commit 54f9f80d6543fb7b157d3b11e2e7911dc1379790 ("hugetlb:
Add hugetlb_dynamic_pool sysctl")

Given the new sysctl nr_overcommit_hugepages, the boolean dynamic pool
sysctl is not needed, as its semantics can be expressed by 0 in the
overcommit sysctl (no dynamic pool) and non-0 in the overcommit sysctl
(pool enabled).

(Needed in 2.6.24 since it reverts a post-2.6.23 userspace-visible change)

Signed-off-by: Nishanth Aravamudan &lt;nacc@us.ibm.com&gt;
Acked-by: Adam Litke &lt;agl@us.ibm.com&gt;
Cc: William Lee Irwin III &lt;wli@holomorphy.com&gt;
Cc: Dave Hansen &lt;haveblue@us.ibm.com&gt;
Cc: David Gibson &lt;david@gibson.dropbear.id.au&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>hugetlb: introduce nr_overcommit_hugepages sysctl</title>
<updated>2007-12-18T03:28:17Z</updated>
<author>
<name>Nishanth Aravamudan</name>
<email>nacc@us.ibm.com</email>
</author>
<published>2007-12-18T00:20:12Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=d1c3fb1f8f29c41b0d098d7cfb3c32939043631f'/>
<id>urn:sha1:d1c3fb1f8f29c41b0d098d7cfb3c32939043631f</id>
<content type='text'>
hugetlb: introduce nr_overcommit_hugepages sysctl

While examining the code to support /proc/sys/vm/hugetlb_dynamic_pool, I
became convinced that having a boolean sysctl was insufficient:

1) To support per-node control of hugepages, I have previously submitted
patches to add a sysfs attribute related to nr_hugepages. However, with
a boolean global value and per-mount quota enforcement constraining the
dynamic pool, adding corresponding control of the dynamic pool on a
per-node basis seems inconsistent to me.

2) Administration of the hugetlb dynamic pool with multiple hugetlbfs
mount points is, arguably, more arduous than it needs to be. Each quota
would need to be set separately, and the sum would need to be monitored.

To ease the administration, and to help make the way for per-node
control of the static &amp; dynamic hugepage pool, I added a separate
sysctl, nr_overcommit_hugepages. This value serves as a high watermark
for the overall hugepage pool, while nr_hugepages serves as a low
watermark. The boolean sysctl can then be removed, as the condition

	nr_overcommit_hugepages &gt; 0

indicates the same administrative setting as

	hugetlb_dynamic_pool == 1

Quotas still serve as local enforcement of the size of the pool on a
per-mount basis.

A few caveats:

1) There is a race whereby the global surplus huge page counter is
incremented before a hugepage has allocated. Another process could then
try grow the pool, and fail to convert a surplus huge page to a normal
huge page and instead allocate a fresh huge page. I believe this is
benign, as no memory is leaked (the actual pages are still tracked
correctly) and the counters won't go out of sync.

2) Shrinking the static pool while a surplus is in effect will allow the
number of surplus huge pages to exceed the overcommit value. As long as
this condition holds, however, no more surplus huge pages will be
allowed on the system until one of the two sysctls are increased
sufficiently, or the surplus huge pages go out of use and are freed.

Successfully tested on x86_64 with the current libhugetlbfs snapshot,
modified to use the new sysctl.

Signed-off-by: Nishanth Aravamudan &lt;nacc@us.ibm.com&gt;
Acked-by: Adam Litke &lt;agl@us.ibm.com&gt;
Cc: William Lee Irwin III &lt;wli@holomorphy.com&gt;
Cc: Dave Hansen &lt;haveblue@us.ibm.com&gt;
Cc: David Gibson &lt;david@gibson.dropbear.id.au&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>Avoid potential NULL dereference in unregister_sysctl_table</title>
<updated>2007-12-05T17:21:20Z</updated>
<author>
<name>Pavel Emelyanov</name>
<email>xemul@openvz.org</email>
</author>
<published>2007-12-05T07:45:24Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=f1dad166e88a5ddca0acf8f11dea0e2bd92d8bf3'/>
<id>urn:sha1:f1dad166e88a5ddca0acf8f11dea0e2bd92d8bf3</id>
<content type='text'>
register_sysctl_table() can return NULL sometimes, e.g.  when kmalloc()
returns NULL or when sysctl check fails.

I've also noticed, that many (most?) code in the kernel doesn't check for
the return value from register_sysctl_table() and later simply calls the
unregister_sysctl_table() with potentially NULL argument.

This is unlikely on a common kernel configuration, but in case we're
dealing with modules and/or fault-injection support, there's a slight
possibility of an OOPS.

Changing all the users to check for return code from the registering does
not look like a good solution - there are too many code doing this and
failure in sysctl tables registration is not a good reason to abort module
loading (in most of the cases).

So I think, that we can just have this check in unregister_sysctl_table
just to avoid accidental OOPS-es (actually, the unregister_sysctl_table()
did exactly this, before the start_unregistering() appeared).

Signed-off-by: Pavel Emelyanov &lt;xemul@openvz.org&gt;
Cc: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>sysctl: check length at deprecated_sysctl_warning</title>
<updated>2007-11-15T02:45:37Z</updated>
<author>
<name>Tetsuo Handa</name>
<email>penguin-kernel@i-love.sakura.ne.jp</email>
</author>
<published>2007-11-15T00:58:38Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=6fc48af82cef55546d640778698943b6227b7fb0'/>
<id>urn:sha1:6fc48af82cef55546d640778698943b6227b7fb0</id>
<content type='text'>
Original patch assumed args-&gt;nlen &lt; CTL_MAXNAME, but it can be false.

Signed-off-by: Tetsuo Handa &lt;penguin-kernel@I-love.SAKURA.ne.jp&gt;
Cc: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>sched: avoid large irq-latencies in smp-balancing</title>
<updated>2007-11-09T21:39:39Z</updated>
<author>
<name>Peter Zijlstra</name>
<email>a.p.zijlstra@chello.nl</email>
</author>
<published>2007-11-09T21:39:39Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=b82d9fdd848abfbe7263a4ecd9bbb55e575100a6'/>
<id>urn:sha1:b82d9fdd848abfbe7263a4ecd9bbb55e575100a6</id>
<content type='text'>
SMP balancing is done with IRQs disabled and can iterate the full rq.
When rqs are large this can cause large irq-latencies. Limit the nr of
iterations on each run.

This fixes a scheduling latency regression reported by the -rt folks.

Signed-off-by: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Acked-by: Steven Rostedt &lt;rostedt@goodmis.org&gt;
Tested-by: Gregory Haskins &lt;ghaskins@novell.com&gt;
Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
</content>
</entry>
<entry>
<title>sched: cleanup, use NSEC_PER_MSEC and NSEC_PER_SEC</title>
<updated>2007-11-09T21:39:38Z</updated>
<author>
<name>Eric Dumazet</name>
<email>dada1@cosmosbay.com</email>
</author>
<published>2007-11-09T21:39:38Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=d6322faf296ab5bbb597f8b0abcb50153754cd08'/>
<id>urn:sha1:d6322faf296ab5bbb597f8b0abcb50153754cd08</id>
<content type='text'>
1) hardcoded 1000000000 value is used five times in places where
   NSEC_PER_SEC might be more readable.

2) A conversion from nsec to msec uses the hardcoded 1000000 value,
   which is a candidate for NSEC_PER_MSEC.

no code changed:

    text    data     bss     dec     hex filename
   44359    3326      36   47721    ba69 sched.o.before
   44359    3326      36   47721    ba69 sched.o.after

Signed-off-by: Eric Dumazet &lt;dada1@cosmosbay.com&gt;
Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
</content>
</entry>
<entry>
<title>sched: reintroduce the sched_min_granularity tunable</title>
<updated>2007-11-09T21:39:37Z</updated>
<author>
<name>Peter Zijlstra</name>
<email>a.p.zijlstra@chello.nl</email>
</author>
<published>2007-11-09T21:39:37Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=b2be5e96dc0b5a179cf4cb98e65cfb605752ca26'/>
<id>urn:sha1:b2be5e96dc0b5a179cf4cb98e65cfb605752ca26</id>
<content type='text'>
we lost the sched_min_granularity tunable to a clever optimization
that uses the sched_latency/min_granularity ratio - but the ratio
is quite unintuitive to users and can also crash the kernel if the
ratio is set to 0. So reintroduce the min_granularity tunable,
while keeping the ratio maintained internally.

no functionality changed.

[ mingo@elte.hu: some fixlets. ]

Signed-off-by: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
</content>
</entry>
<entry>
<title>pid namespaces: changes to show virtual ids to user</title>
<updated>2007-10-19T18:53:40Z</updated>
<author>
<name>Pavel Emelyanov</name>
<email>xemul@openvz.org</email>
</author>
<published>2007-10-19T06:40:14Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=b488893a390edfe027bae7a46e9af8083e740668'/>
<id>urn:sha1:b488893a390edfe027bae7a46e9af8083e740668</id>
<content type='text'>
This is the largest patch in the set. Make all (I hope) the places where
the pid is shown to or get from user operate on the virtual pids.

The idea is:
 - all in-kernel data structures must store either struct pid itself
   or the pid's global nr, obtained with pid_nr() call;
 - when seeking the task from kernel code with the stored id one
   should use find_task_by_pid() call that works with global pids;
 - when showing pid's numerical value to the user the virtual one
   should be used, but however when one shows task's pid outside this
   task's namespace the global one is to be used;
 - when getting the pid from userspace one need to consider this as
   the virtual one and use appropriate task/pid-searching functions.

[akpm@linux-foundation.org: build fix]
[akpm@linux-foundation.org: nuther build fix]
[akpm@linux-foundation.org: yet nuther build fix]
[akpm@linux-foundation.org: remove unneeded casts]
Signed-off-by: Pavel Emelyanov &lt;xemul@openvz.org&gt;
Signed-off-by: Alexey Dobriyan &lt;adobriyan@openvz.org&gt;
Cc: Sukadev Bhattiprolu &lt;sukadev@us.ibm.com&gt;
Cc: Oleg Nesterov &lt;oleg@tv-sign.ru&gt;
Cc: Paul Menage &lt;menage@google.com&gt;
Cc: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>pid namespaces: define is_global_init() and is_container_init()</title>
<updated>2007-10-19T18:53:37Z</updated>
<author>
<name>Serge E. Hallyn</name>
<email>serue@us.ibm.com</email>
</author>
<published>2007-10-19T06:39:52Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=b460cbc581a53cc088ceba80608021dd49c63c43'/>
<id>urn:sha1:b460cbc581a53cc088ceba80608021dd49c63c43</id>
<content type='text'>
is_init() is an ambiguous name for the pid==1 check.  Split it into
is_global_init() and is_container_init().

A cgroup init has it's tsk-&gt;pid == 1.

A global init also has it's tsk-&gt;pid == 1 and it's active pid namespace
is the init_pid_ns.  But rather than check the active pid namespace,
compare the task structure with 'init_pid_ns.child_reaper', which is
initialized during boot to the /sbin/init process and never changes.

Changelog:

	2.6.22-rc4-mm2-pidns1:
	- Use 'init_pid_ns.child_reaper' to determine if a given task is the
	  global init (/sbin/init) process. This would improve performance
	  and remove dependence on the task_pid().

	2.6.21-mm2-pidns2:

	- [Sukadev Bhattiprolu] Changed is_container_init() calls in {powerpc,
	  ppc,avr32}/traps.c for the _exception() call to is_global_init().
	  This way, we kill only the cgroup if the cgroup's init has a
	  bug rather than force a kernel panic.

[akpm@linux-foundation.org: fix comment]
[sukadev@us.ibm.com: Use is_global_init() in arch/m32r/mm/fault.c]
[bunk@stusta.de: kernel/pid.c: remove unused exports]
[sukadev@us.ibm.com: Fix capability.c to work with threaded init]
Signed-off-by: Serge E. Hallyn &lt;serue@us.ibm.com&gt;
Signed-off-by: Sukadev Bhattiprolu &lt;sukadev@us.ibm.com&gt;
Acked-by: Pavel Emelianov &lt;xemul@openvz.org&gt;
Cc: Eric W. Biederman &lt;ebiederm@xmission.com&gt;
Cc: Cedric Le Goater &lt;clg@fr.ibm.com&gt;
Cc: Dave Hansen &lt;haveblue@us.ibm.com&gt;
Cc: Herbert Poetzel &lt;herbert@13thfloor.at&gt;
Cc: Kirill Korotaev &lt;dev@sw.ru&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
</feed>
