<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/kernel/sysctl.c, branch v2.6.25</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v2.6.25</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v2.6.25'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2008-03-04T16:54:06Z</updated>
<entry>
<title>sched: revert load_balance_monitor() changes</title>
<updated>2008-03-04T16:54:06Z</updated>
<author>
<name>Peter Zijlstra</name>
<email>a.p.zijlstra@chello.nl</email>
</author>
<published>2008-02-25T16:34:02Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=62fb185130e4d420f71a30ff59d8b16b74ef5d2b'/>
<id>urn:sha1:62fb185130e4d420f71a30ff59d8b16b74ef5d2b</id>
<content type='text'>
The following commits cause a number of regressions:

  commit 58e2d4ca581167c2a079f4ee02be2f0bc52e8729
  Author: Srivatsa Vaddagiri &lt;vatsa@linux.vnet.ibm.com&gt;
  Date:   Fri Jan 25 21:08:00 2008 +0100
  sched: group scheduling, change how cpu load is calculated

  commit 6b2d7700266b9402e12824e11e0099ae6a4a6a79
  Author: Srivatsa Vaddagiri &lt;vatsa@linux.vnet.ibm.com&gt;
  Date:   Fri Jan 25 21:08:00 2008 +0100
  sched: group scheduler, fix fairness of cpu bandwidth allocation for task groups

Namely:
 - very frequent wakeups on SMP, reported by PowerTop users.
 - cacheline trashing on (large) SMP
 - some latencies larger than 500ms

While there is a mergeable patch to fix the latter, the former issues
are not fixable in a manner suitable for .25 (we're at -rc3 now).

Hence we revert them and try again in v2.6.26.

Signed-off-by: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
CC: Srivatsa Vaddagiri &lt;vatsa@linux.vnet.ibm.com&gt;
Tested-by: Alexey Zaytsev &lt;alexey.zaytsev@gmail.com&gt;
Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
</content>
</entry>
<entry>
<title>hugetlb: fix overcommit locking</title>
<updated>2008-02-14T00:21:18Z</updated>
<author>
<name>Nishanth Aravamudan</name>
<email>nacc@us.ibm.com</email>
</author>
<published>2008-02-13T23:03:19Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=064d9efe947542097be669581f82d6b097e81d1a'/>
<id>urn:sha1:064d9efe947542097be669581f82d6b097e81d1a</id>
<content type='text'>
proc_doulongvec_minmax() calls copy_to_user()/copy_from_user(), so we can't
hold hugetlb_lock over the call.  Use a dummy variable to store the sysctl
result, like in hugetlb_sysctl_handler(), then grab the lock to update
nr_overcommit_huge_pages.

Signed-off-by: Nishanth Aravamudan &lt;nacc@us.ibm.com&gt;
Reported-by: Miles Lane &lt;miles.lane@gmail.com&gt;
Cc: Adam Litke &lt;agl@us.ibm.com&gt;
Cc: David Gibson &lt;david@gibson.dropbear.id.au&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>sched: rt-group: interface</title>
<updated>2008-02-13T14:45:39Z</updated>
<author>
<name>Peter Zijlstra</name>
<email>a.p.zijlstra@chello.nl</email>
</author>
<published>2008-02-13T14:45:39Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=9f0c1e560c43327b70998e6c702b2f01321130d9'/>
<id>urn:sha1:9f0c1e560c43327b70998e6c702b2f01321130d9</id>
<content type='text'>
Change the rt_ratio interface to rt_runtime_us, to match rt_period_us.
This avoids picking a granularity for the ratio.

Extend the /sys/kernel/uids/&lt;uid&gt;/ interface to allow setting
the group's rt_runtime.

Signed-off-by: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
</content>
</entry>
<entry>
<title>printk_ratelimit() functions should use CONFIG_PRINTK</title>
<updated>2008-02-08T17:22:39Z</updated>
<author>
<name>Joe Perches</name>
<email>joe@perches.com</email>
</author>
<published>2008-02-08T12:21:25Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=7ef3d2fd17c377ef64a2aa19677d17576606c3b4'/>
<id>urn:sha1:7ef3d2fd17c377ef64a2aa19677d17576606c3b4</id>
<content type='text'>
Makes an embedded image a bit smaller.

Signed-off-by: Joe Perches &lt;joe@perches.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>Nuke duplicate header from sysctl.c</title>
<updated>2008-02-08T17:22:34Z</updated>
<author>
<name>Jesper Juhl</name>
<email>jesper.juhl@gmail.com</email>
</author>
<published>2008-02-08T12:20:25Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=efae09f3e99fcc1bdead7bc23a508b3bade3f82f'/>
<id>urn:sha1:efae09f3e99fcc1bdead7bc23a508b3bade3f82f</id>
<content type='text'>
Don't include linux/security.h twice in kernel/sysctl.c

Signed-off-by: Jesper Juhl &lt;jesper.juhl@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>Pidns: make full use of xxx_vnr() calls</title>
<updated>2008-02-08T17:22:29Z</updated>
<author>
<name>Pavel Emelyanov</name>
<email>xemul@openvz.org</email>
</author>
<published>2008-02-08T12:19:20Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=6c5f3e7b43300508fe3947ff3cfff0f86043bb57'/>
<id>urn:sha1:6c5f3e7b43300508fe3947ff3cfff0f86043bb57</id>
<content type='text'>
Some time ago the xxx_vnr() calls (e.g.  pid_vnr or find_task_by_vpid) were
_all_ converted to operate on the current pid namespace.  After this each call
like xxx_nr_ns(foo, current-&gt;nsproxy-&gt;pid_ns) is nothing but a xxx_vnr(foo)
one.

Switch all the xxx_nr_ns() callers to use the xxx_vnr() calls where
appropriate.

Signed-off-by: Pavel Emelyanov &lt;xemul@openvz.org&gt;
Reviewed-by: Oleg Nesterov &lt;oleg@tv-sign.ru&gt;
Cc: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Cc: Balbir Singh &lt;balbir@in.ibm.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>hugetlb: add locking for overcommit sysctl</title>
<updated>2008-02-08T17:22:23Z</updated>
<author>
<name>Nishanth Aravamudan</name>
<email>nacc@us.ibm.com</email>
</author>
<published>2008-02-08T12:18:18Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=a3d0c6aa1bb342b9b2c7b123b52ac2f48a4d4d0a'/>
<id>urn:sha1:a3d0c6aa1bb342b9b2c7b123b52ac2f48a4d4d0a</id>
<content type='text'>
When I replaced hugetlb_dynamic_pool with nr_overcommit_hugepages I used
proc_doulongvec_minmax() directly.  However, hugetlb.c's locking rules
require that all counter modifications occur under the hugetlb_lock.  Add a
callback into the hugetlb code similar to the one for nr_hugepages.  Grab
the lock around the manipulation of nr_overcommit_hugepages in
proc_doulongvec_minmax().

Signed-off-by: Nishanth Aravamudan &lt;nacc@us.ibm.com&gt;
Acked-by: Adam Litke &lt;agl@us.ibm.com&gt;
Cc: David Gibson &lt;david@gibson.dropbear.id.au&gt;
Cc: William Lee Irwin III &lt;wli@holomorphy.com&gt;
Cc: &lt;stable@kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>oom: add sysctl to enable task memory dump</title>
<updated>2008-02-07T16:42:19Z</updated>
<author>
<name>David Rientjes</name>
<email>rientjes@google.com</email>
</author>
<published>2008-02-07T08:14:07Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=fef1bdd68c81b71882ccb6f47c70980a03182063'/>
<id>urn:sha1:fef1bdd68c81b71882ccb6f47c70980a03182063</id>
<content type='text'>
Adds a new sysctl, 'oom_dump_tasks', that enables the kernel to produce a
dump of all system tasks (excluding kernel threads) when performing an
OOM-killing.  Information includes pid, uid, tgid, vm size, rss, cpu,
oom_adj score, and name.

This is helpful for determining why there was an OOM condition and which
rogue task caused it.

It is configurable so that large systems, such as those with several
thousand tasks, do not incur a performance penalty associated with dumping
data they may not desire.

If an OOM was triggered as a result of a memory controller, the tasklist
shall be filtered to exclude tasks that are not a member of the same
cgroup.

Cc: Andrea Arcangeli &lt;andrea@suse.de&gt;
Cc: Christoph Lameter &lt;clameter@sgi.com&gt;
Cc: Balbir Singh &lt;balbir@linux.vnet.ibm.com&gt;
Signed-off-by: David Rientjes &lt;rientjes@google.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>get rid of NR_OPEN and introduce a sysctl_nr_open</title>
<updated>2008-02-06T18:41:06Z</updated>
<author>
<name>Eric Dumazet</name>
<email>dada1@cosmosbay.com</email>
</author>
<published>2008-02-06T09:37:16Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=9cfe015aa424b3c003baba3841a60dd9b5ad319b'/>
<id>urn:sha1:9cfe015aa424b3c003baba3841a60dd9b5ad319b</id>
<content type='text'>
NR_OPEN (historically set to 1024*1024) actually forbids processes to open
more than 1024*1024 handles.

Unfortunatly some production servers hit the not so 'ridiculously high
value' of 1024*1024 file descriptors per process.

Changing NR_OPEN is not considered safe because of vmalloc space potential
exhaust.

This patch introduces a new sysctl (/proc/sys/fs/nr_open) wich defaults to
1024*1024, so that admins can decide to change this limit if their workload
needs it.

[akpm@linux-foundation.org: export it for sparc64]
Signed-off-by: Eric Dumazet &lt;dada1@cosmosbay.com&gt;
Cc: Alan Cox &lt;alan@lxorguk.ukuu.org.uk&gt;
Cc: Richard Henderson &lt;rth@twiddle.net&gt;
Cc: Ivan Kokshaysky &lt;ink@jurassic.park.msu.ru&gt;
Cc: "David S. Miller" &lt;davem@davemloft.net&gt;
Cc: Ralf Baechle &lt;ralf@linux-mips.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>capabilities: introduce per-process capability bounding set</title>
<updated>2008-02-05T17:44:20Z</updated>
<author>
<name>Serge E. Hallyn</name>
<email>serue@us.ibm.com</email>
</author>
<published>2008-02-05T06:29:45Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=3b7391de67da515c91f48aa371de77cb6cc5c07e'/>
<id>urn:sha1:3b7391de67da515c91f48aa371de77cb6cc5c07e</id>
<content type='text'>
The capability bounding set is a set beyond which capabilities cannot grow.
 Currently cap_bset is per-system.  It can be manipulated through sysctl,
but only init can add capabilities.  Root can remove capabilities.  By
default it includes all caps except CAP_SETPCAP.

This patch makes the bounding set per-process when file capabilities are
enabled.  It is inherited at fork from parent.  Noone can add elements,
CAP_SETPCAP is required to remove them.

One example use of this is to start a safer container.  For instance, until
device namespaces or per-container device whitelists are introduced, it is
best to take CAP_MKNOD away from a container.

The bounding set will not affect pP and pE immediately.  It will only
affect pP' and pE' after subsequent exec()s.  It also does not affect pI,
and exec() does not constrain pI'.  So to really start a shell with no way
of regain CAP_MKNOD, you would do

	prctl(PR_CAPBSET_DROP, CAP_MKNOD);
	cap_t cap = cap_get_proc();
	cap_value_t caparray[1];
	caparray[0] = CAP_MKNOD;
	cap_set_flag(cap, CAP_INHERITABLE, 1, caparray, CAP_DROP);
	cap_set_proc(cap);
	cap_free(cap);

The following test program will get and set the bounding
set (but not pI).  For instance

	./bset get
		(lists capabilities in bset)
	./bset drop cap_net_raw
		(starts shell with new bset)
		(use capset, setuid binary, or binary with
		file capabilities to try to increase caps)

************************************************************
cap_bound.c
************************************************************
 #include &lt;sys/prctl.h&gt;
 #include &lt;linux/capability.h&gt;
 #include &lt;sys/types.h&gt;
 #include &lt;unistd.h&gt;
 #include &lt;stdio.h&gt;
 #include &lt;stdlib.h&gt;
 #include &lt;string.h&gt;

 #ifndef PR_CAPBSET_READ
 #define PR_CAPBSET_READ 23
 #endif

 #ifndef PR_CAPBSET_DROP
 #define PR_CAPBSET_DROP 24
 #endif

int usage(char *me)
{
	printf("Usage: %s get\n", me);
	printf("       %s drop &lt;capability&gt;\n", me);
	return 1;
}

 #define numcaps 32
char *captable[numcaps] = {
	"cap_chown",
	"cap_dac_override",
	"cap_dac_read_search",
	"cap_fowner",
	"cap_fsetid",
	"cap_kill",
	"cap_setgid",
	"cap_setuid",
	"cap_setpcap",
	"cap_linux_immutable",
	"cap_net_bind_service",
	"cap_net_broadcast",
	"cap_net_admin",
	"cap_net_raw",
	"cap_ipc_lock",
	"cap_ipc_owner",
	"cap_sys_module",
	"cap_sys_rawio",
	"cap_sys_chroot",
	"cap_sys_ptrace",
	"cap_sys_pacct",
	"cap_sys_admin",
	"cap_sys_boot",
	"cap_sys_nice",
	"cap_sys_resource",
	"cap_sys_time",
	"cap_sys_tty_config",
	"cap_mknod",
	"cap_lease",
	"cap_audit_write",
	"cap_audit_control",
	"cap_setfcap"
};

int getbcap(void)
{
	int comma=0;
	unsigned long i;
	int ret;

	printf("i know of %d capabilities\n", numcaps);
	printf("capability bounding set:");
	for (i=0; i&lt;numcaps; i++) {
		ret = prctl(PR_CAPBSET_READ, i);
		if (ret &lt; 0)
			perror("prctl");
		else if (ret==1)
			printf("%s%s", (comma++) ? ", " : " ", captable[i]);
	}
	printf("\n");
	return 0;
}

int capdrop(char *str)
{
	unsigned long i;

	int found=0;
	for (i=0; i&lt;numcaps; i++) {
		if (strcmp(captable[i], str) == 0) {
			found=1;
			break;
		}
	}
	if (!found)
		return 1;
	if (prctl(PR_CAPBSET_DROP, i)) {
		perror("prctl");
		return 1;
	}
	return 0;
}

int main(int argc, char *argv[])
{
	if (argc&lt;2)
		return usage(argv[0]);
	if (strcmp(argv[1], "get")==0)
		return getbcap();
	if (strcmp(argv[1], "drop")!=0 || argc&lt;3)
		return usage(argv[0]);
	if (capdrop(argv[2])) {
		printf("unknown capability\n");
		return 1;
	}
	return execl("/bin/bash", "/bin/bash", NULL);
}
************************************************************

[serue@us.ibm.com: fix typo]
Signed-off-by: Serge E. Hallyn &lt;serue@us.ibm.com&gt;
Signed-off-by: Andrew G. Morgan &lt;morgan@kernel.org&gt;
Cc: Stephen Smalley &lt;sds@tycho.nsa.gov&gt;
Cc: James Morris &lt;jmorris@namei.org&gt;
Cc: Chris Wright &lt;chrisw@sous-sol.org&gt;
Cc: Casey Schaufler &lt;casey@schaufler-ca.com&gt;a
Signed-off-by: "Serge E. Hallyn" &lt;serue@us.ibm.com&gt;
Tested-by: Jiri Slaby &lt;jirislaby@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
</feed>
