<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/kernel/fork.c, branch v4.5</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v4.5</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v4.5'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2016-01-15T00:00:49Z</updated>
<entry>
<title>mm: rework virtual memory accounting</title>
<updated>2016-01-15T00:00:49Z</updated>
<author>
<name>Konstantin Khlebnikov</name>
<email>koct9i@gmail.com</email>
</author>
<published>2016-01-14T23:22:07Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=84638335900f1995495838fe1bd4870c43ec1f67'/>
<id>urn:sha1:84638335900f1995495838fe1bd4870c43ec1f67</id>
<content type='text'>
When inspecting a vague code inside prctl(PR_SET_MM_MEM) call (which
testing the RLIMIT_DATA value to figure out if we're allowed to assign
new @start_brk, @brk, @start_data, @end_data from mm_struct) it's been
commited that RLIMIT_DATA in a form it's implemented now doesn't do
anything useful because most of user-space libraries use mmap() syscall
for dynamic memory allocations.

Linus suggested to convert RLIMIT_DATA rlimit into something suitable
for anonymous memory accounting.  But in this patch we go further, and
the changes are bundled together as:

 * keep vma counting if CONFIG_PROC_FS=n, will be used for limits
 * replace mm-&gt;shared_vm with better defined mm-&gt;data_vm
 * account anonymous executable areas as executable
 * account file-backed growsdown/up areas as stack
 * drop struct file* argument from vm_stat_account
 * enforce RLIMIT_DATA for size of data areas

This way code looks cleaner: now code/stack/data classification depends
only on vm_flags state:

 VM_EXEC &amp; ~VM_WRITE            -&gt; code  (VmExe + VmLib in proc)
 VM_GROWSUP | VM_GROWSDOWN      -&gt; stack (VmStk)
 VM_WRITE &amp; ~VM_SHARED &amp; !stack -&gt; data  (VmData)

The rest (VmSize - VmData - VmStk - VmExe - VmLib) could be called
"shared", but that might be strange beast like readonly-private or VM_IO
area.

 - RLIMIT_AS            limits whole address space "VmSize"
 - RLIMIT_STACK         limits stack "VmStk" (but each vma individually)
 - RLIMIT_DATA          now limits "VmData"

Signed-off-by: Konstantin Khlebnikov &lt;koct9i@gmail.com&gt;
Signed-off-by: Cyrill Gorcunov &lt;gorcunov@openvz.org&gt;
Cc: Quentin Casasnovas &lt;quentin.casasnovas@oracle.com&gt;
Cc: Vegard Nossum &lt;vegard.nossum@oracle.com&gt;
Acked-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Willy Tarreau &lt;w@1wt.eu&gt;
Cc: Andy Lutomirski &lt;luto@amacapital.net&gt;
Cc: Kees Cook &lt;keescook@google.com&gt;
Cc: Vladimir Davydov &lt;vdavydov@virtuozzo.com&gt;
Cc: Pavel Emelyanov &lt;xemul@virtuozzo.com&gt;
Cc: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>kmemcg: account certain kmem allocations to memcg</title>
<updated>2016-01-15T00:00:49Z</updated>
<author>
<name>Vladimir Davydov</name>
<email>vdavydov@virtuozzo.com</email>
</author>
<published>2016-01-14T23:18:21Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=5d097056c9a017a3b720849efb5432f37acabbac'/>
<id>urn:sha1:5d097056c9a017a3b720849efb5432f37acabbac</id>
<content type='text'>
Mark those kmem allocations that are known to be easily triggered from
userspace as __GFP_ACCOUNT/SLAB_ACCOUNT, which makes them accounted to
memcg.  For the list, see below:

 - threadinfo
 - task_struct
 - task_delay_info
 - pid
 - cred
 - mm_struct
 - vm_area_struct and vm_region (nommu)
 - anon_vma and anon_vma_chain
 - signal_struct
 - sighand_struct
 - fs_struct
 - files_struct
 - fdtable and fdtable-&gt;full_fds_bits
 - dentry and external_name
 - inode for all filesystems. This is the most tedious part, because
   most filesystems overwrite the alloc_inode method.

The list is far from complete, so feel free to add more objects.
Nevertheless, it should be close to "account everything" approach and
keep most workloads within bounds.  Malevolent users will be able to
breach the limit, but this was possible even with the former "account
everything" approach (simply because it did not account everything in
fact).

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Vladimir Davydov &lt;vdavydov@virtuozzo.com&gt;
Acked-by: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Acked-by: Michal Hocko &lt;mhocko@suse.com&gt;
Cc: Tejun Heo &lt;tj@kernel.org&gt;
Cc: Greg Thelen &lt;gthelen@google.com&gt;
Cc: Christoph Lameter &lt;cl@linux.com&gt;
Cc: Pekka Enberg &lt;penberg@kernel.org&gt;
Cc: David Rientjes &lt;rientjes@google.com&gt;
Cc: Joonsoo Kim &lt;iamjoonsoo.kim@lge.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>Merge branch 'for-4.5' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup</title>
<updated>2016-01-13T03:20:32Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2016-01-13T03:20:32Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=34a9304a96d6351c2d35dcdc9293258378fc0bd8'/>
<id>urn:sha1:34a9304a96d6351c2d35dcdc9293258378fc0bd8</id>
<content type='text'>
Pull cgroup updates from Tejun Heo:

 - cgroup v2 interface is now official.  It's no longer hidden behind a
   devel flag and can be mounted using the new cgroup2 fs type.

   Unfortunately, cpu v2 interface hasn't made it yet due to the
   discussion around in-process hierarchical resource distribution and
   only memory and io controllers can be used on the v2 interface at the
   moment.

 - The existing documentation which has always been a bit of mess is
   relocated under Documentation/cgroup-v1/. Documentation/cgroup-v2.txt
   is added as the authoritative documentation for the v2 interface.

 - Some features are added through for-4.5-ancestor-test branch to
   enable netfilter xt_cgroup match to use cgroup v2 paths.  The actual
   netfilter changes will be merged through the net tree which pulled in
   the said branch.

 - Various cleanups

* 'for-4.5' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
  cgroup: rename cgroup documentations
  cgroup: fix a typo.
  cgroup: Remove resource_counter.txt in Documentation/cgroup-legacy/00-INDEX.
  cgroup: demote subsystem init messages to KERN_DEBUG
  cgroup: Fix uninitialized variable warning
  cgroup: put controller Kconfig options in meaningful order
  cgroup: clean up the kernel configuration menu nomenclature
  cgroup_pids: fix a typo.
  Subject: cgroup: Fix incomplete dd command in blkio documentation
  cgroup: kill cgrp_ss_priv[CGROUP_CANFORK_COUNT] and friends
  cpuset: Replace all instances of time_t with time64_t
  cgroup: replace unified-hierarchy.txt with a proper cgroup v2 documentation
  cgroup: rename Documentation/cgroups/ to Documentation/cgroup-legacy/
  cgroup: replace __DEVEL__sane_behavior with cgroup2 fs type
</content>
</entry>
<entry>
<title>Merge branch 'sched/urgent' into sched/core, to pick up fixes before merging new patches</title>
<updated>2016-01-06T10:02:29Z</updated>
<author>
<name>Ingo Molnar</name>
<email>mingo@kernel.org</email>
</author>
<published>2016-01-06T10:02:29Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=567bee2803cb46caeb6011de5b738fde33dc3896'/>
<id>urn:sha1:567bee2803cb46caeb6011de5b738fde33dc3896</id>
<content type='text'>
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>sched/core: Reset task's lockless wake-queues on fork()</title>
<updated>2016-01-06T10:01:07Z</updated>
<author>
<name>Sebastian Andrzej Siewior</name>
<email>bigeasy@linutronix.de</email>
</author>
<published>2015-12-21T17:17:10Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=093e5840ae76f1082633503964d035f40ed0216d'/>
<id>urn:sha1:093e5840ae76f1082633503964d035f40ed0216d</id>
<content type='text'>
In the following commit:

  7675104990ed ("sched: Implement lockless wake-queues")

we gained lockless wake-queues.

The -RT kernel managed to lockup itself with those. There could be multiple
attempts for task X to enqueue it for a wakeup _even_ if task X is already
running.

The reason is that task X could be runnable but not yet on CPU. The the
task performing the wakeup did not leave the CPU it could performe
multiple wakeups.

With the proper timming task X could be running and enqueued for a
wakeup. If this happens while X is performing a fork() then its its
child will have a !NULL `wake_q` member copied.

This is not a problem as long as the child task does not participate in
lockless wakeups :)

Signed-off-by: Sebastian Andrzej Siewior &lt;bigeasy@linutronix.de&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: Davidlohr Bueso &lt;dbueso@suse.de&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Steven Rostedt &lt;rostedt@goodmis.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Fixes: 7675104990ed ("sched: Implement lockless wake-queues")
Link: http://lkml.kernel.org/r/20151221171710.GA5499@linutronix.de
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>sched/cputime: Convert vtime_seqlock to seqcount</title>
<updated>2015-12-04T09:34:46Z</updated>
<author>
<name>Frederic Weisbecker</name>
<email>fweisbec@gmail.com</email>
</author>
<published>2015-11-19T15:47:34Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=b7ce2277f087fd052e7e1bbf432f7fecbee82bb6'/>
<id>urn:sha1:b7ce2277f087fd052e7e1bbf432f7fecbee82bb6</id>
<content type='text'>
The cputime can only be updated by the current task itself, even in
vtime case. So we can safely use seqcount instead of seqlock as there
is no writer concurrency involved.

Signed-off-by: Frederic Weisbecker &lt;fweisbec@gmail.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: Chris Metcalf &lt;cmetcalf@ezchip.com&gt;
Cc: Christoph Lameter &lt;cl@linux.com&gt;
Cc: Hiroshi Shimamoto &lt;h-shimamoto@ct.jp.nec.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Luiz Capitulino &lt;lcapitulino@redhat.com&gt;
Cc: Mike Galbraith &lt;efault@gmx.de&gt;
Cc: Paul E . McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
Cc: Paul E. McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Rik van Riel &lt;riel@redhat.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Link: http://lkml.kernel.org/r/1447948054-28668-8-git-send-email-fweisbec@gmail.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>sched/cputime: Clarify vtime symbols and document them</title>
<updated>2015-12-04T09:34:44Z</updated>
<author>
<name>Frederic Weisbecker</name>
<email>fweisbec@gmail.com</email>
</author>
<published>2015-11-19T15:47:30Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=7098c1eac75dc03fdbb7249171a6e68ce6044a5a'/>
<id>urn:sha1:7098c1eac75dc03fdbb7249171a6e68ce6044a5a</id>
<content type='text'>
VTIME_SLEEPING state happens either when:

1) The task is sleeping and no tickless delta is to be added on the task
   cputime stats.
2) The CPU isn't running vtime at all, so the same properties of 1) applies.

Lets rename the vtime symbol to reflect both states.

Signed-off-by: Frederic Weisbecker &lt;fweisbec@gmail.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: Chris Metcalf &lt;cmetcalf@ezchip.com&gt;
Cc: Christoph Lameter &lt;cl@linux.com&gt;
Cc: Hiroshi Shimamoto &lt;h-shimamoto@ct.jp.nec.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Luiz Capitulino &lt;lcapitulino@redhat.com&gt;
Cc: Mike Galbraith &lt;efault@gmx.de&gt;
Cc: Paul E . McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
Cc: Paul E. McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Rik van Riel &lt;riel@redhat.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Link: http://lkml.kernel.org/r/1447948054-28668-4-git-send-email-fweisbec@gmail.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>cgroup: kill cgrp_ss_priv[CGROUP_CANFORK_COUNT] and friends</title>
<updated>2015-12-03T15:24:08Z</updated>
<author>
<name>Oleg Nesterov</name>
<email>oleg@redhat.com</email>
</author>
<published>2015-12-03T15:24:08Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=b53202e6308939d33ba0c78712e850f891b4e76f'/>
<id>urn:sha1:b53202e6308939d33ba0c78712e850f891b4e76f</id>
<content type='text'>
Now that nobody use the "priv" arg passed to can_fork/cancel_fork/fork we can
kill CGROUP_CANFORK_COUNT/SUBSYS_TAG/etc and cgrp_ss_priv[] in copy_process().

Signed-off-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
</content>
</entry>
<entry>
<title>cgroup: pids: fix race between cgroup_post_fork() and cgroup_migrate()</title>
<updated>2015-11-30T14:48:18Z</updated>
<author>
<name>Oleg Nesterov</name>
<email>oleg@redhat.com</email>
</author>
<published>2015-11-27T18:57:19Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=c9e75f0492b248aeaa7af8991a6fc9a21506bc96'/>
<id>urn:sha1:c9e75f0492b248aeaa7af8991a6fc9a21506bc96</id>
<content type='text'>
If the new child migrates to another cgroup before cgroup_post_fork() calls
subsys-&gt;fork(), then both pids_can_attach() and pids_fork() will do the same
pids_uncharge(old_pids) + pids_charge(pids) sequence twice.

Change copy_process() to call threadgroup_change_begin/threadgroup_change_end
unconditionally. percpu_down_read() is cheap and this allows other cleanups,
see the next changes.

Also, this way we can unify cgroup_threadgroup_rwsem and dup_mmap_sem.

Signed-off-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Acked-by: Zefan Li &lt;lizefan@huawei.com&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
</content>
</entry>
<entry>
<title>Merge branch 'akpm' (patches from Andrew)</title>
<updated>2015-11-06T07:10:54Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2015-11-06T07:10:54Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=2e3078af2c67730c479f1d183af5b367f5d95337'/>
<id>urn:sha1:2e3078af2c67730c479f1d183af5b367f5d95337</id>
<content type='text'>
Merge patch-bomb from Andrew Morton:

 - inotify tweaks

 - some ocfs2 updates (many more are awaiting review)

 - various misc bits

 - kernel/watchdog.c updates

 - Some of mm.  I have a huge number of MM patches this time and quite a
   lot of it is quite difficult and much will be held over to next time.

* emailed patches from Andrew Morton &lt;akpm@linux-foundation.org&gt;: (162 commits)
  selftests: vm: add tests for lock on fault
  mm: mlock: add mlock flags to enable VM_LOCKONFAULT usage
  mm: introduce VM_LOCKONFAULT
  mm: mlock: add new mlock system call
  mm: mlock: refactor mlock, munlock, and munlockall code
  kasan: always taint kernel on report
  mm, slub, kasan: enable user tracking by default with KASAN=y
  kasan: use IS_ALIGNED in memory_is_poisoned_8()
  kasan: Fix a type conversion error
  lib: test_kasan: add some testcases
  kasan: update reference to kasan prototype repo
  kasan: move KASAN_SANITIZE in arch/x86/boot/Makefile
  kasan: various fixes in documentation
  kasan: update log messages
  kasan: accurately determine the type of the bad access
  kasan: update reported bug types for kernel memory accesses
  kasan: update reported bug types for not user nor kernel memory accesses
  mm/kasan: prevent deadlock in kasan reporting
  mm/kasan: don't use kasan shadow pointer in generic functions
  mm/kasan: MODULE_VADDR is not available on all archs
  ...
</content>
</entry>
</feed>
