<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/Documentation/vm, branch v4.0</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v4.0</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v4.0'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2015-02-12T01:06:00Z</updated>
<entry>
<title>mm:add KPF_ZERO_PAGE flag for /proc/kpageflags</title>
<updated>2015-02-12T01:06:00Z</updated>
<author>
<name>Wang, Yalin</name>
<email>Yalin.Wang@sonymobile.com</email>
</author>
<published>2015-02-11T23:24:51Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=56873f43abdcd574b25105867a990f067747b2f4'/>
<id>urn:sha1:56873f43abdcd574b25105867a990f067747b2f4</id>
<content type='text'>
Add KPF_ZERO_PAGE flag for zero_page, so that userspace processes can
detect zero_page in /proc/kpageflags, and then do memory analysis more
accurately.

Signed-off-by: Yalin Wang &lt;yalin.wang@sonymobile.com&gt;
Acked-by: Kirill A. Shutemov &lt;kirill@shutemov.name&gt;
Cc: Konstantin Khlebnikov &lt;koct9i@gmail.com&gt;
Cc: Naoya Horiguchi &lt;n-horiguchi@ah.jp.nec.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial</title>
<updated>2015-02-11T02:57:15Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2015-02-11T02:57:15Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=29afc4e9a408f2304e09c6dd0dbcfbd2356d0faa'/>
<id>urn:sha1:29afc4e9a408f2304e09c6dd0dbcfbd2356d0faa</id>
<content type='text'>
Pull trivial tree changes from Jiri Kosina:
 "Patches from trivial.git that keep the world turning around.

  Mostly documentation and comment fixes, and a two corner-case code
  fixes from Alan Cox"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial:
  kexec, Kconfig: spell "architecture" properly
  mm: fix cleancache debugfs directory path
  blackfin: mach-common: ints-priority: remove unused function
  doubletalk: probe failure causes OOPS
  ARM: cache-l2x0.c: Make it clear that cache-l2x0 handles L310 cache controller
  msdos_fs.h: fix 'fields' in comment
  scsi: aic7xxx: fix comment
  ARM: l2c: fix comment
  ibmraid: fix writeable attribute with no store method
  dynamic_debug: fix comment
  doc: usbmon: fix spelling s/unpriviledged/unprivileged/
  x86: init_mem_mapping(): use capital BIOS in comment
</content>
</entry>
<entry>
<title>mm: replace remap_file_pages() syscall with emulation</title>
<updated>2015-02-10T22:30:30Z</updated>
<author>
<name>Kirill A. Shutemov</name>
<email>kirill.shutemov@linux.intel.com</email>
</author>
<published>2015-02-10T22:09:46Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=c8d78c1823f46519473949d33f0d1d33fe21ea16'/>
<id>urn:sha1:c8d78c1823f46519473949d33f0d1d33fe21ea16</id>
<content type='text'>
remap_file_pages(2) was invented to be able efficiently map parts of
huge file into limited 32-bit virtual address space such as in database
workloads.

Nonlinear mappings are pain to support and it seems there's no
legitimate use-cases nowadays since 64-bit systems are widely available.

Let's drop it and get rid of all these special-cased code.

The patch replaces the syscall with emulation which creates new VMA on
each remap_file_pages(), unless they it can be merged with an adjacent
one.

I didn't find *any* real code that uses remap_file_pages(2) to test
emulation impact on.  I've checked Debian code search and source of all
packages in ALT Linux.  No real users: libc wrappers, mentions in
strace, gdb, valgrind and this kind of stuff.

There are few basic tests in LTP for the syscall.  They work just fine
with emulation.

To test performance impact, I've written small test case which
demonstrate pretty much worst case scenario: map 4G shmfs file, write to
begin of every page pgoff of the page, remap pages in reverse order,
read every page.

The test creates 1 million of VMAs if emulation is in use, so I had to
set vm.max_map_count to 1100000 to avoid -ENOMEM.

Before:		23.3 ( +-  4.31% ) seconds
After:		43.9 ( +-  0.85% ) seconds
Slowdown:	1.88x

I believe we can live with that.

Test case:

        #define _GNU_SOURCE
        #include &lt;assert.h&gt;
        #include &lt;stdlib.h&gt;
        #include &lt;stdio.h&gt;
        #include &lt;sys/mman.h&gt;

        #define MB	(1024UL * 1024)
        #define SIZE	(4096 * MB)

        int main(int argc, char **argv)
        {
                unsigned long *p;
                long i, pass;

                for (pass = 0; pass &lt; 10; pass++) {
                        p = mmap(NULL, SIZE, PROT_READ|PROT_WRITE,
                                        MAP_SHARED | MAP_ANONYMOUS, -1, 0);
                        if (p == MAP_FAILED) {
                                perror("mmap");
                                return -1;
                        }

                        for (i = 0; i &lt; SIZE / 4096; i++)
                                p[i * 4096 / sizeof(*p)] = i;

                        for (i = 0; i &lt; SIZE / 4096; i++) {
                                if (remap_file_pages(p + i * 4096 / sizeof(*p), 4096,
                                                0, (SIZE - 4096 * (i + 1)) &gt;&gt; 12, 0)) {
                                        perror("remap_file_pages");
                                        return -1;
                                }
                        }

                        for (i = SIZE / 4096 - 1; i &gt;= 0; i--)
                                assert(p[i * 4096 / sizeof(*p)] == SIZE / 4096 - i - 1);

                        munmap(p, SIZE);
                }

                return 0;
        }

[akpm@linux-foundation.org: fix spello]
[sasha.levin@oracle.com: initialize populate before usage]
[sasha.levin@oracle.com: grab file ref to prevent race while mmaping]
Signed-off-by: "Kirill A. Shutemov" &lt;kirill@shutemov.name&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Dave Jones &lt;davej@redhat.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Armin Rigo &lt;arigo@tunes.org&gt;
Signed-off-by: Sasha Levin &lt;sasha.levin@oracle.com&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: fix cleancache debugfs directory path</title>
<updated>2015-01-20T13:08:31Z</updated>
<author>
<name>Marcin Jabrzyk</name>
<email>m.jabrzyk@samsung.com</email>
</author>
<published>2015-01-07T10:14:41Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=8fc8f4d57c8d146971e4d1456f8e93a22e1487c3'/>
<id>urn:sha1:8fc8f4d57c8d146971e4d1456f8e93a22e1487c3</id>
<content type='text'>
Minor fixes for cleancache about wrong debugfs paths
in documentation and code comment.

Signed-off-by: Marcin Jabrzyk &lt;m.jabrzyk@samsung.com&gt;
Signed-off-by: Jiri Kosina &lt;jkosina@suse.cz&gt;
</content>
</entry>
<entry>
<title>Merge branch 'akpm' (second patch-bomb from Andrew)</title>
<updated>2014-12-13T21:00:36Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2014-12-13T21:00:36Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=78a45c6f067824cf5d0a9fedea7339ac2e28603c'/>
<id>urn:sha1:78a45c6f067824cf5d0a9fedea7339ac2e28603c</id>
<content type='text'>
Merge second patchbomb from Andrew Morton:
 - the rest of MM
 - misc fs fixes
 - add execveat() syscall
 - new ratelimit feature for fault-injection
 - decompressor updates
 - ipc/ updates
 - fallocate feature creep
 - fsnotify cleanups
 - a few other misc things

* emailed patches from Andrew Morton &lt;akpm@linux-foundation.org&gt;: (99 commits)
  cgroups: Documentation: fix trivial typos and wrong paragraph numberings
  parisc: percpu: update comments referring to __get_cpu_var
  percpu: update local_ops.txt to reflect this_cpu operations
  percpu: remove __get_cpu_var and __raw_get_cpu_var macros
  fsnotify: remove destroy_list from fsnotify_mark
  fsnotify: unify inode and mount marks handling
  fallocate: create FAN_MODIFY and IN_MODIFY events
  mm/cma: make kmemleak ignore CMA regions
  slub: fix cpuset check in get_any_partial
  slab: fix cpuset check in fallback_alloc
  shmdt: use i_size_read() instead of -&gt;i_size
  ipc/shm.c: fix overly aggressive shmdt() when calls span multiple segments
  ipc/msg: increase MSGMNI, remove scaling
  ipc/sem.c: increase SEMMSL, SEMMNI, SEMOPM
  ipc/sem.c: change memory barrier in sem_lock() to smp_rmb()
  lib/decompress.c: consistency of compress formats for kernel image
  decompress_bunzip2: off by one in get_next_block()
  usr/Kconfig: make initrd compression algorithm selection not expert
  fault-inject: add ratelimit option
  ratelimit: add initialization macro
  ...
</content>
</entry>
<entry>
<title>Documentation: add new page_owner document</title>
<updated>2014-12-13T20:42:48Z</updated>
<author>
<name>Joonsoo Kim</name>
<email>iamjoonsoo.kim@lge.com</email>
</author>
<published>2014-12-13T00:56:07Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=16a7ade8af3b4ad30aec880177ff291bb5ea86d1'/>
<id>urn:sha1:16a7ade8af3b4ad30aec880177ff291bb5ea86d1</id>
<content type='text'>
page owner is for the tracking about who allocated each page.  This
document explains what is the page owner feature and what is the merit of
it.  And, simple HOW-TO is also explained.  See the document for detailed
information.

Signed-off-by: Joonsoo Kim &lt;iamjoonsoo.kim@lge.com&gt;
Cc: Mel Gorman &lt;mgorman@suse.de&gt;
Cc: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: Minchan Kim &lt;minchan@kernel.org&gt;
Cc: Dave Hansen &lt;dave@sr71.net&gt;
Cc: Michal Nazarewicz &lt;mina86@mina86.com&gt;
Cc: Jungsoo Son &lt;jungsoo.son@lge.com&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: Joonsoo Kim &lt;iamjoonsoo.kim@lge.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>Documentation: vm: Add 1GB large page support information</title>
<updated>2014-11-06T20:14:11Z</updated>
<author>
<name>Masanari Iida</name>
<email>standby24x7@gmail.com</email>
</author>
<published>2014-11-06T15:31:15Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=c0d7305cb3e5e77dba822706e21898314e893fb7'/>
<id>urn:sha1:c0d7305cb3e5e77dba822706e21898314e893fb7</id>
<content type='text'>
This patch adds 1GB large page support information in
Documentation/vm/hugetlbpage.txt

Reference:
https://lkml.org/lkml/2014/10/31/366

Signed-off-by: Masanari Iida &lt;standby24x7@gmail.com&gt;
Reviewed-by: Luiz Capitulino &lt;lcapitulino@redhat.com&gt;
Signed-off-by: Jonathan Corbet &lt;corbet@lwn.net&gt;
</content>
</entry>
<entry>
<title>Docs: Document that the sticky bit is understood by hugetlbfs</title>
<updated>2014-10-22T18:26:04Z</updated>
<author>
<name>Kirill Smelkov</name>
<email>kirr@nexedi.com</email>
</author>
<published>2014-10-22T15:54:46Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=011bc4870fb2265fcd0cc941d0c5d06a3fd581b5'/>
<id>urn:sha1:011bc4870fb2265fcd0cc941d0c5d06a3fd581b5</id>
<content type='text'>
Commit 75897d60 (hugetlb: allow sticky directory mount option) added
support for mounting hugetlbfs with sticky option set, like /tmp is
usually mounted, but forgot to document that.

Acked-by: Ken Chen &lt;kenchen@google.com&gt;
Signed-off-by: Kirill Smelkov &lt;kirr@nexedi.com&gt;
Signed-off-by: Jonathan Corbet &lt;corbet@lwn.net&gt;
</content>
</entry>
<entry>
<title>mm: mark remap_file_pages() syscall as deprecated</title>
<updated>2014-06-06T23:08:17Z</updated>
<author>
<name>Kirill A. Shutemov</name>
<email>kirill.shutemov@linux.intel.com</email>
</author>
<published>2014-06-06T21:38:23Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=33041a0d76d3c3e0aff28ac95a2ffdedf1282dbc'/>
<id>urn:sha1:33041a0d76d3c3e0aff28ac95a2ffdedf1282dbc</id>
<content type='text'>
The remap_file_pages() system call is used to create a nonlinear
mapping, that is, a mapping in which the pages of the file are mapped
into a nonsequential order in memory.  The advantage of using
remap_file_pages() over using repeated calls to mmap(2) is that the
former approach does not require the kernel to create additional VMA
(Virtual Memory Area) data structures.

Supporting of nonlinear mapping requires significant amount of
non-trivial code in kernel virtual memory subsystem including hot paths.
Also to get nonlinear mapping work kernel need a way to distinguish
normal page table entries from entries with file offset (pte_file).
Kernel reserves flag in PTE for this purpose.  PTE flags are scarce
resource especially on some CPU architectures.  It would be nice to free
up the flag for other usage.

Fortunately, there are not many users of remap_file_pages() in the wild.
It's only known that one enterprise RDBMS implementation uses the
syscall on 32-bit systems to map files bigger than can linearly fit into
32-bit virtual address space.  This use-case is not critical anymore
since 64-bit systems are widely available.

The plan is to deprecate the syscall and replace it with an emulation.
The emulation will create new VMAs instead of nonlinear mappings.  It's
going to work slower for rare users of remap_file_pages() but ABI is
preserved.

One side effect of emulation (apart from performance) is that user can
hit vm.max_map_count limit more easily due to additional VMAs.  See
comment for DEFAULT_MAX_MAP_COUNT for more details on the limit.

[akpm@linux-foundation.org: fix spello]
Signed-off-by: Kirill A. Shutemov &lt;kirill.shutemov@linux.intel.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Dave Jones &lt;davej@redhat.com&gt;
Cc: Armin Rigo &lt;arigo@tunes.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm/memory-failure.c: support use of a dedicated thread to handle SIGBUS(BUS_MCEERR_AO)</title>
<updated>2014-06-04T23:54:13Z</updated>
<author>
<name>Naoya Horiguchi</name>
<email>n-horiguchi@ah.jp.nec.com</email>
</author>
<published>2014-06-04T23:11:02Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=3ba08129e38437561df44c36b7ea9081185d5333'/>
<id>urn:sha1:3ba08129e38437561df44c36b7ea9081185d5333</id>
<content type='text'>
Currently memory error handler handles action optional errors in the
deferred manner by default.  And if a recovery aware application wants
to handle it immediately, it can do it by setting PF_MCE_EARLY flag.
However, such signal can be sent only to the main thread, so it's
problematic if the application wants to have a dedicated thread to
handler such signals.

So this patch adds dedicated thread support to memory error handler.  We
have PF_MCE_EARLY flags for each thread separately, so with this patch
AO signal is sent to the thread with PF_MCE_EARLY flag set, not the main
thread.  If you want to implement a dedicated thread, you call prctl()
to set PF_MCE_EARLY on the thread.

Memory error handler collects processes to be killed, so this patch lets
it check PF_MCE_EARLY flag on each thread in the collecting routines.

No behavioral change for all non-early kill cases.

Tony said:

: The old behavior was crazy - someone with a multithreaded process might
: well expect that if they call prctl(PF_MCE_EARLY) in just one thread, then
: that thread would see the SIGBUS with si_code = BUS_MCEERR_A0 - even if
: that thread wasn't the main thread for the process.

[akpm@linux-foundation.org: coding-style fixes]
Signed-off-by: Naoya Horiguchi &lt;n-horiguchi@ah.jp.nec.com&gt;
Reviewed-by: Tony Luck &lt;tony.luck@intel.com&gt;
Cc: Kamil Iskra &lt;iskra@mcs.anl.gov&gt;
Cc: Andi Kleen &lt;andi@firstfloor.org&gt;
Cc: Borislav Petkov &lt;bp@suse.de&gt;
Cc: Chen Gong &lt;gong.chen@linux.jf.intel.com&gt;
Cc: &lt;stable@vger.kernel.org&gt;	[3.2+]
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
</feed>
