<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/include/asm-generic/pgtable.h, branch v3.19</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v3.19</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v3.19'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2014-10-27T12:27:30Z</updated>
<entry>
<title>s390/mm: pmdp_get_and_clear_full optimization</title>
<updated>2014-10-27T12:27:30Z</updated>
<author>
<name>Martin Schwidefsky</name>
<email>schwidefsky@de.ibm.com</email>
</author>
<published>2014-10-24T08:52:29Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=fcbe08d66f57c368e77ca729dd01e6b539ffb3ff'/>
<id>urn:sha1:fcbe08d66f57c368e77ca729dd01e6b539ffb3ff</id>
<content type='text'>
Analog to ptep_get_and_clear_full define a variant of the
pmpd_get_and_clear primitive which gets the full hint from the
mmu_gather struct. This allows s390 to avoid a costly instruction
when destroying an address space.

Signed-off-by: Martin Schwidefsky &lt;schwidefsky@de.ibm.com&gt;
</content>
</entry>
<entry>
<title>mm: softdirty: enable write notifications on VMAs after VM_SOFTDIRTY cleared</title>
<updated>2014-10-14T00:18:28Z</updated>
<author>
<name>Peter Feiner</name>
<email>pfeiner@google.com</email>
</author>
<published>2014-10-13T22:55:46Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=64e455079e1bd7787cc47be30b7f601ce682a5f6'/>
<id>urn:sha1:64e455079e1bd7787cc47be30b7f601ce682a5f6</id>
<content type='text'>
For VMAs that don't want write notifications, PTEs created for read faults
have their write bit set.  If the read fault happens after VM_SOFTDIRTY is
cleared, then the PTE's softdirty bit will remain clear after subsequent
writes.

Here's a simple code snippet to demonstrate the bug:

  char* m = mmap(NULL, getpagesize(), PROT_READ | PROT_WRITE,
                 MAP_ANONYMOUS | MAP_SHARED, -1, 0);
  system("echo 4 &gt; /proc/$PPID/clear_refs"); /* clear VM_SOFTDIRTY */
  assert(*m == '\0');     /* new PTE allows write access */
  assert(!soft_dirty(x));
  *m = 'x';               /* should dirty the page */
  assert(soft_dirty(x));  /* fails */

With this patch, write notifications are enabled when VM_SOFTDIRTY is
cleared.  Furthermore, to avoid unnecessary faults, write notifications
are disabled when VM_SOFTDIRTY is set.

As a side effect of enabling and disabling write notifications with
care, this patch fixes a bug in mprotect where vm_page_prot bits set by
drivers were zapped on mprotect.  An analogous bug was fixed in mmap by
commit c9d0bf241451 ("mm: uncached vma support with writenotify").

Signed-off-by: Peter Feiner &lt;pfeiner@google.com&gt;
Reported-by: Peter Feiner &lt;pfeiner@google.com&gt;
Suggested-by: Kirill A. Shutemov &lt;kirill.shutemov@linux.intel.com&gt;
Cc: Cyrill Gorcunov &lt;gorcunov@openvz.org&gt;
Cc: Pavel Emelyanov &lt;xemul@parallels.com&gt;
Cc: Jamie Liu &lt;jamieliu@google.com&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Cc: Naoya Horiguchi &lt;n-horiguchi@ah.jp.nec.com&gt;
Cc: Bjorn Helgaas &lt;bhelgaas@google.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>Merge branch 'akpm' (fixes from Andrew Morton)</title>
<updated>2014-10-10T02:26:14Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2014-10-10T02:26:14Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=0cf744bc7ae8e0072159a901f6e1a159bbc30ffa'/>
<id>urn:sha1:0cf744bc7ae8e0072159a901f6e1a159bbc30ffa</id>
<content type='text'>
Merge patch-bomb from Andrew Morton:
 - part of OCFS2 (review is laggy again)
 - procfs
 - slab
 - all of MM
 - zram, zbud
 - various other random things: arch, filesystems.

* emailed patches from Andrew Morton &lt;akpm@linux-foundation.org&gt;: (164 commits)
  nosave: consolidate __nosave_{begin,end} in &lt;asm/sections.h&gt;
  include/linux/screen_info.h: remove unused ORIG_* macros
  kernel/sys.c: compat sysinfo syscall: fix undefined behavior
  kernel/sys.c: whitespace fixes
  acct: eliminate compile warning
  kernel/async.c: switch to pr_foo()
  include/linux/blkdev.h: use NULL instead of zero
  include/linux/kernel.h: deduplicate code implementing clamp* macros
  include/linux/kernel.h: rewrite min3, max3 and clamp using min and max
  alpha: use Kbuild logic to include &lt;asm-generic/sections.h&gt;
  frv: remove deprecated IRQF_DISABLED
  frv: remove unused cpuinfo_frv and friends to fix future build error
  zbud: avoid accessing last unused freelist
  zsmalloc: simplify init_zspage free obj linking
  mm/zsmalloc.c: correct comment for fullness group computation
  zram: use notify_free to account all free notifications
  zram: report maximum used memory
  zram: zram memory size limitation
  zsmalloc: change return value unit of zs_get_total_size_bytes
  zsmalloc: move pages_allocated to zs_pool
  ...
</content>
</entry>
<entry>
<title>mm: remove misleading ARCH_USES_NUMA_PROT_NONE</title>
<updated>2014-10-10T02:25:52Z</updated>
<author>
<name>Mel Gorman</name>
<email>mgorman@suse.de</email>
</author>
<published>2014-10-09T22:26:33Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=6a33979d5bd7521497121c5ae4435d7003115a0f'/>
<id>urn:sha1:6a33979d5bd7521497121c5ae4435d7003115a0f</id>
<content type='text'>
ARCH_USES_NUMA_PROT_NONE was defined for architectures that implemented
_PAGE_NUMA using _PROT_NONE.  This saved using an additional PTE bit and
relied on the fact that PROT_NONE vmas were skipped by the NUMA hinting
fault scanner.  This was found to be conceptually confusing with a lot of
implicit assumptions and it was asked that an alternative be found.

Commit c46a7c81 "x86: define _PAGE_NUMA by reusing software bits on the
PMD and PTE levels" redefined _PAGE_NUMA on x86 to be one of the swap PTE
bits and shrunk the maximum possible swap size but it did not go far
enough.  There are no architectures that reuse _PROT_NONE as _PROT_NUMA
but the relics still exist.

This patch removes ARCH_USES_NUMA_PROT_NONE and removes some unnecessary
duplication in powerpc vs the generic implementation by defining the types
the core NUMA helpers expected to exist from x86 with their ppc64
equivalent.  This necessitated that a PTE bit mask be created that
identified the bits that distinguish present from NUMA pte entries but it
is expected this will only differ between arches based on _PAGE_PROTNONE.
The naming for the generic helpers was taken from x86 originally but ppc64
has types that are equivalent for the purposes of the helper so they are
mapped instead of duplicating code.

Signed-off-by: Mel Gorman &lt;mgorman@suse.de&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Cc: "Kirill A. Shutemov" &lt;kirill.shutemov@linux.intel.com&gt;
Cc: Rik van Riel &lt;riel@redhat.com&gt;
Cc: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: Cyrill Gorcunov &lt;gorcunov@gmail.com&gt;
Reviewed-by: Aneesh Kumar K.V &lt;aneesh.kumar@linux.vnet.ibm.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>PCI: Add pci_remap_iospace() to map bus I/O resources</title>
<updated>2014-09-30T23:08:57Z</updated>
<author>
<name>Liviu Dudau</name>
<email>Liviu.Dudau@arm.com</email>
</author>
<published>2014-09-29T14:29:30Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=8b921acfeffdb0b45085da862fc301a2d25ed2cf'/>
<id>urn:sha1:8b921acfeffdb0b45085da862fc301a2d25ed2cf</id>
<content type='text'>
Add pci_remap_iospace() to map bus I/O resources into the CPU virtual
address space.  Architectures with special needs may provide their own
version, but most should be able to use this one.

This function is useful for PCI host bridge drivers that need to map the
PCI I/O resources into virtual memory space.

[bhelgaas: phys_addr description, drop temporary "err" variable]
Signed-off-by: Liviu Dudau &lt;Liviu.Dudau@arm.com&gt;
Signed-off-by: Bjorn Helgaas &lt;bhelgaas@google.com&gt;
Reviewed-by: Rob Herring &lt;robh@kernel.org&gt;
Reviewed-by: Catalin Marinas &lt;catalin.marinas@arm.com&gt;
CC: Arnd Bergmann &lt;arnd@arndb.de&gt;
</content>
</entry>
<entry>
<title>x86: define _PAGE_NUMA by reusing software bits on the PMD and PTE levels</title>
<updated>2014-06-04T23:53:55Z</updated>
<author>
<name>Mel Gorman</name>
<email>mgorman@suse.de</email>
</author>
<published>2014-06-04T23:06:30Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=c46a7c817e662a820373bb76b88d0ad67d6abe5d'/>
<id>urn:sha1:c46a7c817e662a820373bb76b88d0ad67d6abe5d</id>
<content type='text'>
_PAGE_NUMA is currently an alias of _PROT_PROTNONE to trap NUMA hinting
faults on x86.  Care is taken such that _PAGE_NUMA is used only in
situations where the VMA flags distinguish between NUMA hinting faults
and prot_none faults.  This decision was x86-specific and conceptually
it is difficult requiring special casing to distinguish between PROTNONE
and NUMA ptes based on context.

Fundamentally, we only need the _PAGE_NUMA bit to tell the difference
between an entry that is really unmapped and a page that is protected
for NUMA hinting faults as if the PTE is not present then a fault will
be trapped.

Swap PTEs on x86-64 use the bits after _PAGE_GLOBAL for the offset.
This patch shrinks the maximum possible swap size and uses the bit to
uniquely distinguish between NUMA hinting ptes and swap ptes.

Signed-off-by: Mel Gorman &lt;mgorman@suse.de&gt;
Cc: David Vrabel &lt;david.vrabel@citrix.com&gt;
Cc: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Peter Anvin &lt;hpa@zytor.com&gt;
Cc: Fengguang Wu &lt;fengguang.wu@intel.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Steven Noonan &lt;steven@uplinklabs.net&gt;
Cc: Rik van Riel &lt;riel@redhat.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Andrea Arcangeli &lt;aarcange@redhat.com&gt;
Cc: Dave Hansen &lt;dave.hansen@intel.com&gt;
Cc: Srikar Dronamraju &lt;srikar@linux.vnet.ibm.com&gt;
Cc: Cyrill Gorcunov &lt;gorcunov@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: use paravirt friendly ops for NUMA hinting ptes</title>
<updated>2014-04-18T23:40:09Z</updated>
<author>
<name>Mel Gorman</name>
<email>mgorman@suse.de</email>
</author>
<published>2014-04-18T22:07:21Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=29c7787075c92ca8af353acd5301481e6f37082f'/>
<id>urn:sha1:29c7787075c92ca8af353acd5301481e6f37082f</id>
<content type='text'>
David Vrabel identified a regression when using automatic NUMA balancing
under Xen whereby page table entries were getting corrupted due to the
use of native PTE operations.  Quoting him

	Xen PV guest page tables require that their entries use machine
	addresses if the preset bit (_PAGE_PRESENT) is set, and (for
	successful migration) non-present PTEs must use pseudo-physical
	addresses.  This is because on migration MFNs in present PTEs are
	translated to PFNs (canonicalised) so they may be translated back
	to the new MFN in the destination domain (uncanonicalised).

	pte_mknonnuma(), pmd_mknonnuma(), pte_mknuma() and pmd_mknuma()
	set and clear the _PAGE_PRESENT bit using pte_set_flags(),
	pte_clear_flags(), etc.

	In a Xen PV guest, these functions must translate MFNs to PFNs
	when clearing _PAGE_PRESENT and translate PFNs to MFNs when setting
	_PAGE_PRESENT.

His suggested fix converted p[te|md]_[set|clear]_flags to using
paravirt-friendly ops but this is overkill.  He suggested an alternative
of using p[te|md]_modify in the NUMA page table operations but this is
does more work than necessary and would require looking up a VMA for
protections.

This patch modifies the NUMA page table operations to use paravirt
friendly operations to set/clear the flags of interest.  Unfortunately
this will take a performance hit when updating the PTEs on
CONFIG_PARAVIRT but I do not see a way around it that does not break
Xen.

Signed-off-by: Mel Gorman &lt;mgorman@suse.de&gt;
Acked-by: David Vrabel &lt;david.vrabel@citrix.com&gt;
Tested-by: David Vrabel &lt;david.vrabel@citrix.com&gt;
Cc: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Peter Anvin &lt;hpa@zytor.com&gt;
Cc: Fengguang Wu &lt;fengguang.wu@intel.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Steven Noonan &lt;steven@uplinklabs.net&gt;
Cc: Rik van Riel &lt;riel@redhat.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Andrea Arcangeli &lt;aarcange@redhat.com&gt;
Cc: Dave Hansen &lt;dave.hansen@intel.com&gt;
Cc: Srikar Dronamraju &lt;srikar@linux.vnet.ibm.com&gt;
Cc: Cyrill Gorcunov &lt;gorcunov@gmail.com&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: add support for discard of unused ptes</title>
<updated>2014-02-21T07:50:18Z</updated>
<author>
<name>Konstantin Weitz</name>
<email>konstantin.weitz@gmail.com</email>
</author>
<published>2013-04-17T11:59:32Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=45961722f8e30ceab9d135b1ddc0947d53aef7c3'/>
<id>urn:sha1:45961722f8e30ceab9d135b1ddc0947d53aef7c3</id>
<content type='text'>
In a virtualized environment and given an appropriate interface the guest
can mark pages as unused while they are free (for the s390 implementation
see git commit 45e576b1c3d00206 "guest page hinting light"). For the host
the unused state is a property of the pte.

This patch adds the primitive 'pte_unused' and code to the host swap out
handler so that pages marked as unused by all mappers are not swapped out
but discarded instead, thus saving one IO for swap out and potentially
another one for swap in.

[ Martin Schwidefsky: patch reordering and simplification ]

Signed-off-by: Konstantin Weitz &lt;konstantin.weitz@gmail.com&gt;
Signed-off-by: Martin Schwidefsky &lt;schwidefsky@de.ibm.com&gt;
</content>
</entry>
<entry>
<title>mm: Use ptep/pmdp_set_numa() for updating _PAGE_NUMA bit</title>
<updated>2014-02-17T00:19:36Z</updated>
<author>
<name>Aneesh Kumar K.V</name>
<email>aneesh.kumar@linux.vnet.ibm.com</email>
</author>
<published>2014-02-12T03:43:38Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=56eecdb912b536a4fa97fb5bfe5a940a54d79be6'/>
<id>urn:sha1:56eecdb912b536a4fa97fb5bfe5a940a54d79be6</id>
<content type='text'>
Archs like ppc64 doesn't do tlb flush in set_pte/pmd functions when using
a hash table MMU for various reasons (the flush is handled as part of
the PTE modification when necessary).

ppc64 thus doesn't implement flush_tlb_range for hash based MMUs.

Additionally ppc64 require the tlb flushing to be batched within ptl locks.

The reason to do that is to ensure that the hash page table is in sync with
linux page table.

We track the hpte index in linux pte and if we clear them without flushing
hash and drop the ptl lock, we can have another cpu update the pte and can
end up with duplicate entry in the hash table, which is fatal.

We also want to keep set_pte_at simpler by not requiring them to do hash
flush for performance reason. We do that by assuming that set_pte_at() is
never *ever* called on a PTE that is already valid.

This was the case until the NUMA code went in which broke that assumption.

Fix that by introducing a new pair of helpers to set _PAGE_NUMA in a
way similar to ptep/pmdp_set_wrprotect(), with a generic implementation
using set_pte_at() and a powerpc specific one using the appropriate
mechanism needed to keep the hash table in sync.

Acked-by: Mel Gorman &lt;mgorman@suse.de&gt;
Reviewed-by: Rik van Riel &lt;riel@redhat.com&gt;
Signed-off-by: Aneesh Kumar K.V &lt;aneesh.kumar@linux.vnet.ibm.com&gt;
Signed-off-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
</content>
</entry>
<entry>
<title>powerpc/thp: Fix crash on mremap</title>
<updated>2014-01-15T04:46:38Z</updated>
<author>
<name>Aneesh Kumar K.V</name>
<email>aneesh.kumar@linux.vnet.ibm.com</email>
</author>
<published>2014-01-13T06:04:24Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=b3084f4db3aeb991c507ca774337c7e7893ed04f'/>
<id>urn:sha1:b3084f4db3aeb991c507ca774337c7e7893ed04f</id>
<content type='text'>
This patch fix the below crash

NIP [c00000000004cee4] .__hash_page_thp+0x2a4/0x440
LR [c0000000000439ac] .hash_page+0x18c/0x5e0
...
Call Trace:
[c000000736103c40] [00001ffffb000000] 0x1ffffb000000(unreliable)
[437908.479693] [c000000736103d50] [c0000000000439ac] .hash_page+0x18c/0x5e0
[437908.479699] [c000000736103e30] [c00000000000924c] .do_hash_page+0x4c/0x58

On ppc64 we use the pgtable for storing the hpte slot information and
store address to the pgtable at a constant offset (PTRS_PER_PMD) from
pmd. On mremap, when we switch the pmd, we need to withdraw and deposit
the pgtable again, so that we find the pgtable at PTRS_PER_PMD offset
from new pmd.

We also want to move the withdraw and deposit before the set_pmd so
that, when page fault find the pmd as trans huge we can be sure that
pgtable can be located at the offset.

Signed-off-by: Aneesh Kumar K.V &lt;aneesh.kumar@linux.vnet.ibm.com&gt;
Acked-by: Kirill A. Shutemov &lt;kirill.shutemov@linux.intel.com&gt;
Signed-off-by: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
</content>
</entry>
</feed>
