<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/mm/memory.c, branch v5.10</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v5.10</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v5.10'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2020-10-18T16:27:10Z</updated>
<entry>
<title>mm: allow a NULL fn callback in apply_to_page_range</title>
<updated>2020-10-18T16:27:10Z</updated>
<author>
<name>Christoph Hellwig</name>
<email>hch@lst.de</email>
</author>
<published>2020-10-17T23:15:14Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=eeb4a05fcef39a720d24846356cf65a07e71d7a1'/>
<id>urn:sha1:eeb4a05fcef39a720d24846356cf65a07e71d7a1</id>
<content type='text'>
Besides calling the callback on each page, apply_to_page_range also has
the effect of pre-faulting all PTEs for the range.  To support callers
that only need the pre-faulting, make the callback optional.

Based on a patch from Minchan Kim &lt;minchan@kernel.org&gt;.

Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: Boris Ostrovsky &lt;boris.ostrovsky@oracle.com&gt;
Cc: Chris Wilson &lt;chris@chris-wilson.co.uk&gt;
Cc: Jani Nikula &lt;jani.nikula@linux.intel.com&gt;
Cc: Joonas Lahtinen &lt;joonas.lahtinen@linux.intel.com&gt;
Cc: Juergen Gross &lt;jgross@suse.com&gt;
Cc: Matthew Auld &lt;matthew.auld@intel.com&gt;
Cc: "Matthew Wilcox (Oracle)" &lt;willy@infradead.org&gt;
Cc: Nitin Gupta &lt;ngupta@vflare.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Rodrigo Vivi &lt;rodrigo.vivi@intel.com&gt;
Cc: Stefano Stabellini &lt;sstabellini@kernel.org&gt;
Cc: Tvrtko Ursulin &lt;tvrtko.ursulin@intel.com&gt;
Cc: Uladzislau Rezki (Sony) &lt;urezki@gmail.com&gt;
Link: https://lkml.kernel.org/r/20201002122204.1534411-5-hch@lst.de
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm/memory: remove page fault assumption of compound page size</title>
<updated>2020-10-16T18:11:15Z</updated>
<author>
<name>Matthew Wilcox (Oracle)</name>
<email>willy@infradead.org</email>
</author>
<published>2020-10-16T03:05:26Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=d01ac3c35214ce362f50cada37cb7bab8c801896'/>
<id>urn:sha1:d01ac3c35214ce362f50cada37cb7bab8c801896</id>
<content type='text'>
A compound page in the page cache will not necessarily be of PMD size,
so check explicitly.

[willy@infradead.org: fix remove page fault assumption of compound page size]
  Link: https://lkml.kernel.org/r/20201001152259.14932-1-willy@infradead.org

Signed-off-by: Matthew Wilcox (Oracle) &lt;willy@infradead.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: Huang Ying &lt;ying.huang@intel.com&gt;
Cc: Kirill A. Shutemov &lt;kirill.shutemov@linux.intel.com&gt;
Link: https://lkml.kernel.org/r/20200908195539.25896-3-willy@infradead.org
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>Merge tag 'dma-mapping-5.10' of git://git.infradead.org/users/hch/dma-mapping</title>
<updated>2020-10-15T21:43:29Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2020-10-15T21:43:29Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=5a32c3413d3340f90c82c84b375ad4b335a59f28'/>
<id>urn:sha1:5a32c3413d3340f90c82c84b375ad4b335a59f28</id>
<content type='text'>
Pull dma-mapping updates from Christoph Hellwig:

 - rework the non-coherent DMA allocator

 - move private definitions out of &lt;linux/dma-mapping.h&gt;

 - lower CMA_ALIGNMENT (Paul Cercueil)

 - remove the omap1 dma address translation in favor of the common code

 - make dma-direct aware of multiple dma offset ranges (Jim Quinlan)

 - support per-node DMA CMA areas (Barry Song)

 - increase the default seg boundary limit (Nicolin Chen)

 - misc fixes (Robin Murphy, Thomas Tai, Xu Wang)

 - various cleanups

* tag 'dma-mapping-5.10' of git://git.infradead.org/users/hch/dma-mapping: (63 commits)
  ARM/ixp4xx: add a missing include of dma-map-ops.h
  dma-direct: simplify the DMA_ATTR_NO_KERNEL_MAPPING handling
  dma-direct: factor out a dma_direct_alloc_from_pool helper
  dma-direct check for highmem pages in dma_direct_alloc_pages
  dma-mapping: merge &lt;linux/dma-noncoherent.h&gt; into &lt;linux/dma-map-ops.h&gt;
  dma-mapping: move large parts of &lt;linux/dma-direct.h&gt; to kernel/dma
  dma-mapping: move dma-debug.h to kernel/dma/
  dma-mapping: remove &lt;asm/dma-contiguous.h&gt;
  dma-mapping: merge &lt;linux/dma-contiguous.h&gt; into &lt;linux/dma-map-ops.h&gt;
  dma-contiguous: remove dma_contiguous_set_default
  dma-contiguous: remove dev_set_cma_area
  dma-contiguous: remove dma_declare_contiguous
  dma-mapping: split &lt;linux/dma-mapping.h&gt;
  cma: decrease CMA_ALIGNMENT lower limit to 2
  firewire-ohci: use dma_alloc_pages
  dma-iommu: implement -&gt;alloc_noncoherent
  dma-mapping: add new {alloc,free}_noncoherent dma_map_ops methods
  dma-mapping: add a new dma_alloc_pages API
  dma-mapping: remove dma_cache_sync
  53c700: convert to dma_alloc_noncoherent
  ...
</content>
</entry>
<entry>
<title>mm: remove src/dst mm parameter in copy_page_range()</title>
<updated>2020-10-14T01:38:32Z</updated>
<author>
<name>Peter Xu</name>
<email>peterx@redhat.com</email>
</author>
<published>2020-10-13T23:54:21Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=c78f463649d60f4ed12df97a32def4d77b583853'/>
<id>urn:sha1:c78f463649d60f4ed12df97a32def4d77b583853</id>
<content type='text'>
Both of the mm pointers are not needed after commit 7a4830c380f3
("mm/fork: Pass new vma pointer into copy_page_range()").

Jason Gunthorpe also reported that the ordering of copy_page_range() is
odd.  Since working at it, reorder the parameters to be logical, by (1)
always put the dst_* fields to be before src_* fields, and (2) keep the
same type of parameters together.

[peterx@redhat.com: further reorder some parameters and line format, per Jason]
  Link: https://lkml.kernel.org/r/20201002192647.7161-1-peterx@redhat.com
[peterx@redhat.com: fix warnings]
  Link: https://lkml.kernel.org/r/20201006200138.GA6026@xz-x1

Reported-by: Kirill A. Shutemov &lt;kirill@shutemov.name&gt;
Signed-off-by: Peter Xu &lt;peterx@redhat.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Reviewed-by: Jason Gunthorpe &lt;jgg@nvidia.com&gt;
Acked-by: Kirill A. Shutemov &lt;kirill.shutemov@linux.intel.com&gt;
Link: https://lkml.kernel.org/r/20200930204950.6668-1-peterx@redhat.com
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm/memory.c: fix spello of "function"</title>
<updated>2020-10-14T01:38:31Z</updated>
<author>
<name>Randy Dunlap</name>
<email>rdunlap@infradead.org</email>
</author>
<published>2020-10-13T23:54:01Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=f1dc1685f6ca25d435d58d16e53bb6f336f54cac'/>
<id>urn:sha1:f1dc1685f6ca25d435d58d16e53bb6f336f54cac</id>
<content type='text'>
Fix typo/spello of "function".

Signed-off-by: Randy Dunlap &lt;rdunlap@infradead.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Link: https://lkml.kernel.org/r/e7bf180e-c558-b1d5-9a15-6d9708823c9c@infradead.org
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm/memory.c: replace vmf-&gt;vma with variable vma</title>
<updated>2020-10-14T01:38:31Z</updated>
<author>
<name>Yanfei Xu</name>
<email>yanfei.xu@windriver.com</email>
</author>
<published>2020-10-13T23:53:29Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=a7069ee3f891b22727303dfe0857ae47ef0c4936'/>
<id>urn:sha1:a7069ee3f891b22727303dfe0857ae47ef0c4936</id>
<content type='text'>
The code has declared a vma_struct named vma which is assigned a value of
vmf-&gt;vma.  Thus, use variable vma directly here.

Signed-off-by: Yanfei Xu &lt;yanfei.xu@windriver.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Reviewed-by: Matthew Wilcox (Oracle) &lt;willy@infradead.org&gt;
Link: http://lkml.kernel.org/r/20200818084607.37616-1-yanfei.xu@windriver.com
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm/memory.c: fix typo in __do_fault() comment</title>
<updated>2020-10-14T01:38:31Z</updated>
<author>
<name>Yanfei Xu</name>
<email>yanfei.xu@windriver.com</email>
</author>
<published>2020-10-13T23:53:26Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=d383807aaf7796d328a43f20b98b99c2fcc664d7'/>
<id>urn:sha1:d383807aaf7796d328a43f20b98b99c2fcc664d7</id>
<content type='text'>
It's "pte_alloc_one", not "pte_alloc_pne". Let's fix that.

Signed-off-by: Yanfei Xu &lt;yanfei.xu@windriver.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Reviewed-by: David Hildenbrand &lt;david@redhat.com&gt;
Link: http://lkml.kernel.org/r/20200818104339.5310-1-yanfei.xu@windriver.com
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: avoid early COW write protect games during fork()</title>
<updated>2020-10-08T17:11:32Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2020-09-28T19:50:03Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=f3c64eda3e5097ec3198cb271f5f504d65d67131'/>
<id>urn:sha1:f3c64eda3e5097ec3198cb271f5f504d65d67131</id>
<content type='text'>
In commit 70e806e4e645 ("mm: Do early cow for pinned pages during fork()
for ptes") we write-protected the PTE before doing the page pinning
check, in order to avoid a race with concurrent fast-GUP pinning (which
doesn't take the mm semaphore or the page table lock).

That trick doesn't actually work - it doesn't handle memory ordering
properly, and doing so would be prohibitively expensive.

It also isn't really needed.  While we're moving in the direction of
allowing and supporting page pinning without marking the pinned area
with MADV_DONTFORK, the fact is that we've never really supported this
kind of odd "concurrent fork() and page pinning", and doing the
serialization on a pte level is just wrong.

We can add serialization with a per-mm sequence counter, so we know how
to solve that race properly, but we'll do that at a more appropriate
time.  Right now this just removes the write protect games.

It also turns out that the write protect games actually break on Power,
as reported by Aneesh Kumar:

 "Architecture like ppc64 expects set_pte_at to be not used for updating
  a valid pte. This is further explained in commit 56eecdb912b5 ("mm:
  Use ptep/pmdp_set_numa() for updating _PAGE_NUMA bit")"

and the code triggered a warning there:

  WARNING: CPU: 0 PID: 30613 at arch/powerpc/mm/pgtable.c:185 set_pte_at+0x2a8/0x3a0 arch/powerpc/mm/pgtable.c:185
  Call Trace:
    copy_present_page mm/memory.c:857 [inline]
    copy_present_pte mm/memory.c:899 [inline]
    copy_pte_range mm/memory.c:1014 [inline]
    copy_pmd_range mm/memory.c:1092 [inline]
    copy_pud_range mm/memory.c:1127 [inline]
    copy_p4d_range mm/memory.c:1150 [inline]
    copy_page_range+0x1f6c/0x2cc0 mm/memory.c:1212
    dup_mmap kernel/fork.c:592 [inline]
    dup_mm+0x77c/0xab0 kernel/fork.c:1355
    copy_mm kernel/fork.c:1411 [inline]
    copy_process+0x1f00/0x2740 kernel/fork.c:2070
    _do_fork+0xc4/0x10b0 kernel/fork.c:2429

Link: https://lore.kernel.org/lkml/CAHk-=wiWr+gO0Ro4LvnJBMs90OiePNyrE3E+pJvc9PzdBShdmw@mail.gmail.com/
Link: https://lore.kernel.org/linuxppc-dev/20201008092541.398079-1-aneesh.kumar@linux.ibm.com/
Reported-by: Aneesh Kumar K.V &lt;aneesh.kumar@linux.ibm.com&gt;
Tested-by: Leon Romanovsky &lt;leonro@nvidia.com&gt;
Cc: Peter Xu &lt;peterx@redhat.com&gt;
Cc: Jason Gunthorpe &lt;jgg@ziepe.ca&gt;
Cc: John Hubbard &lt;jhubbard@nvidia.com&gt;
Cc: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: Jan Kara &lt;jack@suse.cz&gt;
Cc: Michal Hocko &lt;mhocko@suse.com&gt;
Cc: Kirill Shutemov &lt;kirill@shutemov.name&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>dma-mapping: move dma-debug.h to kernel/dma/</title>
<updated>2020-10-06T05:07:05Z</updated>
<author>
<name>Christoph Hellwig</name>
<email>hch@lst.de</email>
</author>
<published>2020-09-11T08:12:44Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=a1fd09e8e6ae35228ecc7c1e4bfff1fd725f78a0'/>
<id>urn:sha1:a1fd09e8e6ae35228ecc7c1e4bfff1fd725f78a0</id>
<content type='text'>
Most of dma-debug.h is not required by anything outside of kernel/dma.
Move the four declarations needed by dma-mappin.h or dma-ops providers
into dma-mapping.h and dma-map-ops.h, and move the remainder of the
file to kernel/dma/debug.h.

Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
</content>
</entry>
<entry>
<title>mm: Do early cow for pinned pages during fork() for ptes</title>
<updated>2020-09-27T18:21:35Z</updated>
<author>
<name>Peter Xu</name>
<email>peterx@redhat.com</email>
</author>
<published>2020-09-25T22:25:59Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=70e806e4e645019102d0e09d4933654fb5fb58ce'/>
<id>urn:sha1:70e806e4e645019102d0e09d4933654fb5fb58ce</id>
<content type='text'>
This allows copy_pte_range() to do early cow if the pages were pinned on
the source mm.

Currently we don't have an accurate way to know whether a page is pinned
or not.  The only thing we have is page_maybe_dma_pinned().  However
that's good enough for now.  Especially, with the newly added
mm-&gt;has_pinned flag to make sure we won't affect processes that never
pinned any pages.

It would be easier if we can do GFP_KERNEL allocation within
copy_one_pte().  Unluckily, we can't because we're with the page table
locks held for both the parent and child processes.  So the page
allocation needs to be done outside copy_one_pte().

Some trick is there in copy_present_pte(), majorly the wrprotect trick
to block concurrent fast-gup.  Comments in the function should explain
better in place.

Oleg Nesterov reported a (probably harmless) bug during review that we
didn't reset entry.val properly in copy_pte_range() so that potentially
there's chance to call add_swap_count_continuation() multiple times on
the same swp entry.  However that should be harmless since even if it
happens, the same function (add_swap_count_continuation()) will return
directly noticing that there're enough space for the swp counter.  So
instead of a standalone stable patch, it is touched up in this patch
directly.

Link: https://lore.kernel.org/lkml/20200914143829.GA1424636@nvidia.com/
Suggested-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Peter Xu &lt;peterx@redhat.com&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
</feed>
