<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/fs/proc, branch v6.0</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v6.0</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v6.0'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2022-08-20T22:17:45Z</updated>
<entry>
<title>mm/smaps: don't access young/dirty bit if pte unpresent</title>
<updated>2022-08-20T22:17:45Z</updated>
<author>
<name>Peter Xu</name>
<email>peterx@redhat.com</email>
</author>
<published>2022-08-05T16:00:03Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=efd4149342db2df41b1bbe68972ead853b30e444'/>
<id>urn:sha1:efd4149342db2df41b1bbe68972ead853b30e444</id>
<content type='text'>
These bits should only be valid when the ptes are present.  Introducing
two booleans for it and set it to false when !pte_present() for both pte
and pmd accountings.

The bug is found during code reading and no real world issue reported, but
logically such an error can cause incorrect readings for either smaps or
smaps_rollup output on quite a few fields.

For example, it could cause over-estimate on values like Shared_Dirty,
Private_Dirty, Referenced.  Or it could also cause under-estimate on
values like LazyFree, Shared_Clean, Private_Clean.

Link: https://lkml.kernel.org/r/20220805160003.58929-1-peterx@redhat.com
Fixes: b1d4d9e0cbd0 ("proc/smaps: carefully handle migration entries")
Fixes: c94b6923fa0a ("/proc/PID/smaps: Add PMD migration entry parsing")
Signed-off-by: Peter Xu &lt;peterx@redhat.com&gt;
Reviewed-by: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Reviewed-by: David Hildenbrand &lt;david@redhat.com&gt;
Reviewed-by: Yang Shi &lt;shy828301@gmail.com&gt;
Cc: Konstantin Khlebnikov &lt;khlebnikov@openvz.org&gt;
Cc: Huang Ying &lt;ying.huang@intel.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>take care to handle NULL -&gt;proc_lseek()</title>
<updated>2022-08-14T19:16:18Z</updated>
<author>
<name>Al Viro</name>
<email>viro@zeniv.linux.org.uk</email>
</author>
<published>2022-08-14T19:16:18Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=3f61631d47f115b83c935d0039f95cb68b0c8ab7'/>
<id>urn:sha1:3f61631d47f115b83c935d0039f95cb68b0c8ab7</id>
<content type='text'>
Easily done now, just by clearing FMODE_LSEEK in -&gt;f_mode
during proc_reg_open() for such entries.

Fixes: 868941b14441 "fs: remove no_llseek"
Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</content>
</entry>
<entry>
<title>Merge tag 'mm-nonmm-stable-2022-08-06-2' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm</title>
<updated>2022-08-07T17:03:24Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2022-08-07T17:03:24Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=eb5699ba31558bdb2cee6ebde3d0a68091e47dce'/>
<id>urn:sha1:eb5699ba31558bdb2cee6ebde3d0a68091e47dce</id>
<content type='text'>
Pull misc updates from Andrew Morton:
 "Updates to various subsystems which I help look after. lib, ocfs2,
  fatfs, autofs, squashfs, procfs, etc. A relatively small amount of
  material this time"

* tag 'mm-nonmm-stable-2022-08-06-2' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (72 commits)
  scripts/gdb: ensure the absolute path is generated on initial source
  MAINTAINERS: kunit: add David Gow as a maintainer of KUnit
  mailmap: add linux.dev alias for Brendan Higgins
  mailmap: update Kirill's email
  profile: setup_profiling_timer() is moslty not implemented
  ocfs2: fix a typo in a comment
  ocfs2: use the bitmap API to simplify code
  ocfs2: remove some useless functions
  lib/mpi: fix typo 'the the' in comment
  proc: add some (hopefully) insightful comments
  bdi: remove enum wb_congested_state
  kernel/hung_task: fix address space of proc_dohung_task_timeout_secs
  lib/lzo/lzo1x_compress.c: replace ternary operator with min() and min_t()
  squashfs: support reading fragments in readahead call
  squashfs: implement readahead
  squashfs: always build "file direct" version of page actor
  Revert "squashfs: provide backing_dev_info in order to disable read-ahead"
  fs/ocfs2: Fix spelling typo in comment
  ia64: old_rr4 added under CONFIG_HUGETLB_PAGE
  proc: fix test for "vsyscall=xonly" boot option
  ...
</content>
</entry>
<entry>
<title>proc: add some (hopefully) insightful comments</title>
<updated>2022-07-30T01:12:35Z</updated>
<author>
<name>Alexey Dobriyan</name>
<email>adobriyan@gmail.com</email>
</author>
<published>2022-07-23T17:09:07Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=ed8fb78d7ecdeb3e2e86df0027e2c2cc55f9908b'/>
<id>urn:sha1:ed8fb78d7ecdeb3e2e86df0027e2c2cc55f9908b</id>
<content type='text'>
* /proc/${pid}/net status
* removing PDE vs last close stuff (again!)
* random small stuff

Link: https://lkml.kernel.org/r/YtwrM6sDC0OQ53YB@localhost.localdomain
Signed-off-by: Alexey Dobriyan &lt;adobriyan@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>proc: fix a dentry lock race between release_task and lookup</title>
<updated>2022-07-18T00:31:43Z</updated>
<author>
<name>Zhihao Cheng</name>
<email>chengzhihao1@huawei.com</email>
</author>
<published>2022-07-13T13:00:29Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=d919a1e79bac890421537cf02ae773007bf55e6b'/>
<id>urn:sha1:d919a1e79bac890421537cf02ae773007bf55e6b</id>
<content type='text'>
Commit 7bc3e6e55acf06 ("proc: Use a list of inodes to flush from proc")
moved proc_flush_task() behind __exit_signal().  Then, process systemd can
take long period high cpu usage during releasing task in following
concurrent processes:

  systemd                                 ps
kernel_waitid                 stat(/proc/tgid)
  do_wait                       filename_lookup
    wait_consider_task            lookup_fast
      release_task
        __exit_signal
          __unhash_process
            detach_pid
              __change_pid // remove task-&gt;pid_links
                                     d_revalidate -&gt; pid_revalidate  // 0
                                     d_invalidate(/proc/tgid)
                                       shrink_dcache_parent(/proc/tgid)
                                         d_walk(/proc/tgid)
                                           spin_lock_nested(/proc/tgid/fd)
                                           // iterating opened fd
        proc_flush_pid                                    |
           d_invalidate (/proc/tgid/fd)                   |
              shrink_dcache_parent(/proc/tgid/fd)         |
                shrink_dentry_list(subdirs)               ↓
                  shrink_lock_dentry(/proc/tgid/fd) --&gt; race on dentry lock

Function d_invalidate() will remove dentry from hash firstly, but why does
proc_flush_pid() process dentry '/proc/tgid/fd' before dentry
'/proc/tgid'?  That's because proc_pid_make_inode() adds proc inode in
reverse order by invoking hlist_add_head_rcu().  But proc should not add
any inodes under '/proc/tgid' except '/proc/tgid/task/pid', fix it by
adding inode into 'pid-&gt;inodes' only if the inode is /proc/tgid or
/proc/tgid/task/pid.

Performance regression:
Create 200 tasks, each task open one file for 50,000 times. Kill all
tasks when opened files exceed 10,000,000 (cat /proc/sys/fs/file-nr).

Before fix:
$ time killall -wq aa
  real    4m40.946s   # During this period, we can see 'ps' and 'systemd'
			taking high cpu usage.

After fix:
$ time killall -wq aa
  real    1m20.732s   # During this period, we can see 'systemd' taking
			high cpu usage.

Link: https://lkml.kernel.org/r/20220713130029.4133533-1-chengzhihao1@huawei.com
Fixes: 7bc3e6e55acf06 ("proc: Use a list of inodes to flush from proc")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=216054
Signed-off-by: Zhihao Cheng &lt;chengzhihao1@huawei.com&gt;
Signed-off-by: Zhang Yi &lt;yi.zhang@huawei.com&gt;
Suggested-by: Brian Foster &lt;bfoster@redhat.com&gt;
Reviewed-by: Brian Foster &lt;bfoster@redhat.com&gt;
Cc: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: Alexey Dobriyan &lt;adobriyan@gmail.com&gt;
Cc: Eric Biederman &lt;ebiederm@xmission.com&gt;
Cc: Matthew Wilcox &lt;willy@infradead.org&gt;
Cc: Baoquan He &lt;bhe@redhat.com&gt;
Cc: Kalesh Singh &lt;kaleshsingh@google.com&gt;
Cc: Yu Kuai &lt;yukuai3@huawei.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>proc: delete unused &lt;linux/uaccess.h&gt; includes</title>
<updated>2022-07-18T00:31:39Z</updated>
<author>
<name>Alexey Dobriyan</name>
<email>adobriyan@gmail.com</email>
</author>
<published>2022-06-15T11:22:06Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=376b0c266143a1dda162db6d5bc9b3a7f0ae97c9'/>
<id>urn:sha1:376b0c266143a1dda162db6d5bc9b3a7f0ae97c9</id>
<content type='text'>
Those aren't necessary after seq files won.

Link: https://lkml.kernel.org/r/YqnA3mS7KBt8Z4If@localhost.localdomain
Signed-off-by: Alexey Dobriyan &lt;adobriyan@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: thp: kill __transhuge_page_enabled()</title>
<updated>2022-07-18T00:14:33Z</updated>
<author>
<name>Yang Shi</name>
<email>shy828301@gmail.com</email>
</author>
<published>2022-06-16T17:48:38Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=7da4e2cb8b1ff8221759bfc7512d651ee69516dc'/>
<id>urn:sha1:7da4e2cb8b1ff8221759bfc7512d651ee69516dc</id>
<content type='text'>
The page fault path checks THP eligibility with __transhuge_page_enabled()
which does the similar thing as hugepage_vma_check(), so use
hugepage_vma_check() instead.

However page fault allows DAX and !anon_vma cases, so added a new flag,
in_pf, to hugepage_vma_check() to make page fault work correctly.

The in_pf flag is also used to skip shmem and file THP for page fault
since shmem handles THP in its own shmem_fault() and file THP allocation
on fault is not supported yet.

Also remove hugepage_vma_enabled() since hugepage_vma_check() is the only
caller now, it is not necessary to have a helper function.

Link: https://lkml.kernel.org/r/20220616174840.1202070-6-shy828301@gmail.com
Signed-off-by: Yang Shi &lt;shy828301@gmail.com&gt;
Reviewed-by: Zach O'Keefe &lt;zokeefe@google.com&gt;
Cc: Kirill A. Shutemov &lt;kirill.shutemov@linux.intel.com&gt;
Cc: Matthew Wilcox &lt;willy@infradead.org&gt;
Cc: Miaohe Lin &lt;linmiaohe@huawei.com&gt;
Cc: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: thp: kill transparent_hugepage_active()</title>
<updated>2022-07-18T00:14:33Z</updated>
<author>
<name>Yang Shi</name>
<email>shy828301@gmail.com</email>
</author>
<published>2022-06-16T17:48:37Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=9fec51689ff60d9766b38051a0b1692f93d95364'/>
<id>urn:sha1:9fec51689ff60d9766b38051a0b1692f93d95364</id>
<content type='text'>
The transparent_hugepage_active() was introduced to show THP eligibility
bit in smaps in proc, smaps is the only user.  But it actually does the
similar check as hugepage_vma_check() which is used by khugepaged.  We
definitely don't have to maintain two similar checks, so kill
transparent_hugepage_active().

This patch also fixed the wrong behavior for VM_NO_KHUGEPAGED vmas.

Also move hugepage_vma_check() to huge_memory.c and huge_mm.h since it
is not only for khugepaged anymore.

[akpm@linux-foundation.org: check vma-&gt;vm_mm, per Zach]
[akpm@linux-foundation.org: add comment to vdso check]
Link: https://lkml.kernel.org/r/20220616174840.1202070-5-shy828301@gmail.com
Signed-off-by: Yang Shi &lt;shy828301@gmail.com&gt;
Reviewed-by: Zach O'Keefe &lt;zokeefe@google.com&gt;
Cc: Kirill A. Shutemov &lt;kirill.shutemov@linux.intel.com&gt;
Cc: Matthew Wilcox &lt;willy@infradead.org&gt;
Cc: Miaohe Lin &lt;linmiaohe@huawei.com&gt;
Cc: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: handling Non-LRU pages returned by vm_normal_pages</title>
<updated>2022-07-18T00:14:28Z</updated>
<author>
<name>Alex Sierra</name>
<email>alex.sierra@amd.com</email>
</author>
<published>2022-07-15T15:05:11Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=3218f8712d6bba1812efd5e0d66c1e15134f2a91'/>
<id>urn:sha1:3218f8712d6bba1812efd5e0d66c1e15134f2a91</id>
<content type='text'>
With DEVICE_COHERENT, we'll soon have vm_normal_pages() return
device-managed anonymous pages that are not LRU pages.  Although they
behave like normal pages for purposes of mapping in CPU page, and for COW.
They do not support LRU lists, NUMA migration or THP.

Callers to follow_page() currently don't expect ZONE_DEVICE pages,
however, with DEVICE_COHERENT we might now return ZONE_DEVICE.  Check for
ZONE_DEVICE pages in applicable users of follow_page() as well.

Link: https://lkml.kernel.org/r/20220715150521.18165-5-alex.sierra@amd.com
Signed-off-by: Alex Sierra &lt;alex.sierra@amd.com&gt;
Acked-by: Felix Kuehling &lt;Felix.Kuehling@amd.com&gt;	[v2]
Reviewed-by: Alistair Popple &lt;apopple@nvidia.com&gt;	[v6]
Cc: Christoph Hellwig &lt;hch@lst.de&gt;
Cc: David Hildenbrand &lt;david@redhat.com&gt;
Cc: Jason Gunthorpe &lt;jgg@nvidia.com&gt;
Cc: Jerome Glisse &lt;jglisse@redhat.com&gt;
Cc: Matthew Wilcox &lt;willy@infradead.org&gt;
Cc: Ralph Campbell &lt;rcampbell@nvidia.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm/smaps: add Pss_Dirty</title>
<updated>2022-07-04T01:08:49Z</updated>
<author>
<name>Vincent Whitchurch</name>
<email>vincent.whitchurch@axis.com</email>
</author>
<published>2022-06-20T08:12:50Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=30934843019a18df975217e4a819f1c362994764'/>
<id>urn:sha1:30934843019a18df975217e4a819f1c362994764</id>
<content type='text'>
Pss is the sum of the sizes of clean and dirty private pages, and the
proportional sizes of clean and dirty shared pages:

 Private = Private_Dirty + Private_Clean
 Shared_Proportional = Shared_Dirty_Proportional + Shared_Clean_Proportional
 Pss = Private + Shared_Proportional

The Shared*Proportional fields are not present in smaps, so it is not
always possible to determine how much of the Pss is from dirty pages and
how much is from clean pages.  This information can be useful for
measuring memory usage for the purpose of optimisation, since clean pages
can usually be discarded by the kernel immediately while dirty pages
cannot.

The smaps routines in the kernel already have access to this data, so add
a Pss_Dirty to show it to userspace.  Pss_Clean is not added since it can
be calculated from Pss and Pss_Dirty.

Link: https://lkml.kernel.org/r/20220620081251.2928103-1-vincent.whitchurch@axis.com
Signed-off-by: Vincent Whitchurch &lt;vincent.whitchurch@axis.com&gt;
Cc: Jonathan Corbet &lt;corbet@lwn.net&gt;
Cc: Alexey Dobriyan &lt;adobriyan@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
</feed>
