<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/Documentation/admin-guide/sysctl, branch v6.15</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v6.15</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v6.15'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2025-04-02T23:36:59Z</updated>
<entry>
<title>Merge tag 'fuse-update-6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse</title>
<updated>2025-04-02T23:36:59Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2025-04-02T23:36:59Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=5e17b5c71729d8ce936c83a579ed45f65efcb456'/>
<id>urn:sha1:5e17b5c71729d8ce936c83a579ed45f65efcb456</id>
<content type='text'>
Pull fuse updates from Miklos Szeredi:

 - Allow connection to server to time out (Joanne Koong)

 - If server doesn't support creating a hard link, return EPERM rather
   than ENOSYS (Matt Johnston)

 - Allow file names longer than 1024 chars (Bernd Schubert)

 - Fix a possible race if request on io_uring queue is interrupted
   (Bernd Schubert)

 - Misc fixes and cleanups

* tag 'fuse-update-6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
  fuse: remove unneeded atomic set in uring creation
  fuse: fix uring race condition for null dereference of fc
  fuse: Increase FUSE_NAME_MAX to PATH_MAX
  fuse: Allocate only namelen buf memory in fuse_notify_
  fuse: add default_request_timeout and max_request_timeout sysctls
  fuse: add kernel-enforced timeout option for requests
  fuse: optmize missing FUSE_LINK support
  fuse: Return EPERM rather than ENOSYS from link()
  fuse: removed unused function fuse_uring_create() from header
  fuse: {io-uring} Fix a possible req cancellation race
</content>
</entry>
<entry>
<title>fuse: add default_request_timeout and max_request_timeout sysctls</title>
<updated>2025-03-31T12:59:27Z</updated>
<author>
<name>Joanne Koong</name>
<email>joannelkoong@gmail.com</email>
</author>
<published>2025-01-22T21:55:28Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=9b17cb59a7db983a967c3658fe9a2f250f588bbd'/>
<id>urn:sha1:9b17cb59a7db983a967c3658fe9a2f250f588bbd</id>
<content type='text'>
Introduce two new sysctls, "default_request_timeout" and
"max_request_timeout". These control how long (in seconds) a server can
take to reply to a request. If the server does not reply by the timeout,
then the connection will be aborted. The upper bound on these sysctl
values is 65535.

"default_request_timeout" sets the default timeout if no timeout is
specified by the fuse server on mount. 0 (default) indicates no default
timeout should be enforced. If the server did specify a timeout, then
default_request_timeout will be ignored.

"max_request_timeout" sets the max amount of time the server may take to
reply to a request. 0 (default) indicates no maximum timeout. If
max_request_timeout is set and the fuse server attempts to set a
timeout greater than max_request_timeout, the system will use
max_request_timeout as the timeout. Similarly, if default_request_timeout
is greater than max_request_timeout, the system will use
max_request_timeout as the timeout. If the server does not request a
timeout and default_request_timeout is set to 0 but max_request_timeout
is set, then the timeout will be max_request_timeout.

Please note that these timeouts are not 100% precise. The request may
take roughly an extra FUSE_TIMEOUT_TIMER_FREQ seconds beyond the set max
timeout due to how it's internally implemented.

$ sysctl -a | grep fuse.default_request_timeout
fs.fuse.default_request_timeout = 0

$ echo 65536 | sudo tee /proc/sys/fs/fuse/default_request_timeout
tee: /proc/sys/fs/fuse/default_request_timeout: Invalid argument

$ echo 65535 | sudo tee /proc/sys/fs/fuse/default_request_timeout
65535

$ sysctl -a | grep fuse.default_request_timeout
fs.fuse.default_request_timeout = 65535

$ echo 0 | sudo tee /proc/sys/fs/fuse/default_request_timeout
0

$ sysctl -a | grep fuse.default_request_timeout
fs.fuse.default_request_timeout = 0

[Luis Henriques: Limit the timeout to the range [FUSE_TIMEOUT_TIMER_FREQ,
fuse_max_req_timeout]]

Signed-off-by: Joanne Koong &lt;joannelkoong@gmail.com&gt;
Reviewed-by: Bernd Schubert &lt;bschubert@ddn.com&gt;
Reviewed-by: Josef Bacik &lt;josef@toxicpanda.com&gt;
Reviewed-by: Sergey Senozhatsky &lt;senozhatsky@chromium.org&gt;
Reviewed-by: Luis Henriques &lt;luis@igalia.com&gt;
Reviewed-by: Jeff Layton &lt;jlayton@kernel.org&gt;
Signed-off-by: Miklos Szeredi &lt;mszeredi@redhat.com&gt;
</content>
</entry>
<entry>
<title>mm: page_alloc: defrag_mode</title>
<updated>2025-03-18T05:07:07Z</updated>
<author>
<name>Johannes Weiner</name>
<email>hannes@cmpxchg.org</email>
</author>
<published>2025-03-13T21:05:34Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=e3aa7df331bca08742a212764348246e8e8a874e'/>
<id>urn:sha1:e3aa7df331bca08742a212764348246e8e8a874e</id>
<content type='text'>
The page allocator groups requests by migratetype to stave off
fragmentation.  However, in practice this is routinely defeated by the
fact that it gives up *before* invoking reclaim and compaction - which may
well produce suitable pages.  As a result, fragmentation of physical
memory is a common ongoing process in many load scenarios.

Fragmentation deteriorates compaction's ability to produce huge pages. 
Depending on the lifetime of the fragmenting allocations, those effects
can be long-lasting or even permanent, requiring drastic measures like
forcible idle states or even reboots as the only reliable ways to recover
the address space for THP production.

In a kernel build test with supplemental THP pressure, the THP allocation
rate steadily declines over 15 runs:

    thp_fault_alloc
    61988
    56474
    57258
    50187
    52388
    55409
    52925
    47648
    43669
    40621
    36077
    41721
    36685
    34641
    33215

This is a hurdle in adopting THP in any environment where hosts are shared
between multiple overlapping workloads (cloud environments), and rarely
experience true idle periods.  To make THP a reliable and predictable
optimization, there needs to be a stronger guarantee to avoid such
fragmentation.

Introduce defrag_mode.  When enabled, reclaim/compaction is invoked to its
full extent *before* falling back.  Specifically, ALLOC_NOFRAGMENT is
enforced on the allocator fastpath and the reclaiming slowpath.

For now, fallbacks are permitted to avert OOMs.  There is a plan to add
defrag_mode=2 to prefer OOMs over fragmentation, but this requires
additional prep work in compaction and the reserve management to make it
ready for all possible allocation contexts.

The following test results are from a kernel build with periodic bursts of
THP allocations, over 15 runs:

                                        vanilla    defrag_mode=1
@claimer[unmovable]:                        189              103
@claimer[movable]:                           92              103
@claimer[reclaimable]:                      207               61
@pollute[unmovable from movable]:            25                0
@pollute[unmovable from reclaimable]:        28                0
@pollute[movable from unmovable]:         38835                0
@pollute[movable from reclaimable]:      147136                0
@pollute[reclaimable from unmovable]:       178                0
@pollute[reclaimable from movable]:          33                0
@steal[unmovable from movable]:              11                0
@steal[unmovable from reclaimable]:           5                0
@steal[reclaimable from unmovable]:         107                0
@steal[reclaimable from movable]:            90                0
@steal[movable from reclaimable]:           354                0
@steal[movable from unmovable]:             130                0

Both types of polluting fallbacks are eliminated in this workload.

Interestingly, whole block conversions are reduced as well.  This is
because once a block is claimed for a type, its empty space remains
available for future allocations, instead of being padded with fallbacks;
this allows the native type to group up instead of spreading out to new
blocks.  The assumption in the allocator has been that pollution from
movable allocations is less harmful than from other types, since they can
be reclaimed or migrated out should the space be needed.  However, since
fallbacks occur *before* reclaim/compaction is invoked, movable pollution
will still cause non-movable allocations to spread out and claim more
blocks.

Without fragmentation, THP rates hold steady with defrag_mode=1:

    thp_fault_alloc
    32478
    20725
    45045
    32130
    14018
    21711
    40791
    29134
    34458
    45381
    28305
    17265
    22584
    28454
    30850

While the downward trend is eliminated, the keen reader will of course
notice that the baseline rate is much smaller than the vanilla kernel's to
begin with.  This is due to deficiencies in how reclaim and compaction are
currently driven: ALLOC_NOFRAGMENT increases the extent to which smaller
allocations are competing with THPs for pageblocks, while making no effort
themselves to reclaim or compact beyond their own request size.  This
effect already exists with the current usage of ALLOC_NOFRAGMENT, but is
amplified by defrag_mode insisting on whole block stealing much more
strongly.

Subsequent patches will address defrag_mode reclaim strategy to raise the
THP success baseline above the vanilla kernel.

Link: https://lkml.kernel.org/r/20250313210647.1314586-4-hannes@cmpxchg.org
Signed-off-by: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: Mel Gorman &lt;mgorman@techsingularity.net&gt;
Cc: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Cc: Zi Yan &lt;ziy@nvidia.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>coredump: Only sort VMAs when core_sort_vma sysctl is set</title>
<updated>2025-02-24T18:51:57Z</updated>
<author>
<name>Kees Cook</name>
<email>kees@kernel.org</email>
</author>
<published>2025-02-19T19:53:16Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=39ec9eaaa165d297d008d1fa385748430bd18e4d'/>
<id>urn:sha1:39ec9eaaa165d297d008d1fa385748430bd18e4d</id>
<content type='text'>
The sorting of VMAs by size in commit 7d442a33bfe8 ("binfmt_elf: Dump
smaller VMAs first in ELF cores") breaks elfutils[1]. Instead, sort
based on the setting of the new sysctl, core_sort_vma, which defaults
to 0, no sorting.

Reported-by: Michael Stapelberg &lt;michael@stapelberg.ch&gt;
Closes: https://lore.kernel.org/all/20250218085407.61126-1-michael@stapelberg.de/ [1]
Fixes: 7d442a33bfe8 ("binfmt_elf: Dump smaller VMAs first in ELF cores")
Signed-off-by: Kees Cook &lt;kees@kernel.org&gt;
</content>
</entry>
<entry>
<title>Documentation/sysctl: Add timer_migration to kernel.rst</title>
<updated>2025-01-16T18:18:04Z</updated>
<author>
<name>Phil Auld</name>
<email>pauld@redhat.com</email>
</author>
<published>2025-01-14T19:05:25Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=e129fdc599093457648eccf981d672fade55a9c8'/>
<id>urn:sha1:e129fdc599093457648eccf981d672fade55a9c8</id>
<content type='text'>
There is no mention of timer_migration in the docs. Add
a short description.

Signed-off-by: Phil Auld &lt;pauld@redhat.com&gt;
Cc: Jonathan Corbet &lt;corbet@lwn.net&gt;
Cc: linux-doc@vger.kernel.org
Signed-off-by: Jonathan Corbet &lt;corbet@lwn.net&gt;
Link: https://lore.kernel.org/r/20250114190525.169022-1-pauld@redhat.com
</content>
</entry>
<entry>
<title>docs: remove duplicate word</title>
<updated>2024-12-11T16:07:40Z</updated>
<author>
<name>Ruffalo Lavoisier</name>
<email>ruffalolavoisier@gmail.com</email>
</author>
<published>2024-11-20T04:34:13Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=5a3f0a11b2f1d01fd4675c28fdcc72ea0d149385'/>
<id>urn:sha1:5a3f0a11b2f1d01fd4675c28fdcc72ea0d149385</id>
<content type='text'>
- Remove duplicate word, 'to'.

Signed-off-by: Jonathan Corbet &lt;corbet@lwn.net&gt;
Link: https://lore.kernel.org/r/20241120043414.78811-1-RuffaloLavoisier@gmail.com
</content>
</entry>
<entry>
<title>Merge tag 'fuse-update-6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse</title>
<updated>2024-11-26T20:41:27Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2024-11-26T20:41:27Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=fb527fc1f36e252cd1f62a26be4906949e7708ff'/>
<id>urn:sha1:fb527fc1f36e252cd1f62a26be4906949e7708ff</id>
<content type='text'>
Pull fuse updates from Miklos Szeredi:

 - Add page -&gt; folio conversions (Joanne Koong, Josef Bacik)

 - Allow max size of fuse requests to be configurable with a sysctl
   (Joanne Koong)

 - Allow FOPEN_DIRECT_IO to take advantage of async code path (yangyun)

 - Fix large kernel reads (like a module load) in virtio_fs (Hou Tao)

 - Fix attribute inconsistency in case readdirplus (and plain lookup in
   corner cases) is racing with inode eviction (Zhang Tianci)

 - Fix a WARN_ON triggered by virtio_fs (Asahi Lina)

* tag 'fuse-update-6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse: (30 commits)
  virtiofs: dax: remove -&gt;writepages() callback
  fuse: check attributes staleness on fuse_iget()
  fuse: remove pages for requests and exclusively use folios
  fuse: convert direct io to use folios
  mm/writeback: add folio_mark_dirty_lock()
  fuse: convert writebacks to use folios
  fuse: convert retrieves to use folios
  fuse: convert ioctls to use folios
  fuse: convert writes (non-writeback) to use folios
  fuse: convert reads to use folios
  fuse: convert readdir to use folios
  fuse: convert readlink to use folios
  fuse: convert cuse to use folios
  fuse: add support in virtio for requests using folios
  fuse: support folios in struct fuse_args_pages and fuse_copy_pages()
  fuse: convert fuse_notify_store to use folios
  fuse: convert fuse_retrieve to use folios
  fuse: use the folio based vmstat helpers
  fuse: convert fuse_writepage_need_send to take a folio
  fuse: convert fuse_do_readpage to use folios
  ...
</content>
</entry>
<entry>
<title>Merge tag 'mm-nonmm-stable-2024-11-24-02-05' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm</title>
<updated>2024-11-26T00:09:48Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2024-11-26T00:09:48Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=f5f4745a7f057b58c9728ee4e2c5d6d79f382fe7'/>
<id>urn:sha1:f5f4745a7f057b58c9728ee4e2c5d6d79f382fe7</id>
<content type='text'>
Pull non-MM updates from Andrew Morton:

 - The series "resource: A couple of cleanups" from Andy Shevchenko
   performs some cleanups in the resource management code

 - The series "Improve the copy of task comm" from Yafang Shao addresses
   possible race-induced overflows in the management of
   task_struct.comm[]

 - The series "Remove unnecessary header includes from
   {tools/}lib/list_sort.c" from Kuan-Wei Chiu adds some cleanups and a
   small fix to the list_sort library code and to its selftest

 - The series "Enhance min heap API with non-inline functions and
   optimizations" also from Kuan-Wei Chiu optimizes and cleans up the
   min_heap library code

 - The series "nilfs2: Finish folio conversion" from Ryusuke Konishi
   finishes off nilfs2's folioification

 - The series "add detect count for hung tasks" from Lance Yang adds
   more userspace visibility into the hung-task detector's activity

 - Apart from that, singelton patches in many places - please see the
   individual changelogs for details

* tag 'mm-nonmm-stable-2024-11-24-02-05' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (71 commits)
  gdb: lx-symbols: do not error out on monolithic build
  kernel/reboot: replace sprintf() with sysfs_emit()
  lib: util_macros_kunit: add kunit test for util_macros.h
  util_macros.h: fix/rework find_closest() macros
  Improve consistency of '#error' directive messages
  ocfs2: fix uninitialized value in ocfs2_file_read_iter()
  hung_task: add docs for hung_task_detect_count
  hung_task: add detect count for hung tasks
  dma-buf: use atomic64_inc_return() in dma_buf_getfile()
  fs/proc/kcore.c: fix coccinelle reported ERROR instances
  resource: avoid unnecessary resource tree walking in __region_intersects()
  ocfs2: remove unused errmsg function and table
  ocfs2: cluster: fix a typo
  lib/scatterlist: use sg_phys() helper
  checkpatch: always parse orig_commit in fixes tag
  nilfs2: convert metadata aops from writepage to writepages
  nilfs2: convert nilfs_recovery_copy_block() to take a folio
  nilfs2: convert nilfs_page_count_clean_buffers() to take a folio
  nilfs2: remove nilfs_writepage
  nilfs2: convert checkpoint file to be folio-based
  ...
</content>
</entry>
<entry>
<title>hung_task: add docs for hung_task_detect_count</title>
<updated>2024-11-12T01:17:03Z</updated>
<author>
<name>Lance Yang</name>
<email>ioworker0@gmail.com</email>
</author>
<published>2024-10-27T12:07:47Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=62bf7065cc6056a51a240c810b95d887e5bb7c8c'/>
<id>urn:sha1:62bf7065cc6056a51a240c810b95d887e5bb7c8c</id>
<content type='text'>
This commit introduces documentation for hung_task_detect_count in
kernel.rst.

Link: https://lkml.kernel.org/r/20241027120747.42833-3-ioworker0@gmail.com
Signed-off-by: Mingzhe Yang &lt;mingzhe.yang@ly.com&gt;
Signed-off-by: Lance Yang &lt;ioworker0@gmail.com&gt;
Cc: Bang Li &lt;libang.li@antgroup.com&gt;
Cc: Baolin Wang &lt;baolin.wang@linux.alibaba.com&gt;
Cc: David Hildenbrand &lt;david@redhat.com&gt;
Cc: Huang Cun &lt;cunhuang@tencent.com&gt;
Cc: Joel Granados &lt;j.granados@samsung.com&gt;
Cc: Joel Granados &lt;joel.granados@kernel.org&gt;
Cc: John Siddle &lt;jsiddle@redhat.com&gt;
Cc: Kent Overstreet &lt;kent.overstreet@linux.dev&gt;
Cc: Ryan Roberts &lt;ryan.roberts@arm.com&gt;
Cc: Thomas Weißschuh &lt;linux@weissschuh.net&gt;
Cc: Yongliang Gao &lt;leonylgao@tencent.com&gt;
Cc: Zi Yan &lt;ziy@nvidia.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>fuse: enable dynamic configuration of fuse max pages limit (FUSE_MAX_MAX_PAGES)</title>
<updated>2024-10-25T15:05:49Z</updated>
<author>
<name>Joanne Koong</name>
<email>joannelkoong@gmail.com</email>
</author>
<published>2024-09-23T17:13:11Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=2b3933b1e0a0a4b758fbc164bb31db0c113a7e2c'/>
<id>urn:sha1:2b3933b1e0a0a4b758fbc164bb31db0c113a7e2c</id>
<content type='text'>
Introduce the capability to dynamically configure the max pages limit
(FUSE_MAX_MAX_PAGES) through a sysctl. This allows system administrators
to dynamically set the maximum number of pages that can be used for
servicing requests in fuse.

Previously, this is gated by FUSE_MAX_MAX_PAGES which is statically set
to 256 pages. One result of this is that the buffer size for a write
request is limited to 1 MiB on a 4k-page system.

The default value for this sysctl is the original limit (256 pages).

$ sysctl -a | grep max_pages_limit
fs.fuse.max_pages_limit = 256

$ sysctl -n fs.fuse.max_pages_limit
256

$ echo 1024 | sudo tee /proc/sys/fs/fuse/max_pages_limit
1024

$ sysctl -n fs.fuse.max_pages_limit
1024

$ echo 65536 | sudo tee /proc/sys/fs/fuse/max_pages_limit
tee: /proc/sys/fs/fuse/max_pages_limit: Invalid argument

$ echo 0 | sudo tee /proc/sys/fs/fuse/max_pages_limit
tee: /proc/sys/fs/fuse/max_pages_limit: Invalid argument

$ echo 65535 | sudo tee /proc/sys/fs/fuse/max_pages_limit
65535

$ sysctl -n fs.fuse.max_pages_limit
65535

Signed-off-by: Joanne Koong &lt;joannelkoong@gmail.com&gt;
Reviewed-by: Josef Bacik &lt;josef@toxicpanda.com&gt;
Reviewed-by: Sweet Tea Dorminy &lt;sweettea-kernel@dorminy.me&gt;
Signed-off-by: Miklos Szeredi &lt;mszeredi@redhat.com&gt;
</content>
</entry>
</feed>
