<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/ipc, branch v5.17</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v5.17</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v5.17'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2022-02-04T17:25:05Z</updated>
<entry>
<title>ipc/sem: do not sleep with a spin lock held</title>
<updated>2022-02-04T17:25:05Z</updated>
<author>
<name>Minghao Chi</name>
<email>chi.minghao@zte.com.cn</email>
</author>
<published>2022-02-04T04:49:33Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=520ba724061cef59763e2b6f5b26e8387c2e5822'/>
<id>urn:sha1:520ba724061cef59763e2b6f5b26e8387c2e5822</id>
<content type='text'>
We can't call kvfree() with a spin lock held, so defer it.

Link: https://lkml.kernel.org/r/20211223031207.556189-1-chi.minghao@zte.com.cn
Fixes: fc37a3b8b438 ("[PATCH] ipc sem: use kvmalloc for sem_undo allocation")
Reported-by: Zeal Robot &lt;zealci@zte.com.cn&gt;
Signed-off-by: Minghao Chi &lt;chi.minghao@zte.com.cn&gt;
Reviewed-by: Shakeel Butt &lt;shakeelb@google.com&gt;
Reviewed-by: Manfred Spraul &lt;manfred@colorfullife.com&gt;
Cc: Arnd Bergmann &lt;arnd@arndb.de&gt;
Cc: Yang Guang &lt;cgel.zte@gmail.com&gt;
Cc: Davidlohr Bueso &lt;dbueso@suse.de&gt;
Cc: Randy Dunlap &lt;rdunlap@infradead.org&gt;
Cc: Bhaskar Chowdhury &lt;unixbhaskar@gmail.com&gt;
Cc: Vasily Averin &lt;vvs@virtuozzo.com&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>proc: remove PDE_DATA() completely</title>
<updated>2022-01-22T06:33:37Z</updated>
<author>
<name>Muchun Song</name>
<email>songmuchun@bytedance.com</email>
</author>
<published>2022-01-22T06:14:23Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=359745d78351c6f5442435f81549f0207ece28aa'/>
<id>urn:sha1:359745d78351c6f5442435f81549f0207ece28aa</id>
<content type='text'>
Remove PDE_DATA() completely and replace it with pde_data().

[akpm@linux-foundation.org: fix naming clash in drivers/nubus/proc.c]
[akpm@linux-foundation.org: now fix it properly]

Link: https://lkml.kernel.org/r/20211124081956.87711-2-songmuchun@bytedance.com
Signed-off-by: Muchun Song &lt;songmuchun@bytedance.com&gt;
Acked-by: Christian Brauner &lt;christian.brauner@ubuntu.com&gt;
Cc: Alexey Dobriyan &lt;adobriyan@gmail.com&gt;
Cc: Alexey Gladkov &lt;gladkov.alexey@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>shm: extend forced shm destroy to support objects from several IPC nses</title>
<updated>2021-11-20T18:35:54Z</updated>
<author>
<name>Alexander Mikhalitsyn</name>
<email>alexander.mikhalitsyn@virtuozzo.com</email>
</author>
<published>2021-11-20T00:43:21Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=85b6d24646e4125c591639841169baa98a2da503'/>
<id>urn:sha1:85b6d24646e4125c591639841169baa98a2da503</id>
<content type='text'>
Currently, the exit_shm() function not designed to work properly when
task-&gt;sysvshm.shm_clist holds shm objects from different IPC namespaces.

This is a real pain when sysctl kernel.shm_rmid_forced = 1, because it
leads to use-after-free (reproducer exists).

This is an attempt to fix the problem by extending exit_shm mechanism to
handle shm's destroy from several IPC ns'es.

To achieve that we do several things:

1. add a namespace (non-refcounted) pointer to the struct shmid_kernel

2. during new shm object creation (newseg()/shmget syscall) we
   initialize this pointer by current task IPC ns

3. exit_shm() fully reworked such that it traverses over all shp's in
   task-&gt;sysvshm.shm_clist and gets IPC namespace not from current task
   as it was before but from shp's object itself, then call
   shm_destroy(shp, ns).

Note: We need to be really careful here, because as it was said before
(1), our pointer to IPC ns non-refcnt'ed.  To be on the safe side we
using special helper get_ipc_ns_not_zero() which allows to get IPC ns
refcounter only if IPC ns not in the "state of destruction".

Q/A

Q: Why can we access shp-&gt;ns memory using non-refcounted pointer?
A: Because shp object lifetime is always shorther than IPC namespace
   lifetime, so, if we get shp object from the task-&gt;sysvshm.shm_clist
   while holding task_lock(task) nobody can steal our namespace.

Q: Does this patch change semantics of unshare/setns/clone syscalls?
A: No. It's just fixes non-covered case when process may leave IPC
   namespace without getting task-&gt;sysvshm.shm_clist list cleaned up.

Link: https://lkml.kernel.org/r/67bb03e5-f79c-1815-e2bf-949c67047418@colorfullife.com
Link: https://lkml.kernel.org/r/20211109151501.4921-1-manfred@colorfullife.com
Fixes: ab602f79915 ("shm: make exit_shm work proportional to task activity")
Co-developed-by: Manfred Spraul &lt;manfred@colorfullife.com&gt;
Signed-off-by: Manfred Spraul &lt;manfred@colorfullife.com&gt;
Signed-off-by: Alexander Mikhalitsyn &lt;alexander.mikhalitsyn@virtuozzo.com&gt;
Cc: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Cc: Davidlohr Bueso &lt;dave@stgolabs.net&gt;
Cc: Greg KH &lt;gregkh@linuxfoundation.org&gt;
Cc: Andrei Vagin &lt;avagin@gmail.com&gt;
Cc: Pavel Tikhomirov &lt;ptikhomirov@virtuozzo.com&gt;
Cc: Vasily Averin &lt;vvs@virtuozzo.com&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>ipc: WARN if trying to remove ipc object which is absent</title>
<updated>2021-11-20T18:35:54Z</updated>
<author>
<name>Alexander Mikhalitsyn</name>
<email>alexander.mikhalitsyn@virtuozzo.com</email>
</author>
<published>2021-11-20T00:43:18Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=126e8bee943e9926238c891e2df5b5573aee76bc'/>
<id>urn:sha1:126e8bee943e9926238c891e2df5b5573aee76bc</id>
<content type='text'>
Patch series "shm: shm_rmid_forced feature fixes".

Some time ago I met kernel crash after CRIU restore procedure,
fortunately, it was CRIU restore, so, I had dump files and could do
restore many times and crash reproduced easily.  After some
investigation I've constructed the minimal reproducer.  It was found
that it's use-after-free and it happens only if sysctl
kernel.shm_rmid_forced = 1.

The key of the problem is that the exit_shm() function not handles shp's
object destroy when task-&gt;sysvshm.shm_clist contains items from
different IPC namespaces.  In most cases this list will contain only
items from one IPC namespace.

How can this list contain object from different namespaces? The
exit_shm() function is designed to clean up this list always when
process leaves IPC namespace.  But we made a mistake a long time ago and
did not add a exit_shm() call into the setns() syscall procedures.

The first idea was just to add this call to setns() syscall but it
obviously changes semantics of setns() syscall and that's
userspace-visible change.  So, I gave up on this idea.

The first real attempt to address the issue was just to omit forced
destroy if we meet shp object not from current task IPC namespace [1].
But that was not the best idea because task-&gt;sysvshm.shm_clist was
protected by rwsem which belongs to current task IPC namespace.  It
means that list corruption may occur.

Second approach is just extend exit_shm() to properly handle shp's from
different IPC namespaces [2].  This is really non-trivial thing, I've
put a lot of effort into that but not believed that it's possible to
make it fully safe, clean and clear.

Thanks to the efforts of Manfred Spraul working an elegant solution was
designed.  Thanks a lot, Manfred!

Eric also suggested the way to address the issue in ("[RFC][PATCH] shm:
In shm_exit destroy all created and never attached segments") Eric's
idea was to maintain a list of shm_clists one per IPC namespace, use
lock-less lists.  But there is some extra memory consumption-related
concerns.

An alternative solution which was suggested by me was implemented in
("shm: reset shm_clist on setns but omit forced shm destroy").  The idea
is pretty simple, we add exit_shm() syscall to setns() but DO NOT
destroy shm segments even if sysctl kernel.shm_rmid_forced = 1, we just
clean up the task-&gt;sysvshm.shm_clist list.

This chages semantics of setns() syscall a little bit but in comparision
to the "naive" solution when we just add exit_shm() without any special
exclusions this looks like a safer option.

[1] https://lkml.org/lkml/2021/7/6/1108
[2] https://lkml.org/lkml/2021/7/14/736

This patch (of 2):

Let's produce a warning if we trying to remove non-existing IPC object
from IPC namespace kht/idr structures.

This allows us to catch possible bugs when the ipc_rmid() function was
called with inconsistent struct ipc_ids*, struct kern_ipc_perm*
arguments.

Link: https://lkml.kernel.org/r/20211027224348.611025-1-alexander.mikhalitsyn@virtuozzo.com
Link: https://lkml.kernel.org/r/20211027224348.611025-2-alexander.mikhalitsyn@virtuozzo.com
Co-developed-by: Manfred Spraul &lt;manfred@colorfullife.com&gt;
Signed-off-by: Manfred Spraul &lt;manfred@colorfullife.com&gt;
Signed-off-by: Alexander Mikhalitsyn &lt;alexander.mikhalitsyn@virtuozzo.com&gt;
Cc: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Cc: Davidlohr Bueso &lt;dave@stgolabs.net&gt;
Cc: Greg KH &lt;gregkh@linuxfoundation.org&gt;
Cc: Andrei Vagin &lt;avagin@gmail.com&gt;
Cc: Pavel Tikhomirov &lt;ptikhomirov@virtuozzo.com&gt;
Cc: Vasily Averin &lt;vvs@virtuozzo.com&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>ipc/ipc_sysctl.c: remove fallback for !CONFIG_PROC_SYSCTL</title>
<updated>2021-11-09T18:02:53Z</updated>
<author>
<name>Manfred Spraul</name>
<email>manfred@colorfullife.com</email>
</author>
<published>2021-11-09T02:36:02Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=0e9beb8a96f21a6df1579cb3a679e150e3269d80'/>
<id>urn:sha1:0e9beb8a96f21a6df1579cb3a679e150e3269d80</id>
<content type='text'>
Compilation of ipc/ipc_sysctl.c is controlled by
obj-$(CONFIG_SYSVIPC_SYSCTL)
[see ipc/Makefile]

And CONFIG_SYSVIPC_SYSCTL depends on SYSCTL
[see init/Kconfig]

An SYSCTL is selected by PROC_SYSCTL.
[see fs/proc/Kconfig]

Thus: #ifndef CONFIG_PROC_SYSCTL in ipc/ipc_sysctl.c is impossible, the
fallback can be removed.

Link: https://lkml.kernel.org/r/20210918145337.3369-1-manfred@colorfullife.com
Signed-off-by: Manfred Spraul &lt;manfred@colorfullife.com&gt;
Reviewed-by: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Acked-by: Davidlohr Bueso &lt;dbueso@suse.de&gt;
Cc: Manfred Spraul &lt;manfred@colorfullife.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>ipc: check checkpoint_restore_ns_capable() to modify C/R proc files</title>
<updated>2021-11-09T18:02:53Z</updated>
<author>
<name>Michal Clapinski</name>
<email>mclapinski@google.com</email>
</author>
<published>2021-11-09T02:35:59Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=5563cabdde7ee53c34ec7e5e0283bfcc9a1bc893'/>
<id>urn:sha1:5563cabdde7ee53c34ec7e5e0283bfcc9a1bc893</id>
<content type='text'>
This commit removes the requirement to be root to modify sem_next_id,
msg_next_id and shm_next_id and checks checkpoint_restore_ns_capable
instead.

Since those files are specific to the IPC namespace, there is no reason
they should require root privileges.  This is similar to ns_last_pid,
which also only checks checkpoint_restore_ns_capable.

[akpm@linux-foundation.org: ipc/ipc_sysctl.c needs capability.h for checkpoint_restore_ns_capable()]

Link: https://lkml.kernel.org/r/20210916163717.3179496-1-mclapinski@google.com
Signed-off-by: Michal Clapinski &lt;mclapinski@google.com&gt;
Reviewed-by: Davidlohr Bueso &lt;dbueso@suse.de&gt;
Reviewed-by: Manfred Spraul &lt;manfred@colorfullife.com&gt;
Cc: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm,hugetlb: remove mlock ulimit for SHM_HUGETLB</title>
<updated>2021-11-09T18:02:48Z</updated>
<author>
<name>zhangyiru</name>
<email>zhangyiru3@huawei.com</email>
</author>
<published>2021-11-09T02:31:27Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=83c1fd763b3220458652f7d656d5047292ebcde4'/>
<id>urn:sha1:83c1fd763b3220458652f7d656d5047292ebcde4</id>
<content type='text'>
Commit 21a3c273f88c ("mm, hugetlb: add thread name and pid to
SHM_HUGETLB mlock rlimit warning") marked this as deprecated in 2012,
but it is not deleted yet.

Mike says he still sees that message in log files on occasion, so maybe we
should preserve this warning.

Also remove hugetlbfs related user_shm_unlock in ipc/shm.c and remove the
user_shm_unlock after out.

Link: https://lkml.kernel.org/r/20211103105857.25041-1-zhangyiru3@huawei.com
Signed-off-by: zhangyiru &lt;zhangyiru3@huawei.com&gt;
Reviewed-by: Mike Kravetz &lt;mike.kravetz@oracle.com&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Cc: Liu Zixian &lt;liuzixian4@huawei.com&gt;
Cc: Michal Hocko &lt;mhocko@suse.com&gt;
Cc: wuxu.wu &lt;wuxu.wu@huawei.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>ipc: remove memcg accounting for sops objects in do_semtimedop()</title>
<updated>2021-09-14T17:22:11Z</updated>
<author>
<name>Vasily Averin</name>
<email>vvs@virtuozzo.com</email>
</author>
<published>2021-09-11T07:40:08Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=6a4746ba06191e23d30230738e94334b26590a8a'/>
<id>urn:sha1:6a4746ba06191e23d30230738e94334b26590a8a</id>
<content type='text'>
Linus proposes to revert an accounting for sops objects in
do_semtimedop() because it's really just a temporary buffer
for a single semtimedop() system call.

This object can consume up to 2 pages, syscall is sleeping
one, size and duration can be controlled by user, and this
allocation can be repeated by many thread at the same time.

However Shakeel Butt pointed that there are much more popular
objects with the same life time and similar memory
consumption, the accounting of which was decided to be
rejected for performance reasons.

Considering at least 2 pages for task_struct and 2 pages for
the kernel stack, a back of the envelope calculation gives a
footprint amplification of &lt;1.5 so this temporal buffer can be
safely ignored.

The factor would IMO be interesting if it was &gt;&gt; 2 (from the
PoV of excessive (ab)use, fine-grained accounting seems to be
currently unfeasible due to performance impact).

Link: https://lore.kernel.org/lkml/90e254df-0dfe-f080-011e-b7c53ee7fd20@virtuozzo.com/
Fixes: 18319498fdd4 ("memcg: enable accounting of ipc resources")
Signed-off-by: Vasily Averin &lt;vvs@virtuozzo.com&gt;
Acked-by: Michal Hocko &lt;mhocko@suse.com&gt;
Reviewed-by: Michal Koutný &lt;mkoutny@suse.com&gt;
Acked-by: Shakeel Butt &lt;shakeelb@google.com&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>Merge tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm</title>
<updated>2021-09-09T20:25:49Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2021-09-09T20:25:49Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=35776f10513c0d523c5dd2f1b415f642497779e2'/>
<id>urn:sha1:35776f10513c0d523c5dd2f1b415f642497779e2</id>
<content type='text'>
Pull ARM development updates from Russell King:

 - Rename "mod_init" and "mod_exit" so that initcall debug output is
   actually useful (Randy Dunlap)

 - Update maintainers entries for linux-arm-kernel to indicate it is
   moderated for non-subscribers (Randy Dunlap)

 - Move install rules to arch/arm/Makefile (Masahiro Yamada)

 - Drop unnecessary ARCH_NR_GPIOS definition (Linus Walleij)

 - Don't warn about atags_to_fdt() stack size (David Heidelberg)

 - Speed up unaligned copy_{from,to}_kernel_nofault (Arnd Bergmann)

 - Get rid of set_fs() usage (Arnd Bergmann)

 - Remove checks for GCC prior to v4.6 (Geert Uytterhoeven)

* tag 'for-linus' of git://git.armlinux.org.uk/~rmk/linux-arm:
  ARM: 9118/1: div64: Remove always-true __div64_const32_is_OK() duplicate
  ARM: 9117/1: asm-generic: div64: Remove always-true __div64_const32_is_OK()
  ARM: 9116/1: unified: Remove check for gcc &lt; 4
  ARM: 9110/1: oabi-compat: fix oabi epoll sparse warning
  ARM: 9113/1: uaccess: remove set_fs() implementation
  ARM: 9112/1: uaccess: add __{get,put}_kernel_nofault
  ARM: 9111/1: oabi-compat: rework fcntl64() emulation
  ARM: 9114/1: oabi-compat: rework sys_semtimedop emulation
  ARM: 9108/1: oabi-compat: rework epoll_wait/epoll_pwait emulation
  ARM: 9107/1: syscall: always store thread_info-&gt;abi_syscall
  ARM: 9109/1: oabi-compat: add epoll_pwait handler
  ARM: 9106/1: traps: use get_kernel_nofault instead of set_fs()
  ARM: 9115/1: mm/maccess: fix unaligned copy_{from,to}_kernel_nofault
  ARM: 9105/1: atags_to_fdt: don't warn about stack size
  ARM: 9103/1: Drop ARCH_NR_GPIOS definition
  ARM: 9102/1: move theinstall rules to arch/arm/Makefile
  ARM: 9100/1: MAINTAINERS: mark all linux-arm-kernel@infradead list as moderated
  ARM: 9099/1: crypto: rename 'mod_init' &amp; 'mod_exit' functions to be module-specific
</content>
</entry>
<entry>
<title>Merge branch 'akpm' (patches from Andrew)</title>
<updated>2021-09-08T19:55:35Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2021-09-08T19:55:35Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=2d338201d5311bcd79d42f66df4cecbcbc5f4f2c'/>
<id>urn:sha1:2d338201d5311bcd79d42f66df4cecbcbc5f4f2c</id>
<content type='text'>
Merge more updates from Andrew Morton:
 "147 patches, based on 7d2a07b769330c34b4deabeed939325c77a7ec2f.

  Subsystems affected by this patch series: mm (memory-hotplug, rmap,
  ioremap, highmem, cleanups, secretmem, kfence, damon, and vmscan),
  alpha, percpu, procfs, misc, core-kernel, MAINTAINERS, lib,
  checkpatch, epoll, init, nilfs2, coredump, fork, pids, criu, kconfig,
  selftests, ipc, and scripts"

* emailed patches from Andrew Morton &lt;akpm@linux-foundation.org&gt;: (94 commits)
  scripts: check_extable: fix typo in user error message
  mm/workingset: correct kernel-doc notations
  ipc: replace costly bailout check in sysvipc_find_ipc()
  selftests/memfd: remove unused variable
  Kconfig.debug: drop selecting non-existing HARDLOCKUP_DETECTOR_ARCH
  configs: remove the obsolete CONFIG_INPUT_POLLDEV
  prctl: allow to setup brk for et_dyn executables
  pid: cleanup the stale comment mentioning pidmap_init().
  kernel/fork.c: unexport get_{mm,task}_exe_file
  coredump: fix memleak in dump_vma_snapshot()
  fs/coredump.c: log if a core dump is aborted due to changed file permissions
  nilfs2: use refcount_dec_and_lock() to fix potential UAF
  nilfs2: fix memory leak in nilfs_sysfs_delete_snapshot_group
  nilfs2: fix memory leak in nilfs_sysfs_create_snapshot_group
  nilfs2: fix memory leak in nilfs_sysfs_delete_##name##_group
  nilfs2: fix memory leak in nilfs_sysfs_create_##name##_group
  nilfs2: fix NULL pointer in nilfs_##name##_attr_release
  nilfs2: fix memory leak in nilfs_sysfs_create_device_group
  trap: cleanup trap_init()
  init: move usermodehelper_enable() to populate_rootfs()
  ...
</content>
</entry>
</feed>
