<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/fs/notify, branch v5.11</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v5.11</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v5.11'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2020-12-28T10:58:59Z</updated>
<entry>
<title>fanotify: Fix sys_fanotify_mark() on native x86-32</title>
<updated>2020-12-28T10:58:59Z</updated>
<author>
<name>Brian Gerst</name>
<email>brgerst@gmail.com</email>
</author>
<published>2020-11-30T22:30:59Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=2ca408d9c749c32288bc28725f9f12ba30299e8f'/>
<id>urn:sha1:2ca408d9c749c32288bc28725f9f12ba30299e8f</id>
<content type='text'>
Commit

  121b32a58a3a ("x86/entry/32: Use IA32-specific wrappers for syscalls taking 64-bit arguments")

converted native x86-32 which take 64-bit arguments to use the
compat handlers to allow conversion to passing args via pt_regs.
sys_fanotify_mark() was however missed, as it has a general compat
handler. Add a config option that will use the syscall wrapper that
takes the split args for native 32-bit.

 [ bp: Fix typo in Kconfig help text. ]

Fixes: 121b32a58a3a ("x86/entry/32: Use IA32-specific wrappers for syscalls taking 64-bit arguments")
Reported-by: Paweł Jasiak &lt;pawel@jasiak.xyz&gt;
Signed-off-by: Brian Gerst &lt;brgerst@gmail.com&gt;
Signed-off-by: Borislav Petkov &lt;bp@suse.de&gt;
Acked-by: Jan Kara &lt;jack@suse.cz&gt;
Acked-by: Andy Lutomirski &lt;luto@kernel.org&gt;
Link: https://lkml.kernel.org/r/20201130223059.101286-1-brgerst@gmail.com
</content>
</entry>
<entry>
<title>Merge tag 'fsnotify_for_v5.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs</title>
<updated>2020-12-17T18:56:27Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2020-12-17T18:56:27Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=14bd41e41899cbd1de4bb5ddfa46c85b08091a69'/>
<id>urn:sha1:14bd41e41899cbd1de4bb5ddfa46c85b08091a69</id>
<content type='text'>
Pull fsnotify updates from Jan Kara:
 "A few fsnotify fixes from Amir fixing fallout from big fsnotify
  overhaul a few months back and an improvement of defaults limiting
  maximum number of inotify watches from Waiman"

* tag 'fsnotify_for_v5.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
  fsnotify: fix events reported to watching parent and child
  inotify: convert to handle_inode_event() interface
  fsnotify: generalize handle_inode_event()
  inotify: Increase default inotify.max_user_watches limit to 1048576
</content>
</entry>
<entry>
<title>Merge branch 'exec-for-v5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace</title>
<updated>2020-12-16T03:29:43Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2020-12-16T03:29:43Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=faf145d6f3f3d6f2c066f65602ba9d0a03106915'/>
<id>urn:sha1:faf145d6f3f3d6f2c066f65602ba9d0a03106915</id>
<content type='text'>
Pull execve updates from Eric Biederman:
 "This set of changes ultimately fixes the interaction of posix file
  lock and exec. Fundamentally most of the change is just moving where
  unshare_files is called during exec, and tweaking the users of
  files_struct so that the count of files_struct is not unnecessarily
  played with.

  Along the way fcheck and related helpers were renamed to more
  accurately reflect what they do.

  There were also many other small changes that fell out, as this is the
  first time in a long time much of this code has been touched.

  Benchmarks haven't turned up any practical issues but Al Viro has
  observed a possibility for a lot of pounding on task_lock. So I have
  some changes in progress to convert put_files_struct to always rcu
  free files_struct. That wasn't ready for the merge window so that will
  have to wait until next time"

* 'exec-for-v5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (27 commits)
  exec: Move io_uring_task_cancel after the point of no return
  coredump: Document coredump code exclusively used by cell spufs
  file: Remove get_files_struct
  file: Rename __close_fd_get_file close_fd_get_file
  file: Replace ksys_close with close_fd
  file: Rename __close_fd to close_fd and remove the files parameter
  file: Merge __alloc_fd into alloc_fd
  file: In f_dupfd read RLIMIT_NOFILE once.
  file: Merge __fd_install into fd_install
  proc/fd: In fdinfo seq_show don't use get_files_struct
  bpf/task_iter: In task_file_seq_get_next use task_lookup_next_fd_rcu
  proc/fd: In proc_readfd_common use task_lookup_next_fd_rcu
  file: Implement task_lookup_next_fd_rcu
  kcmp: In get_file_raw_ptr use task_lookup_fd_rcu
  proc/fd: In tid_fd_mode use task_lookup_fd_rcu
  file: Implement task_lookup_fd_rcu
  file: Rename fcheck lookup_fd_rcu
  file: Replace fcheck_files with files_lookup_fd_rcu
  file: Factor files_lookup_fd_locked out of fcheck_files
  file: Rename __fcheck_files to files_lookup_fd_raw
  ...
</content>
</entry>
<entry>
<title>fsnotify: fix events reported to watching parent and child</title>
<updated>2020-12-11T10:40:43Z</updated>
<author>
<name>Amir Goldstein</name>
<email>amir73il@gmail.com</email>
</author>
<published>2020-12-02T12:07:09Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=fecc4559780d52d174ea05e3bf543669165389c3'/>
<id>urn:sha1:fecc4559780d52d174ea05e3bf543669165389c3</id>
<content type='text'>
fsnotify_parent() used to send two separate events to backends when a
parent inode is watching children and the child inode is also watching.
In an attempt to avoid duplicate events in fanotify, we unified the two
backend callbacks to a single callback and handled the reporting of the
two separate events for the relevant backends (inotify and dnotify).
However the handling is buggy and can result in inotify and dnotify
listeners receiving events of the type they never asked for or spurious
events.

The problem is the unified event callback with two inode marks (parent and
child) is called when any of the parent and child inodes are watched and
interested in the event, but the parent inode's mark that is interested
in the event on the child is not necessarily the one we are currently
reporting to (it could belong to a different group).

So before reporting the parent or child event flavor to backend we need
to check that the mark is really interested in that event flavor.

The semantics of INODE and CHILD marks were hard to follow and made the
logic more complicated than it should have been.  Replace it with INODE
and PARENT marks semantics to hopefully make the logic more clear.

Thanks to Hugh Dickins for spotting a bug in the earlier version of this
patch.

Fixes: 497b0c5a7c06 ("fsnotify: send event to parent and child with single callback")
CC: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20201202120713.702387-4-amir73il@gmail.com
Reported-by: Hugh Dickins &lt;hughd@google.com&gt;
Signed-off-by: Amir Goldstein &lt;amir73il@gmail.com&gt;
Signed-off-by: Jan Kara &lt;jack@suse.cz&gt;
</content>
</entry>
<entry>
<title>file: Rename fcheck lookup_fd_rcu</title>
<updated>2020-12-10T18:40:07Z</updated>
<author>
<name>Eric W. Biederman</name>
<email>ebiederm@xmission.com</email>
</author>
<published>2020-11-20T23:14:27Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=460b4f812a9d473d4b39d87d37844f9fc30a9eb3'/>
<id>urn:sha1:460b4f812a9d473d4b39d87d37844f9fc30a9eb3</id>
<content type='text'>
Also remove the confusing comment about checking if a fd exists.  I
could not find one instance in the entire kernel that still matches
the description or the reason for the name fcheck.

The need for better names became apparent in the last round of
discussion of this set of changes[1].

[1] https://lkml.kernel.org/r/CAHk-=wj8BQbgJFLa+J0e=iT-1qpmCRTbPAJ8gd6MJQ=kbRPqyQ@mail.gmail.com
Link: https://lkml.kernel.org/r/20201120231441.29911-10-ebiederm@xmission.com
Signed-off-by: Eric W. Biederman &lt;ebiederm@xmission.com&gt;
</content>
</entry>
<entry>
<title>inotify: convert to handle_inode_event() interface</title>
<updated>2020-12-03T14:41:29Z</updated>
<author>
<name>Amir Goldstein</name>
<email>amir73il@gmail.com</email>
</author>
<published>2020-12-02T12:07:08Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=1a2620a99803ad660edc5d22fd9c66cce91ceb1c'/>
<id>urn:sha1:1a2620a99803ad660edc5d22fd9c66cce91ceb1c</id>
<content type='text'>
Convert inotify to use the simple handle_inode_event() interface to
get rid of the code duplication between the generic helper
fsnotify_handle_event() and the inotify_handle_event() callback, which
also happen to be buggy code.

The bug will be fixed in the generic helper.

Link: https://lore.kernel.org/r/20201202120713.702387-3-amir73il@gmail.com
CC: stable@vger.kernel.org
Fixes: b9a1b9772509 ("fsnotify: create method handle_inode_event() in fsnotify_operations")
Signed-off-by: Amir Goldstein &lt;amir73il@gmail.com&gt;
Signed-off-by: Jan Kara &lt;jack@suse.cz&gt;
</content>
</entry>
<entry>
<title>fsnotify: generalize handle_inode_event()</title>
<updated>2020-12-03T13:58:35Z</updated>
<author>
<name>Amir Goldstein</name>
<email>amir73il@gmail.com</email>
</author>
<published>2020-12-02T12:07:07Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=950cc0d2bef078e1f6459900ca4d4b2a2e0e3c37'/>
<id>urn:sha1:950cc0d2bef078e1f6459900ca4d4b2a2e0e3c37</id>
<content type='text'>
The handle_inode_event() interface was added as (quoting comment):
"a simple variant of handle_event() for groups that only have inode
marks and don't have ignore mask".

In other words, all backends except fanotify.  The inotify backend
also falls under this category, but because it required extra arguments
it was left out of the initial pass of backends conversion to the
simple interface.

This results in code duplication between the generic helper
fsnotify_handle_event() and the inotify_handle_event() callback
which also happen to be buggy code.

Generalize the handle_inode_event() arguments and add the check for
FS_EXCL_UNLINK flag to the generic helper, so inotify backend could
be converted to use the simple interface.

Link: https://lore.kernel.org/r/20201202120713.702387-2-amir73il@gmail.com
CC: stable@vger.kernel.org
Fixes: b9a1b9772509 ("fsnotify: create method handle_inode_event() in fsnotify_operations")
Signed-off-by: Amir Goldstein &lt;amir73il@gmail.com&gt;
Signed-off-by: Jan Kara &lt;jack@suse.cz&gt;
</content>
</entry>
<entry>
<title>inotify: Increase default inotify.max_user_watches limit to 1048576</title>
<updated>2020-11-09T14:03:21Z</updated>
<author>
<name>Waiman Long</name>
<email>longman@redhat.com</email>
</author>
<published>2020-11-09T03:59:31Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=92890123749bafc317bbfacbe0a62ce08d78efb7'/>
<id>urn:sha1:92890123749bafc317bbfacbe0a62ce08d78efb7</id>
<content type='text'>
The default value of inotify.max_user_watches sysctl parameter was set
to 8192 since the introduction of the inotify feature in 2005 by
commit 0eeca28300df ("[PATCH] inotify"). Today this value is just too
small for many modern usage. As a result, users have to explicitly set
it to a larger value to make it work.

After some searching around the web, these are the
inotify.max_user_watches values used by some projects:
 - vscode:  524288
 - dropbox support: 100000
 - users on stackexchange: 12228
 - lsyncd user: 2000000
 - code42 support: 1048576
 - monodevelop: 16384
 - tectonic: 524288
 - openshift origin: 65536

Each watch point adds an inotify_inode_mark structure to an inode to
be watched. It also pins the watched inode.

Modeled after the epoll.max_user_watches behavior to adjust the default
value according to the amount of addressable memory available, make
inotify.max_user_watches behave in a similar way to make it use no more
than 1% of addressable memory within the range [8192, 1048576].

We estimate the amount of memory used by inotify mark to size of
inotify_inode_mark plus two times the size of struct inode (we double
the inode size to cover the additional filesystem private inode part).
That means that a 64-bit system with 128GB or more memory will likely
have the maximum value of 1048576 for inotify.max_user_watches. This
default should be big enough for most use cases.

Link: https://lore.kernel.org/r/20201109035931.4740-1-longman@redhat.com
Reviewed-by: Amir Goldstein &lt;amir73il@gmail.com&gt;
Signed-off-by: Waiman Long &lt;longman@redhat.com&gt;
Signed-off-by: Jan Kara &lt;jack@suse.cz&gt;
</content>
</entry>
<entry>
<title>fanotify: fix logic of reporting name info with watched parent</title>
<updated>2020-11-09T14:03:08Z</updated>
<author>
<name>Amir Goldstein</name>
<email>amir73il@gmail.com</email>
</author>
<published>2020-11-08T10:59:06Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=7372e79c9eb9d7034e498721eb2861ae4fdbc618'/>
<id>urn:sha1:7372e79c9eb9d7034e498721eb2861ae4fdbc618</id>
<content type='text'>
The victim inode's parent and name info is required when an event
needs to be delivered to a group interested in filename info OR
when the inode's parent is interested in an event on its children.

Let us call the first condition 'parent_needed' and the second
condition 'parent_interested'.

In fsnotify_parent(), the condition where the inode's parent is
interested in some events on its children, but not necessarily
interested the specific event is called 'parent_watched'.

fsnotify_parent() tests the condition (!parent_watched &amp;&amp; !parent_needed)
for sending the event without parent and name info, which is correct.

It then wrongly assumes that parent_watched implies !parent_needed
and tests the condition (parent_watched &amp;&amp; !parent_interested)
for sending the event without parent and name info, which is wrong,
because parent may still be needed by some group.

For example, after initializing a group with FAN_REPORT_DFID_NAME and
adding a FAN_MARK_MOUNT with FAN_OPEN mask, open events on non-directory
children of "testdir" are delivered with file name info.

After adding another mark to the same group on the parent "testdir"
with FAN_CLOSE|FAN_EVENT_ON_CHILD mask, open events on non-directory
children of "testdir" are no longer delivered with file name info.

Fix the logic and use auxiliary variables to clarify the conditions.

Fixes: 9b93f33105f5 ("fsnotify: send event with parent/name info to sb/mount/non-dir marks")
Cc: stable@vger.kernel.org#v5.9
Link: https://lore.kernel.org/r/20201108105906.8493-1-amir73il@gmail.com
Signed-off-by: Amir Goldstein &lt;amir73il@gmail.com&gt;
Signed-off-by: Jan Kara &lt;jack@suse.cz&gt;
</content>
</entry>
<entry>
<title>mm, memcg: rework remote charging API to support nesting</title>
<updated>2020-10-18T16:27:09Z</updated>
<author>
<name>Roman Gushchin</name>
<email>guro@fb.com</email>
</author>
<published>2020-10-17T23:13:40Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=b87d8cefe43c7f22e8aa13919c1dfa2b4b4b4e01'/>
<id>urn:sha1:b87d8cefe43c7f22e8aa13919c1dfa2b4b4b4e01</id>
<content type='text'>
Currently the remote memcg charging API consists of two functions:
memalloc_use_memcg() and memalloc_unuse_memcg(), which set and clear the
memcg value, which overwrites the memcg of the current task.

  memalloc_use_memcg(target_memcg);
  &lt;...&gt;
  memalloc_unuse_memcg();

It works perfectly for allocations performed from a normal context,
however an attempt to call it from an interrupt context or just nest two
remote charging blocks will lead to an incorrect accounting.  On exit from
the inner block the active memcg will be cleared instead of being
restored.

  memalloc_use_memcg(target_memcg);

  memalloc_use_memcg(target_memcg_2);
    &lt;...&gt;
    memalloc_unuse_memcg();

    Error: allocation here are charged to the memcg of the current
    process instead of target_memcg.

  memalloc_unuse_memcg();

This patch extends the remote charging API by switching to a single
function: struct mem_cgroup *set_active_memcg(struct mem_cgroup *memcg),
which sets the new value and returns the old one.  So a remote charging
block will look like:

  old_memcg = set_active_memcg(target_memcg);
  &lt;...&gt;
  set_active_memcg(old_memcg);

This patch is heavily based on the patch by Johannes Weiner, which can be
found here: https://lkml.org/lkml/2020/5/28/806 .

Signed-off-by: Roman Gushchin &lt;guro@fb.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Reviewed-by: Shakeel Butt &lt;shakeelb@google.com&gt;
Cc: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: Dan Schatzberg &lt;dschatzberg@fb.com&gt;
Link: https://lkml.kernel.org/r/20200821212056.3769116-1-guro@fb.com
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
</feed>
