<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/fs/select.c, branch v5.11</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v5.11</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v5.11'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2021-01-08T19:06:29Z</updated>
<entry>
<title>poll: fix performance regression due to out-of-line __put_user()</title>
<updated>2021-01-08T19:06:29Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2021-01-07T17:43:54Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=ef0ba05538299f1391cbe097de36895bb36ecfe6'/>
<id>urn:sha1:ef0ba05538299f1391cbe097de36895bb36ecfe6</id>
<content type='text'>
The kernel test robot reported a -5.8% performance regression on the
"poll2" test of will-it-scale, and bisected it to commit d55564cfc222
("x86: Make __put_user() generate an out-of-line call").

I didn't expect an out-of-line __put_user() to matter, because no normal
core code should use that non-checking legacy version of user access any
more.  But I had overlooked the very odd poll() usage, which does a
__put_user() to update the 'revents' values of the poll array.

Now, Al Viro correctly points out that instead of updating just the
'revents' field, it would be much simpler to just copy the _whole_
pollfd entry, and then we could just use "copy_to_user()" on the whole
array of entries, the same way we use "copy_from_user()" a few lines
earlier to get the original values.

But that is not what we've traditionally done, and I worry that threaded
applications might be concurrently modifying the other fields of the
pollfd array.  So while Al's suggestion is simpler - and perhaps worth
trying in the future - this instead keeps the "just update revents"
model.

To fix the performance regression, use the modern "unsafe_put_user()"
instead of __put_user(), with the proper "user_write_access_begin()"
guarding in place. This improves code generation enormously.

Link: https://lore.kernel.org/lkml/20210107134723.GA28532@xsang-OptiPlex-9020/
Reported-by: kernel test robot &lt;oliver.sang@intel.com&gt;
Tested-by: Oliver Sang &lt;oliver.sang@intel.com&gt;
Cc: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: David Laight &lt;David.Laight@aculab.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>fs: Replace zero-length array with flexible-array member</title>
<updated>2020-10-29T22:22:59Z</updated>
<author>
<name>Gustavo A. R. Silva</name>
<email>gustavoars@kernel.org</email>
</author>
<published>2020-08-31T13:25:42Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=5e01fdff04b7f7c3b8d456c11c8a9f978b4ddf65'/>
<id>urn:sha1:5e01fdff04b7f7c3b8d456c11c8a9f978b4ddf65</id>
<content type='text'>
There is a regular need in the kernel to provide a way to declare having a
dynamically sized set of trailing elements in a structure. Kernel code should
always use “flexible array members”[1] for these cases. The older style of
one-element or zero-length arrays should no longer be used[2].

[1] https://en.wikipedia.org/wiki/Flexible_array_member
[2] https://www.kernel.org/doc/html/v5.9-rc1/process/deprecated.html#zero-length-and-one-element-arrays

Signed-off-by: Gustavo A. R. Silva &lt;gustavoars@kernel.org&gt;
</content>
</entry>
<entry>
<title>pselect6() and friends: take handling the combined 6th/7th args into helper</title>
<updated>2020-05-29T23:10:42Z</updated>
<author>
<name>Al Viro</name>
<email>viro@zeniv.linux.org.uk</email>
</author>
<published>2020-02-19T14:54:24Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=7e71609f64ec81b8367b7fa59ab06bb571d17e3b'/>
<id>urn:sha1:7e71609f64ec81b8367b7fa59ab06bb571d17e3b</id>
<content type='text'>
... and use unsafe_get_user(), while we are at it.

Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</content>
</entry>
<entry>
<title>y2038: syscalls: change remaining timeval to __kernel_old_timeval</title>
<updated>2019-11-15T13:38:29Z</updated>
<author>
<name>Arnd Bergmann</name>
<email>arnd@arndb.de</email>
</author>
<published>2019-10-25T20:56:17Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=75d319c06e6a76f67549c0ae1007dc3167804f4e'/>
<id>urn:sha1:75d319c06e6a76f67549c0ae1007dc3167804f4e</id>
<content type='text'>
All of the remaining syscalls that pass a timeval (gettimeofday, utime,
futimesat) can trivially be changed to pass a __kernel_old_timeval
instead, which has a compatible layout, but avoids ambiguity with
the timeval type in user space.

Acked-by: Christian Brauner &lt;christian.brauner@ubuntu.com&gt;
Acked-by: Rafael J. Wysocki &lt;rafael.j.wysocki@intel.com&gt;
Signed-off-by: Arnd Bergmann &lt;arnd@arndb.de&gt;
</content>
</entry>
<entry>
<title>fs/select.c: use struct_size() in kmalloc()</title>
<updated>2019-07-17T02:23:25Z</updated>
<author>
<name>Gustavo A. R. Silva</name>
<email>gustavo@embeddedor.com</email>
</author>
<published>2019-07-16T23:30:58Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=43e11fa2d1d3b6e35629fa556eb7d571edba2010'/>
<id>urn:sha1:43e11fa2d1d3b6e35629fa556eb7d571edba2010</id>
<content type='text'>
One of the more common cases of allocation size calculations is finding
the size of a structure that has a zero-sized array at the end, along
with memory for some number of elements for that array.  For example:

  struct foo {
       int stuff;
       struct boo entry[];
  };

  size = sizeof(struct foo) + count * sizeof(struct boo);
  instance = kmalloc(size, GFP_KERNEL);

Instead of leaving these open-coded and prone to type mistakes, we can now
use the new struct_size() helper:

  instance = kmalloc(struct_size(instance, entry, count), GFP_KERNEL);

Also, notice that variable size is unnecessary, hence it is removed.

This code was detected with the help of Coccinelle.

Link: http://lkml.kernel.org/r/20190604164226.GA13823@embeddedor
Signed-off-by: Gustavo A. R. Silva &lt;gustavo@embeddedor.com&gt;
Reviewed-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>select: shift restore_saved_sigmask_unless() into poll_select_copy_remaining()</title>
<updated>2019-07-17T02:23:24Z</updated>
<author>
<name>Oleg Nesterov</name>
<email>oleg@redhat.com</email>
</author>
<published>2019-07-16T23:29:59Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=ac301020627e258a304f40cab5b35b6814a6f033'/>
<id>urn:sha1:ac301020627e258a304f40cab5b35b6814a6f033</id>
<content type='text'>
Now that restore_saved_sigmask_unless() is always called with the same
argument right before poll_select_copy_remaining() we can move it into
poll_select_copy_remaining() and make it the only caller of restore() in
fs/select.c.

The patch also renames poll_select_copy_remaining(),
poll_select_finish() looks better after this change.

kern_select() doesn't use set_user_sigmask(), so in this case
poll_select_finish() does restore_saved_sigmask_unless() "for no
reason".  But this won't hurt, and WARN_ON(!TIF_SIGPENDING) is still
valid.

Link: http://lkml.kernel.org/r/20190606140915.GC13440@redhat.com
Signed-off-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Cc: Al Viro &lt;viro@ZenIV.linux.org.uk&gt;
Cc: Arnd Bergmann &lt;arnd@arndb.de&gt;
Cc: David Laight &lt;David.Laight@aculab.com&gt;
Cc: Davidlohr Bueso &lt;dave@stgolabs.net&gt;
Cc: Deepa Dinamani &lt;deepa.kernel@gmail.com&gt;
Cc: Eric W. Biederman &lt;ebiederm@xmission.com&gt;
Cc: Eric Wong &lt;e@80x24.org&gt;
Cc: Jason Baron &lt;jbaron@akamai.com&gt;
Cc: Jens Axboe &lt;axboe@kernel.dk&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>select: change do_poll() to return -ERESTARTNOHAND rather than -EINTR</title>
<updated>2019-07-17T02:23:24Z</updated>
<author>
<name>Oleg Nesterov</name>
<email>oleg@redhat.com</email>
</author>
<published>2019-07-16T23:29:56Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=8cf8b5539a414da3257db6d121bcee2d883135cb'/>
<id>urn:sha1:8cf8b5539a414da3257db6d121bcee2d883135cb</id>
<content type='text'>
do_poll() returns -EINTR if interrupted and after that all its callers
have to translate it into -ERESTARTNOHAND.  Change do_poll() to return
-ERESTARTNOHAND and update (simplify) the callers.

Note that this also unifies all users of restore_saved_sigmask_unless(),
see the next patch.

Linus:

: The *right* return value will actually be then chosen by
: poll_select_copy_remaining(), which will turn ERESTARTNOHAND to EINTR
: when it can't update the timeout.
:
: Except for the cases that use restart_block and do that instead and
: don't have the whole timeout restart issue as a result.

Link: http://lkml.kernel.org/r/20190606140852.GB13440@redhat.com
Signed-off-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Acked-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Al Viro &lt;viro@ZenIV.linux.org.uk&gt;
Cc: Arnd Bergmann &lt;arnd@arndb.de&gt;
Cc: David Laight &lt;David.Laight@aculab.com&gt;
Cc: Davidlohr Bueso &lt;dave@stgolabs.net&gt;
Cc: Deepa Dinamani &lt;deepa.kernel@gmail.com&gt;
Cc: Eric W. Biederman &lt;ebiederm@xmission.com&gt;
Cc: Eric Wong &lt;e@80x24.org&gt;
Cc: Jason Baron &lt;jbaron@akamai.com&gt;
Cc: Jens Axboe &lt;axboe@kernel.dk&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>signal: simplify set_user_sigmask/restore_user_sigmask</title>
<updated>2019-07-17T02:23:24Z</updated>
<author>
<name>Oleg Nesterov</name>
<email>oleg@redhat.com</email>
</author>
<published>2019-07-16T23:29:53Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=b772434be0891ed1081a08ae7cfd4666728f8e82'/>
<id>urn:sha1:b772434be0891ed1081a08ae7cfd4666728f8e82</id>
<content type='text'>
task-&gt;saved_sigmask and -&gt;restore_sigmask are only used in the ret-from-
syscall paths.  This means that set_user_sigmask() can save -&gt;blocked in
-&gt;saved_sigmask and do set_restore_sigmask() to indicate that -&gt;blocked
was modified.

This way the callers do not need 2 sigset_t's passed to set/restore and
restore_user_sigmask() renamed to restore_saved_sigmask_unless() turns
into the trivial helper which just calls restore_saved_sigmask().

Link: http://lkml.kernel.org/r/20190606113206.GA9464@redhat.com
Signed-off-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Cc: Deepa Dinamani &lt;deepa.kernel@gmail.com&gt;
Cc: Arnd Bergmann &lt;arnd@arndb.de&gt;
Cc: Jens Axboe &lt;axboe@kernel.dk&gt;
Cc: Davidlohr Bueso &lt;dave@stgolabs.net&gt;
Cc: Eric Wong &lt;e@80x24.org&gt;
Cc: Jason Baron &lt;jbaron@akamai.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Al Viro &lt;viro@ZenIV.linux.org.uk&gt;
Cc: Eric W. Biederman &lt;ebiederm@xmission.com&gt;
Cc: David Laight &lt;David.Laight@aculab.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>signal: remove the wrong signal_pending() check in restore_user_sigmask()</title>
<updated>2019-06-29T08:43:45Z</updated>
<author>
<name>Oleg Nesterov</name>
<email>oleg@redhat.com</email>
</author>
<published>2019-06-28T19:06:50Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=97abc889ee296faf95ca0e978340fb7b942a3e32'/>
<id>urn:sha1:97abc889ee296faf95ca0e978340fb7b942a3e32</id>
<content type='text'>
This is the minimal fix for stable, I'll send cleanups later.

Commit 854a6ed56839 ("signal: Add restore_user_sigmask()") introduced
the visible change which breaks user-space: a signal temporary unblocked
by set_user_sigmask() can be delivered even if the caller returns
success or timeout.

Change restore_user_sigmask() to accept the additional "interrupted"
argument which should be used instead of signal_pending() check, and
update the callers.

Eric said:

: For clarity.  I don't think this is required by posix, or fundamentally to
: remove the races in select.  It is what linux has always done and we have
: applications who care so I agree this fix is needed.
:
: Further in any case where the semantic change that this patch rolls back
: (aka where allowing a signal to be delivered and the select like call to
: complete) would be advantage we can do as well if not better by using
: signalfd.
:
: Michael is there any chance we can get this guarantee of the linux
: implementation of pselect and friends clearly documented.  The guarantee
: that if the system call completes successfully we are guaranteed that no
: signal that is unblocked by using sigmask will be delivered?

Link: http://lkml.kernel.org/r/20190604134117.GA29963@redhat.com
Fixes: 854a6ed56839a40f6b5d02a2962f48841482eec4 ("signal: Add restore_user_sigmask()")
Signed-off-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Reported-by: Eric Wong &lt;e@80x24.org&gt;
Tested-by: Eric Wong &lt;e@80x24.org&gt;
Acked-by: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Acked-by: Arnd Bergmann &lt;arnd@arndb.de&gt;
Acked-by: Deepa Dinamani &lt;deepa.kernel@gmail.com&gt;
Cc: Michael Kerrisk &lt;mtk.manpages@gmail.com&gt;
Cc: Jens Axboe &lt;axboe@kernel.dk&gt;
Cc: Davidlohr Bueso &lt;dave@stgolabs.net&gt;
Cc: Jason Baron &lt;jbaron@akamai.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Al Viro &lt;viro@ZenIV.linux.org.uk&gt;
Cc: David Laight &lt;David.Laight@ACULAB.COM&gt;
Cc: &lt;stable@vger.kernel.org&gt;	[5.0+]
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>y2038: syscalls: rename y2038 compat syscalls</title>
<updated>2019-02-06T23:13:27Z</updated>
<author>
<name>Arnd Bergmann</name>
<email>arnd@arndb.de</email>
</author>
<published>2019-01-06T23:33:08Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=8dabe7245bbc134f2cfcc12cde75c019dab924cc'/>
<id>urn:sha1:8dabe7245bbc134f2cfcc12cde75c019dab924cc</id>
<content type='text'>
A lot of system calls that pass a time_t somewhere have an implementation
using a COMPAT_SYSCALL_DEFINEx() on 64-bit architectures, and have
been reworked so that this implementation can now be used on 32-bit
architectures as well.

The missing step is to redefine them using the regular SYSCALL_DEFINEx()
to get them out of the compat namespace and make it possible to build them
on 32-bit architectures.

Any system call that ends in 'time' gets a '32' suffix on its name for
that version, while the others get a '_time32' suffix, to distinguish
them from the normal version, which takes a 64-bit time argument in the
future.

In this step, only 64-bit architectures are changed, doing this rename
first lets us avoid touching the 32-bit architectures twice.

Acked-by: Catalin Marinas &lt;catalin.marinas@arm.com&gt;
Signed-off-by: Arnd Bergmann &lt;arnd@arndb.de&gt;
</content>
</entry>
</feed>
