<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/kernel, branch v5.12</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v5.12</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v5.12'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2021-04-25T16:10:10Z</updated>
<entry>
<title>Merge tag 'locking_urgent_for_v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip</title>
<updated>2021-04-25T16:10:10Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2021-04-25T16:10:10Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=0146da0d4cecad571f69f02fe35d75d6dba9723c'/>
<id>urn:sha1:0146da0d4cecad571f69f02fe35d75d6dba9723c</id>
<content type='text'>
Pull locking fix from Borislav Petkov:
 "Fix ordering in the queued writer lock's slowpath"

* tag 'locking_urgent_for_v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  locking/qrwlock: Fix ordering in queued_write_lock_slowpath()
</content>
</entry>
<entry>
<title>Merge tag 'sched_urgent_for_v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip</title>
<updated>2021-04-25T16:08:19Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2021-04-25T16:08:19Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=682b26bd80f96c2e4da3eb6dcec8bf684b79151c'/>
<id>urn:sha1:682b26bd80f96c2e4da3eb6dcec8bf684b79151c</id>
<content type='text'>
Pull scheduler fix from Borislav Petkov:
 "Fix a typo in a macro ifdeffery"

* tag 'sched_urgent_for_v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  preempt/dynamic: Fix typo in macro conditional statement
</content>
</entry>
<entry>
<title>Merge tag 'trace-v5.12-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace</title>
<updated>2021-04-20T21:38:35Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2021-04-20T21:38:35Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=1fe5501ba1abf2b7e78295df73675423bd6899a0'/>
<id>urn:sha1:1fe5501ba1abf2b7e78295df73675423bd6899a0</id>
<content type='text'>
Pull tracing fix from Steven Rostedt:
 "Fix tp_printk command line and trace events

  Masami added a wrapper to be able to unhash trace event pointers as
  they are only read by root anyway, and they can also be extracted by
  the raw trace data buffers. But this wrapper utilized the iterator to
  have a temporary buffer to manipulate the text with.

  tp_printk is a kernel command line option that will send the trace
  output of a trace event to the console on boot up (useful when the
  system crashes before finishing the boot). But the code used the same
  wrapper that Masami added, and its iterator did not have a buffer, and
  this caused the system to crash.

  Have the wrapper just print the trace event normally if the iterator
  has no temporary buffer"

* tag 'trace-v5.12-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  tracing: Fix checking event hash pointer logic when tp_printk is enabled
</content>
</entry>
<entry>
<title>capabilities: require CAP_SETFCAP to map uid 0</title>
<updated>2021-04-20T21:28:33Z</updated>
<author>
<name>Serge E. Hallyn</name>
<email>serge@hallyn.com</email>
</author>
<published>2021-04-20T13:43:34Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=db2e718a47984b9d71ed890eb2ea36ecf150de18'/>
<id>urn:sha1:db2e718a47984b9d71ed890eb2ea36ecf150de18</id>
<content type='text'>
cap_setfcap is required to create file capabilities.

Since commit 8db6c34f1dbc ("Introduce v3 namespaced file capabilities"),
a process running as uid 0 but without cap_setfcap is able to work
around this as follows: unshare a new user namespace which maps parent
uid 0 into the child namespace.

While this task will not have new capabilities against the parent
namespace, there is a loophole due to the way namespaced file
capabilities are represented as xattrs.  File capabilities valid in
userns 1 are distinguished from file capabilities valid in userns 2 by
the kuid which underlies uid 0.  Therefore the restricted root process
can unshare a new self-mapping namespace, add a namespaced file
capability onto a file, then use that file capability in the parent
namespace.

To prevent that, do not allow mapping parent uid 0 if the process which
opened the uid_map file does not have CAP_SETFCAP, which is the
capability for setting file capabilities.

As a further wrinkle: a task can unshare its user namespace, then open
its uid_map file itself, and map (only) its own uid.  In this case we do
not have the credential from before unshare, which was potentially more
restricted.  So, when creating a user namespace, we record whether the
creator had CAP_SETFCAP.  Then we can use that during map_write().

With this patch:

1. Unprivileged user can still unshare -Ur

   ubuntu@caps:~$ unshare -Ur
   root@caps:~# logout

2. Root user can still unshare -Ur

   ubuntu@caps:~$ sudo bash
   root@caps:/home/ubuntu# unshare -Ur
   root@caps:/home/ubuntu# logout

3. Root user without CAP_SETFCAP cannot unshare -Ur:

   root@caps:/home/ubuntu# /sbin/capsh --drop=cap_setfcap --
   root@caps:/home/ubuntu# /sbin/setcap cap_setfcap=p /sbin/setcap
   unable to set CAP_SETFCAP effective capability: Operation not permitted
   root@caps:/home/ubuntu# unshare -Ur
   unshare: write failed /proc/self/uid_map: Operation not permitted

Note: an alternative solution would be to allow uid 0 mappings by
processes without CAP_SETFCAP, but to prevent such a namespace from
writing any file capabilities.  This approach can be seen at [1].

Background history: commit 95ebabde382 ("capabilities: Don't allow
writing ambiguous v3 file capabilities") tried to fix the issue by
preventing v3 fscaps to be written to disk when the root uid would map
to the same uid in nested user namespaces.  This led to regressions for
various workloads.  For example, see [2].  Ultimately this is a valid
use-case we have to support meaning we had to revert this change in
3b0c2d3eaa83 ("Revert 95ebabde382c ("capabilities: Don't allow writing
ambiguous v3 file capabilities")").

Link: https://git.kernel.org/pub/scm/linux/kernel/git/sergeh/linux.git/log/?h=2021-04-15/setfcap-nsfscaps-v4 [1]
Link: https://github.com/containers/buildah/issues/3071 [2]
Signed-off-by: Serge Hallyn &lt;serge@hallyn.com&gt;
Reviewed-by: Andrew G. Morgan &lt;morgan@kernel.org&gt;
Tested-by: Christian Brauner &lt;christian.brauner@ubuntu.com&gt;
Reviewed-by: Christian Brauner &lt;christian.brauner@ubuntu.com&gt;
Tested-by: Giuseppe Scrivano &lt;gscrivan@redhat.com&gt;
Cc: Eric Biederman &lt;ebiederm@xmission.com&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>tracing: Fix checking event hash pointer logic when tp_printk is enabled</title>
<updated>2021-04-20T14:56:58Z</updated>
<author>
<name>Steven Rostedt (VMware)</name>
<email>rostedt@goodmis.org</email>
</author>
<published>2021-04-19T18:23:12Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=0e1e71d34901a633825cd5ae78efaf8abd9215c6'/>
<id>urn:sha1:0e1e71d34901a633825cd5ae78efaf8abd9215c6</id>
<content type='text'>
Pointers in events that are printed are unhashed if the flags allow it,
and the logic to do so is called before processing the event output from
the raw ring buffer. In most cases, this is done when a user reads one of
the trace files.

But if tp_printk is added on the kernel command line, this logic is done
for trace events when they are triggered, and their output goes out via
printk. The unhash logic (and even the validation of the output) did not
support the tp_printk output, and would crash.

Link: https://lore.kernel.org/linux-tegra/9835d9f1-8d3a-3440-c53f-516c2606ad07@nvidia.com/

Fixes: efbbdaa22bb7 ("tracing: Show real address for trace event arguments")
Reported-by: Jon Hunter &lt;jonathanh@nvidia.com&gt;
Tested-by: Jon Hunter &lt;jonathanh@nvidia.com&gt;
Signed-off-by: Steven Rostedt (VMware) &lt;rostedt@goodmis.org&gt;
</content>
</entry>
<entry>
<title>Revert "gcov: clang: fix clang-11+ build"</title>
<updated>2021-04-19T22:08:49Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2021-04-19T22:08:49Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=7af08140979a6e7e12b78c93b8625c8d25b084e2'/>
<id>urn:sha1:7af08140979a6e7e12b78c93b8625c8d25b084e2</id>
<content type='text'>
This reverts commit 04c53de57cb6435738961dace8b1b71d3ecd3c39.

Nathan Chancellor points out that it should not have been merged into
mainline by itself. It was a fix for "gcov: use kvmalloc()", which is
still in -mm/-next. Merging it alone has broken the build.

Link: https://github.com/ClangBuiltLinux/continuous-integration2/runs/2384465683?check_suite_focus=true
Reported-by: Nathan Chancellor &lt;nathan@kernel.org&gt;
Cc: Johannes Berg &lt;johannes.berg@intel.com&gt;
Cc: Nick Desaulniers &lt;ndesaulniers@google.com&gt;
Cc: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>preempt/dynamic: Fix typo in macro conditional statement</title>
<updated>2021-04-19T18:02:57Z</updated>
<author>
<name>Zhouyi Zhou</name>
<email>zhouzhouyi@gmail.com</email>
</author>
<published>2021-04-10T07:35:23Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=0c89d87d1d43d9fa268d1dc489518564d58bf497'/>
<id>urn:sha1:0c89d87d1d43d9fa268d1dc489518564d58bf497</id>
<content type='text'>
Commit 40607ee97e4e ("preempt/dynamic: Provide irqentry_exit_cond_resched()
static call") tried to provide irqentry_exit_cond_resched() static call
in irqentry_exit, but has a typo in macro conditional statement.

Fixes: 40607ee97e4e ("preempt/dynamic: Provide irqentry_exit_cond_resched() static call")
Signed-off-by: Zhouyi Zhou &lt;zhouzhouyi@gmail.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Link: https://lkml.kernel.org/r/20210410073523.5493-1-zhouzhouyi@gmail.com
</content>
</entry>
<entry>
<title>Merge tag 'net-5.12-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net</title>
<updated>2021-04-17T16:57:15Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2021-04-17T16:57:15Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=88a5af943985fb43b4c9472b5abd9c0b9705533d'/>
<id>urn:sha1:88a5af943985fb43b4c9472b5abd9c0b9705533d</id>
<content type='text'>
Pull networking fixes from Jakub Kicinski:
 "Networking fixes for 5.12-rc8, including fixes from netfilter, and
  bpf. BPF verifier changes stand out, otherwise things have slowed
  down.

  Current release - regressions:

   - gro: ensure frag0 meets IP header alignment

   - Revert "net: stmmac: re-init rx buffers when mac resume back"

   - ethernet: macb: fix the restore of cmp registers

  Previous releases - regressions:

   - ixgbe: Fix NULL pointer dereference in ethtool loopback test

   - ixgbe: fix unbalanced device enable/disable in suspend/resume

   - phy: marvell: fix detection of PHY on Topaz switches

   - make tcp_allowed_congestion_control readonly in non-init netns

   - xen-netback: Check for hotplug-status existence before watching

  Previous releases - always broken:

   - bpf: mitigate a speculative oob read of up to map value size by
     tightening the masking window

   - sctp: fix race condition in sctp_destroy_sock

   - sit, ip6_tunnel: Unregister catch-all devices

   - netfilter: nftables: clone set element expression template

   - netfilter: flowtable: fix NAT IPv6 offload mangling

   - net: geneve: check skb is large enough for IPv4/IPv6 header

   - netlink: don't call -&gt;netlink_bind with table lock held"

* tag 'net-5.12-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (52 commits)
  netlink: don't call -&gt;netlink_bind with table lock held
  MAINTAINERS: update my email
  bpf: Update selftests to reflect new error states
  bpf: Tighten speculative pointer arithmetic mask
  bpf: Move sanitize_val_alu out of op switch
  bpf: Refactor and streamline bounds check into helper
  bpf: Improve verifier error messages for users
  bpf: Rework ptr_limit into alu_limit and add common error path
  bpf: Ensure off_reg has no mixed signed bounds for all types
  bpf: Move off_reg into sanitize_ptr_alu
  bpf: Use correct permission flag for mixed signed bounds arithmetic
  ch_ktls: do not send snd_una update to TCB in middle
  ch_ktls: tcb close causes tls connection failure
  ch_ktls: fix device connection close
  ch_ktls: Fix kernel panic
  i40e: fix the panic when running bpf in xdpdrv mode
  net/mlx5e: fix ingress_ifindex check in mlx5e_flower_parse_meta
  net/mlx5e: Fix setting of RS FEC mode
  net/mlx5: Fix setting of devlink traps in switchdev mode
  Revert "net: stmmac: re-init rx buffers when mac resume back"
  ...
</content>
</entry>
<entry>
<title>locking/qrwlock: Fix ordering in queued_write_lock_slowpath()</title>
<updated>2021-04-17T11:40:50Z</updated>
<author>
<name>Ali Saidi</name>
<email>alisaidi@amazon.com</email>
</author>
<published>2021-04-15T17:27:11Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=84a24bf8c52e66b7ac89ada5e3cfbe72d65c1896'/>
<id>urn:sha1:84a24bf8c52e66b7ac89ada5e3cfbe72d65c1896</id>
<content type='text'>
While this code is executed with the wait_lock held, a reader can
acquire the lock without holding wait_lock.  The writer side loops
checking the value with the atomic_cond_read_acquire(), but only truly
acquires the lock when the compare-and-exchange is completed
successfully which isn’t ordered. This exposes the window between the
acquire and the cmpxchg to an A-B-A problem which allows reads
following the lock acquisition to observe values speculatively before
the write lock is truly acquired.

We've seen a problem in epoll where the reader does a xchg while
holding the read lock, but the writer can see a value change out from
under it.

  Writer                                | Reader
  --------------------------------------------------------------------------------
  ep_scan_ready_list()                  |
  |- write_lock_irq()                   |
      |- queued_write_lock_slowpath()   |
	|- atomic_cond_read_acquire()   |
				        | read_lock_irqsave(&amp;ep-&gt;lock, flags);
     --&gt; (observes value before unlock) |  chain_epi_lockless()
     |                                  |    epi-&gt;next = xchg(&amp;ep-&gt;ovflist, epi);
     |                                  | read_unlock_irqrestore(&amp;ep-&gt;lock, flags);
     |                                  |
     |     atomic_cmpxchg_relaxed()     |
     |-- READ_ONCE(ep-&gt;ovflist);        |

A core can order the read of the ovflist ahead of the
atomic_cmpxchg_relaxed(). Switching the cmpxchg to use acquire
semantics addresses this issue at which point the atomic_cond_read can
be switched to use relaxed semantics.

Fixes: b519b56e378ee ("locking/qrwlock: Use atomic_cond_read_acquire() when spinning in qrwlock")
Signed-off-by: Ali Saidi &lt;alisaidi@amazon.com&gt;
[peterz: use try_cmpxchg()]
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Reviewed-by: Steve Capper &lt;steve.capper@arm.com&gt;
Acked-by: Will Deacon &lt;will@kernel.org&gt;
Acked-by: Waiman Long &lt;longman@redhat.com&gt;
Tested-by: Steve Capper &lt;steve.capper@arm.com&gt;
</content>
</entry>
<entry>
<title>gcov: clang: fix clang-11+ build</title>
<updated>2021-04-16T23:10:37Z</updated>
<author>
<name>Johannes Berg</name>
<email>johannes.berg@intel.com</email>
</author>
<published>2021-04-16T22:46:23Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=04c53de57cb6435738961dace8b1b71d3ecd3c39'/>
<id>urn:sha1:04c53de57cb6435738961dace8b1b71d3ecd3c39</id>
<content type='text'>
With clang-11+, the code is broken due to my kvmalloc() conversion
(which predated the clang-11 support code) leaving one vmalloc() in
place.  Fix that.

Link: https://lkml.kernel.org/r/20210412214210.6e1ecca9cdc5.I24459763acf0591d5e6b31c7e3a59890d802f79c@changeid
Signed-off-by: Johannes Berg &lt;johannes.berg@intel.com&gt;
Reviewed-by: Nick Desaulniers &lt;ndesaulniers@google.com&gt;
Tested-by: Nick Desaulniers &lt;ndesaulniers@google.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
</feed>
