<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/arch/x86/kernel, branch for-next</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=for-next</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=for-next'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2025-01-14T22:16:36Z</updated>
<entry>
<title>x86/fred: Fix the FRED RSP0 MSR out of sync with its per-CPU cache</title>
<updated>2025-01-14T22:16:36Z</updated>
<author>
<name>Xin Li (Intel)</name>
<email>xin@zytor.com</email>
</author>
<published>2025-01-10T17:46:39Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=de31b3cd706347044e1a57d68c3a683d58e8cca4'/>
<id>urn:sha1:de31b3cd706347044e1a57d68c3a683d58e8cca4</id>
<content type='text'>
The FRED RSP0 MSR is only used for delivering events when running
userspace.  Linux leverages this property to reduce expensive MSR
writes and optimize context switches.  The kernel only writes the
MSR when about to run userspace *and* when the MSR has actually
changed since the last time userspace ran.

This optimization is implemented by maintaining a per-CPU cache of
FRED RSP0 and then checking that against the value for the top of
current task stack before running userspace.

However cpu_init_fred_exceptions() writes the MSR without updating
the per-CPU cache.  This means that the kernel might return to
userspace with MSR_IA32_FRED_RSP0==0 when it needed to point to the
top of current task stack.  This would induce a double fault (#DF),
which is bad.

A context switch after cpu_init_fred_exceptions() can paper over
the issue since it updates the cached value.  That evidently
happens most of the time explaining how this bug got through.

Fix the bug through resynchronizing the FRED RSP0 MSR with its
per-CPU cache in cpu_init_fred_exceptions().

Fixes: fe85ee391966 ("x86/entry: Set FRED RSP0 on return to userspace instead of context switch")
Signed-off-by: Xin Li (Intel) &lt;xin@zytor.com&gt;
Signed-off-by: Dave Hansen &lt;dave.hansen@linux.intel.com&gt;
Acked-by: Dave Hansen &lt;dave.hansen@linux.intel.com&gt;
Cc:stable@vger.kernel.org
Link: https://lore.kernel.org/all/20250110174639.1250829-1-xin%40zytor.com
</content>
</entry>
<entry>
<title>x86/fpu: Ensure shadow stack is active before "getting" registers</title>
<updated>2025-01-07T23:55:51Z</updated>
<author>
<name>Rick Edgecombe</name>
<email>rick.p.edgecombe@intel.com</email>
</author>
<published>2025-01-07T23:30:56Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=a9d9c33132d49329ada647e4514d210d15e31d81'/>
<id>urn:sha1:a9d9c33132d49329ada647e4514d210d15e31d81</id>
<content type='text'>
The x86 shadow stack support has its own set of registers. Those registers
are XSAVE-managed, but they are "supervisor state components" which means
that userspace can not touch them with XSAVE/XRSTOR.  It also means that
they are not accessible from the existing ptrace ABI for XSAVE state.
Thus, there is a new ptrace get/set interface for it.

The regset code that ptrace uses provides an -&gt;active() handler in
addition to the get/set ones. For shadow stack this -&gt;active() handler
verifies that shadow stack is enabled via the ARCH_SHSTK_SHSTK bit in the
thread struct. The -&gt;active() handler is checked from some call sites of
the regset get/set handlers, but not the ptrace ones. This was not
understood when shadow stack support was put in place.

As a result, both the set/get handlers can be called with
XFEATURE_CET_USER in its init state, which would cause get_xsave_addr() to
return NULL and trigger a WARN_ON(). The ssp_set() handler luckily has an
ssp_active() check to avoid surprising the kernel with shadow stack
behavior when the kernel is not ready for it (ARCH_SHSTK_SHSTK==0). That
check just happened to avoid the warning.

But the -&gt;get() side wasn't so lucky. It can be called with shadow stacks
disabled, triggering the warning in practice, as reported by Christina
Schimpe:

WARNING: CPU: 5 PID: 1773 at arch/x86/kernel/fpu/regset.c:198 ssp_get+0x89/0xa0
[...]
Call Trace:
&lt;TASK&gt;
? show_regs+0x6e/0x80
? ssp_get+0x89/0xa0
? __warn+0x91/0x150
? ssp_get+0x89/0xa0
? report_bug+0x19d/0x1b0
? handle_bug+0x46/0x80
? exc_invalid_op+0x1d/0x80
? asm_exc_invalid_op+0x1f/0x30
? __pfx_ssp_get+0x10/0x10
? ssp_get+0x89/0xa0
? ssp_get+0x52/0xa0
__regset_get+0xad/0xf0
copy_regset_to_user+0x52/0xc0
ptrace_regset+0x119/0x140
ptrace_request+0x13c/0x850
? wait_task_inactive+0x142/0x1d0
? do_syscall_64+0x6d/0x90
arch_ptrace+0x102/0x300
[...]

Ensure that shadow stacks are active in a thread before looking them up
in the XSAVE buffer. Since ARCH_SHSTK_SHSTK and user_ssp[SHSTK_EN] are
set at the same time, the active check ensures that there will be
something to find in the XSAVE buffer.

[ dhansen: changelog/subject tweaks ]

Fixes: 2fab02b25ae7 ("x86: Add PTRACE interface for shadow stack")
Reported-by: Christina Schimpe &lt;christina.schimpe@intel.com&gt;
Signed-off-by: Rick Edgecombe &lt;rick.p.edgecombe@intel.com&gt;
Signed-off-by: Dave Hansen &lt;dave.hansen@linux.intel.com&gt;
Tested-by: Christina Schimpe &lt;christina.schimpe@intel.com&gt;
Cc:stable@vger.kernel.org
Link: https://lore.kernel.org/all/20250107233056.235536-1-rick.p.edgecombe%40intel.com
</content>
</entry>
<entry>
<title>x86/static-call: Remove early_boot_irqs_disabled check to fix Xen PVH dom0</title>
<updated>2025-01-02T16:11:29Z</updated>
<author>
<name>Andrew Cooper</name>
<email>andrew.cooper3@citrix.com</email>
</author>
<published>2024-12-21T21:10:46Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=5cc2db37124bb33914996d6fdbb2ddb3811f2945'/>
<id>urn:sha1:5cc2db37124bb33914996d6fdbb2ddb3811f2945</id>
<content type='text'>
__static_call_update_early() has a check for early_boot_irqs_disabled, but
is used before early_boot_irqs_disabled is set up in start_kernel().

Xen PV has always special cased early_boot_irqs_disabled, but Xen PVH does
not and falls over the BUG when booting as dom0.

It is very suspect that early_boot_irqs_disabled starts as 0, becomes 1 for
a time, then becomes 0 again, but as this needs backporting to fix a
breakage in a security fix, dropping the BUG_ON() is the far safer option.

Fixes: 0ef8047b737d ("x86/static-call: provide a way to do very early static-call updates")
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=219620
Reported-by: Alex Zenla &lt;alex@edera.dev&gt;
Suggested-by: Peter Zijlstra &lt;peterz@infradead.org&gt;
Signed-off-by: Andrew Cooper &lt;andrew.cooper3@citrix.com&gt;
Signed-off-by: Borislav Petkov (AMD) &lt;bp@alien8.de&gt;
Reviewed-by: Juergen Gross &lt;jgross@suse.com&gt;
Acked-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Tested-by: Alex Zenla &lt;alex@edera.dev&gt;
Link: https://lore.kernel.org/r/20241221211046.6475-1-andrew.cooper3@citrix.com
</content>
</entry>
<entry>
<title>x86/fred: Clear WFE in missing-ENDBRANCH #CPs</title>
<updated>2024-12-29T09:18:10Z</updated>
<author>
<name>Xin Li (Intel)</name>
<email>xin@zytor.com</email>
</author>
<published>2024-11-13T17:59:34Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=dc81e556f2a017d681251ace21bf06c126d5a192'/>
<id>urn:sha1:dc81e556f2a017d681251ace21bf06c126d5a192</id>
<content type='text'>
An indirect branch instruction sets the CPU indirect branch tracker
(IBT) into WAIT_FOR_ENDBRANCH (WFE) state and WFE stays asserted
across the instruction boundary.  When the decoder finds an
inappropriate instruction while WFE is set ENDBR, the CPU raises a #CP
fault.

For the "kernel IBT no ENDBR" selftest where #CPs are deliberately
triggered, the WFE state of the interrupted context needs to be
cleared to let execution continue.  Otherwise when the CPU resumes
from the instruction that just caused the previous #CP, another
missing-ENDBRANCH #CP is raised and the CPU enters a dead loop.

This is not a problem with IDT because it doesn't preserve WFE and
IRET doesn't set WFE.  But FRED provides space on the entry stack
(in an expanded CS area) to save and restore the WFE state, thus the
WFE state is no longer clobbered, so software must clear it.

Clear WFE to avoid dead looping in ibt_clear_fred_wfe() and the
!ibt_fatal code path when execution is allowed to continue.

Clobbering WFE in any other circumstance is a security-relevant bug.

[ dhansen: changelog rewording ]

Fixes: a5f6c2ace997 ("x86/shstk: Add user control-protection fault handler")
Signed-off-by: Xin Li (Intel) &lt;xin@zytor.com&gt;
Signed-off-by: Dave Hansen &lt;dave.hansen@linux.intel.com&gt;
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Acked-by: Dave Hansen &lt;dave.hansen@linux.intel.com&gt;
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/all/20241113175934.3897541-1-xin%40zytor.com
</content>
</entry>
<entry>
<title>Merge tag 'hyperv-fixes-signed-20241217' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux</title>
<updated>2024-12-18T17:55:55Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2024-12-18T17:55:55Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=37cb0c76ac6c3bf3d82d25a7c95950f2dac3ca41'/>
<id>urn:sha1:37cb0c76ac6c3bf3d82d25a7c95950f2dac3ca41</id>
<content type='text'>
Pull hyperv fixes from Wei Liu:

 - Various fixes to Hyper-V tools in the kernel tree (Dexuan Cui, Olaf
   Hering, Vitaly Kuznetsov)

 - Fix a bug in the Hyper-V TSC page based sched_clock() (Naman Jain)

 - Two bug fixes in the Hyper-V utility functions (Michael Kelley)

 - Convert open-coded timeouts to secs_to_jiffies() in Hyper-V drivers
   (Easwar Hariharan)

* tag 'hyperv-fixes-signed-20241217' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux:
  tools/hv: reduce resource usage in hv_kvp_daemon
  tools/hv: add a .gitignore file
  tools/hv: reduce resouce usage in hv_get_dns_info helper
  hv/hv_kvp_daemon: Pass NIC name to hv_get_dns_info as well
  Drivers: hv: util: Avoid accessing a ringbuffer not initialized yet
  Drivers: hv: util: Don't force error code to ENODEV in util_probe()
  tools/hv: terminate fcopy daemon if read from uio fails
  drivers: hv: Convert open-coded timeouts to secs_to_jiffies()
  tools: hv: change permissions of NetworkManager configuration file
  x86/hyperv: Fix hv tsc page based sched_clock for hibernation
  tools: hv: Fix a complier warning in the fcopy uio daemon
</content>
</entry>
<entry>
<title>x86/xen: remove hypercall page</title>
<updated>2024-12-17T07:23:42Z</updated>
<author>
<name>Juergen Gross</name>
<email>jgross@suse.com</email>
</author>
<published>2024-10-17T13:27:31Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=7fa0da5373685e7ed249af3fa317ab1e1ba8b0a6'/>
<id>urn:sha1:7fa0da5373685e7ed249af3fa317ab1e1ba8b0a6</id>
<content type='text'>
The hypercall page is no longer needed. It can be removed, as from the
Xen perspective it is optional.

But, from Linux's perspective, it removes naked RET instructions that
escape the speculative protections that Call Depth Tracking and/or
Untrain Ret are trying to achieve.

This is part of XSA-466 / CVE-2024-53241.

Reported-by: Andrew Cooper &lt;andrew.cooper3@citrix.com&gt;
Signed-off-by: Juergen Gross &lt;jgross@suse.com&gt;
Reviewed-by: Andrew Cooper &lt;andrew.cooper3@citrix.com&gt;
Reviewed-by: Jan Beulich &lt;jbeulich@suse.com&gt;
</content>
</entry>
<entry>
<title>x86/static-call: provide a way to do very early static-call updates</title>
<updated>2024-12-13T08:28:32Z</updated>
<author>
<name>Juergen Gross</name>
<email>jgross@suse.com</email>
</author>
<published>2024-11-29T15:15:54Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=0ef8047b737d7480a5d4c46d956e97c190f13050'/>
<id>urn:sha1:0ef8047b737d7480a5d4c46d956e97c190f13050</id>
<content type='text'>
Add static_call_update_early() for updating static-call targets in
very early boot.

This will be needed for support of Xen guest type specific hypercall
functions.

This is part of XSA-466 / CVE-2024-53241.

Reported-by: Andrew Cooper &lt;andrew.cooper3@citrix.com&gt;
Signed-off-by: Juergen Gross &lt;jgross@suse.com&gt;
Co-developed-by: Peter Zijlstra &lt;peterz@infradead.org&gt;
Co-developed-by: Josh Poimboeuf &lt;jpoimboe@redhat.com&gt;
</content>
</entry>
<entry>
<title>x86: make get_cpu_vendor() accessible from Xen code</title>
<updated>2024-12-13T08:28:10Z</updated>
<author>
<name>Juergen Gross</name>
<email>jgross@suse.com</email>
</author>
<published>2024-10-17T06:29:48Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=efbcd61d9bebb771c836a3b8bfced8165633db7c'/>
<id>urn:sha1:efbcd61d9bebb771c836a3b8bfced8165633db7c</id>
<content type='text'>
In order to be able to differentiate between AMD and Intel based
systems for very early hypercalls without having to rely on the Xen
hypercall page, make get_cpu_vendor() non-static.

Refactor early_cpu_init() for the same reason by splitting out the
loop initializing cpu_devs() into an externally callable function.

This is part of XSA-466 / CVE-2024-53241.

Reported-by: Andrew Cooper &lt;andrew.cooper3@citrix.com&gt;
Signed-off-by: Juergen Gross &lt;jgross@suse.com&gt;
</content>
</entry>
<entry>
<title>x86/hyperv: Fix hv tsc page based sched_clock for hibernation</title>
<updated>2024-12-09T18:42:42Z</updated>
<author>
<name>Naman Jain</name>
<email>namjain@linux.microsoft.com</email>
</author>
<published>2024-09-17T05:39:17Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=bcc80dec91ee745b3d66f3e48f0ec2efdea97149'/>
<id>urn:sha1:bcc80dec91ee745b3d66f3e48f0ec2efdea97149</id>
<content type='text'>
read_hv_sched_clock_tsc() assumes that the Hyper-V clock counter is
bigger than the variable hv_sched_clock_offset, which is cached during
early boot, but depending on the timing this assumption may be false
when a hibernated VM starts again (the clock counter starts from 0
again) and is resuming back (Note: hv_init_tsc_clocksource() is not
called during hibernation/resume); consequently,
read_hv_sched_clock_tsc() may return a negative integer (which is
interpreted as a huge positive integer since the return type is u64)
and new kernel messages are prefixed with huge timestamps before
read_hv_sched_clock_tsc() grows big enough (which typically takes
several seconds).

Fix the issue by saving the Hyper-V clock counter just before the
suspend, and using it to correct the hv_sched_clock_offset in
resume. This makes hv tsc page based sched_clock continuous and ensures
that post resume, it starts from where it left off during suspend.
Override x86_platform.save_sched_clock_state and
x86_platform.restore_sched_clock_state routines to correct this as soon
as possible.

Note: if Invariant TSC is available, the issue doesn't happen because
1) we don't register read_hv_sched_clock_tsc() for sched clock:
See commit e5313f1c5404 ("clocksource/drivers/hyper-v: Rework
clocksource and sched clock setup");
2) the common x86 code adjusts TSC similarly: see
__restore_processor_state() -&gt;  tsc_verify_tsc_adjust(true) and
x86_platform.restore_sched_clock_state().

Cc: stable@vger.kernel.org
Fixes: 1349401ff1aa ("clocksource/drivers/hyper-v: Suspend/resume Hyper-V clocksource for hibernation")
Co-developed-by: Dexuan Cui &lt;decui@microsoft.com&gt;
Signed-off-by: Dexuan Cui &lt;decui@microsoft.com&gt;
Signed-off-by: Naman Jain &lt;namjain@linux.microsoft.com&gt;
Reviewed-by: Michael Kelley &lt;mhklinux@outlook.com&gt;
Link: https://lore.kernel.org/r/20240917053917.76787-1-namjain@linux.microsoft.com
Signed-off-by: Wei Liu &lt;wei.liu@kernel.org&gt;
Message-ID: &lt;20240917053917.76787-1-namjain@linux.microsoft.com&gt;
</content>
</entry>
<entry>
<title>x86: Fix build regression with CONFIG_KEXEC_JUMP enabled</title>
<updated>2024-12-09T18:13:28Z</updated>
<author>
<name>Damien Le Moal</name>
<email>dlemoal@kernel.org</email>
</author>
<published>2024-12-08T23:53:32Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=aeb68937614f4aeceaaa762bd7f0212ce842b797'/>
<id>urn:sha1:aeb68937614f4aeceaaa762bd7f0212ce842b797</id>
<content type='text'>
Build 6.13-rc12 for x86_64 with gcc 14.2.1 fails with the error:

  ld: vmlinux.o: in function `virtual_mapped':
  linux/arch/x86/kernel/relocate_kernel_64.S:249:(.text+0x5915b): undefined reference to `saved_context_gdt_desc'

when CONFIG_KEXEC_JUMP is enabled.

This was introduced by commit 07fa619f2a40 ("x86/kexec: Restore GDT on
return from ::preserve_context kexec") which introduced a use of
saved_context_gdt_desc without a declaration for it.

Fix that by including asm/asm-offsets.h where saved_context_gdt_desc
is defined (indirectly in include/generated/asm-offsets.h which
asm/asm-offsets.h includes).

Fixes: 07fa619f2a40 ("x86/kexec: Restore GDT on return from ::preserve_context kexec")
Signed-off-by: Damien Le Moal &lt;dlemoal@kernel.org&gt;
Acked-by: Borislav Petkov (AMD) &lt;bp@alien8.de&gt;
Acked-by: David Woodhouse &lt;dwmw@amazon.co.uk&gt;
Closes: https://lore.kernel.org/oe-kbuild-all/202411270006.ZyyzpYf8-lkp@intel.com/
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
</feed>
