<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/tools/objtool, branch v6.4</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v6.4</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v6.4'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2023-04-28T21:02:54Z</updated>
<entry>
<title>Merge tag 'objtool-core-2023-04-27' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip</title>
<updated>2023-04-28T21:02:54Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2023-04-28T21:02:54Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=2aff7c706c7483f4895ca250c92c1d71e45b6e82'/>
<id>urn:sha1:2aff7c706c7483f4895ca250c92c1d71e45b6e82</id>
<content type='text'>
Pull objtool updates from Ingo Molnar:

 - Mark arch_cpu_idle_dead() __noreturn, make all architectures &amp;
   drivers that did this inconsistently follow this new, common
   convention, and fix all the fallout that objtool can now detect
   statically

 - Fix/improve the ORC unwinder becoming unreliable due to
   UNWIND_HINT_EMPTY ambiguity, split it into UNWIND_HINT_END_OF_STACK
   and UNWIND_HINT_UNDEFINED to resolve it

 - Fix noinstr violations in the KCSAN code and the lkdtm/stackleak code

 - Generate ORC data for __pfx code

 - Add more __noreturn annotations to various kernel startup/shutdown
   and panic functions

 - Misc improvements &amp; fixes

* tag 'objtool-core-2023-04-27' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (52 commits)
  x86/hyperv: Mark hv_ghcb_terminate() as noreturn
  scsi: message: fusion: Mark mpt_halt_firmware() __noreturn
  x86/cpu: Mark {hlt,resume}_play_dead() __noreturn
  btrfs: Mark btrfs_assertfail() __noreturn
  objtool: Include weak functions in global_noreturns check
  cpu: Mark nmi_panic_self_stop() __noreturn
  cpu: Mark panic_smp_self_stop() __noreturn
  arm64/cpu: Mark cpu_park_loop() and friends __noreturn
  x86/head: Mark *_start_kernel() __noreturn
  init: Mark start_kernel() __noreturn
  init: Mark [arch_call_]rest_init() __noreturn
  objtool: Generate ORC data for __pfx code
  x86/linkage: Fix padding for typed functions
  objtool: Separate prefix code from stack validation code
  objtool: Remove superfluous dead_end_function() check
  objtool: Add symbol iteration helpers
  objtool: Add WARN_INSN()
  scripts/objdump-func: Support multiple functions
  context_tracking: Fix KCSAN noinstr violation
  objtool: Add stackleak instrumentation to uaccess safe list
  ...
</content>
</entry>
<entry>
<title>Merge tag 'for-6.4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux</title>
<updated>2023-04-26T16:13:44Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2023-04-26T16:13:44Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=85d7ab2463822a4ab096c0b7b59feec962552572'/>
<id>urn:sha1:85d7ab2463822a4ab096c0b7b59feec962552572</id>
<content type='text'>
Pull btrfs updates from David Sterba:
 "Mostly core changes and cleanups, some notable fixes and two
  performance improvements in directory logging.

  The IO path cleanups are removing or refactoring old code, scrub main
  loop has been completely rewritten also refactoring old code.

  There are some changes to non-btrfs code, mostly trivial, the cgroup
  punt bio logic is only moved from generic code.

  Performance improvements:

   - improve logging changes in a directory during one transaction,
     avoid iterating over items and reduce lock contention (fsync time
     4x lower)

   - when logging directory entries during one transaction, reduce
     locking of subvolume trees by checking tree-log instead
     (improvement in throughput and latency for concurrent access to a
     subvolume)

  Notable fixes:

   - dev-replace:
      - properly honor read mode when requested to avoid reading from
        source device
      - target device won't be used for eventual read repair, this is
        unreliable for NODATASUM files
      - when there are unpaired (and unrepairable) metadata during
        replace, exit early with error and don't try to finish whole
        operation

   - scrub ioctl properly rejects unknown flags

   - fix global block reserve calculations

   - fix partial direct io write when there's a page fault in the
     middle, iomap will try to continue with partial request but the
     btrfs part did not match that, this can lead to zeros written
     instead of data

  Core changes:

   - io path:
      - continued cleanups and refactoring around bio handling
      - extent io submit path simplifications and cleanups
      - flush write path simplifications and cleanups
      - rework logic of passing sync mode of bio, with further cleanups

   - rewrite scrub code flow, restructure how the stripes are enumerated
     and verified in a more unified way

   - allow to set lower threshold for block group reclaim in debug mode
     to aid zoned mode testing

   - remove obsolete time-based delayed ref throttling logic when
     truncating items

   - DREW locks are not using percpu variables anymore

   - more warning fixes (-Wmaybe-uninitialized)

   - u64 division simplifications

   - error handling improvements

  Non-btrfs code changes:

   - push cgroup punt bio logic to btrfs code (there was no other user
     of that), the functionality can be now selected separately by
     BLK_CGROUP_PUNT_BIO

   - crc32c_impl removed after removing last uses in btrfs code

   - add btrfs_assertfail() to objtool table"

* tag 'for-6.4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: (147 commits)
  btrfs: mark btrfs_assertfail() __noreturn
  btrfs: fix uninitialized variable warnings
  btrfs: use log root when iterating over index keys when logging directory
  btrfs: avoid iterating over all indexes when logging directory
  btrfs: dev-replace: error out if we have unrepaired metadata error during
  btrfs: remove pointless loop at btrfs_get_next_valid_item()
  btrfs: scrub: reject unsupported scrub flags
  btrfs: reinterpret async discard iops_limit=0 as no delay
  btrfs: set default discard iops_limit to 1000
  btrfs: remove unused raid56 functions which were dedicated for scrub
  btrfs: scrub: remove scrub_bio structure
  btrfs: scrub: remove scrub_block and scrub_sector structures
  btrfs: scrub: remove the old scrub recheck code
  btrfs: scrub: remove the old writeback infrastructure
  btrfs: scrub: remove scrub_parity structure
  btrfs: scrub: use scrub_stripe to implement RAID56 P/Q scrub
  btrfs: scrub: switch scrub_simple_mirror() to scrub_stripe infrastructure
  btrfs: scrub: introduce helper to queue a stripe for scrub
  btrfs: scrub: introduce error reporting functionality for scrub_stripe
  btrfs: scrub: introduce a writeback helper for scrub_stripe
  ...
</content>
</entry>
<entry>
<title>Merge tag 'docs-6.4' of git://git.lwn.net/linux</title>
<updated>2023-04-24T19:35:49Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2023-04-24T19:35:49Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=c23f28975abc2eb02cecc8bc1f2c95473a59ed2e'/>
<id>urn:sha1:c23f28975abc2eb02cecc8bc1f2c95473a59ed2e</id>
<content type='text'>
Pull documentation updates from Jonathan Corbet:
 "Commit volume in documentation is relatively low this time, but there
  is still a fair amount going on, including:

   - Reorganize the architecture-specific documentation under
     Documentation/arch

     This makes the structure match the source directory and helps to
     clean up the mess that is the top-level Documentation directory a
     bit. This work creates the new directory and moves x86 and most of
     the less-active architectures there.

     The current plan is to move the rest of the architectures in 6.5,
     with the patches going through the appropriate subsystem trees.

   - Some more Spanish translations and maintenance of the Italian
     translation

   - A new "Kernel contribution maturity model" document from Ted

   - A new tutorial on quickly building a trimmed kernel from Thorsten

  Plus the usual set of updates and fixes"

* tag 'docs-6.4' of git://git.lwn.net/linux: (47 commits)
  media: Adjust column width for pdfdocs
  media: Fix building pdfdocs
  docs: clk: add documentation to log which clocks have been disabled
  docs: trace: Fix typo in ftrace.rst
  Documentation/process: always CC responsible lists
  docs: kmemleak: adjust to config renaming
  ELF: document some de-facto PT_* ABI quirks
  Documentation: arm: remove stih415/stih416 related entries
  docs: turn off "smart quotes" in the HTML build
  Documentation: firmware: Clarify firmware path usage
  docs/mm: Physical Memory: Fix grammar
  Documentation: Add document for false sharing
  dma-api-howto: typo fix
  docs: move m68k architecture documentation under Documentation/arch/
  docs: move parisc documentation under Documentation/arch/
  docs: move ia64 architecture docs under Documentation/arch/
  docs: Move arc architecture docs under Documentation/arch/
  docs: move nios2 documentation under Documentation/arch/
  docs: move openrisc documentation under Documentation/arch/
  docs: move superh documentation under Documentation/arch/
  ...
</content>
</entry>
<entry>
<title>x86: improve on the non-rep 'copy_user' function</title>
<updated>2023-04-19T00:05:28Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2023-04-17T01:23:06Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=427fda2c8a4977d9dbd9bc108bbe6e21ec84648d'/>
<id>urn:sha1:427fda2c8a4977d9dbd9bc108bbe6e21ec84648d</id>
<content type='text'>
The old 'copy_user_generic_unrolled' function was oddly implemented for
largely historical reasons: it had been largely based on the uncached
copy case, which has some other concerns.

For example, the __copy_user_nocache() function uses 'movnti' for the
destination stores, and those want the destination to be aligned.  In
contrast, the regular copy function doesn't really care, and trying to
align things only complicates matters.

Also, like the clear_user function, the copy function had some odd
handling of the repeat counts, complicating the exception handling for
no really good reason.  So as with clear_user, just write it to keep all
the byte counts in the %rcx register, exactly like the 'rep movs'
functionality that this replaces.

Unlike a real 'rep movs', we do allow for this to trash a few temporary
registers to not have to unnecessarily save/restore registers on the
stack.

And like the clearing case, rename this to what it now clearly is:
'rep_movs_alternative', and make it one coherent function, so that it
shows up as such in profiles (instead of the odd split between
"copy_user_generic_unrolled" and "copy_user_short_string", the latter of
which was not about strings at all, and which was shared with the
uncached case).

Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>x86: improve on the non-rep 'clear_user' function</title>
<updated>2023-04-19T00:05:28Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2023-04-16T21:06:58Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=8c9b6a88b7e2f33c656cd667a081bfd4dc8f5005'/>
<id>urn:sha1:8c9b6a88b7e2f33c656cd667a081bfd4dc8f5005</id>
<content type='text'>
The old version was oddly written to have the repeat count in multiple
registers.  So instead of taking advantage of %rax being zero, it had
some sub-counts in it.  All just for a "single word clearing" loop,
which isn't even efficient to begin with.

So get rid of those games, and just keep all the state in the same
registers we got it in (and that we should return things in).  That not
only makes this act much more like 'rep stos' (which this function is
replacing), but makes it much easier to actually do the obvious loop
unrolling.

Also rename the function from the now nonsensical 'clear_user_original'
to what it now clearly is: 'rep_stos_alternative'.

End result: if we don't have a fast 'rep stosb', at least we can have a
fast fallback for it.

Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>x86: inline the 'rep movs' in user copies for the FSRM case</title>
<updated>2023-04-19T00:05:28Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2023-04-16T02:31:34Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=577e6a7fd50d519c201d20968b6a027a6563dc4c'/>
<id>urn:sha1:577e6a7fd50d519c201d20968b6a027a6563dc4c</id>
<content type='text'>
This does the same thing for the user copies as commit 0db7058e8e23
("x86/clear_user: Make it faster") did for clear_user().  In other
words, it inlines the "rep movs" case when X86_FEATURE_FSRM is set,
avoiding the function call entirely.

In order to do that, it makes the calling convention for the out-of-line
case ("copy_user_generic_unrolled") match the 'rep movs' calling
convention, although it does also end up clobbering a number of
additional registers.

Also, to simplify code sharing in the low-level assembly with the
__copy_user_nocache() function (that uses the normal C calling
convention), we end up with a kind of mixed return value for the
low-level asm code: it will return the result in both %rcx (to work as
an alternative for the 'rep movs' case), _and_ in %rax (for the nocache
case).

We could avoid this by wrapping __copy_user_nocache() callers in an
inline asm, but since the cost is just an extra register copy, it's
probably not worth it.

Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>x86: move stac/clac from user copy routines into callers</title>
<updated>2023-04-19T00:05:28Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2023-04-15T20:39:15Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=3639a535587d7aac449cdce9710dfdc97a3c8c8e'/>
<id>urn:sha1:3639a535587d7aac449cdce9710dfdc97a3c8c8e</id>
<content type='text'>
This is preparatory work for inlining the 'rep movs' case, but also a
cleanup.  The __copy_user_nocache() function was mis-used by the rdma
code to do uncached kernel copies that don't actually want user copies
at all, and as a result doesn't want the stac/clac either.

Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>x86: don't use REP_GOOD or ERMS for user memory clearing</title>
<updated>2023-04-19T00:05:28Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2023-04-15T20:22:31Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=d2c95f9d6802cc518d71d9795f4d9da54fb4e24d'/>
<id>urn:sha1:d2c95f9d6802cc518d71d9795f4d9da54fb4e24d</id>
<content type='text'>
The modern target to use is FSRS (Fast Short REP STOS), and the other
cases should only be used for bigger areas (ie mainly things like page
clearing).

Note! This changes the conditional for the inlining from FSRM ("fast
short rep movs") to FSRS ("fast short rep stos").

We'll have a separate fixup for AMD microarchitectures that have a good
'rep stosb' yet do not set the new Intel-specific FSRS bit (because FSRM
was there first).

Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>btrfs: mark btrfs_assertfail() __noreturn</title>
<updated>2023-04-17T17:52:19Z</updated>
<author>
<name>Josh Poimboeuf</name>
<email>jpoimboe@kernel.org</email>
</author>
<published>2023-04-12T23:49:38Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=f372463124df5f980de6ee0cd6000a3e43df0e01'/>
<id>urn:sha1:f372463124df5f980de6ee0cd6000a3e43df0e01</id>
<content type='text'>
Fixes a bunch of warnings including:

  vmlinux.o: warning: objtool: select_reloc_root+0x314: unreachable instruction
  vmlinux.o: warning: objtool: finish_inode_if_needed+0x15b1: unreachable instruction
  vmlinux.o: warning: objtool: get_bio_sector_nr+0x259: unreachable instruction
  vmlinux.o: warning: objtool: raid_wait_read_end_io+0xc26: unreachable instruction
  vmlinux.o: warning: objtool: raid56_parity_alloc_scrub_rbio+0x37b: unreachable instruction
  ...

Reported-by: kernel test robot &lt;lkp@intel.com&gt;
Link: https://lore.kernel.org/oe-kbuild-all/202302210709.IlXfgMpX-lkp@intel.com/
Signed-off-by: Josh Poimboeuf &lt;jpoimboe@kernel.org&gt;
Reviewed-by: David Sterba &lt;dsterba@suse.com&gt;
Signed-off-by: David Sterba &lt;dsterba@suse.com&gt;
</content>
</entry>
<entry>
<title>x86/hyperv: Mark hv_ghcb_terminate() as noreturn</title>
<updated>2023-04-14T15:31:28Z</updated>
<author>
<name>Guilherme G. Piccoli</name>
<email>gpiccoli@igalia.com</email>
</author>
<published>2023-04-12T23:49:41Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=611d4c716db0141cfc436994dc5aff1d69c924ad'/>
<id>urn:sha1:611d4c716db0141cfc436994dc5aff1d69c924ad</id>
<content type='text'>
Annotate the function prototype and definition as noreturn to prevent
objtool warnings like:

vmlinux.o: warning: objtool: hyperv_init+0x55c: unreachable instruction

Also, as per Josh's suggestion, add it to the global_noreturns list.
As a comparison, an objdump output without the annotation:

[...]
1b63:  mov    $0x1,%esi
1b68:  xor    %edi,%edi
1b6a:  callq  ffffffff8102f680 &lt;hv_ghcb_terminate&gt;
1b6f:  jmpq   ffffffff82f217ec &lt;hyperv_init+0x9c&gt; # unreachable
1b74:  cmpq   $0xffffffffffffffff,-0x702a24(%rip)
[...]

Now, after adding the __noreturn to the function prototype:

[...]
17df:  callq  ffffffff8102f6d0 &lt;hv_ghcb_negotiate_protocol&gt;
17e4:  test   %al,%al
17e6:  je     ffffffff82f21bb9 &lt;hyperv_init+0x469&gt;
[...]  &lt;many insns&gt;
1bb9:  mov    $0x1,%esi
1bbe:  xor    %edi,%edi
1bc0:  callq  ffffffff8102f680 &lt;hv_ghcb_terminate&gt;
1bc5:  nopw   %cs:0x0(%rax,%rax,1) # end of function

Reported-by: Arnd Bergmann &lt;arnd@arndb.de&gt;
Signed-off-by: Guilherme G. Piccoli &lt;gpiccoli@igalia.com&gt;
Signed-off-by: Josh Poimboeuf &lt;jpoimboe@kernel.org&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Reviewed-by: Michael Kelley &lt;mikelley@microsoft.com&gt;
Link: https://lore.kernel.org/r/32453a703dfcf0d007b473c9acbf70718222b74b.1681342859.git.jpoimboe@kernel.org
</content>
</entry>
</feed>
