<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/tools/perf, branch v5.19</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v5.19</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v5.19'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2022-07-27T14:19:39Z</updated>
<entry>
<title>perf bpf: Remove undefined behavior from bpf_perf_object__next()</title>
<updated>2022-07-27T14:19:39Z</updated>
<author>
<name>Ian Rogers</name>
<email>irogers@google.com</email>
</author>
<published>2022-07-26T22:09:21Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=9a241805673ec0a826b7ddf84b00f4e03adb0a5e'/>
<id>urn:sha1:9a241805673ec0a826b7ddf84b00f4e03adb0a5e</id>
<content type='text'>
bpf_perf_object__next() folded the last element in the list test with the
empty list test. However, this meant that offsets were computed against
null and that a struct list_head was compared against a 'struct
bpf_perf_object'.

Working around this with clang's undefined behavior sanitizer required
-fno-sanitize=null and -fno-sanitize=object-size.

Remove the undefined behavior by using the regular Linux list APIs and
handling the starting case separately from the end testing case.

Looking at uses like bpf_perf_object__for_each(), as the constant NULL
or non-NULL argument can be constant propagated, the code is no less
efficient.

Signed-off-by: Ian Rogers &lt;irogers@google.com&gt;
Cc: Alexander Shishkin &lt;alexander.shishkin@linux.intel.com&gt;
Cc: Alexei Starovoitov &lt;ast@kernel.org&gt;
Cc: Andrii Nakryiko &lt;andrii@kernel.org&gt;
Cc: Christy Lee &lt;christylee@fb.com&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Mark Rutland &lt;mark.rutland@arm.com&gt;
Cc: Miaoqian Lin &lt;linmq006@gmail.com&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: Nathan Chancellor &lt;nathan@kernel.org&gt;
Cc: Nick Desaulniers &lt;ndesaulniers@google.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Stephane Eranian &lt;eranian@google.com&gt;
Cc: Tom Rix &lt;trix@redhat.com&gt;
Cc: bpf@vger.kernel.org
Cc: llvm@lists.linux.dev
Link: https://lore.kernel.org/r/20220726220921.2567761-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
</entry>
<entry>
<title>perf symbol: Skip symbols if SHF_ALLOC flag is not set</title>
<updated>2022-07-27T14:17:50Z</updated>
<author>
<name>Leo Yan</name>
<email>leo.yan@linaro.org</email>
</author>
<published>2022-07-24T06:00:13Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=882528d2e77687c3ef26abb9c490f77a9c1f6e1a'/>
<id>urn:sha1:882528d2e77687c3ef26abb9c490f77a9c1f6e1a</id>
<content type='text'>
Some symbols are observed with the 'st_value' field zeroed.  E.g.
libc.so.6 in Ubuntu contains a symbol '__evoke_link_warning_getwd' which
resides in the '.gnu.warning.getwd' section.

Unlike normal sections, such kind of sections are used for linker
warning when a file calls deprecated functions, but they are not part of
memory images, the symbols in these sections should be dropped.

This patch checks the section attribute SHF_ALLOC bit, if the bit is not
set, it skips symbols to avoid spurious ones.

Suggested-by: Fangrui Song &lt;maskray@google.com&gt;
Signed-off-by: Leo Yan &lt;leo.yan@linaro.org&gt;
Acked-by: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: Alexander Shishkin &lt;alexander.shishkin@linux.intel.com&gt;
Cc: Chang Rui &lt;changruinj@gmail.com&gt;
Cc: Ian Rogers &lt;irogers@google.com&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Mark Rutland &lt;mark.rutland@arm.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Link: https://lore.kernel.org/r/20220724060013.171050-3-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
</entry>
<entry>
<title>perf symbol: Correct address for bss symbols</title>
<updated>2022-07-27T14:17:50Z</updated>
<author>
<name>Leo Yan</name>
<email>leo.yan@linaro.org</email>
</author>
<published>2022-07-24T06:00:12Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=2d86612aacb7805f72873691a2644d7279ed0630'/>
<id>urn:sha1:2d86612aacb7805f72873691a2644d7279ed0630</id>
<content type='text'>
When using 'perf mem' and 'perf c2c', an issue is observed that tool
reports the wrong offset for global data symbols.  This is a common
issue on both x86 and Arm64 platforms.

Let's see an example, for a test program, below is the disassembly for
its .bss section which is dumped with objdump:

  ...

  Disassembly of section .bss:

  0000000000004040 &lt;completed.0&gt;:
  	...

  0000000000004080 &lt;buf1&gt;:
  	...

  00000000000040c0 &lt;buf2&gt;:
  	...

  0000000000004100 &lt;thread&gt;:
  	...

First we used 'perf mem record' to run the test program and then used
'perf --debug verbose=4 mem report' to observe what's the symbol info
for 'buf1' and 'buf2' structures.

  # ./perf mem record -e ldlat-loads,ldlat-stores -- false_sharing.exe 8
  # ./perf --debug verbose=4 mem report
    ...
    dso__load_sym_internal: adjusting symbol: st_value: 0x40c0 sh_addr: 0x4040 sh_offset: 0x3028
    symbol__new: buf2 0x30a8-0x30e8
    ...
    dso__load_sym_internal: adjusting symbol: st_value: 0x4080 sh_addr: 0x4040 sh_offset: 0x3028
    symbol__new: buf1 0x3068-0x30a8
    ...

The perf tool relies on libelf to parse symbols, in executable and
shared object files, 'st_value' holds a virtual address; 'sh_addr' is
the address at which section's first byte should reside in memory, and
'sh_offset' is the byte offset from the beginning of the file to the
first byte in the section.  The perf tool uses below formula to convert
a symbol's memory address to a file address:

  file_address = st_value - sh_addr + sh_offset
                    ^
                    ` Memory address

We can see the final adjusted address ranges for buf1 and buf2 are
[0x30a8-0x30e8) and [0x3068-0x30a8) respectively, apparently this is
incorrect, in the code, the structure for 'buf1' and 'buf2' specifies
compiler attribute with 64-byte alignment.

The problem happens for 'sh_offset', libelf returns it as 0x3028 which
is not 64-byte aligned, combining with disassembly, it's likely libelf
doesn't respect the alignment for .bss section, therefore, it doesn't
return the aligned value for 'sh_offset'.

Suggested by Fangrui Song, ELF file contains program header which
contains PT_LOAD segments, the fields p_vaddr and p_offset in PT_LOAD
segments contain the execution info.  A better choice for converting
memory address to file address is using the formula:

  file_address = st_value - p_vaddr + p_offset

This patch introduces elf_read_program_header() which returns the
program header based on the passed 'st_value', then it uses the formula
above to calculate the symbol file address; and the debugging log is
updated respectively.

After applying the change:

  # ./perf --debug verbose=4 mem report
    ...
    dso__load_sym_internal: adjusting symbol: st_value: 0x40c0 p_vaddr: 0x3d28 p_offset: 0x2d28
    symbol__new: buf2 0x30c0-0x3100
    ...
    dso__load_sym_internal: adjusting symbol: st_value: 0x4080 p_vaddr: 0x3d28 p_offset: 0x2d28
    symbol__new: buf1 0x3080-0x30c0
    ...

Fixes: f17e04afaff84b5c ("perf report: Fix ELF symbol parsing")
Reported-by: Chang Rui &lt;changruinj@gmail.com&gt;
Suggested-by: Fangrui Song &lt;maskray@google.com&gt;
Signed-off-by: Leo Yan &lt;leo.yan@linaro.org&gt;
Acked-by: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: Alexander Shishkin &lt;alexander.shishkin@linux.intel.com&gt;
Cc: Ian Rogers &lt;irogers@google.com&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Mark Rutland &lt;mark.rutland@arm.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Link: https://lore.kernel.org/r/20220724060013.171050-2-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
</entry>
<entry>
<title>perf scripts python: Let script to be python2 compliant</title>
<updated>2022-07-27T14:17:50Z</updated>
<author>
<name>Leo Yan</name>
<email>leo.yan@linaro.org</email>
</author>
<published>2022-07-25T10:42:20Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=b226521923aee7051f4b24df9be5bf07d53f0a2b'/>
<id>urn:sha1:b226521923aee7051f4b24df9be5bf07d53f0a2b</id>
<content type='text'>
The mainline kernel can be used for relative old distros, e.g. RHEL 7.
The distro doesn't upgrade from python2 to python3, this causes the
building error that the python script is not python2 compliant.

To fix the building failure, this patch changes from the python f-string
format to traditional string format.

Fixes: 12fdd6c009da0d02 ("perf scripts python: Support Arm CoreSight trace data disassembly")
Reported-by: Akemi Yagi &lt;toracat@elrepo.org&gt;
Signed-off-by: Leo Yan &lt;leo.yan@linaro.org&gt;
Cc: Alexander Shishkin &lt;alexander.shishkin@linux.intel.com&gt;
Cc: ElRepo &lt;contact@elrepo.org&gt;
Cc: Ian Rogers &lt;irogers@google.com&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Leo Yan &lt;leo.yan@linaro.org&gt;
Cc: Mark Rutland &lt;mark.rutland@arm.com&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Link: https://lore.kernel.org/r/20220725104220.1106663-1-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
</entry>
<entry>
<title>perf trace: Fix SIGSEGV when processing syscall args</title>
<updated>2022-07-17T13:59:52Z</updated>
<author>
<name>Naveen N. Rao</name>
<email>naveen.n.rao@linux.vnet.ibm.com</email>
</author>
<published>2022-07-07T09:09:00Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=4b335e1e0d6f8fa91dac615a44b123c9f26e93d3'/>
<id>urn:sha1:4b335e1e0d6f8fa91dac615a44b123c9f26e93d3</id>
<content type='text'>
On powerpc, 'perf trace' is crashing with a SIGSEGV when trying to
process a perf.data file created with 'perf trace record -p':

  #0  0x00000001225b8988 in syscall_arg__scnprintf_augmented_string &lt;snip&gt; at builtin-trace.c:1492
  #1  syscall_arg__scnprintf_filename &lt;snip&gt; at builtin-trace.c:1492
  #2  syscall_arg__scnprintf_filename &lt;snip&gt; at builtin-trace.c:1486
  #3  0x00000001225bdd9c in syscall_arg_fmt__scnprintf_val &lt;snip&gt; at builtin-trace.c:1973
  #4  syscall__scnprintf_args &lt;snip&gt; at builtin-trace.c:2041
  #5  0x00000001225bff04 in trace__sys_enter &lt;snip&gt; at builtin-trace.c:2319

That points to the below code in tools/perf/builtin-trace.c:
	/*
	 * If this is raw_syscalls.sys_enter, then it always comes with the 6 possible
	 * arguments, even if the syscall being handled, say "openat", uses only 4 arguments
	 * this breaks syscall__augmented_args() check for augmented args, as we calculate
	 * syscall-&gt;args_size using each syscalls:sys_enter_NAME tracefs format file,
	 * so when handling, say the openat syscall, we end up getting 6 args for the
	 * raw_syscalls:sys_enter event, when we expected just 4, we end up mistakenly
	 * thinking that the extra 2 u64 args are the augmented filename, so just check
	 * here and avoid using augmented syscalls when the evsel is the raw_syscalls one.
	 */
	if (evsel != trace-&gt;syscalls.events.sys_enter)
		augmented_args = syscall__augmented_args(sc, sample, &amp;augmented_args_size, trace-&gt;raw_augmented_syscalls_args_size);

As the comment points out, we should not be trying to augment the args
for raw_syscalls. However, when processing a perf.data file, we are not
initializing those properly. Fix the same.

Reported-by: Claudio Carvalho &lt;cclaudio@linux.ibm.com&gt;
Signed-off-by: Naveen N. Rao &lt;naveen.n.rao@linux.vnet.ibm.com&gt;
Cc: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Link: http://lore.kernel.org/lkml/20220707090900.572584-1-naveen.n.rao@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
</entry>
<entry>
<title>perf tests: Fix Convert perf time to TSC test for hybrid</title>
<updated>2022-07-17T13:57:07Z</updated>
<author>
<name>Adrian Hunter</name>
<email>adrian.hunter@intel.com</email>
</author>
<published>2022-07-13T12:34:59Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=deb44a6249f696106645c63c0603eab08a6122af'/>
<id>urn:sha1:deb44a6249f696106645c63c0603eab08a6122af</id>
<content type='text'>
The test does not always correctly determine the number of events for
hybrids, nor allow for more than 1 evsel when parsing.

Fix by iterating the events actually created and getting the correct
evsel for the events processed.

Fixes: d9da6f70eb235110 ("perf tests: Support 'Convert perf time to TSC' test for hybrid")
Reviewed-by: Kan Liang &lt;kan.liang@linux.intel.com&gt;
Signed-off-by: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Cc: Ian Rogers &lt;irogers@google.com&gt;
Cc: Jin Yao &lt;yao.jin@linux.intel.com&gt;
Cc: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: Thomas Richter &lt;tmricht@linux.ibm.com&gt;
Link: https://lore.kernel.org/r/20220713123459.24145-3-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
</entry>
<entry>
<title>perf tests: Stop Convert perf time to TSC test opening events twice</title>
<updated>2022-07-17T13:56:58Z</updated>
<author>
<name>Adrian Hunter</name>
<email>adrian.hunter@intel.com</email>
</author>
<published>2022-07-13T12:34:58Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=498c7a54f169b2699104d3060604d840424f15d2'/>
<id>urn:sha1:498c7a54f169b2699104d3060604d840424f15d2</id>
<content type='text'>
Do not call evlist__open() twice.

Fixes: 5bb017d4b97a0f13 ("perf test: Fix error message for test case 71 on s390, where it is not supported")
Reviewed-by: Kan Liang &lt;kan.liang@linux.intel.com&gt;
Signed-off-by: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Cc: Ian Rogers &lt;irogers@google.com&gt;
Cc: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: Thomas Richter &lt;tmricht@linux.ibm.com&gt;
Link: https://lore.kernel.org/r/20220713123459.24145-2-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
</entry>
<entry>
<title>perf synthetic-events: Ignore dead threads during event synthesis</title>
<updated>2022-07-02T12:22:26Z</updated>
<author>
<name>Namhyung Kim</name>
<email>namhyung@kernel.org</email>
</author>
<published>2022-07-01T20:54:58Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=ff898552fb32d255517fb0676f9fa500664c484d'/>
<id>urn:sha1:ff898552fb32d255517fb0676f9fa500664c484d</id>
<content type='text'>
When it synthesize various task events, it scans the list of task
first and then accesses later.  There's a window threads can die
between the two and proc entries may not be available.

Instead of bailing out, we can ignore that thread and move on.

Signed-off-by: Namhyung Kim &lt;namhyung@kernel.org&gt;
Acked-by: Ian Rogers &lt;irogers@google.com&gt;
Cc: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Link: http://lore.kernel.org/lkml/20220701205458.985106-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
</entry>
<entry>
<title>perf synthetic-events: Don't sort the task scan result from /proc</title>
<updated>2022-07-02T12:21:41Z</updated>
<author>
<name>Namhyung Kim</name>
<email>namhyung@kernel.org</email>
</author>
<published>2022-07-01T20:54:57Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=363afa3aef24f5e08df6a539f5dc3aae4cddcc1a'/>
<id>urn:sha1:363afa3aef24f5e08df6a539f5dc3aae4cddcc1a</id>
<content type='text'>
It should not sort the result as procfs already returns a proper
ordering of tasks.  Actually sorting the order caused problems that it
doesn't guararantee to process the main thread first.

Signed-off-by: Namhyung Kim &lt;namhyung@kernel.org&gt;
Acked-by: Ian Rogers &lt;irogers@google.com&gt;
Cc: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Link: http://lore.kernel.org/lkml/20220701205458.985106-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
</entry>
<entry>
<title>perf unwind: Fix unitialized 'offset' variable on aarch64</title>
<updated>2022-07-02T12:16:52Z</updated>
<author>
<name>Ivan Babrou</name>
<email>ivan@cloudflare.com</email>
</author>
<published>2022-07-01T18:20:46Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=5eb502b2e1ae1ab052cdf6bdd7615217e8517360'/>
<id>urn:sha1:5eb502b2e1ae1ab052cdf6bdd7615217e8517360</id>
<content type='text'>
Commit dc2cf4ca866f5715 ("perf unwind: Fix segbase for ld.lld linked
objects") uncovered the following issue on aarch64:

    util/unwind-libunwind-local.c: In function 'find_proc_info':
    util/unwind-libunwind-local.c:386:28: error: 'offset' may be used uninitialized in this function [-Werror=maybe-uninitialized]
    386 |                         if (ofs &gt; 0) {
        |                            ^
    util/unwind-libunwind-local.c:199:22: note: 'offset' was declared here
    199 |         u64 address, offset;
        |                      ^~~~~~
    util/unwind-libunwind-local.c:371:20: error: 'offset' may be used uninitialized in this function [-Werror=maybe-uninitialized]
    371 |                 if (ofs &lt;= 0) {
        |                    ^
    util/unwind-libunwind-local.c:199:22: note: 'offset' was declared here
    199 |         u64 address, offset;
        |                      ^~~~~~
    util/unwind-libunwind-local.c:363:20: error: 'offset' may be used uninitialized in this function [-Werror=maybe-uninitialized]
    363 |                 if (ofs &lt;= 0) {
        |                    ^
    util/unwind-libunwind-local.c:199:22: note: 'offset' was declared here
    199 |         u64 address, offset;
        |                      ^~~~~~
    In file included from util/libunwind/arm64.c:37:

Fixes: dc2cf4ca866f5715 ("perf unwind: Fix segbase for ld.lld linked objects")
Signed-off-by: Ivan Babrou &lt;ivan@cloudflare.com&gt;
Cc: Alexander Shishkin &lt;alexander.shishkin@linux.intel.com&gt;
Cc: Fangrui Song &lt;maskray@google.com&gt;
Cc: Ian Rogers &lt;irogers@google.com&gt;
Cc: James Clark &lt;james.clark@arm.com&gt;
Cc: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: kernel-team@cloudflare.com
Cc: Mark Rutland &lt;mark.rutland@arm.com&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Link: http://lore.kernel.org/lkml/20220701182046.12589-1-ivan@cloudflare.com
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
</entry>
</feed>
