<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/tools/perf, branch v5.0</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v5.0</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v5.0'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2019-02-05T13:31:08Z</updated>
<entry>
<title>perf script python: Add Python3 support to tests/attr.py</title>
<updated>2019-02-05T13:31:08Z</updated>
<author>
<name>Tony Jones</name>
<email>tonyj@suse.de</email>
</author>
<published>2019-01-24T00:52:29Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=8f2f350cbdb2c2fbff654cb778139144b48a59ba'/>
<id>urn:sha1:8f2f350cbdb2c2fbff654cb778139144b48a59ba</id>
<content type='text'>
Support both Python 2 and Python 3 in tests/attr.py

The use of "except as" syntax implies the minimum supported Python2 version is
now v2.6

Committer testing:

  $ make -C tools/perf PYTHON3=python install-bin

Before:

  # perf test attr
  16: Setup struct perf_event_attr                          : FAILED!
  48: Synthesize attr update                                : Ok
  [root@quaco ~]# perf test -v attr
  16: Setup struct perf_event_attr                          :
  --- start ---
  test child forked, pid 3121
    File "/home/acme/libexec/perf-core/tests/attr.py", line 324
      except Unsup, obj:
                ^
  SyntaxError: invalid syntax
  test child finished with -1
  ---- end ----
  Setup struct perf_event_attr: FAILED!
  48: Synthesize attr update                                :
  --- start ---
  test child forked, pid 3124
  test child finished with 0
  ---- end ----
  Synthesize attr update: Ok
  #

After:

   # perf test attr
  16: Setup struct perf_event_attr                          : Ok
  48: Synthesize attr update                                : Ok
  #

Signed-off-by: Tony Jones &lt;tonyj@suse.de&gt;
Acked-by: Jiri Olsa &lt;jolsa@kernel.org&gt;
Tested-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
Cc: Jonathan Corbet &lt;corbet@lwn.net&gt;
Cc: Ravi Bangoria &lt;ravi.bangoria@linux.ibm.com&gt;
Cc: Seeteena Thoufeek &lt;s1seetee@linux.vnet.ibm.com&gt;
Link: http://lkml.kernel.org/r/20190124005229.16146-7-tonyj@suse.de
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
</entry>
<entry>
<title>perf trace: Support multiple "vfs_getname" probes</title>
<updated>2019-02-04T18:50:38Z</updated>
<author>
<name>Arnaldo Carvalho de Melo</name>
<email>acme@redhat.com</email>
</author>
<published>2019-01-29T14:12:34Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=6ab3bc240ade47a0f52bc16d97edd9accbe0024e'/>
<id>urn:sha1:6ab3bc240ade47a0f52bc16d97edd9accbe0024e</id>
<content type='text'>
With a suitably defined "probe:vfs_getname" probe, 'perf trace' can
"beautify" its output, so syscalls like open() or openat() can print the
"filename" argument instead of just its hex address, like:

  $ perf trace -e open -- touch /dev/null
  [...]
       0.590 ( 0.014 ms): touch/18063 open(filename: /dev/null, flags: CREAT|NOCTTY|NONBLOCK|WRONLY, mode: IRUGO|IWUGO) = 3
  [...]

The output without such beautifier looks like:

     0.529 ( 0.011 ms): touch/18075 open(filename: 0xc78cf288, flags: CREAT|NOCTTY|NONBLOCK|WRONLY, mode: IRUGO|IWUGO) = 3

However, when the vfs_getname probe expands to multiple probes and it is
not the first one that is hit, the beautifier fails, as following:

     0.326 ( 0.010 ms): touch/18072 open(filename: , flags: CREAT|NOCTTY|NONBLOCK|WRONLY, mode: IRUGO|IWUGO) = 3

Fix it by hooking into all the expanded probes (inlines), now, for instance:

  [root@quaco ~]# perf probe -l
    probe:vfs_getname    (on getname_flags:73@fs/namei.c with pathname)
    probe:vfs_getname_1  (on getname_flags:73@fs/namei.c with pathname)
  [root@quaco ~]# perf trace -e open* sleep 1
       0.010 ( 0.005 ms): sleep/5588 openat(dfd: CWD, filename: /etc/ld.so.cache, flags: RDONLY|CLOEXEC)   = 3
       0.029 ( 0.006 ms): sleep/5588 openat(dfd: CWD, filename: /lib64/libc.so.6, flags: RDONLY|CLOEXEC)   = 3
       0.194 ( 0.008 ms): sleep/5588 openat(dfd: CWD, filename: /usr/lib/locale/locale-archive, flags: RDONLY|CLOEXEC) = 3
  [root@quaco ~]#

Works, further verified with:

  [root@quaco ~]# perf test vfs
  65: Use vfs_getname probe to get syscall args filenames   : Ok
  66: Add vfs_getname probe to get syscall args filenames   : Ok
  67: Check open filename arg using perf trace + vfs_getname: Ok
  [root@quaco ~]#

Reported-by: Michael Petlan &lt;mpetlan@redhat.com&gt;
Tested-by: Michael Petlan &lt;mpetlan@redhat.com&gt;
Cc: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Cc: Jiri Olsa &lt;jolsa@redhat.com&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Link: https://lkml.kernel.org/n/tip-mv8kolk17xla1smvmp3qabv1@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
</entry>
<entry>
<title>perf symbols: Filter out hidden symbols from labels</title>
<updated>2019-02-04T18:50:38Z</updated>
<author>
<name>Jiri Olsa</name>
<email>jolsa@redhat.com</email>
</author>
<published>2019-01-28T13:35:26Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=59a17706915fe5ea6f711e1f92d4fb706bce07fe'/>
<id>urn:sha1:59a17706915fe5ea6f711e1f92d4fb706bce07fe</id>
<content type='text'>
When perf is built with the annobin plugin (RHEL8 build) extra symbols
are added to its binary:

  # nm perf | grep annobin | head -10
  0000000000241100 t .annobin_annotate.c
  0000000000326490 t .annobin_annotate.c
  0000000000249255 t .annobin_annotate.c_end
  00000000003283a8 t .annobin_annotate.c_end
  00000000001bce18 t .annobin_annotate.c_end.hot
  00000000001bce18 t .annobin_annotate.c_end.hot
  00000000001bc3e2 t .annobin_annotate.c_end.unlikely
  00000000001bc400 t .annobin_annotate.c_end.unlikely
  00000000001bce18 t .annobin_annotate.c.hot
  00000000001bce18 t .annobin_annotate.c.hot
  ...

Those symbols have no use for report or annotation and should be
skipped.  Moreover they interfere with the DWARF unwind test on the PPC
arch, where they are mixed with checked symbols and then the test fails:

  # perf test dwarf -v
  59: Test dwarf unwind                                     :
  --- start ---
  test child forked, pid 8515
  unwind: .annobin_dwarf_unwind.c:ip = 0x10dba40dc (0x2740dc)
  ...
  got: .annobin_dwarf_unwind.c 0x10dba40dc, expecting test__arch_unwind_sample
  unwind: failed with 'no error'

The annobin symbols are defined as NOTYPE/LOCAL/HIDDEN:

  # readelf -s ./perf | grep annobin | head -1
    40: 00000000001bce4f     0 NOTYPE  LOCAL  HIDDEN    13 .annobin_init.c

They can still pass the check for the label symbol. Adding check for
HIDDEN and INTERNAL (as suggested by Nick below) visibility and filter
out such symbols.

&gt;   Just to be awkward, if you are going to ignore STV_HIDDEN
&gt;   symbols then you should probably also ignore STV_INTERNAL ones
&gt;   as well...  Annobin does not generate them, but you never know,
&gt;   one day some other tool might create some.

Signed-off-by: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Alexander Shishkin &lt;alexander.shishkin@linux.intel.com&gt;
Cc: Masami Hiramatsu &lt;mhiramat@kernel.org&gt;
Cc: Michael Petlan &lt;mpetlan@redhat.com&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: Nick Clifton &lt;nickc@redhat.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Link: http://lkml.kernel.org/r/20190128133526.GD15461@krava
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
</entry>
<entry>
<title>perf symbols: Add fallback definitions for GELF_ST_VISIBILITY()</title>
<updated>2019-02-04T18:50:37Z</updated>
<author>
<name>Arnaldo Carvalho de Melo</name>
<email>acme@redhat.com</email>
</author>
<published>2019-02-04T18:48:03Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=843cf70ed29a7fb51f1e796c1d6e1ba3620250ac'/>
<id>urn:sha1:843cf70ed29a7fb51f1e796c1d6e1ba3620250ac</id>
<content type='text'>
Those aren't present in Alpine Linux 3.4 to edge, so provide fallback
defines to get the next patch building there keeping the build
bisectable.

Cc: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Cc: Alexander Shishkin &lt;alexander.shishkin@linux.intel.com&gt;
Cc: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Masami Hiramatsu &lt;mhiramat@kernel.org&gt;
Cc: Michael Petlan &lt;mpetlan@redhat.com&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: Nick Clifton &lt;nickc@redhat.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Link: https://lkml.kernel.org/n/tip-03cg3gya2ju4ba2x6ibb9fuz@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
</entry>
<entry>
<title>perf clang: Do not use 'return std::move(something)'</title>
<updated>2019-02-04T14:32:34Z</updated>
<author>
<name>Arnaldo Carvalho de Melo</name>
<email>acme@redhat.com</email>
</author>
<published>2019-02-04T14:04:20Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=d34cecfb6b2bdc35713180ba4fcfd912a2f3e9bf'/>
<id>urn:sha1:d34cecfb6b2bdc35713180ba4fcfd912a2f3e9bf</id>
<content type='text'>
It prevents copy elision, generating this warning when building with
fedora:rawhide's clang:

  clang version 7.0.1 (Fedora 7.0.1-2.fc30)
  Target: x86_64-unknown-linux-gnu
  Thread model: posix
  InstalledDir: /usr/bin
  Found candidate GCC installation: /usr/bin/../lib/gcc/x86_64-redhat-linux/9
  Found candidate GCC installation: /usr/lib/gcc/x86_64-redhat-linux/9
  Selected GCC installation: /usr/bin/../lib/gcc/x86_64-redhat-linux/9
  Candidate multilib: .;@m64
  Candidate multilib: 32;@m32
  Selected multilib: .;@m64

  $ make -C tools/perf CC=clang LIBCLANGLLVM=1
  &lt;SNIP&gt;
  util/c++/clang.cpp: In function 'std::unique_ptr&lt;llvm::SmallVectorImpl&lt;char&gt; &gt; perf::getBPFObjectFromModule(llvm::Module*)':
  util/c++/clang.cpp:163:18: error: moving a local object in a return statement prevents copy elision [-Werror=pessimizing-move]
    163 |  return std::move(Buffer);
        |         ~~~~~~~~~^~~~~~~~
  util/c++/clang.cpp:163:18: note: remove 'std::move' call
  cc1plus: all warnings being treated as errors
  &lt;SNIP&gt;

References:

  http://www.cplusplus.com/forum/general/186411/#msg908572
  https://en.cppreference.com/w/cpp/language/return#Notes
  https://en.cppreference.com/w/cpp/language/copy_elision

Cc: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Cc: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Luis Cláudio Gonçalves &lt;lclaudio@redhat.com&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: Wang Nan &lt;wangnan0@huawei.com&gt;
Link: https://lkml.kernel.org/n/tip-lehqf5x5q96l0o8myhb6blz6@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
</entry>
<entry>
<title>perf mem/c2c: Fix perf_mem_events to support powerpc</title>
<updated>2019-02-04T14:32:14Z</updated>
<author>
<name>Ravi Bangoria</name>
<email>ravi.bangoria@linux.ibm.com</email>
</author>
<published>2019-01-29T13:24:12Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=f0fabf9c897327abd39018aefb5029aff8c7e133'/>
<id>urn:sha1:f0fabf9c897327abd39018aefb5029aff8c7e133</id>
<content type='text'>
PowerPC hardware does not have a builtin latency filter (--ldlat) for
the "mem-load" event and perf_mem_events by default includes
"/ldlat=30/" which is causing a failure on PowerPC. Refactor the code to
support "perf mem/c2c" on PowerPC.

This patch depends on kernel side changes done my Madhavan:
https://lists.ozlabs.org/pipermail/linuxppc-dev/2018-December/182596.html

Signed-off-by: Ravi Bangoria &lt;ravi.bangoria@linux.ibm.com&gt;
Acked-by: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Dick Fowles &lt;fowles@inreach.com&gt;
Cc: Don Zickus &lt;dzickus@redhat.com&gt;
Cc: Joe Mario &lt;jmario@redhat.com&gt;
Cc: Madhavan Srinivasan &lt;maddy@linux.vnet.ibm.com&gt;
Cc: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: linuxppc-dev@lists.ozlabs.org
Link: http://lkml.kernel.org/r/20190129132412.771-1-ravi.bangoria@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
</entry>
<entry>
<title>perf tests evsel-tp-sched: Fix bitwise operator</title>
<updated>2019-02-04T14:32:14Z</updated>
<author>
<name>Gustavo A. R. Silva</name>
<email>gustavo@embeddedor.com</email>
</author>
<published>2019-01-22T23:34:39Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=489338a717a0dfbbd5a3fabccf172b78f0ac9015'/>
<id>urn:sha1:489338a717a0dfbbd5a3fabccf172b78f0ac9015</id>
<content type='text'>
Notice that the use of the bitwise OR operator '|' always leads to true
in this particular case, which seems a bit suspicious due to the context
in which this expression is being used.

Fix this by using bitwise AND operator '&amp;' instead.

This bug was detected with the help of Coccinelle.

Signed-off-by: Gustavo A. R. Silva &lt;gustavo@embeddedor.com&gt;
Acked-by: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Alexander Shishkin &lt;alexander.shishkin@linux.intel.com&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: stable@vger.kernel.org
Fixes: 6a6cd11d4e57 ("perf test: Add test for the sched tracepoint format fields")
Link: http://lkml.kernel.org/r/20190122233439.GA5868@embeddedor
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
</entry>
<entry>
<title>Merge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip</title>
<updated>2019-02-03T16:59:51Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2019-02-03T16:59:51Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=58f6d4287af751a75b24f4d32e6e1b077146ab18'/>
<id>urn:sha1:58f6d4287af751a75b24f4d32e6e1b077146ab18</id>
<content type='text'>
Pull perf fixes from Thomas Gleixner:
 "A pile of perf updates:

   - Fix broken sanity check in the /proc/sys/kernel/perf_cpu_time_max_percent
     write handler

   - Cure a perf script crash which caused by an unitinialized data
     structure

   - Highlight the hottest instruction in perf top and not a random one

   - Cure yet another clang issue when building perf python

   - Handle topology entries with no CPU correctly in the tools

   - Handle perf data which contains both tracepoints and performance
     counter entries correctly.

   - Add a missing NULL pointer check in perf ordered_events_free()"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf script: Fix crash when processing recorded stat data
  perf top: Fix wrong hottest instruction highlighted
  perf tools: Handle TOPOLOGY headers with no CPU
  perf python: Remove -fstack-clash-protection when building with some clang versions
  perf core: Fix perf_proc_update_handler() bug
  perf script: Fix crash with printing mixed trace point and other events
  perf ordered_events: Fix crash in ordered_events__free
</content>
</entry>
<entry>
<title>perf script: Fix crash when processing recorded stat data</title>
<updated>2019-01-21T14:29:07Z</updated>
<author>
<name>Tony Jones</name>
<email>tonyj@suse.de</email>
</author>
<published>2019-01-20T19:14:14Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=8bf8c6da53c2265aea365a1de6038f118f522113'/>
<id>urn:sha1:8bf8c6da53c2265aea365a1de6038f118f522113</id>
<content type='text'>
While updating perf to work with Python3 and Python2 I noticed that the
stat-cpi script was dumping core.

$ perf  stat -e cycles,instructions record -o /tmp/perf.data /bin/false

 Performance counter stats for '/bin/false':

           802,148      cycles

           604,622      instructions                                                       802,148      cycles
           604,622      instructions

       0.001445842 seconds time elapsed

$ perf script -i /tmp/perf.data -s scripts/python/stat-cpi.py
Segmentation fault (core dumped)
...
...
    rblist=rblist@entry=0xb2a200 &lt;rt_stat&gt;,
    new_entry=new_entry@entry=0x7ffcb755c310) at util/rblist.c:33
    ctx=&lt;optimized out&gt;, type=&lt;optimized out&gt;, create=&lt;optimized out&gt;,
    cpu=&lt;optimized out&gt;, evsel=&lt;optimized out&gt;) at util/stat-shadow.c:118
    ctx=&lt;optimized out&gt;, type=&lt;optimized out&gt;, st=&lt;optimized out&gt;)
    at util/stat-shadow.c:196
    count=count@entry=727442, cpu=cpu@entry=0, st=0xb2a200 &lt;rt_stat&gt;)
    at util/stat-shadow.c:239
    config=config@entry=0xafeb40 &lt;stat_config&gt;,
    counter=counter@entry=0x133c6e0) at util/stat.c:372
...
...

The issue is that since 1fcd03946b52 perf_stat__update_shadow_stats now calls
update_runtime_stat passing rt_stat rather than calling update_stats but
perf_stat__init_shadow_stats has never been called to initialize rt_stat in
the script path processing recorded stat data.

Since I can't see any reason why perf_stat__init_shadow_stats() is presently
initialized like it is in builtin-script.c::perf_sample__fprint_metric()
[4bd1bef8bba2f] I'm proposing it instead be initialized once in __cmd_script

Committer testing:

After applying the patch:

  # perf script -i /tmp/perf.data -s tools/perf/scripts/python/stat-cpi.py
       0.001970: cpu -1, thread -1 -&gt; cpi 1.709079 (1075684/629394)
  #

No segfault.

Signed-off-by: Tony Jones &lt;tonyj@suse.de&gt;
Reviewed-by: Jiri Olsa &lt;jolsa@kernel.org&gt;
Tested-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
Tested-by: Ravi Bangoria &lt;ravi.bangoria@linux.ibm.com&gt;
Cc: Andi Kleen &lt;ak@linux.intel.com&gt;
Cc: Jin Yao &lt;yao.jin@linux.intel.com&gt;
Fixes: 1fcd03946b52 ("perf stat: Update per-thread shadow stats")
Link: http://lkml.kernel.org/r/20190120191414.12925-1-tonyj@suse.de
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
</entry>
<entry>
<title>perf top: Fix wrong hottest instruction highlighted</title>
<updated>2019-01-21T14:29:07Z</updated>
<author>
<name>He Kuang</name>
<email>hekuang@huawei.com</email>
</author>
<published>2019-01-20T16:05:22Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=da06d568386877809532e8ec678f4a5e300f0951'/>
<id>urn:sha1:da06d568386877809532e8ec678f4a5e300f0951</id>
<content type='text'>
The annotation line percentage is compared and inserted into the rbtree,
but the percent field of 'struct annotation_data' is an array, the
comparison result between them is the address difference.

This patch compares the right slot of percent array according to
opts-&gt;percent_type and makes things right.

The problem can be reproduced by pressing 'H' in perf top annotation view.
It should highlight the instruction line which has the highest sampling
percentage.

Signed-off-by: He Kuang &lt;hekuang@huawei.com&gt;
Reviewed-by: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Alexander Shishkin &lt;alexander.shishkin@linux.intel.com&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Link: http://lkml.kernel.org/r/20190120160523.4391-1-hekuang@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
</content>
</entry>
</feed>
