summaryrefslogtreecommitdiffstats
path: root/tools/perf/util/Build
AgeCommit message (Collapse)AuthorLines
2026-02-03perf kvm stat: Remove use of the arch directoryIan Rogers-1/+2
`perf kvm stat` supports record and report options. By using the arch directory a report for a different machine type cannot be supported. Move the kvm-stat code out of the arch directory and into util/kvm-stat-arch following the pattern of perf-regs and dwarf-regs. Avoid duplicate symbols by renaming functions to have the architecture name within them. For global variables, wrap them in an architecture specific function. Selecting the architecture to use with `perf kvm stat` is selected by EM_HOST, ie no different than before the change. Later the ELF machine can be determined from the session or a header feature (ie EM_HOST at the time of the record). The build and #define HAVE_KVM_STAT_SUPPORT is now redundant so remove across Makefiles and in the build. Opportunistically constify architectural structs and arrays. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Aditya Bodkhe <aditya.b1@linux.ibm.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Ghiti <alex@ghiti.fr> Cc: Andi Kleen <ak@linux.intel.com> Cc: Andrew Jones <ajones@ventanamicro.com> Cc: Anubhav Shelat <ashelat@redhat.com> Cc: Anup Patel <anup@brainfault.org> Cc: Athira Rajeev <atrajeev@linux.ibm.com> Cc: Blake Jones <blakejones@google.com> Cc: Chun-Tse Shao <ctshao@google.com> Cc: Dapeng Mi <dapeng1.mi@linux.intel.com> Cc: Dmitriy Vyukov <dvyukov@google.com> Cc: Howard Chu <howardchu95@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Leo Yan <leo.yan@linux.dev> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Walmsley <pjw@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Quan Zhou <zhouquan@iscas.ac.cn> Cc: Shimin Guo <shimin.guo@skydio.com> Cc: Swapnil Sapkal <swapnil.sapkal@amd.com> Cc: Thomas Falcon <thomas.falcon@intel.com> Cc: Will Deacon <will@kernel.org> Cc: Yunseong Kim <ysk@kzalloc.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2026-02-03perf capstone: Support for dlopen-ing libcapstone.soIan Rogers-1/+1
If perf is built with LIBCAPSTONE_DLOPEN=1, support dlopen-ing libcapstone.so and then calling the necessary functions by looking them up using dlsym. The types come from capstone.h which means the libcapstone feature check needs to pass, and NO_CAPSTONE=1 hasn't been defined. This will cause the definition of HAVE_LIBCAPSTONE_SUPPORT. Earlier versions of this code tried to declare the necessary capstone.h constants and structs, but they weren't stable and caused breakages across libcapstone releases. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Bill Wendling <morbo@google.com> Cc: Charlie Jenkins <charlie@rivosinc.com> Cc: Collin Funk <collin.funk1@gmail.com> Cc: Dmitriy Vyukov <dvyukov@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Justin Stitt <justinstitt@google.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <nick.desaulniers+lkml@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2026-01-23perf disasm: Don't include C files from the arch directoryIan Rogers-0/+1
Move the arch instructions.c files into appropriately named files in annotate-arch in the util directory. Don't #include to compile the code, switch to building the files and fix up the #includes accordingly. Move powerpc specific disasm code out of disasm.c and into annotate-powerpc.c. Declarations and static removed as appropriate for the code to compile as separate compilation units. The e_machine and e_flags set up is moved to the disasm.c architectures array so that later patches can sort by them. Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Aditya Bodkhe <aditya.b1@linux.ibm.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Ghiti <alex@ghiti.fr> Cc: Athira Rajeev <atrajeev@linux.ibm.com> Cc: Bill Wendling <morbo@google.com> Cc: Dr. David Alan Gilbert <linux@treblig.org> Cc: Guo Ren <guoren@kernel.org> Cc: Howard Chu <howardchu95@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Julia Lawall <Julia.Lawall@inria.fr> Cc: Justin Stitt <justinstitt@google.com> Cc: Krzysztof Łopatowski <krzysztof.m.lopatowski@gmail.com> Cc: Leo Yan <leo.yan@linux.dev> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <nick.desaulniers+lkml@gmail.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Walmsley <pjw@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sergei Trofimovich <slyich@gmail.com> Cc: Shimin Guo <shimin.guo@skydio.com> Cc: Suchit Karunakaran <suchitkarunakaran@gmail.com> Cc: Thomas Falcon <thomas.falcon@intel.com> Cc: Tianyou Li <tianyou.li@intel.com> Cc: Will Deacon <will@kernel.org> Cc: Zecheng Li <zecheng@google.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2026-01-20perf build: Remove NO_LIBDW_DWARF_UNWIND optionIan Rogers-2/+1
Libdw unwinding support is present for every architecture that has a perf_regs.h - perf registers are needed for the initial frame to unwind. Elfutils also supports SPARC, ARC and m68k but there is no support in the Linux kernel for perf registers on these architectures. As the perf supported DWARF unwinding architectures are a subset of the elfutils ones, remove NO_LIBDW_DWARF_UNWIND as there isn't a case of elfutils lacking the support need for perf. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Aditya Bodkhe <aditya.b1@linux.ibm.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexandre Ghiti <alex@ghiti.fr> Cc: Andi Kleen <ak@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.ibm.com> Cc: Chun-Tse Shao <ctshao@google.com> Cc: Dmitriy Vyukov <dvyukov@google.com> Cc: Dr. David Alan Gilbert <linux@treblig.org> Cc: Guo Ren <guoren@kernel.org> Cc: Haibo Xu <haibo1.xu@intel.com> Cc: Howard Chu <howardchu95@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Krzysztof Łopatowski <krzysztof.m.lopatowski@gmail.com> Cc: Leo Yan <leo.yan@linux.dev> Cc: Mark Wielaard <mark@klomp.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Walmsley <pjw@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sergei Trofimovich <slyich@gmail.com> Cc: Shimin Guo <shimin.guo@skydio.com> Cc: Stephen Brennan <stephen.s.brennan@oracle.com> Cc: Thomas Falcon <thomas.falcon@intel.com> Cc: Will Deacon <will@kernel.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2026-01-20perf dwarf-regs: Add util/dwarf-regs-arch for consistency with perf-regsIan Rogers-3/+1
perf_regs.h has cross architecture functions for operating with the differing perf register constants. dwarf-regs.h is similar but for cross architecture dwarf notions of registers. For consistency move the arch parts of dwarf-regs out of util and into its own directory. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Aditya Bodkhe <aditya.b1@linux.ibm.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexandre Ghiti <alex@ghiti.fr> Cc: Andi Kleen <ak@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.ibm.com> Cc: Chun-Tse Shao <ctshao@google.com> Cc: Dmitriy Vyukov <dvyukov@google.com> Cc: Dr. David Alan Gilbert <linux@treblig.org> Cc: Guo Ren <guoren@kernel.org> Cc: Haibo Xu <haibo1.xu@intel.com> Cc: Howard Chu <howardchu95@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Krzysztof Łopatowski <krzysztof.m.lopatowski@gmail.com> Cc: Leo Yan <leo.yan@linux.dev> Cc: Mark Wielaard <mark@klomp.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Walmsley <pjw@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sergei Trofimovich <slyich@gmail.com> Cc: Shimin Guo <shimin.guo@skydio.com> Cc: Stephen Brennan <stephen.s.brennan@oracle.com> Cc: Thomas Falcon <thomas.falcon@intel.com> Cc: Will Deacon <will@kernel.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2026-01-20perf unwind-libdw: Fix a cross-arch unwinding bugShimin Guo-0/+1
The set_initial_registers field of Dwfl_Thread_Callbacks needs to be set according to the arch of the stack samples being analyzed, not the arch that perf itself is built for. Currently perf fails to unwind stack samples collected from archs different from that of the host perf is running on. This patch moves the arch-specific implementations of set_initial_registers from tools/perf/arch to tools/perf/utli/unwind-libdw-arch, similar to the way the perf-regs-arch folder contains arch-specific functions related to registers, and chooses the implementation based on the arch of the data being processed. Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Shimin Guo <shimin.guo@skydio.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Aditya Bodkhe <aditya.b1@linux.ibm.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexandre Ghiti <alex@ghiti.fr> Cc: Andi Kleen <ak@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.ibm.com> Cc: Chun-Tse Shao <ctshao@google.com> Cc: Dmitriy Vyukov <dvyukov@google.com> Cc: Dr. David Alan Gilbert <linux@treblig.org> Cc: Guo Ren <guoren@kernel.org> Cc: Haibo Xu <haibo1.xu@intel.com> Cc: Howard Chu <howardchu95@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Krzysztof Łopatowski <krzysztof.m.lopatowski@gmail.com> Cc: Leo Yan <leo.yan@linux.dev> Cc: Mark Wielaard <mark@klomp.org> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Walmsley <pjw@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Sergei Trofimovich <slyich@gmail.com> Cc: Stephen Brennan <stephen.s.brennan@oracle.com> Cc: Thomas Falcon <thomas.falcon@intel.com> Cc: Will Deacon <will@kernel.org> Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2026-01-14perf help: Move common_cmds into builtin-helpIan Rogers-14/+0
There's a lot of infrastructure for generating a relatively simple array used by one function. Move the array into the function and remove the supporting build logic. At the same time opportunistically const-ify the array. Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Charlie Jenkins <charlie@rivosinc.com> Cc: Howard Chu <howardchu95@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2026-01-13perf util: Remove SHA-1 codeEric Biggers-1/+0
Now that the SHA-1 code is no longer used, remove it. Signed-off-by: Eric Biggers <ebiggers@kernel.org> Tested-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Fangrui Song <maskray@sourceware.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jason A. Donenfeld <Jason@zx2c4.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Pablo Galindo <pablogsal@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2026-01-13perf util: Add BLAKE2s supportEric Biggers-0/+1
Add BLAKE2s support to the perf utility library. The code is borrowed from the kernel. This will replace the use of SHA-1 in genelf.c. Signed-off-by: Eric Biggers <ebiggers@kernel.org> Tested-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Fangrui Song <maskray@sourceware.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jason A. Donenfeld <Jason@zx2c4.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Pablo Galindo <pablogsal@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2026-01-12perf addr2line: Add a libdw implementationIan Rogers-0/+1
Add an implementation of addr2line that uses libdw. Other addr2line implementations are slow, particularly in the case of forking addr2line. Add an implementation that caches the libdw information in the dso and uses it to find the file and line number information. Inline information is supported but because cu_walk_functions_at visits the leaf function last add a inline_list__append_tail to reverse the lists order. Committer testing: # perf probe -x ~/bin/perf libdw__addr2line Added new event: probe_perf:libdw_addr2line (on libdw__addr2line in /home/acme/bin/perf) You can now use it in all perf tools, such as: perf record -e probe_perf:libdw_addr2line -aR sleep 1 # # perf stat -e probe_perf:libdw_addr2line perf report -f --dso perf --stdio -s srcfile,srcline # To display the perf.data header info, please use --header/--header-only options. # # # Total Lost Samples: 0 # # Samples: 4K of event 'cpu/cycles/Pu' # Event count (approx.): 5535180842 # # Overhead Source File Source:Line # ........ ............ ............... # 99.04% inlineloop.c inlineloop.c:21 0.46% inlineloop.c inlineloop.c:20 # # (Tip: For tracepoint events, try: perf report -s trace_fields) # Performance counter stats for 'perf report -f --dso perf --stdio -s srcfile,srcline': 44 probe_perf:libdw_addr2line 0.037260744 seconds time elapsed 0.025299000 seconds user 0.011918000 seconds sys # Adding probes to the other addr2line implementations (llvm__addr2line, libbfd__addr2line and cmd__addr2line) I noticed some fallbacks to the llvm one: Performance counter stats for 'perf report -f --dso perf --stdio -s srcfile,srcline': 44 probe_perf:libdw_addr2line 23 probe_perf:llvm_addr2line 0 probe_perf:libbfd_addr2line 0 probe_perf:cmd_addr2line Something to investigate further, but at least we don't fallback to the cmd based one :-) Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Howard Chu <howardchu95@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephen Brennan <stephen.s.brennan@oracle.com> Cc: Tony Jones <tonyj@suse.de> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-11-13perf build: Remove NO_AUXTRACE build optionIan Rogers-12/+12
The NO_AUXTRACE build option was used when the __get_cpuid feature test failed or if it was provided on the command line. The option no longer avoids a dependency on a library and so having the option is just adding complexity to the code base. Remove the option CONFIG_AUXTRACE from Build files and HAVE_AUXTRACE_SUPPORT by assuming it is always defined. Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-10-06perf srcline: Fallback between addr2line implementationsIan Rogers-0/+1
Factor the addr2line function implementation into separate source files (addr2line.[ch]) and rename the addr2line function cmd__addr2line. In srcline replace the ifdef-ed addr2line implementations with one that first tries the llvm__addr2line implementation, then the deprecated libbfd__addr2line function and on failure uses cmd__addr2line. If HAVE_LIBLLVM_SUPPORT is enabled the llvm__addr2line will execute against the libLLVM.so it is linked against. If HAVE_LIBLLVM_DYNAMIC is enabled then libperf-llvm.so (that links against libLLVM.so) will be dlopened. If the dlopen succeeds then the behavior should match HAVE_LIBLLVM_SUPPORT. On failure cmd__addr2line is used. The dlopen is only tried once. If HAVE_LIBLLVM_DYNAMIC isn't enabled then llvm__addr2line immediately fails and cmd__addr2line is used. Clean up the dso__free_a2l logic, which is only needed in the non-LLVM version and moved to addr2line.c. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Ghiti <alexghiti@rivosinc.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.ibm.com> Cc: Bill Wendling <morbo@google.com> Cc: Charlie Jenkins <charlie@rivosinc.com> Cc: Collin Funk <collin.funk1@gmail.com> Cc: Dmitriy Vyukov <dvyukov@google.com> Cc: Dr. David Alan Gilbert <linux@treblig.org> Cc: Eric Biggers <ebiggers@kernel.org> Cc: Haibo Xu <haibo1.xu@intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Justin Stitt <justinstitt@google.com> Cc: Li Huafei <lihuafei1@huawei.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <nick.desaulniers+lkml@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <song@kernel.org> Cc: Stephen Brennan <stephen.s.brennan@oracle.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-10-03perf namespaces: Avoid get_current_dir_name dependencyIan Rogers-1/+0
get_current_dir_name is a GNU extension not supported on, for example, Android. There is only one use of it so let's just switch to getcwd to avoid build and other complexity. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-10-02perf libbfd: Move libbfd functionality to its own fileIan Rogers-1/+1
Move symbolization and srcline libbfd dependencies to a separate libbfd.c. This mirrors moving llvm and capstone code. While this code is deprecated as it is part of BUILD_NONDISTRO license incompatible code, moving the code to its own file minimizes disruption in the main files. disasm_bpf.c is moved to libbfd.c also except for symbol__disassemble_bpf_image which is currently more of a placeholder function rather than something that provides disassembly support. demangle-cxx.cpp code isn't migrated as it is very limited. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Ghiti <alexghiti@rivosinc.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.ibm.com> Cc: Bill Wendling <morbo@google.com> Cc: Charlie Jenkins <charlie@rivosinc.com> Cc: Collin Funk <collin.funk1@gmail.com> Cc: Dmitriy Vyukov <dvyukov@google.com> Cc: Dr. David Alan Gilbert <linux@treblig.org> Cc: Eric Biggers <ebiggers@kernel.org> Cc: Haibo Xu <haibo1.xu@intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Justin Stitt <justinstitt@google.com> Cc: Li Huafei <lihuafei1@huawei.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <nick.desaulniers+lkml@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <song@kernel.org> Cc: Stephen Brennan <stephen.s.brennan@oracle.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-10-02perf llvm: Move llvm functionality into its own fileIan Rogers-0/+1
LLVM disassembly support was in disasm.c and addr2line support in srcline.c. Move support out of these files into llvm.[ch] and remove LLVM includes from those files. As disassembly routines can fail, make failure the only option without HAVE_LIBLLVM_SUPPORT. For simplicity's sake, duplicate the read_symbol utility function. The intent with moving LLVM support into a single file is that dynamic support, using dlopen for libllvm, can be added in later patches. This can potentially always succeed or fail, so relying on ifdefs isn't sufficient. Using dlopen is a useful option to minimize the perf tools dependencies and potentially size. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Ghiti <alexghiti@rivosinc.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.ibm.com> Cc: Bill Wendling <morbo@google.com> Cc: Charlie Jenkins <charlie@rivosinc.com> Cc: Collin Funk <collin.funk1@gmail.com> Cc: Dmitriy Vyukov <dvyukov@google.com> Cc: Dr. David Alan Gilbert <linux@treblig.org> Cc: Eric Biggers <ebiggers@kernel.org> Cc: Haibo Xu <haibo1.xu@intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Justin Stitt <justinstitt@google.com> Cc: Li Huafei <lihuafei1@huawei.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <nick.desaulniers+lkml@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <song@kernel.org> Cc: Stephen Brennan <stephen.s.brennan@oracle.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-10-02perf capstone: Move capstone functionality into its own fileIan Rogers-0/+1
Capstone disassembly support was split between disasm.c and print_insn.c. Move support out of these files into capstone.[ch] and remove include capstone/capstone.h from those files. As disassembly routines can fail, make failure the only option without HAVE_LIBCAPSTONE_SUPPORT. For simplicity's sake, duplicate the read_symbol utility function. The intent with moving capstone support into a single file is that dynamic support, using dlopen for libcapstone, can be added in later patches. This can potentially always succeed or fail, so relying on ifdefs isn't sufficient. Using dlopen is a useful option to minimize the perf tools dependencies and potentially size. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Ghiti <alexghiti@rivosinc.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.ibm.com> Cc: Bill Wendling <morbo@google.com> Cc: Charlie Jenkins <charlie@rivosinc.com> Cc: Collin Funk <collin.funk1@gmail.com> Cc: Dmitriy Vyukov <dvyukov@google.com> Cc: Dr. David Alan Gilbert <linux@treblig.org> Cc: Eric Biggers <ebiggers@kernel.org> Cc: Haibo Xu <haibo1.xu@intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Justin Stitt <justinstitt@google.com> Cc: Li Huafei <lihuafei1@huawei.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <nick.desaulniers+lkml@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <song@kernel.org> Cc: Stephen Brennan <stephen.s.brennan@oracle.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-10-01perf powerpc: Process auxtrace events and display in 'perf report -D'Athira Rajeev-0/+1
Add VPA DTL PMU auxtrace process function for "perf report -D". The auxtrace event processing functions are defined in file "util/powerpc-vpadtl.c". Data structures used includes "struct powerpc_vpadtl_queue", "struct powerpc_vpadtl" to store the auxtrace buffers in queue. Different PERF_RECORD_XXX are generated during recording. PERF_RECORD_AUXTRACE_INFO is processed first since it is of type perf_user_event_type and perf session event delivers perf_session__process_user_event() first. Define function powerpc_vpadtl_process_auxtrace_info() to handle the processing of PERF_RECORD_AUXTRACE_INFO records. In this function, initialize the aux buffer queues using auxtrace_queues__init(). Setup the required infrastructure for aux data processing. The data is collected per CPU and auxtrace_queue is created for each CPU. Define powerpc_vpadtl_process_event() function to process PERF_RECORD_AUXTRACE records. In this, add the event to queue using auxtrace_queues__add_event() and process the buffer in powerpc_vpadtl_dump_event(). The first entry in the buffer with timebase as zero has boot timebase and frequency. Remaining data is of format for "struct powerpc_vpadtl_entry". Define the translation for dispatch_reasons and preempt_reasons, report this when dump trace is invoked via powerpc_vpadtl_dump() Sample output: ./perf record -a -e sched:*,vpa_dtl/dtl_all/ -c 1000000000 sleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.300 MB perf.data ] ./perf report -D 0 0 0x39b10 [0x30]: PERF_RECORD_AUXTRACE size: 0x690 offset: 0 ref: 0 idx: 0 tid: -1 cpu: 0 . . ... VPA DTL PMU data: size 1680 bytes, entries is 35 . 00000000: boot_tb: 21349649546353231, tb_freq: 512000000 . 00000030: dispatch_reason:decrementer interrupt, preempt_reason:H_CEDE, enqueue_to_dispatch_time:7064, ready_to_enqueue_time:187, waiting_to_ready_time:6611773 . 00000060: dispatch_reason:priv doorbell, preempt_reason:H_CEDE, enqueue_to_dispatch_time:146, ready_to_enqueue_time:0, waiting_to_ready_time:15359437 . 00000090: dispatch_reason:decrementer interrupt, preempt_reason:H_CEDE, enqueue_to_dispatch_time:4868, ready_to_enqueue_time:232, waiting_to_ready_time:5100709 . 000000c0: dispatch_reason:priv doorbell, preempt_reason:H_CEDE, enqueue_to_dispatch_time:179, ready_to_enqueue_time:0, waiting_to_ready_time:30714243 . 000000f0: dispatch_reason:priv doorbell, preempt_reason:H_CEDE, enqueue_to_dispatch_time:197, ready_to_enqueue_time:0, waiting_to_ready_time:15350648 . 00000120: dispatch_reason:priv doorbell, preempt_reason:H_CEDE, enqueue_to_dispatch_time:213, ready_to_enqueue_time:0, waiting_to_ready_time:15353446 . 00000150: dispatch_reason:priv doorbell, preempt_reason:H_CEDE, enqueue_to_dispatch_time:212, ready_to_enqueue_time:0, waiting_to_ready_time:15355126 . 00000180: dispatch_reason:decrementer interrupt, preempt_reason:H_CEDE, enqueue_to_dispatch_time:6368, ready_to_enqueue_time:164, waiting_to_ready_time:5104665 Reviewed-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Athira Rajeev <atrajeev@linux.ibm.com> Tested-by: Tejas Manhas <tejas05@linux.ibm.com> Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com> Cc: Aboorva Devarajan <aboorvad@linux.ibm.com> Cc: Aditya Bodkhe <Aditya.Bodkhe1@ibm.com> Cc: Hari Bathini <hbathini@linux.vnet.ibm.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Shrikanth Hegde <sshegde@linux.ibm.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-07-26perf tp_pmu: Factor existing tracepoint logic to new fileIan Rogers-0/+1
Start the creation of a tracepoint PMU abstraction. Tracepoint events don't follow the regular sysfs perf conventions. Eventually the new PMU abstraction will bridge the gap so tracepoint events look more like regular perf ones. Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Link: https://lore.kernel.org/r/20250725185202.68671-5-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-30perf build: Specify shellcheck should use bashCollin Funk-1/+1
When someone has a global shellcheckrc file, for example at ~/.config/shellcheckrc, with the directive 'shell=sh', building perf will fail with many shellcheck errors like: In tests/shell/base_probe/test_adding_kernel.sh line 294: (( TEST_RESULT += $? )) ^---------------------^ SC3006 (warning): In POSIX sh, standalone ((..)) is undefined. For more information: https://www.shellcheck.net/wiki/SC3006 -- In POSIX sh, standalone ((..)) is... make[5]: *** [tests/Build:91: tests/shell/base_probe/test_adding_kernel.sh.shellcheck_log] Error 1 Passing the '-s bash' option ensures that it runs correctly regardless of a developers global configuration. This patch adds '-s bash' and other options to the SHELLCHECK variable in Makefile.perf and makes use of the variable consistently. Signed-off-by: Collin Funk <collin.funk1@gmail.com> Link: https://lore.kernel.org/r/63491dbc8439edf2e949d80e264b9d22332fea61.1751082075.git.collin.funk1@gmail.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-26perf util: add a basic SHA-1 implementationEric Biggers-0/+1
SHA-1 can be written in fewer than 100 lines of code. Just add a basic SHA-1 implementation so that there's no need to use an external library or try to pull in the kernel's SHA-1 implementation. The kernel's SHA-1 implementation is not really intended to be pulled into userspace programs in the way that it was proposed to do so for perf (https://lore.kernel.org/r/20250521225307.743726-3-yuzhuo@google.com/), and it's also likely to undergo some refactoring in the future. There's no need to tie userspace tools to it. Include a test for sha1() in the util test suite. Signed-off-by: Eric Biggers <ebiggers@kernel.org> Reviewed-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250625202311.23244-3-ebiggers@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-26perf trace: Split BPF skel code to util/bpf_trace_augment.cNamhyung Kim-0/+1
And make builtin-trace.c less conditional. Dummy functions will be called when BUILD_BPF_SKEL=0 is used. This makes the builtin-trace.c slightly smaller and simpler by removing the skeleton and its helpers. The conditional guard of trace__init_syscalls_bpf_prog_array_maps() is changed from the HAVE_BPF_SKEL to HAVE_LIBBPF_SUPPORT as it doesn't have a skeleton in the code directly. And a dummy function is added so that it can be called unconditionally. The function will succeed only if the both conditions are true. Do not include trace_augment.h from the BPF code and move the definition of TRACE_AUG_MAX_BUF to the BPF directly. Reviewed-by: Howard Chu <howardchu95@gmail.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Link: https://lore.kernel.org/r/20250623225721.21553-1-namhyung@kernel.org Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-25perf drm_pmu: Add a tool like PMU to expose DRM informationIan Rogers-0/+1
DRM clients expose information through usage stats as documented in Documentation/gpu/drm-usage-stats.rst (available online at https://docs.kernel.org/gpu/drm-usage-stats.html). Add a tool like PMU, similar to the hwmon PMU, that exposes DRM information. For example on a tigerlake laptop: ``` $ perf list drm List of pre-defined events (to be used in -e or -M): drm: drm-active-stolen-system0 [Total memory active in one or more engines. Unit: drm_i915] drm-active-system0 [Total memory active in one or more engines. Unit: drm_i915] drm-engine-capacity-video [Engine capacity. Unit: drm_i915] drm-engine-copy [Utilization in ns. Unit: drm_i915] drm-engine-render [Utilization in ns. Unit: drm_i915] drm-engine-video [Utilization in ns. Unit: drm_i915] drm-engine-video-enhance [Utilization in ns. Unit: drm_i915] drm-purgeable-stolen-system0 [Size of resident and purgeable memory bufers. Unit: drm_i915] drm-purgeable-system0 [Size of resident and purgeable memory bufers. Unit: drm_i915] drm-resident-stolen-system0 [Size of resident memory bufers. Unit: drm_i915] drm-resident-system0 [Size of resident memory bufers. Unit: drm_i915] drm-shared-stolen-system0 [Size of shared memory bufers. Unit: drm_i915] drm-shared-system0 [Size of shared memory bufers. Unit: drm_i915] drm-total-stolen-system0 [Size of shared and private memory. Unit: drm_i915] drm-total-system0 [Size of shared and private memory. Unit: drm_i915] ``` System wide data can be gathered: ``` $ perf stat -x, -I 1000 -e drm-active-stolen-system0,drm-active-system0,drm-engine-capacity-video,drm-engine-copy,drm-engine-render,drm-engine-video,drm-engine-video-enhance,drm-purgeable-stolen-system0,drm-purgeable-system0,drm-resident-stolen-system0,drm-resident-system0,drm-shared-stolen-system0,drm-shared-system0,drm-total-stolen-system0,drm-total-system0 1.000904910,0,bytes,drm-active-stolen-system0,1,100.00,, 1.000904910,0,bytes,drm-active-system0,1,100.00,, 1.000904910,36,capacity,drm-engine-capacity-video,1,100.00,, 1.000904910,0,ns,drm-engine-copy,1,100.00,, 1.000904910,1472970566175,ns,drm-engine-render,1,100.00,, 1.000904910,0,ns,drm-engine-video,1,100.00,, 1.000904910,0,ns,drm-engine-video-enhance,1,100.00,, 1.000904910,0,bytes,drm-purgeable-stolen-system0,1,100.00,, 1.000904910,38199296,bytes,drm-purgeable-system0,1,100.00,, 1.000904910,0,bytes,drm-resident-stolen-system0,1,100.00,, 1.000904910,4643196928,bytes,drm-resident-system0,1,100.00,, 1.000904910,0,bytes,drm-shared-stolen-system0,1,100.00,, 1.000904910,1886871552,bytes,drm-shared-system0,1,100.00,, 1.000904910,0,bytes,drm-total-stolen-system0,1,100.00,, 1.000904910,4643196928,bytes,drm-total-system0,1,100.00,, 2.264426839,0,bytes,drm-active-stolen-system0,1,100.00,, ``` Or for a particular process: ``` $ perf stat -x, -I 1000 -e drm-active-stolen-system0,drm-active-system0,drm-engine-capacity-video,drm-engine-copy,drm-engine-render,drm-engine-video,drm-engine-video-enhance,drm-purgeable-stolen-system0,drm-purgeable-system0,drm-resident-stolen-system0,drm-resident-system0,drm-shared-stolen-system0,drm-shared-system0,drm-total-stolen-system0,drm-total-system0 -p 200027 1.001040274,0,bytes,drm-active-stolen-system0,6,100.00,, 1.001040274,0,bytes,drm-active-system0,6,100.00,, 1.001040274,12,capacity,drm-engine-capacity-video,6,100.00,, 1.001040274,0,ns,drm-engine-copy,6,100.00,, 1.001040274,1542300,ns,drm-engine-render,6,100.00,, 1.001040274,0,ns,drm-engine-video,6,100.00,, 1.001040274,0,ns,drm-engine-video-enhance,6,100.00,, 1.001040274,0,bytes,drm-purgeable-stolen-system0,6,100.00,, 1.001040274,13516800,bytes,drm-purgeable-system0,6,100.00,, 1.001040274,0,bytes,drm-resident-stolen-system0,6,100.00,, 1.001040274,27746304,bytes,drm-resident-system0,6,100.00,, 1.001040274,0,bytes,drm-shared-stolen-system0,6,100.00,, 1.001040274,0,bytes,drm-shared-system0,6,100.00,, 1.001040274,0,bytes,drm-total-stolen-system0,6,100.00,, 1.001040274,27746304,bytes,drm-total-system0,6,100.00,, 2.016629075,0,bytes,drm-active-stolen-system0,6,100.00,, ``` As with the hwmon PMU, high numbered PMU types are used to encode multiple possible "DRM" PMUs. The appropriate fdinfo is found by scanning /proc and filtering which fdinfos to read with stat. To avoid some unneeding scanning, events not starting with "drm-" are ignored. The patch builds on commit 57e13264dcea ("perf pmus: Restructure pmu_read_sysfs to scan fewer PMUs") and later so that only if full wild carding is being done, the PMU starts with "drm_" or the event starts with "drm-" will /proc be scanned. That is there should be little to no cost in this PMU unless DRM events are requested. Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250624231837.179536-3-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-05-09perf symbol-elf: Integrate rust-v0 demanglingIan Rogers-1/+4
Use the demangle-rust-v0 APIs to see if symbol is Rust mangled and demangle if so. The API requires a pre-allocated output buffer, some estimation and retrying are added for this. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alex Gaynor <alex.gaynor@gmail.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alice Ryhl <aliceryhl@google.com> Cc: Andreas Hindborg <a.hindborg@kernel.org> Cc: Ariel Ben-Yehuda <ariel.byd@gmail.com> Cc: Benno Lossin <benno.lossin@proton.me> Cc: Bill Wendling <morbo@google.com> Cc: Björn Roy Baron <bjorn3_gh@protonmail.com> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Daniel Xu <dxu@dxuuu.xyz> Cc: Danilo Krummrich <dakr@kernel.org> Cc: Gary Guo <gary@garyguo.net> Cc: Howard Chu <howardchu95@gmail.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Justin Stitt <justinstitt@google.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <nick.desaulniers+lkml@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Stephen Brennan <stephen.s.brennan@oracle.com> Cc: Trevor Gross <tmgross@umich.edu> Link: https://lore.kernel.org/r/20250430004128.474388-3-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-28perf trace: Implement syscall summary in BPFNamhyung Kim-0/+4
When -s/--summary option is used, it doesn't need (augmented) arguments of syscalls. Let's skip the augmentation and load another small BPF program to collect the statistics in the kernel instead of copying the data to the ring-buffer to calculate the stats in userspace. This will be much more light-weight than the existing approach and remove any lost events. Let's add a new option --bpf-summary to control this behavior. I cannot make it default because there's no way to get e_machine in the BPF which is needed for detecting different ABIs like 32-bit compat mode. No functional changes intended except for no more LOST events. :) $ sudo ./perf trace -as --summary-mode=total --bpf-summary sleep 1 Summary of events: total, 6194 events syscall calls errors total min avg max stddev (msec) (msec) (msec) (msec) (%) --------------- -------- ------ -------- --------- --------- --------- ------ epoll_wait 561 0 4530.843 0.000 8.076 520.941 18.75% futex 693 45 4317.231 0.000 6.230 500.077 21.98% poll 300 0 1040.109 0.000 3.467 120.928 17.02% clock_nanosleep 1 0 1000.172 1000.172 1000.172 1000.172 0.00% ppoll 360 0 872.386 0.001 2.423 253.275 41.91% epoll_pwait 14 0 384.349 0.001 27.453 380.002 98.79% pselect6 14 0 108.130 7.198 7.724 8.206 0.85% nanosleep 39 0 43.378 0.069 1.112 10.084 44.23% ... Reviewed-by: Howard Chu <howardchu95@gmail.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Howard Chu <howardchu95@gmail.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20250326044001.3503432-1-namhyung@kernel.org [ Added fixup sent from Namhyung in response to my report to make it also dependent on CONFIG_TRACE ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-25perf intel-tpebs: Cleanup headerIan Rogers-1/+1
Remove arch conditional compilation. Arch conditional compilation belongs in the arch/ directory. Tidy header guards to match other files. Remove unneeded includes and switch to forward declarations when necesary. Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Weilin Wang <weilin.wang@intel.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Torgue <alexandre.torgue@foss.st.com> Cc: Andreas Färber <afaerber@suse.de> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Falcon <thomas.falcon@intel.com> Link: https://lore.kernel.org/r/20250414174134.3095492-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-03-24perf build: Add pylint build testsIan Rogers-0/+12
If PYLINT=1 is passed to the build then run pylint over python code in perf. Unlike shellcheck this isn't default on as there are currently too many errors. An example of an error: ``` ************* Module setup util/setup.py:19:0: C0301: Line too long (127/100) (line-too-long) util/setup.py:20:0: C0301: Line too long (138/100) (line-too-long) util/setup.py:63:0: C0301: Line too long (106/100) (line-too-long) util/setup.py:1:0: C0114: Missing module docstring (missing-module-docstring) util/setup.py:24:4: W0622: Redefining built-in 'vars' (redefined-builtin) util/setup.py:11:4: C0103: Constant name "cc_options" doesn't conform to UPPER_CASE naming style (invalid-name) util/setup.py:13:4: C0103: Constant name "cc_options" doesn't conform to UPPER_CASE naming style (invalid-name) util/setup.py:15:34: R1732: Consider using 'with' for resource-allocating operations (consider-using-with) util/setup.py:18:0: C0116: Missing function or method docstring (missing-function-docstring) util/setup.py:19:16: R1732: Consider using 'with' for resource-allocating operations (consider-using-with) util/setup.py:44:0: C0413: Import "from setuptools import setup, Extension" should be placed at the top of the module (wrong-import-position) util/setup.py:46:0: C0413: Import "from setuptools.command.build_ext import build_ext as _build_ext" should be placed at the top of the module (wrong-import-position) util/setup.py:47:0: C0413: Import "from setuptools.command.install_lib import install_lib as _install_lib" should be placed at the top of the module (wrong-import-position) util/setup.py:49:0: C0115: Missing class docstring (missing-class-docstring) util/setup.py:49:0: C0103: Class name "build_ext" doesn't conform to PascalCase naming style (invalid-name) util/setup.py:52:8: W0201: Attribute 'build_lib' defined outside __init__ (attribute-defined-outside-init) util/setup.py:53:8: W0201: Attribute 'build_temp' defined outside __init__ (attribute-defined-outside-init) util/setup.py:55:0: C0115: Missing class docstring (missing-class-docstring) util/setup.py:55:0: C0103: Class name "install_lib" doesn't conform to PascalCase naming style (invalid-name) util/setup.py:58:8: W0201: Attribute 'build_dir' defined outside __init__ (attribute-defined-outside-init) *----------------------------------------------------------------- Your code has been rated at 6.67/10 (previous run: 6.51/10, +0.16) make[4]: *** [util/Build:442: util/setup.py.pylint_log] Error 1 ``` Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250311213628.569562-5-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-24perf build: Add mypy build testsIan Rogers-0/+13
If MYPY=1 is passed to the build then run mypy over python code in perf. Unlike shellcheck this isn't default on as there are currently too many errors. An example of an error: ``` util/setup.py:8: error: Item "None" of "str | None" has no attribute "split" [union-attr] util/setup.py:15: error: Item "None" of "IO[bytes] | None" has no attribute "readline" [union-attr] util/setup.py:15: error: List item 0 has incompatible type "str | None"; expected "str | bytes | PathLike[str] | PathLike[bytes]" [list-item] util/setup.py:16: error: Unsupported left operand type for + ("None") [operator] util/setup.py:16: note: Left operand is of type "str | None" util/setup.py:74: error: Unsupported left operand type for + ("None") [operator] util/setup.py:74: note: Left operand is of type "str | None" Found 5 errors in 1 file (checked 1 source file) make[4]: *** [util/Build:430: util/setup.py.mypy_log] Error 1 ``` Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250311213628.569562-4-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-24perf build: Rename TEST_LOGS to SHELL_TEST_LOGSIan Rogers-3/+3
Rename TEST_LOGS to SHELL_TEST_LOGS as later changes will add more kinds of test logs. Minor comment tweak in Makefile.perf as more than just test shell tests are checked. Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250311213628.569562-3-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-12perf sample: Make user_regs and intr_regs optionalIan Rogers-0/+1
The struct dump_regs contains 512 bytes of cache_regs, meaning the two values in perf_sample contribute 1088 bytes of its total 1384 bytes size. Initializing this much memory has a cost reported by Tavian Barnes <tavianator@tavianator.com> as about 2.5% when running `perf script --itrace=i0`: https://lore.kernel.org/lkml/d841b97b3ad2ca8bcab07e4293375fb7c32dfce7.1736618095.git.tavianator@tavianator.com/ Adrian Hunter <adrian.hunter@intel.com> replied that the zero initialization was necessary and couldn't simply be removed. This patch aims to strike a middle ground of still zeroing the perf_sample, but removing 79% of its size by make user_regs and intr_regs optional pointers to zalloc-ed memory. To support the allocation accessors are created for user_regs and intr_regs. To support correct cleanup perf_sample__init and perf_sample__exit functions are created and added throughout the code base. Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250113194345.1537821-1-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-12-18perf lock: Move common lock contention code to new fileIan Rogers-0/+1
Avoid references from util code to builtin-lock that require python stubs. Move the functions and related variables to util/lock-contention.c. Add max_stack_depth parameter to match_callstack_filter to avoid sharing a global variable. Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Colin Ian King <colin.i.king@gmail.com> Cc: Dapeng Mi <dapeng1.mi@linux.intel.com> Cc: Howard Chu <howardchu95@gmail.com> Cc: Ilya Leoshkevich <iii@linux.ibm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Veronika Molnarova <vmolnaro@redhat.com> Cc: Weilin Wang <weilin.wang@intel.com> Link: https://lore.kernel.org/r/20241119011644.971342-16-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-12-18perf x86: Define arch_fetch_insn in NO_AUXTRACE buildsIan Rogers-1/+1
archinsn.c containing arch_fetch_insn was only enabled with CONFIG_AUXTRACE, but this meant that a NO_AUXTRACE build on x86 would use the empty weak version of arch_fetch_insn - weak symbols are a frequent source of errors like this and are outside of the C specification. Change it so that archinsn.c is always built on x86 and make the weak symbol empty version of arch_fetch_insn a strong one guarded by ifdefs. arch_fetch_insn on x86 depends on insn_decode which is a function included then built into intel-pt-insn-decoder.c. intel-pt-insn-decoder.c isn't built in a NO_AUXTRACE=1 build. Separate the insn_decode function from intel-pt-insn-decoder.c by just directly compiling the relevant file. Guard this compilation to be for either always on x86 (because of the use in arch_fetch_insn) or when auxtrace is enabled. Apply the CFLAGS overrides as necessary, reducing the amount of code where warnings are disabled. Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Adrian Hunter <adrian.hunter@intel.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Colin Ian King <colin.i.king@gmail.com> Cc: Dapeng Mi <dapeng1.mi@linux.intel.com> Cc: Howard Chu <howardchu95@gmail.com> Cc: Ilya Leoshkevich <iii@linux.ibm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Veronika Molnarova <vmolnaro@redhat.com> Cc: Weilin Wang <weilin.wang@intel.com> Link: https://lore.kernel.org/r/20241119011644.971342-13-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-12-18perf kvm: Move functions used in util out of builtinIan Rogers-0/+1
The util library code is used by the python module but doesn't have access to the builtin files. Make a util/kvm-stat.c to match the kvm-stat.h file that declares the functions and move the functions there. Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Colin Ian King <colin.i.king@gmail.com> Cc: Dapeng Mi <dapeng1.mi@linux.intel.com> Cc: Howard Chu <howardchu95@gmail.com> Cc: Ilya Leoshkevich <iii@linux.ibm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Veronika Molnarova <vmolnaro@redhat.com> Cc: Weilin Wang <weilin.wang@intel.com> Link: https://lore.kernel.org/r/20241119011644.971342-6-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-12-09perf trace-event: Always build trace-event-info.cIan Rogers-1/+1
trace-event-info.c has no libtraceevent dependencies, always build it and use it in builtin-record and perf_event_attr printing. Reviewed-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Ben Gainey <ben.gainey@arm.com> Cc: Colin Ian King <colin.i.king@gmail.com> Cc: Dominique Martinet <asmadeus@codewreck.org> Cc: Ilkka Koskinen <ilkka@os.amperecomputing.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Oliver Upton <oliver.upton@linux.dev> Cc: Paran Lee <p4ranlee@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steinar H. Gunderson <sesse@google.com> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: Thomas Falcon <thomas.falcon@intel.com> Cc: Weilin Wang <weilin.wang@intel.com> Cc: Yang Jihong <yangjihong@bytedance.com> Cc: Yang Li <yang.lee@linux.alibaba.com> Cc: Ze Gao <zegao2021@gmail.com> Cc: Zixian Cai <fzczx123@gmail.com> Cc: zhaimingbing <zhaimingbing@cmss.chinamobile.com> Link: https://lore.kernel.org/r/20241118225345.889810-5-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-12-09perf btf: Make the sigtrap test helper to find a member by name widely availableArnaldo Carvalho de Melo-0/+1
By introducing a tools/perf/util/btf.c to collect utilities not yet available via libbpf, the first being a way to find a member by name once we get the type_id for the struct. Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-11-09perf dwarf-regs: Move powerpc dwarf-regs out of archIan Rogers-0/+1
Move arch/powerpc/util/dwarf-regs.c to util/dwarf-regs-powerpc.c and compile in unconditionally. get_arch_regstr is redundant when EM_NONE is treated as EM_HOST so remove and update dwarf-regs.c conditions. Make get_powerpc_regs unconditionally available whwn libdw is. Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Anup Patel <anup@brainfault.org> Cc: Yang Jihong <yangjihong@bytedance.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: David S. Miller <davem@davemloft.net> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Shenlin Liang <liangshenlin@eswincomputing.com> Cc: Nick Terrell <terrelln@fb.com> Cc: Guilherme Amadio <amadio@gentoo.org> Cc: Steinar H. Gunderson <sesse@google.com> Cc: Changbin Du <changbin.du@huawei.com> Cc: Alexander Lobakin <aleksander.lobakin@intel.com> Cc: Przemek Kitszel <przemyslaw.kitszel@intel.com> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Guo Ren <guoren@kernel.org> Cc: Masahiro Yamada <masahiroy@kernel.org> Cc: Will Deacon <will@kernel.org> Cc: James Clark <james.clark@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Chen Pei <cp0613@linux.alibaba.com> Cc: Leo Yan <leo.yan@linux.dev> Cc: Oliver Upton <oliver.upton@linux.dev> Cc: Aditya Gupta <adityag@linux.ibm.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-riscv@lists.infradead.org Cc: Bibo Mao <maobibo@loongson.cn> Cc: John Garry <john.g.garry@oracle.com> Cc: Atish Patra <atishp@rivosinc.com> Cc: Dima Kogan <dima@secretsauce.net> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Dr. David Alan Gilbert <linux@treblig.org> Cc: linux-csky@vger.kernel.org Link: https://lore.kernel.org/r/20241108234606.429459-14-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-11-09perf dwarf-regs: Move csky dwarf-regs out of archIan Rogers-0/+1
Move arch/csky/util/dwarf-regs.c to util/dwarf-regs-csky.c and compile in unconditionally. To avoid get_arch_regstr being duplicated, rename to get_csky_regstr and add to get_dwarf_regstr switch. Update #ifdefs to allow ABI V1 and V2 tables at the same time. Determine the table from the ELF flags. Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Anup Patel <anup@brainfault.org> Cc: Yang Jihong <yangjihong@bytedance.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: David S. Miller <davem@davemloft.net> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Shenlin Liang <liangshenlin@eswincomputing.com> Cc: Nick Terrell <terrelln@fb.com> Cc: Guilherme Amadio <amadio@gentoo.org> Cc: Steinar H. Gunderson <sesse@google.com> Cc: Changbin Du <changbin.du@huawei.com> Cc: Alexander Lobakin <aleksander.lobakin@intel.com> Cc: Przemek Kitszel <przemyslaw.kitszel@intel.com> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Guo Ren <guoren@kernel.org> Cc: Masahiro Yamada <masahiroy@kernel.org> Cc: Will Deacon <will@kernel.org> Cc: James Clark <james.clark@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Chen Pei <cp0613@linux.alibaba.com> Cc: Leo Yan <leo.yan@linux.dev> Cc: Oliver Upton <oliver.upton@linux.dev> Cc: Aditya Gupta <adityag@linux.ibm.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-riscv@lists.infradead.org Cc: Bibo Mao <maobibo@loongson.cn> Cc: John Garry <john.g.garry@oracle.com> Cc: Atish Patra <atishp@rivosinc.com> Cc: Dima Kogan <dima@secretsauce.net> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Dr. David Alan Gilbert <linux@treblig.org> Cc: linux-csky@vger.kernel.org Link: https://lore.kernel.org/r/20241108234606.429459-11-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-11-09perf dwarf-regs: Move x86 dwarf-regs out of archIan Rogers-0/+1
Move arch/x86/util/dwarf-regs.c to util/dwarf-regs-x86.c and compile in unconditionally. To avoid get_arch_regnum being duplicated, rename to get_x86_regnum and add to get_dwarf_regnum switch. For get_arch_regstr, this was unused on x86 unless the machine type was EM_NONE. Map that case to EM_HOST and remove get_arch_regstr from dwarf-regs-x86.c. Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Anup Patel <anup@brainfault.org> Cc: Yang Jihong <yangjihong@bytedance.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: David S. Miller <davem@davemloft.net> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Shenlin Liang <liangshenlin@eswincomputing.com> Cc: Nick Terrell <terrelln@fb.com> Cc: Guilherme Amadio <amadio@gentoo.org> Cc: Steinar H. Gunderson <sesse@google.com> Cc: Changbin Du <changbin.du@huawei.com> Cc: Alexander Lobakin <aleksander.lobakin@intel.com> Cc: Przemek Kitszel <przemyslaw.kitszel@intel.com> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Guo Ren <guoren@kernel.org> Cc: Masahiro Yamada <masahiroy@kernel.org> Cc: Will Deacon <will@kernel.org> Cc: James Clark <james.clark@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Chen Pei <cp0613@linux.alibaba.com> Cc: Leo Yan <leo.yan@linux.dev> Cc: Oliver Upton <oliver.upton@linux.dev> Cc: Aditya Gupta <adityag@linux.ibm.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-riscv@lists.infradead.org Cc: Bibo Mao <maobibo@loongson.cn> Cc: John Garry <john.g.garry@oracle.com> Cc: Atish Patra <atishp@rivosinc.com> Cc: Dima Kogan <dima@secretsauce.net> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Dr. David Alan Gilbert <linux@treblig.org> Cc: linux-csky@vger.kernel.org Link: https://lore.kernel.org/r/20241108234606.429459-8-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-11-09perf hwmon_pmu: Add hwmon filename parserIan Rogers-0/+1
hwmon filenames have a specific encoding that will be used to give a config value. The encoding is described in: Documentation/hwmon/sysfs-interface.rst Add a function to parse the filename into consituent enums/ints that will then be amenable to config encoding. Note, things are done this way to allow mapping names to config and back without the use of hash/dynamic lookup tables. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Yoshihiro Furudera <fj5100bi@fujitsu.com> Cc: Howard Chu <howardchu95@gmail.com> Cc: Ze Gao <zegao2021@gmail.com> Cc: Changbin Du <changbin.du@huawei.com> Cc: Junhao He <hejunhao3@huawei.com> Cc: Weilin Wang <weilin.wang@intel.com> Cc: James Clark <james.clark@linaro.org> Cc: Oliver Upton <oliver.upton@linux.dev> Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com> [namhyung: add #include <linux/string.h> for strlcpy()] Link: https://lore.kernel.org/r/20241109003759.473460-3-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-10-18perf build: Rename CONFIG_DWARF to CONFIG_LIBDWIan Rogers-6/+6
In Makefile.config for unwinding the name dwarf implies either libunwind or libdw. Make it clearer that CONFIG_DWARF is really just defined when libdw is present by renaming to CONFIG_LIBDW. Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Leo Yan <leo.yan@arm.com> Cc: Anup Patel <anup@brainfault.org> Cc: Yang Jihong <yangjihong@bytedance.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: David S. Miller <davem@davemloft.net> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Shenlin Liang <liangshenlin@eswincomputing.com> Cc: Nick Terrell <terrelln@fb.com> Cc: Guilherme Amadio <amadio@gentoo.org> Cc: Steinar H. Gunderson <sesse@google.com> Cc: Changbin Du <changbin.du@huawei.com> Cc: Alexander Lobakin <aleksander.lobakin@intel.com> Cc: Przemek Kitszel <przemyslaw.kitszel@intel.com> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Guo Ren <guoren@kernel.org> Cc: Masahiro Yamada <masahiroy@kernel.org> Cc: Masami Hiramatsu (Google) <mhiramat@kernel.org> Cc: Will Deacon <will@kernel.org> Cc: James Clark <james.clark@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Chen Pei <cp0613@linux.alibaba.com> Cc: Leo Yan <leo.yan@linux.dev> Cc: Oliver Upton <oliver.upton@linux.dev> Cc: Aditya Gupta <adityag@linux.ibm.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-riscv@lists.infradead.org Cc: Bibo Mao <maobibo@loongson.cn> Cc: John Garry <john.g.garry@oracle.com> Cc: Atish Patra <atishp@rivosinc.com> Cc: Dima Kogan <dima@secretsauce.net> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Dr. David Alan Gilbert <linux@treblig.org> Cc: linux-csky@vger.kernel.org Link: https://lore.kernel.org/r/20241017001354.56973-12-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-10-10perf tool_pmu: Factor tool events into their own PMUIan Rogers-0/+1
Rather than treat tool events as a special kind of event, create a tool only PMU where the events/aliases match the existing duration_time, user_time and system_time events. Remove special parsing and printing support for the tool events, but add function calls for when PMU functions are called on a tool_pmu. Move the tool PMU code in evsel into tool_pmu.c to better encapsulate the tool event behavior in that file. Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20241002032016.333748-5-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-09-03perf report: Support LLVM for addr2line()Steinar H. Gunderson-0/+1
In addition to the existing support for libbfd and calling out to an external addr2line command, add support for using libllvm directly. This is both faster than libbfd, and can be enabled in distro builds (the LLVM license has an explicit provision for GPLv2 compatibility). Thus, it is set as the primary choice if available. As an example, running 'perf report' on a medium-size profile with DWARF-based backtraces took 58 seconds with LLVM, 78 seconds with libbfd, 153 seconds with external llvm-addr2line, and I got tired and aborted the test after waiting for 55 minutes with external bfd addr2line (which is the default for perf as compiled by distributions today). Evidently, for this case, the bfd addr2line process needs 18 seconds (on a 5.2 GHz Zen 3) to load the .debug ELF in question, hits the 1-second timeout and gets killed during initialization, getting restarted anew every time. Having an in-process addr2line makes this much more robust. As future extensions, libllvm can be used in many other places where we currently use libbfd or other libraries: - Symbol enumeration (in particular, for PE binaries). - Demangling (including non-Itanium demangling, e.g. Microsoft or Rust). - Disassembling (perf annotate). However, these are much less pressing; most people don't profile PE binaries, and perf has non-bfd paths for ELF. The same with demangling; the default _cxa_demangle path works fine for most users, and while bfd objdump can be slow on large binaries, it is possible to use --objdump=llvm-objdump to get the speed benefits. (It appears LLVM-based demangling is very simple, should we want that.) Tested with LLVM 14, 15, 16, 18 and 19. For some reason, LLVM 12 was not correctly detected using feature_check, and thus was not tested. Committer notes: Added the name and a __maybe_unused to address: 1 13.50 almalinux:8 : FAIL gcc version 8.5.0 20210514 (Red Hat 8.5.0-22) (GCC) util/srcline.c: In function 'dso__free_a2l': util/srcline.c:184:20: error: parameter name omitted void dso__free_a2l(struct dso *) ^~~~~~~~~~~~ make[3]: *** [/git/perf-6.11.0-rc3/tools/build/Makefile.build:158: util] Error 2 Signed-off-by: Steinar H. Gunderson <sesse@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20240803152008.2818485-1-sesse@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-08-28perf bpf-filter: Add build dependency to header filesNamhyung Kim-2/+2
The flex and bison files need to be recompiled when one of these header filters are changed. * util/bpf-filter.h * util/bpf_skel/sample-filter.h Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: https://lore.kernel.org/r/20240826221045.1202305-3-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-08-20perf cap: Tidy up and improve capability testingIan Rogers-1/+1
Remove dependence on libcap. libcap is only used to query whether a capability is supported, which is just 1 capget system call. If the capget system call fails, fall back on root permission checking. Previously if libcap fails then the permission is assumed not present which may be pessimistic/wrong. Add a used_root out argument to perf_cap__capable to say whether the fall back root check was used. This allows the correct error message, "root" vs "users with the CAP_PERFMON or CAP_SYS_ADMIN capability", to be selected. Tidy uses of perf_cap__capable so that tests aren't repeated if capget isn't supported. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Changbin Du <changbin.du@huawei.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linux.dev> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Oliver Upton <oliver.upton@linux.dev> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240806220614.831914-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-08-13perf stat: Fork and launch 'perf record' when 'perf stat' needs to get ↵Weilin Wang-0/+1
retire latency value for a metric. When retire_latency value is used in a metric formula, evsel would fork a 'perf record' process with "-e" and "-W" options. 'perf record' will collect required retire_latency values in parallel while 'perf stat' is collecting counting values. At the point of time that 'perf stat' stops counting, evsel would stop 'perf record' by sending sigterm signal to 'perf record' process. Sampled data will be processed to get retire latency value. Another thread is required to synchronize between 'perf stat' and 'perf record' when we pass data through pipe. Retire_latency evsel is not opened for 'perf stat' so that there is no counter wasted on it. This commit includes code suggested by Namhyung to adjust reading size for groups that include retire_latency evsels. In current :R parsing implementation, the parser would recognize events with retire_latency modifier and insert them into the evlist like a normal event. Ideally, we need to avoid counting these events. In this commit, at the time when a retire_latency evsel is read, set the retire latency value processed from the sampled data to count value. This sampled retire latency value will be used for metric calculation and final event count print out. No special metric calculation and event print out code required for retire_latency events. Reviewed-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Weilin Wang <weilin.wang@intel.com> Acked-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Caleb Biggers <caleb.biggers@intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Perry Taylor <perry.taylor@intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Samantha Alt <samantha.alt@intel.com> Link: https://lore.kernel.org/r/20240720062102.444578-4-weilin.wang@intel.com [ Squashed the 3rd and 4th commit in the series to keep it building patch by patch ] [ Constified the 'struct perf_tool' pointer in process_sample_event() ] [ Use perf_tool__init(&tool, false) to address a segfault I reported and Ian/Weilin diagnosed ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-08-12perf tool: Move fill defaults into tool.cIan Rogers-0/+1
The aim here is to eventually make perf_tool__fill_defaults() an init function so that the tools struct is more const. Create a tool.c to go along with tool.h. Move perf_tool__fill_defaults() out of session.c into tool.c along with the default stub values. Add perf_tool__compressed_is_stub() for a test in perf_session__process_user_event(). perf_session__process_compressed_event() is only used from being default initialized so migrate into tool.c. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ilkka Koskinen <ilkka@os.amperecomputing.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Jonathan Cameron <jonathan.cameron@huawei.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linux.dev> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Nick Terrell <terrelln@fb.com> Cc: Oliver Upton <oliver.upton@linux.dev> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <song@kernel.org> Cc: Sun Haiyong <sunhaiyong@loongson.cn> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Yanteng Si <siyanteng@loongson.cn> Cc: Yicong Yang <yangyicong@hisilicon.com> Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20240812204720.631678-5-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-08-01perf bpf: Move BPF disassembly routines to separate file to avoid clash with ↵Arnaldo Carvalho de Melo-0/+1
capstone bpf headers There is a clash of the libbpf and capstone libraries, that ends up with: In file included from /usr/include/capstone/capstone.h:325, from util/disasm.c:1513: /usr/include/capstone/bpf.h:94:14: error: ‘bpf_insn’ defined as wrong kind of tag 94 | typedef enum bpf_insn { So far we're just trying to avoid this by not having both headers included in the same .c or .h file, do it one more time by moving the BPF diassembly routines from util/disasm.c to util/disasm_bpf.c. This is only being hit when building with BUILD_NONDISTRO=1, i.e. building with binutils-devel, that isn't the in the default build due to a licencing clash. We need to reimplement what is now isolated in util/disasm_bpf.c using some other library to have BPF annotation feature that now only is available with BUILD_NONDISTRO=1. Fixes: 6d17edc113de1e21 ("perf annotate: Use libcapstone to disassemble") Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/lkml/ZqpUSKPxMwaQKORr@x1 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-06-26perf util: Make util its own libraryIan Rogers-197/+197
Make the util directory into its own library. This is done to avoid compiling code twice, once for the perf tool and once for the perf python module. For convenience: arch/common.c scripts/perl/Perf-Trace-Util/Context.c scripts/python/Perf-Trace-Util/Context.c are made part of this library. Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: James Clark <james.clark@arm.com> Cc: Suzuki K Poulose <suzuki.poulose@arm.com> Cc: Kees Cook <keescook@chromium.org> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Nick Terrell <terrelln@fb.com> Cc: Gary Guo <gary@garyguo.net> Cc: Alex Gaynor <alex.gaynor@gmail.com> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Wedson Almeida Filho <wedsonaf@gmail.com> Cc: Ze Gao <zegao2021@gmail.com> Cc: Alice Ryhl <aliceryhl@google.com> Cc: Andrei Vagin <avagin@google.com> Cc: Yicong Yang <yangyicong@hisilicon.com> Cc: Jonathan Cameron <jonathan.cameron@huawei.com> Cc: Guo Ren <guoren@kernel.org> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Will Deacon <will@kernel.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Leo Yan <leo.yan@linux.dev> Cc: Oliver Upton <oliver.upton@linux.dev> Cc: John Garry <john.g.garry@oracle.com> Cc: Benno Lossin <benno.lossin@proton.me> Cc: Björn Roy Baron <bjorn3_gh@protonmail.com> Cc: Andreas Hindborg <a.hindborg@samsung.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240625214117.953777-7-irogers@google.com
2024-05-07perf mem-info: Move mem-info out of mem-events and symbolIan Rogers-0/+1
Move mem-info to its own header rather than having it split between mem-events and symbol. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Ben Gainey <ben.gainey@arm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: K Prateek Nayak <kprateek.nayak@amd.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Li Dong <lidong@vivo.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Oliver Upton <oliver.upton@linux.dev> Cc: Paran Lee <p4ranlee@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sun Haiyong <sunhaiyong@loongson.cn> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Yanteng Si <siyanteng@loongson.cn> Cc: Yicong Yang <yangyicong@hisilicon.com> Link: https://lore.kernel.org/r/20240507183545.1236093-7-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-12perf util: Add shellcheck to generate-cmdlist.shIan Rogers-0/+14
Add shellcheck to generate-cmdlist.sh to avoid basic shell script mistakes. Reviewed-by: James Clark <james.clark@arm.com> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Oliver Upton <oliver.upton@linux.dev> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Link: https://lore.kernel.org/r/20240409023216.2342032-4-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-03perf annotate: Split out util/disasm.cNamhyung Kim-0/+1
The util/annotate.c code has both disassembly and sample annotation related codes. Factor out the disasm part so that it can be handled more easily. No functional changes intended. Committer notes: Add missing include env.h, util.h, bpf-event.h and bpf-util.h to disasm.c, to fix things like: util/disasm.c: In function ‘symbol__disassemble_bpf’: util/disasm.c:1203:9: error: implicit declaration of function ‘perf_exe’ [-Werror=implicit-function-declaration] 1203 | perf_exe(tpath, sizeof(tpath)); | ^~~~~~~~ Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240329215812.537846-4-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>