<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/kernel/trace/ftrace.c, branch v5.15</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v5.15</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v5.15'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2021-10-29T13:52:23Z</updated>
<entry>
<title>ftrace: Fix kernel-doc formatting issues</title>
<updated>2021-10-29T13:52:23Z</updated>
<author>
<name>Steven Rostedt (VMware)</name>
<email>rostedt@goodmis.org</email>
</author>
<published>2021-10-29T13:52:23Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=6130722f1114cac4f3ec858763ad8995d5a2e586'/>
<id>urn:sha1:6130722f1114cac4f3ec858763ad8995d5a2e586</id>
<content type='text'>
Some functions had kernel-doc that used a comma instead of a hash to
separate the function name from the one line description.

Also, the "ftrace_is_dead()" had an incomplete description.

Signed-off-by: Steven Rostedt (VMware) &lt;rostedt@goodmis.org&gt;
</content>
</entry>
<entry>
<title>tracing: Have all levels of checks prevent recursion</title>
<updated>2021-10-18T22:12:09Z</updated>
<author>
<name>Steven Rostedt (VMware)</name>
<email>rostedt@goodmis.org</email>
</author>
<published>2021-10-18T19:44:12Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=ed65df63a39a3f6ed04f7258de8b6789e5021c18'/>
<id>urn:sha1:ed65df63a39a3f6ed04f7258de8b6789e5021c18</id>
<content type='text'>
While writing an email explaining the "bit = 0" logic for a discussion on
making ftrace_test_recursion_trylock() disable preemption, I discovered a
path that makes the "not do the logic if bit is zero" unsafe.

The recursion logic is done in hot paths like the function tracer. Thus,
any code executed causes noticeable overhead. Thus, tricks are done to try
to limit the amount of code executed. This included the recursion testing
logic.

Having recursion testing is important, as there are many paths that can
end up in an infinite recursion cycle when tracing every function in the
kernel. Thus protection is needed to prevent that from happening.

Because it is OK to recurse due to different running context levels (e.g.
an interrupt preempts a trace, and then a trace occurs in the interrupt
handler), a set of bits are used to know which context one is in (normal,
softirq, irq and NMI). If a recursion occurs in the same level, it is
prevented*.

Then there are infrastructure levels of recursion as well. When more than
one callback is attached to the same function to trace, it calls a loop
function to iterate over all the callbacks. Both the callbacks and the
loop function have recursion protection. The callbacks use the
"ftrace_test_recursion_trylock()" which has a "function" set of context
bits to test, and the loop function calls the internal
trace_test_and_set_recursion() directly, with an "internal" set of bits.

If an architecture does not implement all the features supported by ftrace
then the callbacks are never called directly, and the loop function is
called instead, which will implement the features of ftrace.

Since both the loop function and the callbacks do recursion protection, it
was seemed unnecessary to do it in both locations. Thus, a trick was made
to have the internal set of recursion bits at a more significant bit
location than the function bits. Then, if any of the higher bits were set,
the logic of the function bits could be skipped, as any new recursion
would first have to go through the loop function.

This is true for architectures that do not support all the ftrace
features, because all functions being traced must first go through the
loop function before going to the callbacks. But this is not true for
architectures that support all the ftrace features. That's because the
loop function could be called due to two callbacks attached to the same
function, but then a recursion function inside the callback could be
called that does not share any other callback, and it will be called
directly.

i.e.

 traced_function_1: [ more than one callback tracing it ]
   call loop_func

 loop_func:
   trace_recursion set internal bit
   call callback

 callback:
   trace_recursion [ skipped because internal bit is set, return 0 ]
   call traced_function_2

 traced_function_2: [ only traced by above callback ]
   call callback

 callback:
   trace_recursion [ skipped because internal bit is set, return 0 ]
   call traced_function_2

 [ wash, rinse, repeat, BOOM! out of shampoo! ]

Thus, the "bit == 0 skip" trick is not safe, unless the loop function is
call for all functions.

Since we want to encourage architectures to implement all ftrace features,
having them slow down due to this extra logic may encourage the
maintainers to update to the latest ftrace features. And because this
logic is only safe for them, remove it completely.

 [*] There is on layer of recursion that is allowed, and that is to allow
     for the transition between interrupt context (normal -&gt; softirq -&gt;
     irq -&gt; NMI), because a trace may occur before the context update is
     visible to the trace recursion logic.

Link: https://lore.kernel.org/all/609b565a-ed6e-a1da-f025-166691b5d994@linux.alibaba.com/
Link: https://lkml.kernel.org/r/20211018154412.09fcad3c@gandalf.local.home

Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Petr Mladek &lt;pmladek@suse.com&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: "James E.J. Bottomley" &lt;James.Bottomley@hansenpartnership.com&gt;
Cc: Helge Deller &lt;deller@gmx.de&gt;
Cc: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Cc: Benjamin Herrenschmidt &lt;benh@kernel.crashing.org&gt;
Cc: Paul Mackerras &lt;paulus@samba.org&gt;
Cc: Paul Walmsley &lt;paul.walmsley@sifive.com&gt;
Cc: Palmer Dabbelt &lt;palmer@dabbelt.com&gt;
Cc: Albert Ou &lt;aou@eecs.berkeley.edu&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Borislav Petkov &lt;bp@alien8.de&gt;
Cc: "H. Peter Anvin" &lt;hpa@zytor.com&gt;
Cc: Josh Poimboeuf &lt;jpoimboe@redhat.com&gt;
Cc: Jiri Kosina &lt;jikos@kernel.org&gt;
Cc: Miroslav Benes &lt;mbenes@suse.cz&gt;
Cc: Joe Lawrence &lt;joe.lawrence@redhat.com&gt;
Cc: Colin Ian King &lt;colin.king@canonical.com&gt;
Cc: Masami Hiramatsu &lt;mhiramat@kernel.org&gt;
Cc: "Peter Zijlstra (Intel)" &lt;peterz@infradead.org&gt;
Cc: Nicholas Piggin &lt;npiggin@gmail.com&gt;
Cc: Jisheng Zhang &lt;jszhang@kernel.org&gt;
Cc: =?utf-8?b?546L6LSH?= &lt;yun.wang@linux.alibaba.com&gt;
Cc: Guo Ren &lt;guoren@kernel.org&gt;
Cc: stable@vger.kernel.org
Fixes: edc15cafcbfa3 ("tracing: Avoid unnecessary multiple recursion checks")
Signed-off-by: Steven Rostedt (VMware) &lt;rostedt@goodmis.org&gt;
</content>
</entry>
<entry>
<title>ftrace: Introduce ftrace_need_init_nop()</title>
<updated>2021-08-03T12:31:40Z</updated>
<author>
<name>Ilya Leoshkevich</name>
<email>iii@linux.ibm.com</email>
</author>
<published>2021-07-28T21:25:45Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=67ccddf86621b18dbffe56f11a106774ee8f44bd'/>
<id>urn:sha1:67ccddf86621b18dbffe56f11a106774ee8f44bd</id>
<content type='text'>
Implementing live patching on s390 requires each function's prologue to
contain a very special kind of nop, which gcc and clang don't generate.
However, the current code assumes that if CC_USING_NOP_MCOUNT is
defined, then whatever the compiler generates is good enough.

Move the CC_USING_NOP_MCOUNT check into the new ftrace_need_init_nop()
macro, that the architectures can override.

An alternative solution is to disable using -mnop-mcount in the
Makefile, however, this makes the build logic (even) more complicated
and forces the arch-specific code to deal with the useless __fentry__
symbol.

Signed-off-by: Ilya Leoshkevich &lt;iii@linux.ibm.com&gt;
Reviewed-by: Steven Rostedt (VMware) &lt;rostedt@goodmis.org&gt;
Link: https://lore.kernel.org/r/20210728212546.128248-2-iii@linux.ibm.com
Signed-off-by: Heiko Carstens &lt;hca@linux.ibm.com&gt;
</content>
</entry>
<entry>
<title>ftrace: Remove redundant initialization of variable ret</title>
<updated>2021-07-23T12:46:02Z</updated>
<author>
<name>Colin Ian King</name>
<email>colin.king@canonical.com</email>
</author>
<published>2021-07-21T12:09:15Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=3b1a8f457fcf105924c72e99f1191834837c978d'/>
<id>urn:sha1:3b1a8f457fcf105924c72e99f1191834837c978d</id>
<content type='text'>
The variable ret is being initialized with a value that is never
read, it is being updated later on. The assignment is redundant and
can be removed.

Link: https://lkml.kernel.org/r/20210721120915.122278-1-colin.king@canonical.com

Addresses-Coverity: ("Unused value")
Signed-off-by: Colin Ian King &lt;colin.king@canonical.com&gt;
Signed-off-by: Steven Rostedt (VMware) &lt;rostedt@goodmis.org&gt;
</content>
</entry>
<entry>
<title>ftrace: Avoid synchronize_rcu_tasks_rude() call when not necessary</title>
<updated>2021-07-23T12:45:53Z</updated>
<author>
<name>Nicolas Saenz Julienne</name>
<email>nsaenzju@redhat.com</email>
</author>
<published>2021-07-21T11:47:26Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=68e83498cb4fad31963b5c76a71e80b824bc316e'/>
<id>urn:sha1:68e83498cb4fad31963b5c76a71e80b824bc316e</id>
<content type='text'>
synchronize_rcu_tasks_rude() triggers IPIs and forces rescheduling on
all CPUs. It is a costly operation and, when targeting nohz_full CPUs,
very disrupting (hence the name). So avoid calling it when 'old_hash'
doesn't need to be freed.

Link: https://lkml.kernel.org/r/20210721114726.1545103-1-nsaenzju@redhat.com

Signed-off-by: Nicolas Saenz Julienne &lt;nsaenzju@redhat.com&gt;
Signed-off-by: Steven Rostedt (VMware) &lt;rostedt@goodmis.org&gt;
</content>
</entry>
<entry>
<title>ftrace: Use list_move instead of list_del/list_add</title>
<updated>2021-07-08T17:02:58Z</updated>
<author>
<name>Baokun Li</name>
<email>libaokun1@huawei.com</email>
</author>
<published>2021-06-08T03:11:08Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=3ecda64475bccdfdcbfd5b9b7e4bf639d8b233da'/>
<id>urn:sha1:3ecda64475bccdfdcbfd5b9b7e4bf639d8b233da</id>
<content type='text'>
Using list_move() instead of list_del() + list_add().

Link: https://lkml.kernel.org/r/20210608031108.2820996-1-libaokun1@huawei.com

Reported-by: Hulk Robot &lt;hulkci@huawei.com&gt;
Signed-off-by: Baokun Li &lt;libaokun1@huawei.com&gt;
Signed-off-by: Steven Rostedt (VMware) &lt;rostedt@goodmis.org&gt;
</content>
</entry>
<entry>
<title>ftrace: Do not blindly read the ip address in ftrace_bug()</title>
<updated>2021-06-08T20:44:00Z</updated>
<author>
<name>Steven Rostedt (VMware)</name>
<email>rostedt@goodmis.org</email>
</author>
<published>2021-06-08T01:39:08Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=6c14133d2d3f768e0a35128faac8aa6ed4815051'/>
<id>urn:sha1:6c14133d2d3f768e0a35128faac8aa6ed4815051</id>
<content type='text'>
It was reported that a bug on arm64 caused a bad ip address to be used for
updating into a nop in ftrace_init(), but the error path (rightfully)
returned -EINVAL and not -EFAULT, as the bug caused more than one error to
occur. But because -EINVAL was returned, the ftrace_bug() tried to report
what was at the location of the ip address, and read it directly. This
caused the machine to panic, as the ip was not pointing to a valid memory
address.

Instead, read the ip address with copy_from_kernel_nofault() to safely
access the memory, and if it faults, report that the address faulted,
otherwise report what was in that location.

Link: https://lore.kernel.org/lkml/20210607032329.28671-1-mark-pk.tsai@mediatek.com/

Cc: stable@vger.kernel.org
Fixes: 05736a427f7e1 ("ftrace: warn on failure to disable mcount callers")
Reported-by: Mark-PK Tsai &lt;mark-pk.tsai@mediatek.com&gt;
Tested-by: Mark-PK Tsai &lt;mark-pk.tsai@mediatek.com&gt;
Signed-off-by: Steven Rostedt (VMware) &lt;rostedt@goodmis.org&gt;
</content>
</entry>
<entry>
<title>Merge tag 'trace-v5.13-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace</title>
<updated>2021-05-06T17:03:38Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2021-05-06T17:03:38Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=7ec901b6fa9ce5be3fc53d6216cb9e83ea0cf1da'/>
<id>urn:sha1:7ec901b6fa9ce5be3fc53d6216cb9e83ea0cf1da</id>
<content type='text'>
Pull tracing fix from Steven Rostedt:
 "Fix probes written to the set_ftrace_filter file

  Now that there's a library that accesses the tracefs file system
  (libtracefs), the way the files are interacted with is slightly
  different than the command line. For instance, the write() system call
  is used directly instead of an echo. This exposes some old bugs.

  If a probe is written to "set_ftrace_filter" without any white space
  after it, it will be ignored. This is because the write expects that a
  string written to it that does not end with white spaces thinks there
  is more to come. But if the file is closed, the release function needs
  to finish it. The "set_ftrace_filter" release function handles the
  filter part of the "set_ftrace_filter" file, but did not handle the
  probe part"

* tag 'trace-v5.13-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  ftrace: Handle commands when closing set_ftrace_filter file
</content>
</entry>
<entry>
<title>ftrace: Handle commands when closing set_ftrace_filter file</title>
<updated>2021-05-05T14:38:24Z</updated>
<author>
<name>Steven Rostedt (VMware)</name>
<email>rostedt@goodmis.org</email>
</author>
<published>2021-05-05T14:38:24Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=8c9af478c06bb1ab1422f90d8ecbc53defd44bc3'/>
<id>urn:sha1:8c9af478c06bb1ab1422f90d8ecbc53defd44bc3</id>
<content type='text'>
 # echo switch_mm:traceoff &gt; /sys/kernel/tracing/set_ftrace_filter

will cause switch_mm to stop tracing by the traceoff command.

 # echo -n switch_mm:traceoff &gt; /sys/kernel/tracing/set_ftrace_filter

does nothing.

The reason is that the parsing in the write function only processes
commands if it finished parsing (there is white space written after the
command). That's to handle:

 write(fd, "switch_mm:", 10);
 write(fd, "traceoff", 8);

cases, where the command is broken over multiple writes.

The problem is if the file descriptor is closed, then the write call is
not processed, and the command needs to be processed in the release code.
The release code can handle matching of functions, but does not handle
commands.

Cc: stable@vger.kernel.org
Fixes: eda1e32855656 ("tracing: handle broken names in ftrace filter")
Signed-off-by: Steven Rostedt (VMware) &lt;rostedt@goodmis.org&gt;
</content>
</entry>
<entry>
<title>Merge tag 'trace-v5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace</title>
<updated>2021-05-03T18:19:54Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2021-05-03T18:19:54Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=9b1f61d5d73d550a20dd79b9a17b6bb05a8f9307'/>
<id>urn:sha1:9b1f61d5d73d550a20dd79b9a17b6bb05a8f9307</id>
<content type='text'>
Pull tracing updates from Steven Rostedt:
 "New feature:

   - A new "func-no-repeats" option in tracefs/options directory.

     When set the function tracer will detect if the current function
     being traced is the same as the previous one, and instead of
     recording it, it will keep track of the number of times that the
     function is repeated in a row. And when another function is
     recorded, it will write a new event that shows the function that
     repeated, the number of times it repeated and the time stamp of
     when the last repeated function occurred.

  Enhancements:

   - In order to implement the above "func-no-repeats" option, the ring
     buffer timestamp can now give the accurate timestamp of the event
     as it is being recorded, instead of having to record an absolute
     timestamp for all events. This helps the histogram code which no
     longer needs to waste ring buffer space.

   - New validation logic to make sure all trace events that access
     dereferenced pointers do so in a safe way, and will warn otherwise.

  Fixes:

   - No longer limit the PIDs of tasks that are recorded for
     "saved_cmdlines" to PID_MAX_DEFAULT (32768), as systemd now allows
     for a much larger range. This caused the mapping of PIDs to the
     task names to be dropped for all tasks with a PID greater than
     32768.

   - Change trace_clock_global() to never block. This caused a deadlock.

  Clean ups:

   - Typos, prototype fixes, and removing of duplicate or unused code.

   - Better management of ftrace_page allocations"

* tag 'trace-v5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (32 commits)
  tracing: Restructure trace_clock_global() to never block
  tracing: Map all PIDs to command lines
  ftrace: Reuse the output of the function tracer for func_repeats
  tracing: Add "func_no_repeats" option for function tracing
  tracing: Unify the logic for function tracing options
  tracing: Add method for recording "func_repeats" events
  tracing: Add "last_func_repeats" to struct trace_array
  tracing: Define new ftrace event "func_repeats"
  tracing: Define static void trace_print_time()
  ftrace: Simplify the calculation of page number for ftrace_page-&gt;records some more
  ftrace: Store the order of pages allocated in ftrace_page
  tracing: Remove unused argument from "ring_buffer_time_stamp()
  tracing: Remove duplicate struct declaration in trace_events.h
  tracing: Update create_system_filter() kernel-doc comment
  tracing: A minor cleanup for create_system_filter()
  kernel: trace: Mundane typo fixes in the file trace_events_filter.c
  tracing: Fix various typos in comments
  scripts/recordmcount.pl: Make vim and emacs indent the same
  scripts/recordmcount.pl: Make indent spacing consistent
  tracing: Add a verifier to check string pointers for trace events
  ...
</content>
</entry>
</feed>
