<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/kernel/events/core.c, branch v6.19</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v6.19</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v6.19'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2026-01-30T22:06:07Z</updated>
<entry>
<title>perf: sched: Fix perf crash with new is_user_task() helper</title>
<updated>2026-01-30T22:06:07Z</updated>
<author>
<name>Steven Rostedt</name>
<email>rostedt@goodmis.org</email>
</author>
<published>2026-01-29T15:28:21Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=76ed27608f7dd235b727ebbb12163438c2fbb617'/>
<id>urn:sha1:76ed27608f7dd235b727ebbb12163438c2fbb617</id>
<content type='text'>
In order to do a user space stacktrace the current task needs to be a user
task that has executed in user space. It use to be possible to test if a
task is a user task or not by simply checking the task_struct mm field. If
it was non NULL, it was a user task and if not it was a kernel task.

But things have changed over time, and some kernel tasks now have their
own mm field.

An idea was made to instead test PF_KTHREAD and two functions were used to
wrap this check in case it became more complex to test if a task was a
user task or not[1]. But this was rejected and the C code simply checked
the PF_KTHREAD directly.

It was later found that not all kernel threads set PF_KTHREAD. The io-uring
helpers instead set PF_USER_WORKER and this needed to be added as well.

But checking the flags is still not enough. There's a very small window
when a task exits that it frees its mm field and it is set back to NULL.
If perf were to trigger at this moment, the flags test would say its a
user space task but when perf would read the mm field it would crash with
at NULL pointer dereference.

Now there are flags that can be used to test if a task is exiting, but
they are set in areas that perf may still want to profile the user space
task (to see where it exited). The only real test is to check both the
flags and the mm field.

Instead of making this modification in every location, create a new
is_user_task() helper function that does all the tests needed to know if
it is safe to read the user space memory or not.

[1] https://lore.kernel.org/all/20250425204120.639530125@goodmis.org/

Fixes: 90942f9fac05 ("perf: Use current-&gt;flags &amp; PF_KTHREAD|PF_USER_WORKER instead of current-&gt;mm == NULL")
Closes: https://lore.kernel.org/all/0d877e6f-41a7-4724-875d-0b0a27b8a545@roeck-us.net/
Reported-by: Guenter Roeck &lt;linux@roeck-us.net&gt;
Signed-off-by: Steven Rostedt (Google) &lt;rostedt@goodmis.org&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Tested-by: Guenter Roeck &lt;linux@roeck-us.net&gt;
Cc: stable@vger.kernel.org
Link: https://patch.msgid.link/20260129102821.46484722@gandalf.local.home
</content>
</entry>
<entry>
<title>perf: Fix refcount warning on event-&gt;mmap_count increment</title>
<updated>2026-01-21T15:28:58Z</updated>
<author>
<name>Will Rosenberg</name>
<email>whrosenb@asu.edu</email>
</author>
<published>2026-01-19T18:49:56Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=d06bf78e55d5159c1b00072e606ab924ffbbad35'/>
<id>urn:sha1:d06bf78e55d5159c1b00072e606ab924ffbbad35</id>
<content type='text'>
When calling refcount_inc(&amp;event-&gt;mmap_count) inside perf_mmap_rb(), the
following warning is triggered:

        refcount_t: addition on 0; use-after-free.
        WARNING: lib/refcount.c:25

PoC:

    struct perf_event_attr attr = {0};
    int fd = syscall(__NR_perf_event_open, &amp;attr, 0, -1, -1, 0);
    mmap(NULL, 0x3000, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
    int victim = syscall(__NR_perf_event_open, &amp;attr, 0, -1, fd,
                         PERF_FLAG_FD_OUTPUT);
    mmap(NULL, 0x3000, PROT_READ | PROT_WRITE, MAP_SHARED, victim, 0);

This occurs when creating a group member event with the flag
PERF_FLAG_FD_OUTPUT. The group leader should be mmap-ed and then mmap-ing
the event triggers the warning.

Since the event has copied the output_event in perf_event_set_output(),
event-&gt;rb is set. As a result, perf_mmap_rb() calls
refcount_inc(&amp;event-&gt;mmap_count) when event-&gt;mmap_count = 0.

Disallow the case when event-&gt;mmap_count = 0. This also prevents two
events from updating the same user_page.

Fixes: 448f97fba901 ("perf: Convert mmap() refcounts to refcount_t")
Suggested-by: Peter Zijlstra &lt;peterz@infradead.org&gt;
Signed-off-by: Will Rosenberg &lt;whrosenb@asu.edu&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Link: https://patch.msgid.link/20260119184956.801238-1-whrosenb@asu.edu
</content>
</entry>
<entry>
<title>Merge tag 'perf-urgent-2026-01-11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip</title>
<updated>2026-01-11T16:55:27Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2026-01-11T16:55:27Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=fe948326e95dcc7b1efb6b76e92b5b919f56250c'/>
<id>urn:sha1:fe948326e95dcc7b1efb6b76e92b5b919f56250c</id>
<content type='text'>
Pull perf event fix from Ingo Molnar:
 "Fix perf swevent hrtimer deinit regression"

* tag 'perf-urgent-2026-01-11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf: Ensure swevent hrtimer is properly destroyed
</content>
</entry>
<entry>
<title>treewide: Update email address</title>
<updated>2026-01-11T16:09:11Z</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@kernel.org</email>
</author>
<published>2026-01-11T15:53:48Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=2e4b28c48f88ce9e263957b1d944cf5349952f88'/>
<id>urn:sha1:2e4b28c48f88ce9e263957b1d944cf5349952f88</id>
<content type='text'>
In a vain attempt to consolidate the email zoo switch everything to the
kernel.org account.

Signed-off-by: Thomas Gleixner &lt;tglx@kernel.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>perf: Ensure swevent hrtimer is properly destroyed</title>
<updated>2026-01-05T07:55:54Z</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2025-12-20T13:14:41Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=ff5860f5088e9076ebcccf05a6ca709d5935cfa9'/>
<id>urn:sha1:ff5860f5088e9076ebcccf05a6ca709d5935cfa9</id>
<content type='text'>
With the change to hrtimer_try_to_cancel() in
perf_swevent_cancel_hrtimer() it appears possible for the hrtimer to
still be active by the time the event gets freed.

Make sure the event does a full hrtimer_cancel() on the free path by
installing a perf_event::destroy handler.

Fixes: eb3182ef0405 ("perf/core: Fix system hang caused by cpu-clock usage")
Reported-by: CyberUnicorns &lt;a101e_iotvul@163.com&gt;
Tested-by: CyberUnicorns &lt;a101e_iotvul@163.com&gt;
Debugged-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
</content>
</entry>
<entry>
<title>perf/core: Fix missing read event generation on task exit</title>
<updated>2025-12-09T11:22:25Z</updated>
<author>
<name>Thaumy Cheng</name>
<email>thaumy.love@gmail.com</email>
</author>
<published>2025-12-09T04:16:00Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=c418d8b4d7a43a86b82ee39cb52ece3034383530'/>
<id>urn:sha1:c418d8b4d7a43a86b82ee39cb52ece3034383530</id>
<content type='text'>
For events with inherit_stat enabled, a "read" event will be generated
to collect per task event counts on task exit.

The call chain is as follows:

do_exit
  -&gt; perf_event_exit_task
    -&gt; perf_event_exit_task_context
      -&gt; perf_event_exit_event
        -&gt; perf_remove_from_context
          -&gt; perf_child_detach
            -&gt; sync_child_event
              -&gt; perf_event_read_event

However, the child event context detaches the task too early in
perf_event_exit_task_context, which causes sync_child_event to never
generate the read event in this case, since child_event-&gt;ctx-&gt;task is
always set to TASK_TOMBSTONE. Fix that by moving context lock section
backward to ensure ctx-&gt;task is not set to TASK_TOMBSTONE before
generating the read event.

Because perf_event_free_task calls perf_event_exit_task_context with
exit = false to tear down all child events from the context, and the
task never lived, accessing the task PID can lead to a use-after-free.

To fix that, let sync_child_event read task from argument and move the
call to the only place it should be triggered to avoid the effect of
setting ctx-&gt;task to TASK_TOMESTONE, and add a task parameter to
perf_event_exit_event to trigger the sync_child_event properly when
needed.

This bug can be reproduced by running "perf record -s" and attaching to
any program that generates perf events in its child tasks. If we check
the result with "perf report -T", the last line of the report will leave
an empty table like "# PID  TID", which is expected to contain the
per-task event counts by design.

Fixes: ef54c1a476ae ("perf: Rework perf_event_exit_event()")
Signed-off-by: Thaumy Cheng &lt;thaumy.love@gmail.com&gt;
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Acked-by: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Cc: Alexander Shishkin &lt;alexander.shishkin@linux.intel.com&gt;
Cc: Arnaldo Carvalho de Melo &lt;acme@kernel.org&gt;
Cc: Ian Rogers &lt;irogers@google.com&gt;
Cc: James Clark &lt;james.clark@linaro.org&gt;
Cc: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Mark Rutland &lt;mark.rutland@arm.com&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: linux-perf-users@vger.kernel.org
Link: https://patch.msgid.link/20251209041600.963586-1-thaumy.love@gmail.com
</content>
</entry>
<entry>
<title>Merge tag 'perf-core-2025-12-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip</title>
<updated>2025-12-02T04:42:01Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2025-12-02T04:42:01Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=6c26fbe8c9d3e932dce6afe2505b19b4b261cae9'/>
<id>urn:sha1:6c26fbe8c9d3e932dce6afe2505b19b4b261cae9</id>
<content type='text'>
Pull performance events updates from Ingo Molnar:
 "Callchain support:

   - Add support for deferred user-space stack unwinding for perf,
     enabled on x86. (Peter Zijlstra, Steven Rostedt)

   - unwind_user/x86: Enable frame pointer unwinding on x86 (Josh
     Poimboeuf)

  x86 PMU support and infrastructure:

   - x86/insn: Simplify for_each_insn_prefix() (Peter Zijlstra)

   - x86/insn,uprobes,alternative: Unify insn_is_nop() (Peter Zijlstra)

  Intel PMU driver:

   - Large series to prepare for and implement architectural PEBS
     support for Intel platforms such as Clearwater Forest (CWF) and
     Panther Lake (PTL). (Dapeng Mi, Kan Liang)

   - Check dynamic constraints (Kan Liang)

   - Optimize PEBS extended config (Peter Zijlstra)

   - cstates:
      - Remove PC3 support from LunarLake (Zhang Rui)
      - Add Pantherlake support (Zhang Rui)
      - Clearwater Forest support (Zide Chen)

  AMD PMU driver:

   - x86/amd: Check event before enable to avoid GPF (George Kennedy)

  Fixes and cleanups:

   - task_work: Fix NMI race condition (Peter Zijlstra)

   - perf/x86: Fix NULL event access and potential PEBS record loss
     (Dapeng Mi)

   - Misc other fixes and cleanups (Dapeng Mi, Ingo Molnar, Peter
     Zijlstra)"

* tag 'perf-core-2025-12-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (38 commits)
  perf/x86/intel: Fix and clean up intel_pmu_drain_arch_pebs() type use
  perf/x86/intel: Optimize PEBS extended config
  perf/x86/intel: Check PEBS dyn_constraints
  perf/x86/intel: Add a check for dynamic constraints
  perf/x86/intel: Add counter group support for arch-PEBS
  perf/x86/intel: Setup PEBS data configuration and enable legacy groups
  perf/x86/intel: Update dyn_constraint base on PEBS event precise level
  perf/x86/intel: Allocate arch-PEBS buffer and initialize PEBS_BASE MSR
  perf/x86/intel: Process arch-PEBS records or record fragments
  perf/x86/intel/ds: Factor out PEBS group processing code to functions
  perf/x86/intel/ds: Factor out PEBS record processing code to functions
  perf/x86/intel: Initialize architectural PEBS
  perf/x86/intel: Correct large PEBS flag check
  perf/x86/intel: Replace x86_pmu.drain_pebs calling with static call
  perf/x86: Fix NULL event access and potential PEBS record loss
  perf/x86: Remove redundant is_x86_event() prototype
  entry,unwind/deferred: Fix unwind_reset_info() placement
  unwind_user/x86: Fix arch=um build
  perf: Support deferred user unwind
  unwind_user/x86: Teach FP unwind about start of function
  ...
</content>
</entry>
<entry>
<title>perf: Fix 0 count issue of cpu-clock</title>
<updated>2025-11-20T09:42:12Z</updated>
<author>
<name>Dapeng Mi</name>
<email>dapeng1.mi@linux.intel.com</email>
</author>
<published>2025-11-12T08:05:26Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=f1f96511b1c4c33e53f05909dd267878e0643a9a'/>
<id>urn:sha1:f1f96511b1c4c33e53f05909dd267878e0643a9a</id>
<content type='text'>
Currently cpu-clock event always returns 0 count, e.g.,

perf stat -e cpu-clock -- sleep 1

 Performance counter stats for 'sleep 1':
                 0      cpu-clock                        #    0.000 CPUs utilized
       1.002308394 seconds time elapsed

The root cause is the commit 'bc4394e5e79c ("perf: Fix the throttle
 error of some clock events")' adds PERF_EF_UPDATE flag check before
calling cpu_clock_event_update() to update the count, however the
PERF_EF_UPDATE flag is never set when the cpu-clock event is stopped in
counting mode (pmu-&gt;dev() -&gt; cpu_clock_event_del() -&gt;
cpu_clock_event_stop()). This leads to the cpu-clock event count is
never updated.

To fix this issue, force to set PERF_EF_UPDATE flag for cpu-clock event
just like what task-clock does.

Fixes: bc4394e5e79c ("perf: Fix the throttle error of some clock events")
Signed-off-by: Dapeng Mi &lt;dapeng1.mi@linux.intel.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Reviewed-by: Ian Rogers &lt;irogers@google.com&gt;
Acked-by: Namhyung Kim &lt;namhyung@kernel.org&gt;
Link: https://patch.msgid.link/20251112080526.3971392-1-dapeng1.mi@linux.intel.com
</content>
</entry>
<entry>
<title>perf/core: Fix system hang caused by cpu-clock usage</title>
<updated>2025-11-03T10:04:19Z</updated>
<author>
<name>Dapeng Mi</name>
<email>dapeng1.mi@linux.intel.com</email>
</author>
<published>2025-10-15T05:18:28Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=eb3182ef0405ff2f6668fd3e5ff9883f60ce8801'/>
<id>urn:sha1:eb3182ef0405ff2f6668fd3e5ff9883f60ce8801</id>
<content type='text'>
cpu-clock usage by the async-profiler tool can trigger a system hang,
which got bisected back to the following commit by Octavia Togami:

  18dbcbfabfff ("perf: Fix the POLL_HUP delivery breakage") causes this issue

The root cause of the hang is that cpu-clock is a special type of SW
event which relies on hrtimers. The __perf_event_overflow() callback
is invoked from the hrtimer handler for cpu-clock events, and
__perf_event_overflow() tries to call cpu_clock_event_stop()
to stop the event, which calls htimer_cancel() to cancel the hrtimer.

But that's a recursion into the hrtimer code from a hrtimer handler,
which (unsurprisingly) deadlocks.

To fix this bug, use hrtimer_try_to_cancel() instead, and set
the PERF_HES_STOPPED flag, which causes perf_swevent_hrtimer()
to stop the event once it sees the PERF_HES_STOPPED flag.

[ mingo: Fixed the comments and improved the changelog. ]

Closes: https://lore.kernel.org/all/CAHPNGSQpXEopYreir+uDDEbtXTBvBvi8c6fYXJvceqtgTPao3Q@mail.gmail.com/
Fixes: 18dbcbfabfff ("perf: Fix the POLL_HUP delivery breakage")
Reported-by: Octavia Togami &lt;octavia.togami@gmail.com&gt;
Suggested-by: Peter Zijlstra &lt;peterz@infradead.org&gt;
Signed-off-by: Dapeng Mi &lt;dapeng1.mi@linux.intel.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Tested-by: Octavia Togami &lt;octavia.togami@gmail.com&gt;
Cc: stable@vger.kernel.org
Link: https://github.com/lucko/spark/issues/530
Link: https://patch.msgid.link/20251015051828.12809-1-dapeng1.mi@linux.intel.com
</content>
</entry>
<entry>
<title>perf: Support deferred user unwind</title>
<updated>2025-10-29T09:29:58Z</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2025-10-23T13:17:05Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=c69993ecdd4dfde2b7da08b022052a33b203da07'/>
<id>urn:sha1:c69993ecdd4dfde2b7da08b022052a33b203da07</id>
<content type='text'>
Add support for deferred userspace unwind to perf.

Where perf currently relies on in-place stack unwinding; from NMI
context and all that. This moves the userspace part of the unwind to
right before the return-to-userspace.

This has two distinct benefits, the biggest is that it moves the
unwind to a faultable context. It becomes possible to fault in debug
info (.eh_frame, SFrame etc.) that might not otherwise be readily
available. And secondly, it de-duplicates the user callchain where
multiple samples happen during the same kernel entry.

To facilitate this the perf interface is extended with a new record
type:

  PERF_RECORD_CALLCHAIN_DEFERRED

and two new attribute flags:

  perf_event_attr::defer_callchain - to request the user unwind be deferred
  perf_event_attr::defer_output    - to request PERF_RECORD_CALLCHAIN_DEFERRED records

The existing PERF_RECORD_SAMPLE callchain section gets a new
context type:

  PERF_CONTEXT_USER_DEFERRED

After which will come a single entry, denoting the 'cookie' of the
deferred callchain that should be attached here, matching the 'cookie'
field of the above mentioned PERF_RECORD_CALLCHAIN_DEFERRED.

The 'defer_callchain' flag is expected on all events with
PERF_SAMPLE_CALLCHAIN. The 'defer_output' flag is expect on the event
responsible for collecting side-band events (like mmap, comm etc.).
Setting 'defer_output' on multiple events will get you duplicated
PERF_RECORD_CALLCHAIN_DEFERRED records.

Based on earlier patches by Josh and Steven.

Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Link: https://patch.msgid.link/20251023150002.GR4067720@noisy.programming.kicks-ass.net
</content>
</entry>
</feed>
