<feed xmlns='http://www.w3.org/2005/Atom'>
<title>linux/kernel/events/core.c, branch v5.4</title>
<subtitle>Mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
</subtitle>
<id>https://git.shady.money/linux/atom?h=v5.4</id>
<link rel='self' href='https://git.shady.money/linux/atom?h=v5.4'/>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/'/>
<updated>2019-11-13T07:16:44Z</updated>
<entry>
<title>perf/core: Fix missing static inline on perf_cgroup_switch()</title>
<updated>2019-11-13T07:16:44Z</updated>
<author>
<name>Ben Dooks (Codethink)</name>
<email>ben.dooks@codethink.co.uk</email>
</author>
<published>2019-11-06T13:25:27Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=d00dbd29814236ad128ff9517e8f7af6b6ef4ba0'/>
<id>urn:sha1:d00dbd29814236ad128ff9517e8f7af6b6ef4ba0</id>
<content type='text'>
It looks like a "static inline" has been missed in front
of the empty definition of perf_cgroup_switch() under
certain configurations.

Fixes the following sparse warning:

  kernel/events/core.c:1035:1: warning: symbol 'perf_cgroup_switch' was not declared. Should it be static?

Signed-off-by: Ben Dooks (Codethink) &lt;ben.dooks@codethink.co.uk&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Acked-by: Mark Rutland &lt;mark.rutland@arm.com&gt;
Cc: Alexander Shishkin &lt;alexander.shishkin@linux.intel.com&gt;
Cc: Arnaldo Carvalho de Melo &lt;acme@kernel.org&gt;
Cc: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
Cc: David Ahern &lt;dsahern@gmail.com&gt;
Cc: Jiri Olsa &lt;jolsa@redhat.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: Stephane Eranian &lt;eranian@google.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Vince Weaver &lt;vincent.weaver@maine.edu&gt;
Link: https://lkml.kernel.org/r/20191106132527.19977-1-ben.dooks@codethink.co.uk
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>perf/core: Consistently fail fork on allocation failures</title>
<updated>2019-11-13T07:16:43Z</updated>
<author>
<name>Alexander Shishkin</name>
<email>alexander.shishkin@linux.intel.com</email>
</author>
<published>2019-11-05T07:57:02Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=697d877849d4b34ab58d7078d6930bad0ef6fc66'/>
<id>urn:sha1:697d877849d4b34ab58d7078d6930bad0ef6fc66</id>
<content type='text'>
Commit:

  313ccb9615948 ("perf: Allocate context task_ctx_data for child event")

makes the inherit path skip over the current event in case of task_ctx_data
allocation failure. This, however, is inconsistent with allocation failures
in perf_event_alloc(), which would abort the fork.

Correct this by returning an error code on task_ctx_data allocation
failure and failing the fork in that case.

Signed-off-by: Alexander Shishkin &lt;alexander.shishkin@linux.intel.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
Cc: David Ahern &lt;dsahern@gmail.com&gt;
Cc: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Jiri Olsa &lt;jolsa@redhat.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Mark Rutland &lt;mark.rutland@arm.com&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: Stephane Eranian &lt;eranian@google.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Vince Weaver &lt;vincent.weaver@maine.edu&gt;
Link: https://lkml.kernel.org/r/20191105075702.60319-1-alexander.shishkin@linux.intel.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>perf/aux: Disallow aux_output for kernel events</title>
<updated>2019-11-13T07:16:42Z</updated>
<author>
<name>Alexander Shishkin</name>
<email>alexander.shishkin@linux.intel.com</email>
</author>
<published>2019-10-30T13:47:31Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=dce5affb94eb54edfff17727a6240a6a5d998666'/>
<id>urn:sha1:dce5affb94eb54edfff17727a6240a6a5d998666</id>
<content type='text'>
Commit

  ab43762ef0109 ("perf: Allow normal events to output AUX data")

added 'aux_output' bit to the attribute structure, which relies on AUX
events and grouping, neither of which is supported for the kernel events.
This notwithstanding, attempts have been made to use it in the kernel
code, suggesting the necessity of an explicit hard -EINVAL.

Fix this by rejecting attributes with aux_output set for kernel events.

Signed-off-by: Alexander Shishkin &lt;alexander.shishkin@linux.intel.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
Cc: David Ahern &lt;dsahern@gmail.com&gt;
Cc: Jiri Olsa &lt;jolsa@redhat.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Mark Rutland &lt;mark.rutland@arm.com&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: Stephane Eranian &lt;eranian@google.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Vince Weaver &lt;vincent.weaver@maine.edu&gt;
Link: https://lkml.kernel.org/r/20191030134731.5437-3-alexander.shishkin@linux.intel.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>perf/core: Reattach a misplaced comment</title>
<updated>2019-11-13T07:16:41Z</updated>
<author>
<name>Alexander Shishkin</name>
<email>alexander.shishkin@linux.intel.com</email>
</author>
<published>2019-10-30T13:47:30Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=f25d8ba9e1b204b90fbf55970ea6e68955006068'/>
<id>urn:sha1:f25d8ba9e1b204b90fbf55970ea6e68955006068</id>
<content type='text'>
A comment is in a wrong place in perf_event_create_kernel_counter().
Fix that.

Signed-off-by: Alexander Shishkin &lt;alexander.shishkin@linux.intel.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
Cc: David Ahern &lt;dsahern@gmail.com&gt;
Cc: Jiri Olsa &lt;jolsa@redhat.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Mark Rutland &lt;mark.rutland@arm.com&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: Stephane Eranian &lt;eranian@google.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Vince Weaver &lt;vincent.weaver@maine.edu&gt;
Link: https://lkml.kernel.org/r/20191030134731.5437-2-alexander.shishkin@linux.intel.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>perf/aux: Fix the aux_output group inheritance fix</title>
<updated>2019-11-13T07:16:40Z</updated>
<author>
<name>Alexander Shishkin</name>
<email>alexander.shishkin@linux.intel.com</email>
</author>
<published>2019-11-01T15:12:48Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=00496fe5e09e8c8bb115540e7e3470553cd07a5c'/>
<id>urn:sha1:00496fe5e09e8c8bb115540e7e3470553cd07a5c</id>
<content type='text'>
Commit

  f733c6b508bc ("perf/core: Fix inheritance of aux_output groups")

adds a NULL pointer dereference in case inherit_group() races with
perf_release(), which causes the below crash:

 &gt; BUG: kernel NULL pointer dereference, address: 000000000000010b
 &gt; #PF: supervisor read access in kernel mode
 &gt; #PF: error_code(0x0000) - not-present page
 &gt; PGD 3b203b067 P4D 3b203b067 PUD 3b2040067 PMD 0
 &gt; Oops: 0000 [#1] SMP KASAN
 &gt; CPU: 0 PID: 315 Comm: exclusive-group Tainted: G B 5.4.0-rc3-00181-g72e1839403cb-dirty #878
 &gt; RIP: 0010:perf_get_aux_event+0x86/0x270
 &gt; Call Trace:
 &gt;  ? __perf_read_group_add+0x3b0/0x3b0
 &gt;  ? __kasan_check_write+0x14/0x20
 &gt;  ? __perf_event_init_context+0x154/0x170
 &gt;  inherit_task_group.isra.0.part.0+0x14b/0x170
 &gt;  perf_event_init_task+0x296/0x4b0

Fix this by skipping over events that are getting closed, in the
inheritance path.

Signed-off-by: Alexander Shishkin &lt;alexander.shishkin@linux.intel.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
Cc: David Ahern &lt;dsahern@gmail.com&gt;
Cc: Jiri Olsa &lt;jolsa@redhat.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Mark Rutland &lt;mark.rutland@arm.com&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: Stephane Eranian &lt;eranian@google.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Vince Weaver &lt;vincent.weaver@maine.edu&gt;
Fixes: f733c6b508bc ("perf/core: Fix inheritance of aux_output groups")
Link: https://lkml.kernel.org/r/20191101151248.47327-1-alexander.shishkin@linux.intel.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>perf/core: Disallow uncore-cgroup events</title>
<updated>2019-11-13T07:16:39Z</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2019-11-06T11:51:04Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=09f4e8f05d85bfc98fe9227e988a7c1b3ec416ec'/>
<id>urn:sha1:09f4e8f05d85bfc98fe9227e988a7c1b3ec416ec</id>
<content type='text'>
While discussing uncore event scheduling, I noticed we do not in fact
seem to dis-allow making uncore-cgroup events. Such events make no
sense what so ever because the cgroup is a CPU local state where
uncore counts across a number of CPUs.

Disallow them.

Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: Alexander Shishkin &lt;alexander.shishkin@linux.intel.com&gt;
Cc: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
Cc: David Ahern &lt;dsahern@gmail.com&gt;
Cc: Jiri Olsa &lt;jolsa@redhat.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Mark Rutland &lt;mark.rutland@arm.com&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: Stephane Eranian &lt;eranian@google.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Vince Weaver &lt;vincent.weaver@maine.edu&gt;
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>perf/core: Start rejecting the syscall with attr.__reserved_2 set</title>
<updated>2019-10-28T10:01:59Z</updated>
<author>
<name>Alexander Shishkin</name>
<email>alexander.shishkin@linux.intel.com</email>
</author>
<published>2019-10-25T12:16:36Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=8c7e975667fbc3b7c816119dd56104739899f125'/>
<id>urn:sha1:8c7e975667fbc3b7c816119dd56104739899f125</id>
<content type='text'>
Commit:

  1a5941312414c ("perf: Add wakeup watermark control to the AUX area")

added attr.__reserved_2 padding, but forgot to add an ABI check to reject
attributes with this field set. Fix that.

Signed-off-by: Alexander Shishkin &lt;alexander.shishkin@linux.intel.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
Cc: Jiri Olsa &lt;jolsa@redhat.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Mark Rutland &lt;mark.rutland@arm.com&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: Stephane Eranian &lt;eranian@google.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Vince Weaver &lt;vincent.weaver@maine.edu&gt;
Cc: adrian.hunter@intel.com
Cc: mathieu.poirier@linaro.org
Link: https://lkml.kernel.org/r/20191025121636.75182-1-alexander.shishkin@linux.intel.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>perf/aux: Fix AUX output stopping</title>
<updated>2019-10-22T12:39:37Z</updated>
<author>
<name>Alexander Shishkin</name>
<email>alexander.shishkin@linux.intel.com</email>
</author>
<published>2019-10-22T07:39:40Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=f3a519e4add93b7b31a6616f0b09635ff2e6a159'/>
<id>urn:sha1:f3a519e4add93b7b31a6616f0b09635ff2e6a159</id>
<content type='text'>
Commit:

  8a58ddae2379 ("perf/core: Fix exclusive events' grouping")

allows CAP_EXCLUSIVE events to be grouped with other events. Since all
of those also happen to be AUX events (which is not the case the other
way around, because arch/s390), this changes the rules for stopping the
output: the AUX event may not be on its PMU's context any more, if it's
grouped with a HW event, in which case it will be on that HW event's
context instead. If that's the case, munmap() of the AUX buffer can't
find and stop the AUX event, potentially leaving the last reference with
the atomic context, which will then end up freeing the AUX buffer. This
will then trip warnings:

Fix this by using the context's PMU context when looking for events
to stop, instead of the event's PMU context.

Signed-off-by: Alexander Shishkin &lt;alexander.shishkin@linux.intel.com&gt;
Cc: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
Cc: Jiri Olsa &lt;jolsa@redhat.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Stephane Eranian &lt;eranian@google.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Vince Weaver &lt;vincent.weaver@maine.edu&gt;
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20191022073940.61814-1-alexander.shishkin@linux.intel.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>perf/aux: Fix tracking of auxiliary trace buffer allocation</title>
<updated>2019-10-21T09:31:24Z</updated>
<author>
<name>Thomas Richter</name>
<email>tmricht@linux.ibm.com</email>
</author>
<published>2019-10-21T08:33:54Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=5e6c3c7b1ec217c1c4c95d9148182302b9969b97'/>
<id>urn:sha1:5e6c3c7b1ec217c1c4c95d9148182302b9969b97</id>
<content type='text'>
The following commit from the v5.4 merge window:

  d44248a41337 ("perf/core: Rework memory accounting in perf_mmap()")

... breaks auxiliary trace buffer tracking.

If I run command 'perf record -e rbd000' to record samples and saving
them in the **auxiliary** trace buffer then the value of 'locked_vm' becomes
negative after all trace buffers have been allocated and released:

During allocation the values increase:

  [52.250027] perf_mmap user-&gt;locked_vm:0x87 pinned_vm:0x0 ret:0
  [52.250115] perf_mmap user-&gt;locked_vm:0x107 pinned_vm:0x0 ret:0
  [52.250251] perf_mmap user-&gt;locked_vm:0x188 pinned_vm:0x0 ret:0
  [52.250326] perf_mmap user-&gt;locked_vm:0x208 pinned_vm:0x0 ret:0
  [52.250441] perf_mmap user-&gt;locked_vm:0x289 pinned_vm:0x0 ret:0
  [52.250498] perf_mmap user-&gt;locked_vm:0x309 pinned_vm:0x0 ret:0
  [52.250613] perf_mmap user-&gt;locked_vm:0x38a pinned_vm:0x0 ret:0
  [52.250715] perf_mmap user-&gt;locked_vm:0x408 pinned_vm:0x2 ret:0
  [52.250834] perf_mmap user-&gt;locked_vm:0x408 pinned_vm:0x83 ret:0
  [52.250915] perf_mmap user-&gt;locked_vm:0x408 pinned_vm:0x103 ret:0
  [52.251061] perf_mmap user-&gt;locked_vm:0x408 pinned_vm:0x184 ret:0
  [52.251146] perf_mmap user-&gt;locked_vm:0x408 pinned_vm:0x204 ret:0
  [52.251299] perf_mmap user-&gt;locked_vm:0x408 pinned_vm:0x285 ret:0
  [52.251383] perf_mmap user-&gt;locked_vm:0x408 pinned_vm:0x305 ret:0
  [52.251544] perf_mmap user-&gt;locked_vm:0x408 pinned_vm:0x386 ret:0
  [52.251634] perf_mmap user-&gt;locked_vm:0x408 pinned_vm:0x406 ret:0
  [52.253018] perf_mmap user-&gt;locked_vm:0x408 pinned_vm:0x487 ret:0
  [52.253197] perf_mmap user-&gt;locked_vm:0x408 pinned_vm:0x508 ret:0
  [52.253374] perf_mmap user-&gt;locked_vm:0x408 pinned_vm:0x589 ret:0
  [52.253550] perf_mmap user-&gt;locked_vm:0x408 pinned_vm:0x60a ret:0
  [52.253726] perf_mmap user-&gt;locked_vm:0x408 pinned_vm:0x68b ret:0
  [52.253903] perf_mmap user-&gt;locked_vm:0x408 pinned_vm:0x70c ret:0
  [52.254084] perf_mmap user-&gt;locked_vm:0x408 pinned_vm:0x78d ret:0
  [52.254263] perf_mmap user-&gt;locked_vm:0x408 pinned_vm:0x80e ret:0

The value of user-&gt;locked_vm increases to a limit then the memory
is tracked by pinned_vm.

During deallocation the size is subtracted from pinned_vm until
it hits a limit. Then a larger value is subtracted from locked_vm
leading to a large number (because of type unsigned):

  [64.267797] perf_mmap_close mmap_user-&gt;locked_vm:0x408 pinned_vm:0x78d
  [64.267826] perf_mmap_close mmap_user-&gt;locked_vm:0x408 pinned_vm:0x70c
  [64.267848] perf_mmap_close mmap_user-&gt;locked_vm:0x408 pinned_vm:0x68b
  [64.267869] perf_mmap_close mmap_user-&gt;locked_vm:0x408 pinned_vm:0x60a
  [64.267891] perf_mmap_close mmap_user-&gt;locked_vm:0x408 pinned_vm:0x589
  [64.267911] perf_mmap_close mmap_user-&gt;locked_vm:0x408 pinned_vm:0x508
  [64.267933] perf_mmap_close mmap_user-&gt;locked_vm:0x408 pinned_vm:0x487
  [64.267952] perf_mmap_close mmap_user-&gt;locked_vm:0x408 pinned_vm:0x406
  [64.268883] perf_mmap_close mmap_user-&gt;locked_vm:0x307 pinned_vm:0x406
  [64.269117] perf_mmap_close mmap_user-&gt;locked_vm:0x206 pinned_vm:0x406
  [64.269433] perf_mmap_close mmap_user-&gt;locked_vm:0x105 pinned_vm:0x406
  [64.269536] perf_mmap_close mmap_user-&gt;locked_vm:0x4 pinned_vm:0x404
  [64.269797] perf_mmap_close mmap_user-&gt;locked_vm:0xffffffffffffff84 pinned_vm:0x303
  [64.270105] perf_mmap_close mmap_user-&gt;locked_vm:0xffffffffffffff04 pinned_vm:0x202
  [64.270374] perf_mmap_close mmap_user-&gt;locked_vm:0xfffffffffffffe84 pinned_vm:0x101
  [64.270628] perf_mmap_close mmap_user-&gt;locked_vm:0xfffffffffffffe04 pinned_vm:0x0

This value sticks for the user until system is rebooted, causing
follow-on system calls using locked_vm resource limit to fail.

Note: There is no issue using the normal trace buffer.

In fact the issue is in perf_mmap_close(). During allocation auxiliary
trace buffer memory is either traced as 'extra' and added to 'pinned_vm'
or trace as 'user_extra' and added to 'locked_vm'. This applies for
normal trace buffers and auxiliary trace buffer.

However in function perf_mmap_close() all auxiliary trace buffer is
subtraced from 'locked_vm' and never from 'pinned_vm'. This breaks the
ballance.

Signed-off-by: Thomas Richter &lt;tmricht@linux.ibm.com&gt;
Acked-by: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Cc: Alexander Shishkin &lt;alexander.shishkin@linux.intel.com&gt;
Cc: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
Cc: Jiri Olsa &lt;jolsa@redhat.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Mark Rutland &lt;mark.rutland@arm.com&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: acme@kernel.org
Cc: gor@linux.ibm.com
Cc: hechaol@fb.com
Cc: heiko.carstens@de.ibm.com
Cc: linux-perf-users@vger.kernel.org
Cc: songliubraving@fb.com
Fixes: d44248a41337 ("perf/core: Rework memory accounting in perf_mmap()")
Link: https://lkml.kernel.org/r/20191021083354.67868-1-tmricht@linux.ibm.com
[ Minor readability edits. ]
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>perf/core: Fix corner case in perf_rotate_context()</title>
<updated>2019-10-09T10:44:13Z</updated>
<author>
<name>Song Liu</name>
<email>songliubraving@fb.com</email>
</author>
<published>2019-10-08T16:59:49Z</published>
<link rel='alternate' type='text/html' href='https://git.shady.money/linux/commit/?id=7fa343b7fdc4f351de4e3f28d5c285937dd1f42f'/>
<id>urn:sha1:7fa343b7fdc4f351de4e3f28d5c285937dd1f42f</id>
<content type='text'>
In perf_rotate_context(), when the first cpu flexible event fail to
schedule, cpu_rotate is 1, while cpu_event is NULL. Since cpu_event is
NULL, perf_rotate_context will _NOT_ call cpu_ctx_sched_out(), thus
cpuctx-&gt;ctx.is_active will have EVENT_FLEXIBLE set. Then, the next
perf_event_sched_in() will skip all cpu flexible events because of the
EVENT_FLEXIBLE bit.

In the next call of perf_rotate_context(), cpu_rotate stays 1, and
cpu_event stays NULL, so this process repeats. The end result is, flexible
events on this cpu will not be scheduled (until another event being added
to the cpuctx).

Here is an easy repro of this issue. On Intel CPUs, where ref-cycles
could only use one counter, run one pinned event for ref-cycles, one
flexible event for ref-cycles, and one flexible event for cycles. The
flexible ref-cycles is never scheduled, which is expected. However,
because of this issue, the cycles event is never scheduled either.

 $ perf stat -e ref-cycles:D,ref-cycles,cycles -C 5 -I 1000

           time             counts unit events
    1.000152973         15,412,480      ref-cycles:D
    1.000152973      &lt;not counted&gt;      ref-cycles     (0.00%)
    1.000152973      &lt;not counted&gt;      cycles         (0.00%)
    2.000486957         18,263,120      ref-cycles:D
    2.000486957      &lt;not counted&gt;      ref-cycles     (0.00%)
    2.000486957      &lt;not counted&gt;      cycles         (0.00%)

To fix this, when the flexible_active list is empty, try rotate the
first event in the flexible_groups. Also, rename ctx_first_active() to
ctx_event_to_rotate(), which is more accurate.

Signed-off-by: Song Liu &lt;songliubraving@fb.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: &lt;kernel-team@fb.com&gt;
Cc: Arnaldo Carvalho de Melo &lt;acme@kernel.org&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Sasha Levin &lt;sashal@kernel.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Fixes: 8d5bce0c37fa ("perf/core: Optimize perf_rotate_context() event scheduling")
Link: https://lkml.kernel.org/r/20191008165949.920548-1-songliubraving@fb.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
</feed>
