OSDN Git Service

perf/core: Fix cgroup events tracking
authorChengming Zhou <zhouchengming@bytedance.com>
Wed, 7 Dec 2022 12:40:23 +0000 (20:40 +0800)
committerPeter Zijlstra <peterz@infradead.org>
Tue, 27 Dec 2022 11:44:00 +0000 (12:44 +0100)
commitf841b682baef90ee144df8b12e2c76aa460717c1
tree65fbbce2fc0705878db55a0b9d202722505a41d4
parente2d371484653ac83b970d3ebcf343383f39f8b6b
perf/core: Fix cgroup events tracking

We encounter perf warnings when using cgroup events like:

  cd /sys/fs/cgroup
  mkdir test
  perf stat -e cycles -a -G test

Which then triggers:

  WARNING: CPU: 0 PID: 690 at kernel/events/core.c:849 perf_cgroup_switch+0xb2/0xc0
  Call Trace:
   <TASK>
   __schedule+0x4ae/0x9f0
   ? _raw_spin_unlock_irqrestore+0x23/0x40
   ? __cond_resched+0x18/0x20
   preempt_schedule_common+0x2d/0x70
   __cond_resched+0x18/0x20
   wait_for_completion+0x2f/0x160
   ? cpu_stop_queue_work+0x9e/0x130
   affine_move_task+0x18a/0x4f0

  WARNING: CPU: 0 PID: 690 at kernel/events/core.c:829 ctx_sched_in+0x1cf/0x1e0
  Call Trace:
   <TASK>
   ? ctx_sched_out+0xb7/0x1b0
   perf_cgroup_switch+0x88/0xc0
   __schedule+0x4ae/0x9f0
   ? _raw_spin_unlock_irqrestore+0x23/0x40
   ? __cond_resched+0x18/0x20
   preempt_schedule_common+0x2d/0x70
   __cond_resched+0x18/0x20
   wait_for_completion+0x2f/0x160
   ? cpu_stop_queue_work+0x9e/0x130
   affine_move_task+0x18a/0x4f0

The above two warnings are not complete here since I remove other
unimportant information. The problem is caused by the perf cgroup
events tracking:

  CPU0 CPU1
  perf_event_open()
    perf_event_alloc()
      account_event()
account_event_cpu()
  atomic_inc(perf_cgroup_events)
  __perf_event_task_sched_out()
    if (atomic_read(perf_cgroup_events))
      perf_cgroup_switch()
// kernel/events/core.c:849
WARN_ON_ONCE(cpuctx->ctx.nr_cgroups == 0)
if (READ_ONCE(cpuctx->cgrp) == cgrp) // false
  return
perf_ctx_lock()
ctx_sched_out()
cpuctx->cgrp = cgrp
ctx_sched_in()
  perf_cgroup_set_timestamp()
    // kernel/events/core.c:829
    WARN_ON_ONCE(!ctx->nr_cgroups)
perf_ctx_unlock()
    perf_install_in_context()
      cpu_function_call()
  __perf_install_in_context()
    add_event_to_ctx()
      list_add_event()
perf_cgroup_event_enable()
  ctx->nr_cgroups++
  cpuctx->cgrp = X

We can see from above that we wrongly use percpu atomic perf_cgroup_events
to check if we need to perf_cgroup_switch(), which should only be used
when we know this CPU has cgroup events enabled.

The commit bd2756811766 ("perf: Rewrite core context handling") change
to have only one context per-CPU, so we can just use cpuctx->cgrp to
check if this CPU has cgroup events enabled.

So percpu atomic perf_cgroup_events is not needed.

Fixes: bd2756811766 ("perf: Rewrite core context handling")
Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Ravi Bangoria <ravi.bangoria@amd.com>
Link: https://lkml.kernel.org/r/20221207124023.66252-1-zhouchengming@bytedance.com
kernel/events/core.c