tools/perf/Documentation/perf-bench.txt

   1 perf-bench(1)
   2 =============
   3
   4 NAME
   5 ----
   6 perf-bench - General framework for benchmark suites
   7
   8 SYNOPSIS
   9 --------
  10 [verse]
  11 'perf bench' [<common options>] <subsystem> <suite> [<options>]
  12
  13 DESCRIPTION
  14 -----------
  15 This 'perf bench' command is a general framework for benchmark suites.
  16
  17 COMMON OPTIONS
  18 --------------
  19 -r::
  20 --repeat=::
  21 Specify number of times to repeat the run (default 10).
  22
  23 -f::
  24 --format=::
  25 Specify format style.
  26 Current available format styles are:
  27
  28 'default'::
  29 Default style. This is mainly for human reading.
  30 ---------------------
  31 % perf bench sched pipe                      # with no style specified
  32 (executing 1000000 pipe operations between two tasks)
  33         Total time:5.855 sec
  34                 5.855061 usecs/op
  35                 170792 ops/sec
  36 ---------------------
  37
  38 'simple'::
  39 This simple style is friendly for automated
  40 processing by scripts.
  41 ---------------------
  42 % perf bench --format=simple sched pipe      # specified simple
  43 5.988
  44 ---------------------
  45
  46 SUBSYSTEM
  47 ---------
  48
  49 'sched'::
  50         Scheduler and IPC mechanisms.
  51
  52 'syscall'::
  53         System call performance (throughput).
  54
  55 'mem'::
  56         Memory access performance.
  57
  58 'numa'::
  59         NUMA scheduling and MM benchmarks.
  60
  61 'futex'::
  62         Futex stressing benchmarks.
  63
  64 'epoll'::
  65         Eventpoll (epoll) stressing benchmarks.
  66
  67 'internals'::
  68         Benchmark internal perf functionality.
  69
  70 'uprobe'::
  71         Benchmark overhead of uprobe + BPF.
  72
  73 'all'::
  74         All benchmark subsystems.
  75
  76 SUITES FOR 'sched'
  77 ~~~~~~~~~~~~~~~~~~
  78 *messaging*::
  79 Suite for evaluating performance of scheduler and IPC mechanisms.
  80 Based on hackbench by Rusty Russell.
  81
  82 Options of *messaging*
  83 ^^^^^^^^^^^^^^^^^^^^^^
  84 -p::
  85 --pipe::
  86 Use pipe() instead of socketpair()
  87
  88 -t::
  89 --thread::
  90 Be multi thread instead of multi process
  91
  92 -g::
  93 --group=::
  94 Specify number of groups
  95
  96 -l::
  97 --nr_loops=::
  98 Specify number of loops
  99
 100 Example of *messaging*
 101 ^^^^^^^^^^^^^^^^^^^^^^
 102
 103 ---------------------
 104 % perf bench sched messaging                 # run with default
 105 options (20 sender and receiver processes per group)
 106 (10 groups == 400 processes run)
 107
 108       Total time:0.308 sec
 109
 110 % perf bench sched messaging -t -g 20        # be multi-thread, with 20 groups
 111 (20 sender and receiver threads per group)
 112 (20 groups == 800 threads run)
 113
 114       Total time:0.582 sec
 115 ---------------------
 116
 117 *pipe*::
 118 Suite for pipe() system call.
 119 Based on pipe-test-1m.c by Ingo Molnar.
 120
 121 Options of *pipe*
 122 ^^^^^^^^^^^^^^^^^
 123 -l::
 124 --loop=::
 125 Specify number of loops.
 126
 127 Example of *pipe*
 128 ^^^^^^^^^^^^^^^^^
 129
 130 ---------------------
 131 % perf bench sched pipe
 132 (executing 1000000 pipe operations between two tasks)
 133
 134         Total time:8.091 sec
 135                 8.091833 usecs/op
 136                 123581 ops/sec
 137
 138 % perf bench sched pipe -l 1000              # loop 1000
 139 (executing 1000 pipe operations between two tasks)
 140
 141         Total time:0.016 sec
 142                 16.948000 usecs/op
 143                 59004 ops/sec
 144 ---------------------
 145
 146 SUITES FOR 'syscall'
 147 ~~~~~~~~~~~~~~~~~~
 148 *basic*::
 149 Suite for evaluating performance of core system call throughput (both usecs/op and ops/sec metrics).
 150 This uses a single thread simply doing getppid(2), which is a simple syscall where the result is not
 151 cached by glibc.
 152
 153
 154 SUITES FOR 'mem'
 155 ~~~~~~~~~~~~~~~~
 156 *memcpy*::
 157 Suite for evaluating performance of simple memory copy in various ways.
 158
 159 Options of *memcpy*
 160 ^^^^^^^^^^^^^^^^^^^
 161 -l::
 162 --size::
 163 Specify size of memory to copy (default: 1MB).
 164 Available units are B, KB, MB, GB and TB (case insensitive).
 165
 166 -f::
 167 --function::
 168 Specify function to copy (default: default).
 169 Available functions are depend on the architecture.
 170 On x86-64, x86-64-unrolled, x86-64-movsq and x86-64-movsb are supported.
 171
 172 -l::
 173 --nr_loops::
 174 Repeat memcpy invocation this number of times.
 175
 176 -c::
 177 --cycles::
 178 Use perf's cpu-cycles event instead of gettimeofday syscall.
 179
 180 *memset*::
 181 Suite for evaluating performance of simple memory set in various ways.
 182
 183 Options of *memset*
 184 ^^^^^^^^^^^^^^^^^^^
 185 -l::
 186 --size::
 187 Specify size of memory to set (default: 1MB).
 188 Available units are B, KB, MB, GB and TB (case insensitive).
 189
 190 -f::
 191 --function::
 192 Specify function to set (default: default).
 193 Available functions are depend on the architecture.
 194 On x86-64, x86-64-unrolled, x86-64-stosq and x86-64-stosb are supported.
 195
 196 -l::
 197 --nr_loops::
 198 Repeat memset invocation this number of times.
 199
 200 -c::
 201 --cycles::
 202 Use perf's cpu-cycles event instead of gettimeofday syscall.
 203
 204 SUITES FOR 'numa'
 205 ~~~~~~~~~~~~~~~~~
 206 *mem*::
 207 Suite for evaluating NUMA workloads.
 208
 209 SUITES FOR 'futex'
 210 ~~~~~~~~~~~~~~~~~~
 211 *hash*::
 212 Suite for evaluating hash tables.
 213
 214 *wake*::
 215 Suite for evaluating wake calls.
 216
 217 *wake-parallel*::
 218 Suite for evaluating parallel wake calls.
 219
 220 *requeue*::
 221 Suite for evaluating requeue calls.
 222
 223 *lock-pi*::
 224 Suite for evaluating futex lock_pi calls.
 225
 226 SUITES FOR 'epoll'
 227 ~~~~~~~~~~~~~~~~~~
 228 *wait*::
 229 Suite for evaluating concurrent epoll_wait calls.
 230
 231 *ctl*::
 232 Suite for evaluating multiple epoll_ctl calls.
 233
 234 SUITES FOR 'internals'
 235 ~~~~~~~~~~~~~~~~~~~~~~
 236 *synthesize*::
 237 Suite for evaluating perf's event synthesis performance.
 238
 239 SEE ALSO
 240 --------
 241 linkperf:perf[1]