OSDN Git Service

android-x86/kernel.git
8 years agoperf scripts python: Add new compaction-times script
Tony Jones [Mon, 17 Aug 2015 19:48:52 +0000 (12:48 -0700)]
perf scripts python: Add new compaction-times script

This patch creates a new script (compaction-times) to report time
spent in mm compaction. It is possible to report times in nanoseconds
(default) or microseconds (-u).

The option -p will break down results by process id, -pv will further
decompose by each compaction entry/exit.

For each compaction entry/exit what is reported is controlled by the
options:

  -t   report only timing
  -m   report migration stats
  -ms  report migration scanner stats
  -fs  report free scanner stats

The default is to report all.

Entries may be further filtered by pid, pid-range or comm (regex).

The script is useful when analysing workloads that compact memory. The
most common example will be THP allocations on systems with a lot of
uptime that has fragmented memory.

This is an example of using the script to analyse a thpscale from
mmtests which deliberately fragments memory and allocates THP in 4
separate threads

  # Recording step, one of the following;
  $ perf record -e 'compaction:mm_compaction_*' ./workload
  # or:
  $ perf script record compaction-times

  # Reporting: basic
  total: 2444505743ns migration: moved=357738 failed=39275
  free_scanner: scanned=2705578 isolated=387875
  migration_scanner: scanned=414426 isolated=397013

  # Reporting: Per task stall times
  $ perf script report compaction-times -- -t -p
  total: 2444505743ns
  6384[thpscale]: 740800017ns
  6385[thpscale]: 274119512ns
  6386[thpscale]: 832961337ns
  6383[thpscale]: 596624877ns

  # Reporting: Per-compaction attempts for task 6385
  $ perf script report compaction-times -- -m -pv 6385
  total: 274119512ns migration: moved=14893 failed=24285
  6385[thpscale]: 274119512ns migration: moved=14893 failed=24285
  6385[thpscale].1: 3033277ns migration: moved=511 failed=1
  6385[thpscale].2: 9592094ns migration: moved=1524 failed=12
  6385[thpscale].3: 2495587ns migration: moved=512 failed=0
  6385[thpscale].4: 2561766ns migration: moved=512 failed=0
  6385[thpscale].5: 2523521ns migration: moved=512 failed=0
  ..... output continues ...

Changes since v1:
- report stats for isolate_migratepages and isolate_freepages
  (Vlastimil Babka)
- refactor code to achieve above
- add help text
- output to stdout/stderr explicitly

Signed-off-by: Tony Jones <tonyj@suse.com>
Cc: Mel Gorman <mgorman@suse.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Link: http://lkml.kernel.org/r/1439840932-8933-1-git-send-email-tonyj@suse.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agotools lib traceeveent: Allow for negative numbers in print format
Steven Rostedt [Thu, 27 Aug 2015 13:46:01 +0000 (09:46 -0400)]
tools lib traceeveent: Allow for negative numbers in print format

It was reported that "%-8s" does not parse well when used in the printk
format. The '-' is what is throwing it off. Allow that to be included.

Reporter note:

Example before:

  transhuge-stres-10730 [004]  5897.713989: mm_compaction_finished: node=0
  zone=>-<8s order=-2119871790 ret=

Example after:

  transhuge-stres-4235  [000]   453.149280: mm_compaction_finished: node=0
  zone=ffffffff81815d7a order=9 ret=

(I will send patches to fix the string handling in the tracepoints so
it's on par with in-kernel printing via trace_pipe:)

  transhuge-stres-10921 [007] ...1  6307.140205: mm_compaction_finished: node=0
  zone=Normal   order=9 ret=partial

Reported-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Tested-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/20150827094601.46518bcc@gandalf.local.home
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf script: Add --[no-]-demangle/--[no-]-demangle-kernel
Mark Drayton [Wed, 26 Aug 2015 19:18:15 +0000 (12:18 -0700)]
perf script: Add --[no-]-demangle/--[no-]-demangle-kernel

Sometimes when post-processing output from `perf script` one does not
want to demangle C++ symbol names. Add an option to allow this.

Also add --[no-]demangle-kernel to be consistent with top/report/probe.

Signed-off-by: Mark Drayton <mbd@fb.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/1440616695-32340-1-git-send-email-scientist@fb.com
Signed-off-by: Yannick Brosseau <scientist@fb.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoMerge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git...
Ingo Molnar [Fri, 28 Aug 2015 06:22:02 +0000 (08:22 +0200)]
Merge tag 'perf-core-for-mingo' of git://git./linux/kernel/git/acme/linux into perf/core

Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:

User visible changes:

  - Add support for using several Intel PT features (CYC, MTC packets), the
    relevant documentation was updated: tools/perf/Documentation/intel-pt.txt,
    briefly describing those packets, its purposes, how to configure them in
    the event config terms and relevant external documentation for further
    reading. (Adrian Hunter)

  - Introduce support for probing at an absolute address, for user and kernel
    'perf probe's, useful when one have the symbol maps on a developer machine
    but not on an embedded system. (Wang Nan)

  - Fix 'perf probe' list results when a symbol can't be found or the
    address is zero and when an offset is provided without a function (Wang Nan)

  - Do not print '0x (null)' in uprobes when offset is zero (Wang Nan)

  - Clear the progress bar at the end of a ordered_events flush, fixing
    an UI artifact when, after ordering the events the screen doesn't get
    completely redraw, for instance, when an error window covers just the
    center of the screen and waits for user input. (Arnaldo Carvalho de Melo)

  - Fix 'annotate' segfault by resetting the dso find_symbol cache when removing
    symbols. (Arnaldo Carvalho de Melo)

Infrastructure changes:

  - Allow duplicate objects in the object list, just like it is possible to have
    things like this, in the kernel:

      drivers/Makefile:obj-$(CONFIG_PCI)        += usb/
      drivers/Makefile:obj-$(CONFIG_USB_GADGET) += usb/

    (Jiri Olsa)

  - Fix Intel PT 'instructions' sample period. (Adrian Hunter)

  - Prevent segfault when reading probe point with absolute address. (Wang Nan)

Build fixes:

  - Fix tarball build broken by pt/bts. (Adrian Hunter)

  - Remove export.h from MANIFEST, fixing the perf tarball make target. (Jiri Olsa)

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
8 years agotracing/uprobes: Do not print '0x (null)' when offset is 0
Wang Nan [Wed, 26 Aug 2015 10:57:46 +0000 (10:57 +0000)]
tracing/uprobes: Do not print '0x (null)' when offset is 0

When manually added uprobe point with zero address, 'uprobe_events'
output '(null)' instead of 0x00000000:

  # echo p:probe_libc/abs_0 /path/to/lib.bin:0x0 arg1=%ax > \
            /sys/kernel/debug/tracing/uprobe_events

  # cat /sys/kernel/debug/tracing/uprobe_events
    p:probe_libc/abs_0 /path/to/lib.bin:0x          (null) arg1=%ax

 This patch fixes this behavior:

  # cat /sys/kernel/debug/tracing/uprobe_events
  p:probe_libc/abs_0 /path/to/lib.bin:0x0000000000000000

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1440586666-235233-8-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf probe: Support probing at absolute address
Wang Nan [Wed, 26 Aug 2015 10:57:45 +0000 (10:57 +0000)]
perf probe: Support probing at absolute address

It should be useful to allow 'perf probe' probe at absolute offset of a
target. For example, when (u)probing at a instruction of a shared object
in a embedded system where debuginfo is not avaliable but we know the
offset of that instruction by manually digging.

This patch enables following perf probe command syntax:

  # perf probe 0xffffffff811e6615

And

  # perf probe /lib/x86_64-linux-gnu/libc-2.19.so 0xeb860

In the above example, we don't need a anchor symbol, so it is possible
to compute absolute addresses using other methods and then use 'perf
probe' to create the probing points.

v1 -> v2:
  Drop the leading '+' in cmdline;
  Allow uprobing at offset 0x0;
  Improve 'perf probe -l' result when uprobe at area without debuginfo.

v2 -> v3:
  Split bugfix to a separated patch.

Test result:

  # perf probe 0xffffffff8119d175 %ax
  # perf probe sys_write %ax
  # perf probe /lib64/libc-2.18.so 0x0 %ax
  # perf probe /lib64/libc-2.18.so 0x5 %ax
  # perf probe /lib64/libc-2.18.so 0xd8e40 %ax
  # perf probe /lib64/libc-2.18.so __write %ax
  # perf probe /lib64/libc-2.18.so 0xd8e49 %ax
  # cat /sys/kernel/debug/tracing/uprobe_events

  p:probe_libc/abs_0 /lib64/libc-2.18.so:0x          (null) arg1=%ax
  p:probe_libc/abs_5 /lib64/libc-2.18.so:0x0000000000000005 arg1=%ax
  p:probe_libc/abs_d8e40 /lib64/libc-2.18.so:0x00000000000d8e40 arg1=%ax
  p:probe_libc/__write /lib64/libc-2.18.so:0x00000000000d8e40 arg1=%ax
  p:probe_libc/abs_d8e49 /lib64/libc-2.18.so:0x00000000000d8e49 arg1=%ax

  # cat /sys/kernel/debug/tracing/kprobe_events

  p:probe/abs_ffffffff8119d175 0xffffffff8119d175 arg1=%ax
  p:probe/sys_write _text+1692016 arg1=%ax

  # perf probe -l

  Failed to find debug information for address 5
    probe:abs_ffffffff8119d175 (on sys_write+5 with arg1)
    probe:sys_write      (on sys_write with arg1)
    probe_libc:__write   (on @unix/syscall-template.S:81 in /lib64/libc-2.18.so with arg1)
    probe_libc:abs_0     (on 0x0 in /lib64/libc-2.18.so with arg1)
    probe_libc:abs_5     (on 0x5 in /lib64/libc-2.18.so with arg1)
    probe_libc:abs_d8e40 (on @unix/syscall-template.S:81 in /lib64/libc-2.18.so with arg1)
    probe_libc:abs_d8e49 (on __GI___libc_write+9 in /lib64/libc-2.18.so with arg1)

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1440586666-235233-7-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf probe: Fix error reported when offset without function
Wang Nan [Wed, 26 Aug 2015 10:57:44 +0000 (10:57 +0000)]
perf probe: Fix error reported when offset without function

This patch fixes a bug that, when offset is provided but function is
lost, parse_perf_probe_point() will give a "" string as function name,
so the checking code at the end of parse_perf_probe_point() become
useless.  For example:

  # perf probe +0x1234
  Failed to find symbol  in kernel
    Error: Failed to add events.

After this patch:

  # perf probe +0x1234
  Semantic error :Offset requires an entry function.
    Error: Command Parse Error.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1440586666-235233-6-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf probe: Fix list result when address is zero
Wang Nan [Wed, 26 Aug 2015 10:57:43 +0000 (10:57 +0000)]
perf probe: Fix list result when address is zero

When manually added uprobe point with zero address, 'perf probe -l'
reports error. For example:

  # echo p:probe_libc/abs_0 /path/to/lib.bin:0x0 arg1=%ax > \
           /sys/kernel/debug/tracing/uprobe_events

  # perf probe -l
  Error: Failed to show event list.

Probing at 0x0 is possible and useful when lib.bin is not a normal
shared object but is manually mapped. However, in this case kernel
report:

  # cat /sys/kernel/debug/tracing/uprobe_events
  p:probe_libc/abs_0 /path/to/lib.bin:0x          (null) arg1=%ax

This patch supports the above kernel output.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1440586666-235233-5-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf probe: Fix list result when symbol can't be found
Wang Nan [Wed, 26 Aug 2015 10:57:42 +0000 (10:57 +0000)]
perf probe: Fix list result when symbol can't be found

'perf probe -l' reports error if it is unable find symbol through
address. Here is an example.

  # echo 'p:probe_libc/abs_5 /lib64/libc.so.6:0x5' >
          /sys/kernel/debug/tracing/uprobe_events
  # cat /sys/kernel/debug/tracing/uprobe_events
   p:probe_libc/abs_5 /lib64/libc.so.6:0x0000000000000005
  # perf probe -l
    Error: Failed to show event list

Also, this situation triggers a logical inconsistency in
convert_to_perf_probe_point() that, it returns ENOMEM but actually it
never try strdup().

This patch removes !tp->module && !is_kprobe condition, so it always
uses address to build function name if symbol not found.

Test result:

  # perf probe -l
    probe_libc:abs_5     (on 0x5 in /lib64/libc.so.6)

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1440586666-235233-4-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agotools build: Allow duplicate objects in the object list
Jiri Olsa [Wed, 26 Aug 2015 13:01:03 +0000 (15:01 +0200)]
tools build: Allow duplicate objects in the object list

It's sometimes useful to specify the object affiliation to multiple
config options like:

  libperf-$(CONFIG_X86) += tsc.o
  libperf-$(CONFIG_AUXTRACE) += tsc.o

while the object itself is linked only once. Adding the support for this
and ignoring duplicate objects in the object list.

Suggested-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/r/20150826130103.GF22670@krava.redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf tools: Remove export.h from MANIFEST
Jiri Olsa [Wed, 26 Aug 2015 08:07:50 +0000 (10:07 +0200)]
perf tools: Remove export.h from MANIFEST

We don't carry an export.h wrapper anymore, remove it from the MANIFEST
file to avoid breaking the make perf-tar targets.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/r/20150826080750.GD22670@krava.redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf probe: Prevent segfault when reading probe point with absolute address
Wang Nan [Tue, 25 Aug 2015 13:27:35 +0000 (13:27 +0000)]
perf probe: Prevent segfault when reading probe point with absolute address

'perf probe -l' panic if there is a manually inserted probing point with
absolute address. For example:

  # echo 'p:probe/abs_ffffffff811e6615 0xffffffff811e6615' > /sys/kernel/debug/tracing/kprobe_events
  # perf probe -l
  Segmentation fault (core dumped)

This patch fix this problem by considering the situation that
"tp->symbol == NULL" in find_perf_probe_point_from_dwarf() and
find_perf_probe_point_from_map().

After this patch:

  # perf probe -l
  probe:abs_ffffffff811e6615 (on SyS_write+5@fs/read_write.c)

And when debug info is missing:

  # rm -rf ~/.debug
  # mv /lib/modules/4.2.0-rc1+/build/vmlinux /lib/modules/4.2.0-rc1+/build/vmlinux.bak
  # perf probe -l
  probe:abs_ffffffff811e6615 (on sys_write+5)

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1440509256-193590-1-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf tools: Update Intel PT documentation
Adrian Hunter [Fri, 17 Jul 2015 16:34:00 +0000 (19:34 +0300)]
perf tools: Update Intel PT documentation

Update Intel PT documentation to describe new features.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1437150840-31811-26-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf tools: Add Intel PT support for decoding TRACESTOP packets
Adrian Hunter [Fri, 17 Jul 2015 16:33:59 +0000 (19:33 +0300)]
perf tools: Add Intel PT support for decoding TRACESTOP packets

A TRACESTOP packet is produced when an Intel PT trace enters a defined
region of the address space at which point the tracing stops.

This patch just adds decoder support.

Support for specifying TRACESTOP regions is left until later.

For details refer to the June 2015 or later Intel 64 and IA-32
Architectures SDM Chapter 36 Intel Processor Trace.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1437150840-31811-25-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf tools: Add Intel PT support for using CYC packets
Adrian Hunter [Fri, 17 Jul 2015 16:33:58 +0000 (19:33 +0300)]
perf tools: Add Intel PT support for using CYC packets

CYC packets are a new Intel PT feature.

CYC packets provide even finer grain timestamp information than MTC and
TSC packets.  A CYC packet contains the number of CPU cycles since the
last CYC packet. Unlike MTC and TSC packets, CYC packets are only sent
when another packet is also sent.

Support for this feature is indicated by:

/sys/bus/event_source/devices/intel_pt/caps/psb_cyc

which contains "1" if the feature is supported and "0" otherwise.

CYC packets can be requested using a PMU config term e.g. perf record -e
intel_pt/cyc/u sleep 1

The frequency of CYC packets can also be specified.  e.g. perf record -e
intel_pt/cyc,cyc_thresh=2/u sleep 1

CYC packets are not requested by default.

Valid cyc_thresh values are given by:

/sys/bus/event_source/devices/intel_pt/caps/cycle_thresholds

which contains a hexadecimal value, the bits of which represent valid
values e.g. bit 2 set means value 2 is valid.

The value represents the minimum number of CPU cycles that must have
passed before a CYC packet can be sent.  The number of CPU cycles is:

    2 ^ (value - 1)

e.g. value 4 means 8 CPU cycles must pass before a CYC packet can be
sent.  Note a CYC packet is still only sent when another packet is sent,
not at, e.g. every 8 CPU cycles.

If an invalid value is entered, the error message will give a list of
valid values e.g.

    $ perf record -e intel_pt/cyc,cyc_thresh=15/u uname
    Invalid cyc_thresh for intel_pt. Valid values are: 0-12

tools/perf/Documentation/intel-pt.txt is updated in a later patch as
there are a number of new features being added.

For more information refer to the June 2015 or later Intel 64 and IA-32
Architectures SDM Chapter 36 Intel Processor Trace.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1437150840-31811-24-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf tools: Add Intel PT support for decoding CYC packets
Adrian Hunter [Fri, 17 Jul 2015 16:33:57 +0000 (19:33 +0300)]
perf tools: Add Intel PT support for decoding CYC packets

CYC packets provide even finer grain timestamp information than MTC and
TSC packets.  A CYC packet contains the number of CPU cycles since the
last CYC packet.

This patch just adds decoder support.  The CPU frequency can be related
to TSC using the Maximum Non-Turbo Ratio in combination with the CBR
(core-to-bus ratio) packet.  However more accuracy is achieved by simply
interpolating the number of cycles between other timing packets like MTC
or TSC.  This patch takes the latter approach.

Support for a default value and validation of values is provided by a
later patch. Also documentation is updated in a separate patch.

For details refer to the June 2015 or later Intel 64 and IA-32
Architectures SDM Chapter 36 Intel Processor Trace.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1437150840-31811-23-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf tools: Add Intel PT support for using MTC packets
Adrian Hunter [Fri, 17 Jul 2015 16:33:56 +0000 (19:33 +0300)]
perf tools: Add Intel PT support for using MTC packets

MTC packets are a new Intel PT feature.

MTC packets provide finer grain timestamp information than TSC packets.

Support for this feature is indicated by:

  /sys/bus/event_source/devices/intel_pt/caps/mtc

which contains "1" if the feature is supported and "0" otherwise.

MTC packets can be requested using a PMU config term e.g. perf record -e
intel_pt/mtc/u sleep 1

The frequency of MTC packets can also be specified.  e.g. perf record -e
intel_pt/mtc,mtc_period=2/u sleep 1

The default value is 3 or the nearest lower value that is supported.  0
is always supported.

Valid values are given by:

/sys/bus/event_source/devices/intel_pt/caps/mtc_periods

which contains a hexadecimal value, the bits of which represent valid
values e.g. bit 2 set means value 2 is valid.

The value is converted to the MTC frequency as:

CTC-frequency / (2 ^ value)

e.g. value 3 means one eighth of CTC-frequency

Where CTC is the hardware crystal clock, the frequency of which can be
related to TSC via values provided in cpuid leaf 0x15.

If an invalid value is entered, the error message will give a list of
valid values e.g.

$ perf record -e intel_pt/mtc_period=15/u uname
Invalid mtc_period for intel_pt. Valid values are: 0,3,6,9

tools/perf/Documentation/intel-pt.txt is updated in a later patch as
there are a number of new features being added.

For more information refer to the June 2015 or later Intel 64 and IA-32
Architectures SDM Chapter 36 Intel Processor Trace.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1437150840-31811-22-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf tools: Add Intel PT support for decoding MTC packets
Adrian Hunter [Fri, 17 Jul 2015 16:33:55 +0000 (19:33 +0300)]
perf tools: Add Intel PT support for decoding MTC packets

MTC packets provide finer grain timestamp information than TSC packets.
MTC packets record time using the hardware crystal clock (CTC) which is
related to TSC packets using a TMA packet.

This patch just adds decoder support.

Support for a default value and validation of values is provided by a
later patch. Also documentation is updated in a separate patch.

For details refer to the June 2015 or later Intel 64 and IA-32
Architectures SDM Chapter 36 Intel Processor Trace.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1437150840-31811-21-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf tools: Pass Intel PT information for decoding MTC and CYC
Adrian Hunter [Fri, 17 Jul 2015 16:33:54 +0000 (19:33 +0300)]
perf tools: Pass Intel PT information for decoding MTC and CYC

Record additional information in the AUXTRACE_INFO event in preparation
for decoding MTC and CYC packets.  Pass the information to the decoder.

The AUXTRACE_INFO record can be extended by using the size to indicate
the presence of new members.

The additional information includes PMU config bit positions and the TSC
to CTC (hardware crystal clock) ratio needed to decode MTC packets.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1437150840-31811-20-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf tools: Add new Intel PT packet definitions
Adrian Hunter [Fri, 17 Jul 2015 16:33:53 +0000 (19:33 +0300)]
perf tools: Add new Intel PT packet definitions

New features have been added to Intel PT which include a number of new
packet definitions.

This patch adds packet definitions for new packets: TMA, MTC, CYC, VMCS,
TRACESTOP and MNT.  Also another bit in PIP is defined.

This patch only adds support for the definitions. Later patches add
support for decoding TMA, MTC, CYC and TRACESTOP which is where those
packets are explained.

VMCS and the newly defined bit in PIP are used with virtualization which
is not supported yet.  MNT is a maintenance packet which the decoder
should ignore.

For details, refer to the June 2015 or later Intel 64 and IA-32
Architectures SDM Chapter 36 Intel Processor Trace.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1437150840-31811-19-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf tools: Add Intel PT support for PSB periods
Adrian Hunter [Fri, 17 Jul 2015 16:33:52 +0000 (19:33 +0300)]
perf tools: Add Intel PT support for PSB periods

The PSB packet is a synchronization packet that provides a starting
point for decoding or recovery from errors.

This patch adds support for a new Intel PT feature that allows the
frequency of PSB packets to be specified.

Support for this feature is indicated by
/sys/bus/event_source/devices/intel_pt/caps/psb_cyc which contains "1"
if the feature is supported and "0" otherwise.

The PSB period can be specified as a PMU config term e.g. perf record -e
intel_pt/psb_period=2/u sleep 1

The default value is 3 or the nearest lower value that is supported.  0
is always supported.

Valid values are given by:

/sys/bus/event_source/devices/intel_pt/caps/psb_periods

which contains a hexadecimal value, the bits of which represent valid
values e.g. bit 2 set means value 2 is valid.

The value is converted to the approximate number of trace bytes between
PSB packets as:

2 ^ (value + 11)

e.g. value 3 means 16KiB bytes between PSBs

If an invalid value is entered, the error message will give a list of
valid values e.g.

$ perf record -e intel_pt/psb_period=15/u uname
Invalid psb_period for intel_pt. Valid values are: 0-5

tools/perf/Documentation/intel-pt.txt is updated in a later patch as
there are a number of new features being added.

For more information about PSB periods refer to the Intel 64 and IA-32
Architectures SDM Chapter 36 Intel Processor Trace from June 2015 or
later.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1437150840-31811-18-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf tools: Fix Intel PT 'instructions' sample period
Adrian Hunter [Fri, 17 Jul 2015 16:33:48 +0000 (19:33 +0300)]
perf tools: Fix Intel PT 'instructions' sample period

The period on synthesized 'instructions' samples was being set to a
fixed value, whereas the correct value is the number of instructions
since the last sample, which is a value that the decoder can provide.
So do it that way.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1437150840-31811-14-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf ordered_events: Clear the progress bar at the end of a flush
Arnaldo Carvalho de Melo [Mon, 24 Aug 2015 20:16:22 +0000 (17:16 -0300)]
perf ordered_events: Clear the progress bar at the end of a flush

We were depending on the next screen operation after a flush() being
one that would redraw the whole screen so that the progress bar would
be overwritten, when that didn't happen a screen artifact of, say, a
error dialog window would be overlaid on top of the progress bar, fix
it by calling ui_browser__finish(), that now has a TUI implementation.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-el0fyw6duemnx62lydjzhs8c@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf ui tui progress: Implement the ui_progress_ops->finish() method
Arnaldo Carvalho de Melo [Mon, 24 Aug 2015 19:18:26 +0000 (16:18 -0300)]
perf ui tui progress: Implement the ui_progress_ops->finish() method

So that we can erase the progress bar after we're done with it, avoiding
things like:

-------------------------------------------------------------------

          ┌─Error:──────────────────────────────────────────────────────┐
          │Can't annotate unmapped_area_topdown:                        │
          │                                                             │
          │No vmlinux file with build id a826726b5ddacfab1f0bade868f1a79
          │was found in the path.                                       │
          │                                                             │
          │Note that annotation using /proc/kcore requires CAP_SYS_RAWIO│
┌Processin│                                                             │──┐
│         │Please use:                                                  │  │
└─────────│                                                             │──┘
          │  perf buildid-cache -vu vmlinux                             │
          │                                                             │
          │or:                                                          │
          │                                                             │
          │  --vmlinux vmlinux                                          │
          │                                                             │
          │                                                             │
          │Press any key...                                             │
          └─────────────────────────────────────────────────────────────┘

Can't annotate unmapped_area_topdown:
-------------------------------------------------------------------

I.e. that finished progress bar behind the error window. It is not a
problem when we end up redrawing the whole screen, but its ugly when
we present such error windows, provide a TUI method so that code like
the above may avoid this situation, as will be done with the annotation
code in the next cset.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-qvktnojzwwe37pweging058t@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf annotate: Reset the dso find_symbol cache when removing symbols
Arnaldo Carvalho de Melo [Mon, 24 Aug 2015 16:33:14 +0000 (13:33 -0300)]
perf annotate: Reset the dso find_symbol cache when removing symbols

The 'annotate' tool does some filtering in the entries in a DSO but
forgot to reset the cache done in dso__find_symbol(), cauxing a SEGV:

  [root@zoo ~]# perf annotate netlink_poll
  perf: Segmentation fault
  -------- backtrace --------
  perf[0x526ceb]
  /lib64/libc.so.6(+0x34960)[0x7faedfbe0960]
  perf(rb_erase+0x223)[0x499d63]
  perf[0x4213e9]
  perf[0x4bc123]
  perf[0x4bc621]
  perf[0x4bf26b]
  perf[0x4bc855]
  perf(perf_session__process_events+0x340)[0x4bddc0]
  perf(cmd_annotate+0x6bb)[0x421b5b]
  perf[0x479063]
  perf(main+0x60a)[0x42098a]
  /lib64/libc.so.6(__libc_start_main+0xf0)[0x7faedfbcbfe0]
  perf[0x420aa9]
  [0x0]
  [root@zoo ~]#

Fix it by reseting the find cache when removing symbols.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Fixes: b685ac22b436 ("perf symbols: Add front end cache for DSO symbol lookup")
Link: http://lkml.kernel.org/n/tip-b2y9x46y0t8yem1ive41zqyp@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf tools: Fix tarball build broken by pt/bts
Adrian Hunter [Fri, 21 Aug 2015 19:05:58 +0000 (22:05 +0300)]
perf tools: Fix tarball build broken by pt/bts

Fix some include paths and add missing inat_types.h.

Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/55D77696.60102@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoMerge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git...
Ingo Molnar [Sat, 22 Aug 2015 06:45:46 +0000 (08:45 +0200)]
Merge tag 'perf-core-for-mingo' of git://git./linux/kernel/git/acme/linux into perf/core

Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:

User visible changes:

  - Fix segfault using 'perf script --show-mmap-events', affects
    only current perf/core. (Adrian Hunter)

  - /proc/kcore requires CAP_SYS_RAWIO message too noisy, make it
    debug only. (Adrian Hunter)

  - Fix Intel PT timestamp handling. (Adrian Hunter)

  - Add Intel BTS support, with a call-graph script to show it and
    PT in use in a GUI using 'perf script' python scripting with
    postgresql and Qt. (Adrian Hunter)

  - Add checks for returned EVENT_ERROR type in libtraceevent, fixing
    a bug that surfaced on arm64 systems. (Dean Nelson)

  - Fallback to using kallsyms when libdw fails to handle a vmlinux file,
    that can happen, for instance, when perf is statically linked and
    then libdw fails to load libebl_{arch}.so. (Wang Nan)

Infrastructure changes:

  - Initialize reference counts in map__clone(). (Arnaldo Carvalho de Melo)

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
8 years agoperf probe: Try to use symbol table if searching debug info failed
Wang Nan [Fri, 21 Aug 2015 10:09:02 +0000 (10:09 +0000)]
perf probe: Try to use symbol table if searching debug info failed

A problem can occur in a statically linked perf when vmlinux can be found:

 # perf probe --add sys_epoll_pwait
 probe-definition(0): sys_epoll_pwait
 symbol:sys_epoll_pwait file:(null) line:0 offset:0 return:0 lazy:(null)
 0 arguments
 Looking at the vmlinux_path (7 entries long)
 Using /lib/modules/4.2.0-rc1+/build/vmlinux for symbols
 Open Debuginfo file: /lib/modules/4.2.0-rc1+/build/vmlinux
 Try to find probe point from debuginfo.
 Symbol sys_epoll_pwait address found : ffffffff8122bd40
 Matched function: SyS_epoll_pwait
 Failed to get call frame on 0xffffffff8122bd40
 An error occurred in debuginfo analysis (-2).
   Error: Failed to add events. Reason: No such file or directory (Code: -2)

The reason is caused by libdw that, if libdw is statically linked, it
can't load libebl_{arch}.so reliable.

In this case it is still possible to get the address from
/proc/kalksyms.  However, perf tries that only when libdw returns
-EBADF.

This patch gives it another chance to utilize symbol table, even if
libdw returns an error code other than -EBADF.

After applying this patch:

 # perf probe -nv --add sys_epoll_pwait
 probe-definition(0): sys_epoll_pwait
 symbol:sys_epoll_pwait file:(null) line:0 offset:0 return:0 lazy:(null)
 0 arguments
 Looking at the vmlinux_path (7 entries long)
 Using /lib/modules/4.2.0-rc1+/build/vmlinux for symbols
 Open Debuginfo file: /lib/modules/4.2.0-rc1+/build/vmlinux
 Try to find probe point from debuginfo.
 Symbol sys_epoll_pwait address found : ffffffff8122bd40
 Matched function: SyS_epoll_pwait
 Failed to get call frame on 0xffffffff8122bd40
 An error occurred in debuginfo analysis (-2).
 Trying to use symbols.
 Opening /sys/kernel/debug/tracing/kprobe_events write=1
 Added new event:
 Writing event: p:probe/sys_epoll_pwait _text+2276672
   probe:sys_epoll_pwait (on sys_epoll_pwait)

 You can now use it in all perf tools, such as:

  perf record -e probe:sys_epoll_pwait -aR sleep 1

Although libdw returns an error (Failed to get call frame), perf tries
symbol table and finally gets correct address.

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Alexei Starovoitov <ast@plumgrid.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David Ahern <dsahern@gmail.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kaixu Xia <xiakaixu@huawei.com>
Cc: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Zefan Li <lizefan@huawei.com>
Cc: pi3orama@163.com
Link: http://lkml.kernel.org/r/1440151770-129878-2-git-send-email-wangnan0@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf tools: Initialize reference counts in map__clone()
Arnaldo Carvalho de Melo [Tue, 18 Aug 2015 18:19:50 +0000 (15:19 -0300)]
perf tools: Initialize reference counts in map__clone()

Map clone was written before we introduced reference counts for
maps and dsos, so all that was needed was just a copy and then we
would insert it into the new map_groups instance.

Fix it by, after copying, initializing the map->refcnt, grabbing
a struct dso refcount and resetting pointers that may be used
to determine if a map, when deleted, is in a rb_tree.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-pd4mr80o5b9gvk50iineacec@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf tools: Add example call-graph script
Adrian Hunter [Fri, 17 Jul 2015 16:33:45 +0000 (19:33 +0300)]
perf tools: Add example call-graph script

Add a script to produce a call-graph from data exported to a postgresql
database and derived from a processor trace event like intel_pt or intel_bts.

Refer to comments in the scripts call-graph-from-postgresql.py and
export-to-postgresql.py for more details on how to set up the environment,
install the required packages, etc.

Committer note:

From the scripts, for convenience while reading 'git log':

  An example of using this script with Intel PT:

  $ perf record -e intel_pt//u ls
  $ perf script -s ~/libexec/perf-core/scripts/python/export-to-postgresql.py pt_example branches calls
  2015-05-29 12:49:23.464364 Creating database...
  2015-05-29 12:49:26.281717 Writing to intermediate files...
  2015-05-29 12:49:27.190383 Copying to database...
  2015-05-29 12:49:28.140451 Removing intermediate files...
  2015-05-29 12:49:28.147451 Adding primary keys
  2015-05-29 12:49:28.655683 Adding foreign keys
  2015-05-29 12:49:29.365350 Done
  $ python tools/perf/scripts/python/call-graph-from-postgresql.py pt_example
  # The result is a GUI window with a tree representing a context-sensitive
  # call-graph.  Expanding a couple of levels of the tree and adjusting column
  # widths to suit will display something like:

                                         Call Graph: pt_example
  Call Path                        |Object     |Count|Time(ns)|Time(%)|Branch Count|Branch Count(%)
  v- ls
     v- 2638:2638
         v- _start                  ld-2.19.so    1   10074071  100.0        211135          100.0
           |- unknown               unknown       1      13198    0.1             1            0.0
           >- _dl_start             ld-2.19.so    1    1400980   13.9         19637            9.3
           >- _d_linit_internal     ld-2.19.so    1     448152    4.4         11094            5.3
           v-__libc_start_main@plt  ls            1    8211741   81.5        180397           85.4
              >- _dl_fixup          ld-2.19.so    1       7607    0.1           108            0.1
              >- __cxa_atexit       libc-2.19.so  1      11737    0.1            10            0.0
              >- __libc_csu_init    ls            1      10354    0.1            10            0.0
              |- _setjmp            libc-2.19.so  1          0    0.0             4            0.0
              v- main               ls            1    8182043   99.6        180254           99.9

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1437150840-31811-11-git-send-email-adrian.hunter@intel.com
[ Added 'python-pyside qt-postgresql' to the yum cmdline installing required packages ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf tools: Put itrace options into an asciidoc include
Adrian Hunter [Fri, 17 Jul 2015 16:33:44 +0000 (19:33 +0300)]
perf tools: Put itrace options into an asciidoc include

perf script, report and inject all have the same itrace options. Put
them into an asciidoc include file.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1437150840-31811-10-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf tools: Add Intel BTS support
Adrian Hunter [Fri, 17 Jul 2015 16:33:43 +0000 (19:33 +0300)]
perf tools: Add Intel BTS support

Intel BTS support fits within the new auxtrace infrastructure.  Recording is
supporting by identifying the Intel BTS PMU, parsing options and setting up
events.

Decoding is supported by queuing up trace data by thread and then decoding
synchronously delivering synthesized event samples into the session processing
for tools to consume.

Committer note:

E.g:

  [root@felicio ~]# perf record --per-thread -e intel_bts// ls
  anaconda-ks.cfg  apctest.output  bin  kernel-rt-3.10.0-298.rt56.171.el7.x86_64.rpm  libexec  lock_page.bpf.c  perf.data  perf.data.old
  [ perf record: Woken up 3 times to write data ]
  [ perf record: Captured and wrote 4.367 MB perf.data ]
  [root@felicio ~]# perf evlist -v
  intel_bts//: type: 6, size: 112, { sample_period, sample_freq }: 1, sample_type: IP|TID|IDENTIFIER, read_format: ID, disabled: 1, enable_on_exec: 1, sample_id_all: 1, exclude_guest: 1
  dummy:u: type: 1, size: 112, config: 0x9, { sample_period, sample_freq }: 1, sample_type: IP|TID|IDENTIFIER, read_format: ID, disabled: 1, exclude_kernel: 1, exclude_hv: 1, mmap: 1, comm: 1, enable_on_exec: 1, task: 1, sample_id_all: 1, mmap2: 1, comm_exec: 1
  [root@felicio ~]# perf script # the navigate in the pager to some interesting place:
    ls 1843 1 branches: ffffffff810a60cb flush_signal_handlers ([kernel.kallsyms]) => ffffffff8121a522 setup_new_exec ([kernel.kallsyms])
    ls 1843 1 branches: ffffffff8121a529 setup_new_exec ([kernel.kallsyms]) => ffffffff8122fa30 do_close_on_exec ([kernel.kallsyms])
    ls 1843 1 branches: ffffffff8122fa5d do_close_on_exec ([kernel.kallsyms]) => ffffffff81767ae0 _raw_spin_lock ([kernel.kallsyms])
    ls 1843 1 branches: ffffffff81767af4 _raw_spin_lock ([kernel.kallsyms]) => ffffffff8122fa62 do_close_on_exec ([kernel.kallsyms])
    ls 1843 1 branches: ffffffff8122fa8e do_close_on_exec ([kernel.kallsyms]) => ffffffff8122faf0 do_close_on_exec ([kernel.kallsyms])
    ls 1843 1 branches: ffffffff8122faf7 do_close_on_exec ([kernel.kallsyms]) => ffffffff8122fa8b do_close_on_exec ([kernel.kallsyms])
    ls 1843 1 branches: ffffffff8122fa8e do_close_on_exec ([kernel.kallsyms]) => ffffffff8122faf0 do_close_on_exec ([kernel.kallsyms])
    ls 1843 1 branches: ffffffff8122faf7 do_close_on_exec ([kernel.kallsyms]) => ffffffff8122fa8b do_close_on_exec ([kernel.kallsyms])
    ls 1843 1 branches: ffffffff8122fa8e do_close_on_exec ([kernel.kallsyms]) => ffffffff8122faf0 do_close_on_exec ([kernel.kallsyms])
    ls 1843 1 branches: ffffffff8122faf7 do_close_on_exec ([kernel.kallsyms]) => ffffffff8122fa8b do_close_on_exec ([kernel.kallsyms])
    ls 1843 1 branches: ffffffff8122fa8e do_close_on_exec ([kernel.kallsyms]) => ffffffff8122faf0 do_close_on_exec ([kernel.kallsyms])
    ls 1843 1 branches: ffffffff8122faf7 do_close_on_exec ([kernel.kallsyms]) => ffffffff8122fa8b do_close_on_exec ([kernel.kallsyms])
    ls 1843 1 branches: ffffffff8122fa8e do_close_on_exec ([kernel.kallsyms]) => ffffffff8122faf0 do_close_on_exec ([kernel.kallsyms])
    ls 1843 1 branches: ffffffff8122faf7 do_close_on_exec ([kernel.kallsyms]) => ffffffff8122fa8b do_close_on_exec ([kernel.kallsyms])
    ls 1843 1 branches: ffffffff8122fa8e do_close_on_exec ([kernel.kallsyms]) => ffffffff8122faf0 do_close_on_exec ([kernel.kallsyms])
    ls 1843 1 branches: ffffffff8122faf7 do_close_on_exec ([kernel.kallsyms]) => ffffffff8122fa8b do_close_on_exec ([kernel.kallsyms])
    ls 1843 1 branches: ffffffff8122fac9 do_close_on_exec ([kernel.kallsyms]) => ffffffff8122fad2 do_close_on_exec ([kernel.kallsyms])
    ls 1843 1 branches: ffffffff8122fadd do_close_on_exec ([kernel.kallsyms]) => ffffffff8120fc80 filp_close ([kernel.kallsyms])
    ls 1843 1 branches: ffffffff8120fcaf filp_close ([kernel.kallsyms]) => ffffffff8120fcb6 filp_close ([kernel.kallsyms])
    ls 1843 1 branches: ffffffff8120fcc2 filp_close ([kernel.kallsyms]) => ffffffff812547f0 dnotify_flush ([kernel.kallsyms])
    ls 1843 1 branches: ffffffff81254823 dnotify_flush ([kernel.kallsyms]) => ffffffff8120fcc7 filp_close ([kernel.kallsyms])
    ls 1843 1 branches: ffffffff8120fccd filp_close ([kernel.kallsyms]) => ffffffff81261790 locks_remove_posix ([kernel.kallsyms])
    ls 1843 1 branches: ffffffff812617a3 locks_remove_posix ([kernel.kallsyms]) => ffffffff812617b9 locks_remove_posix ([kernel.kallsyms])
    ls 1843 1 branches: ffffffff812617b9 locks_remove_posix ([kernel.kallsyms]) => ffffffff8120fcd2 filp_close ([kernel.kallsyms])
    ls 1843 1 branches: ffffffff8120fcd5 filp_close ([kernel.kallsyms]) => ffffffff812142c0 fput ([kernel.kallsyms])
    ls 1843 1 branches: ffffffff812142d6 fput ([kernel.kallsyms]) => ffffffff812142df fput ([kernel.kallsyms])
    ls 1843 1 branches: ffffffff8121430c fput ([kernel.kallsyms]) => ffffffff810b6580 task_work_add ([kernel.kallsyms])
    ls 1843 1 branches: ffffffff810b65ad task_work_add ([kernel.kallsyms]) => ffffffff810b65b1 task_work_add ([kernel.kallsyms])
    ls 1843 1 branches: ffffffff810b65c1 task_work_add ([kernel.kallsyms]) => ffffffff810bc710 kick_process ([kernel.kallsyms])
    ls 1843 1 branches: ffffffff810bc725 kick_process ([kernel.kallsyms]) => ffffffff810bc742 kick_process ([kernel.kallsyms])
    ls 1843 1 branches: ffffffff810bc742 kick_process ([kernel.kallsyms]) => ffffffff810b65c6 task_work_add ([kernel.kallsyms])
    ls 1843 1 branches: ffffffff810b65c9 task_work_add ([kernel.kallsyms]) => ffffffff81214311 fput ([kernel.kallsyms])

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1437150840-31811-9-git-send-email-adrian.hunter@intel.com
[ Merged sample->time fix for bug found after first round of testing on slightly older kernel ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agotools lib traceevent: Add checks for returned EVENT_ERROR type
Dean Nelson [Thu, 20 Aug 2015 15:16:32 +0000 (11:16 -0400)]
tools lib traceevent: Add checks for returned EVENT_ERROR type

Running the following perf-stat command on an arm64 system produces the
following result...

  [root@aarch64 ~]# perf stat -e kmem:mm_page_alloc -a sleep 1
    Warning: [kmem:mm_page_alloc] function sizeof not defined
    Warning: Error: expected type 4 but read 0
  Segmentation fault
  [root@aarch64 ~]#

The second warning was a result of the first warning not stopping
processing after it detected the issue.

That is, code that found the issue reported the first problem, but
because it did not exit out of the functions smoothly, it caused the
other warning to appear and not only that, it later caused the SIGSEGV.

Signed-off-by: Dean Nelson <dnelson@redhat.com>
Reviewed-by: Steven Rostedt <rostedt@goodmis.org>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20150820151632.13927.13791.email-sent-by-dnelson@teal
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf tools: Fix Intel PT timestamp handling
Adrian Hunter [Thu, 20 Aug 2015 08:51:32 +0000 (11:51 +0300)]
perf tools: Fix Intel PT timestamp handling

Events that don't sample the timestamp have a timestamp value of -1.

Intel PT processing wasn't taking that into account.

This is particularly noticeable with Intel BTS because timestamps are
not requested by default.

Then, if the conversion of -1 to TSC results in a small number, the
processing is unaffected.

However if the conversion results in a big number, then the data is
processed prematurely before relevant sideband data like mmap events,
which in turn results in samples with unknown dsos.

Commiter note:

Since BTS wasn't upstream, I split the patch to fold the BTS part with
the patch introducing it, to avoid having this bug in the commit
history. PT was already upstream, so this patch contains that part.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1440060692-5585-1-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf tools: /proc/kcore requires CAP_SYS_RAWIO message too noisy
Adrian Hunter [Thu, 20 Aug 2015 10:07:40 +0000 (13:07 +0300)]
perf tools: /proc/kcore requires CAP_SYS_RAWIO message too noisy

The "/proc/kcore requires CAP_SYS_RAWIO" message comes up all the time
for 'perf script' if vmlinux is not found and the user isn't root, even
when the kernel is not being traced and even though the message is only
really relevant for annotation.

Change it to pr_debug and instead put a note in the message displayed if
annotation is not possible.

Also, the file being accessed might not be /proc/kcore.  Tools can be
directed to a different location using the --kallsyms option in which
case kcore is expected to be in the same directory.  Adjust the message
so it is not misleading in that case.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Li Zhang <zhlcindy@linux.vnet.ibm.com>
Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/r/1440065260-8802-1-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf script: Fix segfault using --show-mmap-events
Adrian Hunter [Thu, 20 Aug 2015 08:26:45 +0000 (11:26 +0300)]
perf script: Fix segfault using --show-mmap-events

Patch "perf script: Don't assume evsel position of tracking events"
changed 'perf script' to use 'perf_evlist__id2evsel()'. That results
in a segfault if there is more than 1 event and there are
synthesized mmap events e.g.

$ perf record -e cycles,instructions -p$$ sleep 1
$ perf script --show-mmap-events
Segmentation fault (core dumped)

That happens because these synthesized events have an 'id' of zero
which does not match any 'evsel'.

Currently, these synthesized events use the sample type of the first
evsel.

Change 'perf_evlist__id2evsel()' to reflect that which also makes
it consistent with 'perf_evlist__event2evsel()'.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Fixes: 06b234ec26fd ("perf script: Don't assume evsel position of tracking events")
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1440059205-1765-1-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf/x86/msr: Fix the MSR driver build
Ingo Molnar [Fri, 21 Aug 2015 06:14:46 +0000 (08:14 +0200)]
perf/x86/msr: Fix the MSR driver build

The new MSR PMU driver made use of rdtsc() which does not exist (yet) in
this tree:

  arch/x86/kernel/cpu/perf_event_msr.c:91:3: error: implicit declaration of function 'rdtsc'

Use the old rdtscll() primitive for now.

Reported-by: kbuild test robot <fengguang.wu@intel.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
8 years agoMerge tag 'perf-core-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git...
Ingo Molnar [Thu, 20 Aug 2015 09:49:26 +0000 (11:49 +0200)]
Merge tag 'perf-core-for-mingo' of git://git./linux/kernel/git/acme/linux into perf/core

Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:

  - Support Intel PT in several tools, enabling the use of the processor trace
    feature introduced in Intel Broadwell processors: (Adrian Hunter)

 # dmesg | grep Performance
 # [0.188477] Performance Events: PEBS fmt2+, 16-deep LBR, Broadwell events, full-width counters, Intel PMU driver.
 # perf record -e intel_pt//u -a sleep 1
 [ perf record: Woken up 1 times to write data ]
 [ perf record: Captured and wrote 0.216 MB perf.data ]
 # perf script # then navigate in the tool output to some area, like this one:
 184 1030 dl_main (/usr/lib64/ld-2.17.so) => 7f21ba661440 dl_main (/usr/lib64/ld-2.17.so)
 185 1457 dl_main (/usr/lib64/ld-2.17.so) => 7f21ba669f10 _dl_new_object (/usr/lib64/ld-2.17.so)
 186 9f37 _dl_new_object (/usr/lib64/ld-2.17.so) => 7f21ba677b90 strlen (/usr/lib64/ld-2.17.so)
 187 7ba3 strlen (/usr/lib64/ld-2.17.so) => 7f21ba677c75 strlen (/usr/lib64/ld-2.17.so)
 188 7c78 strlen (/usr/lib64/ld-2.17.so) => 7f21ba669f3c _dl_new_object (/usr/lib64/ld-2.17.so)
 189 9f8a _dl_new_object (/usr/lib64/ld-2.17.so) => 7f21ba65fab0 calloc@plt (/usr/lib64/ld-2.17.so)
 190 fab0 calloc@plt (/usr/lib64/ld-2.17.so) => 7f21ba675e70 calloc (/usr/lib64/ld-2.17.so)
 191 5e87 calloc (/usr/lib64/ld-2.17.so) => 7f21ba65fa90 malloc@plt (/usr/lib64/ld-2.17.so)
 192 fa90 malloc@plt (/usr/lib64/ld-2.17.so) => 7f21ba675e60 malloc (/usr/lib64/ld-2.17.so)
 193 5e68 malloc (/usr/lib64/ld-2.17.so) => 7f21ba65fa80 __libc_memalign@plt (/usr/lib64/ld-2.17.so)
 194 fa80 __libc_memalign@plt (/usr/lib64/ld-2.17.so) => 7f21ba675d50 __libc_memalign (/usr/lib64/ld-2.17.so)
 195 5d63 __libc_memalign (/usr/lib64/ld-2.17.so) => 7f21ba675e20 __libc_memalign (/usr/lib64/ld-2.17.so)
 196 5e40 __libc_memalign (/usr/lib64/ld-2.17.so) => 7f21ba675d73 __libc_memalign (/usr/lib64/ld-2.17.so)
 197 5d97 __libc_memalign (/usr/lib64/ld-2.17.so) => 7f21ba675e18 __libc_memalign (/usr/lib64/ld-2.17.so)
 198 5e1e __libc_memalign (/usr/lib64/ld-2.17.so) => 7f21ba675df9 __libc_memalign (/usr/lib64/ld-2.17.so)
 199 5e10 __libc_memalign (/usr/lib64/ld-2.17.so) => 7f21ba669f8f _dl_new_object (/usr/lib64/ld-2.17.so)
 200 9fc2 _dl_new_object (/usr/lib64/ld-2.17.so) =>  7f21ba678e70 memcpy (/usr/lib64/ld-2.17.so)
 201 8e8c memcpy (/usr/lib64/ld-2.17.so) => 7f21ba678ea0 memcpy (/usr/lib64/ld-2.17.so)

  - Fix annotation of vdso (Adrian Hunter)

  - Fix DWARF callchains in 'perf script' (Jiri Olsa)

  - Fix adding probes in kernel syscalls and listing which variables can be
    collected at kernel syscall function lines (Masami Hiramatsu)

Build Fixes:

  - Fix 32-bit compilation error in util/annotate.c (Adrian Hunter)

  - Support static linking with libdw on Fedora 22 (Andi Kleen)

Infrastructure changes:

  - Add a helper function to probe whether cpu-wide tracing is possible (Adrian Hunter)

  - Move vfs_getname storage to per thread area in 'perf trace' (Arnaldo Carvalho de Melo)

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
8 years agoMerge branch 'perf/urgent' into perf/core, to pick up fixes before adding more changes
Ingo Molnar [Thu, 20 Aug 2015 09:48:56 +0000 (11:48 +0200)]
Merge branch 'perf/urgent' into perf/core, to pick up fixes before adding more changes

Signed-off-by: Ingo Molnar <mingo@kernel.org>
8 years agoMerge tag 'perf-urgent-for-mingo' of git://git.kernel.org/pub/scm/linux/kernel/git...
Ingo Molnar [Thu, 20 Aug 2015 09:47:14 +0000 (11:47 +0200)]
Merge tag 'perf-urgent-for-mingo' of git://git./linux/kernel/git/acme/linux into perf/urgent

Pull perf/urgent fixes from Arnaldo Carvalho de Melo:

  - Fix buildid processing done at the end of a 'perf record' session, a
    problem that happened in workloads involving lots of small short-lived
    processes.  That code was not asking the perf_session layer to order
    the events.

    Make the code more robust to handle some of the problems with such
    out-of-order events and fix 'perf record' to ask for ordered events
    on systems where we have perf_event_attr.sample_id_all.  (Adrian Hunter)

  - Show backtrace when handling a SIGSEGV in 'perf top --stdio' (Arnaldo Carvalho de Melo)

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
8 years agoperf top: Show backtrace when handling a SIGSEGV on --stdio mode
Arnaldo Carvalho de Melo [Wed, 19 Aug 2015 18:16:08 +0000 (15:16 -0300)]
perf top: Show backtrace when handling a SIGSEGV on --stdio mode

It was just freezing instead of informing about the SEGV, fix it and
also print a backtrace, just like in the TUI mode and in 'perf trace'.

Tested by provoking a NULL deref when pressing 'z':

     0.31%  libc-2.20.so     [.] malloc_consolidate
     0.31%  ld-2.20.so       [.] _dl_relocate_object
     0.28%  cc1              [.] ht_lookup
     0.28%  cc1              [.] ira_init_register_move_cost
  perf: Segmentation fault
  Obtained 7 stack frames.
  perf(dump_stack+0x32) [0x4d69f2]
  perf(sighandler_dump_stack+0x29) [0x4d6a89]
  /lib64/libc.so.6(+0x34960) [0x7f5064333960]
  perf() [0x438790]
  /lib64/libpthread.so.0(+0x752a) [0x7f50663dd52a]
  /lib64/libc.so.6(clone+0x6d) [0x7f50643ff22d]
  #

Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-pewrpzqd29rgmhu2wkk7fhww@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf tools: Fix buildid processing
Adrian Hunter [Wed, 19 Aug 2015 14:29:21 +0000 (17:29 +0300)]
perf tools: Fix buildid processing

After recording, 'perf record' post-processes the data to determine
which buildids are needed.

That processing must process the data in time order, if possible,
because otherwise dependent events, like forks and mmaps, will not make
sense.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Tested-by: Jiri Olsa <jolsa@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/1439994561-27436-4-git-send-email-adrian.hunter@intel.com
[ Moved the sample_id_add to after trying to open the events, use pr_warning ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf tools: Make fork event processing more resilient
Adrian Hunter [Wed, 19 Aug 2015 14:29:20 +0000 (17:29 +0300)]
perf tools: Make fork event processing more resilient

When processing a fork event, the tools lookup the parent thread by its
tid.  In a couple of cases, it is possible for that thread to have the
wrong pid.

That can happen if the data is being processed out of order, or if the
(fork) event that would have removed the erroneous thread was lost.

Assume the latter case, print a dump message, remove the erroneous
thread, create a new one with the correct pid, and keep going.

Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Tested-by: Jiri Olsa <jolsa@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/1439994561-27436-3-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf tools: Avoid deadlock when map_groups are broken
Adrian Hunter [Wed, 19 Aug 2015 14:29:19 +0000 (17:29 +0300)]
perf tools: Avoid deadlock when map_groups are broken

Attempting to clone map groups onto themselves will deadlock.

It only happens because of other bugs, but the code should protect
itself anyway.

Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Tested-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/1439994561-27436-2-git-send-email-adrian.hunter@intel.com
[ Use pr_debug() instead of dump_fprintf() ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoMerge tag 'dmaengine-fix-4.2-rc8' of git://git.infradead.org/users/vkoul/slave-dma
Linus Torvalds [Tue, 18 Aug 2015 19:17:36 +0000 (12:17 -0700)]
Merge tag 'dmaengine-fix-4.2-rc8' of git://git.infradead.org/users/vkoul/slave-dma

Pull dmaengine fix from Vinod Koul:
 "We recently found issue with dma_request_slave_channel() API causing
  privatecnt value to go bad.  This is fixed by balancing the privatecnt"

* tag 'dmaengine-fix-4.2-rc8' of git://git.infradead.org/users/vkoul/slave-dma:
  dmaengine: fix balance of privatecnt inc/dec operations

8 years agoMerge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux
Linus Torvalds [Tue, 18 Aug 2015 14:55:05 +0000 (07:55 -0700)]
Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux

Pull drm fixes from Dave Airlie:
 "These came in late last week, I wanted to look over the mst one before
  forwarding, but it seems good.

  Just three i915 and one MST fix"

* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
  drm/i915: Commit planes on each crtc separately.
  drm/i915: calculate primary visibility changes instead of calling from set_config
  drm/i915: Only dither on 6bpc panels
  drm/dp/mst: Remove port after removing connector.

8 years agoMerge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma
Linus Torvalds [Mon, 17 Aug 2015 23:26:30 +0000 (16:26 -0700)]
Merge tag 'for-linus' of git://git./linux/kernel/git/dledford/rdma

Pull rdma bugfix from Doug Ledford:
 "Bugfix in iw_cxgb4"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma:
  iw_cxgb4: gracefully handle unknown CQE status errors

8 years agoMerge branch 'for-4.2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj...
Linus Torvalds [Mon, 17 Aug 2015 23:20:45 +0000 (16:20 -0700)]
Merge branch 'for-4.2-fixes' of git://git./linux/kernel/git/tj/libata

Pull libata fixes from Tejun Heo:
 "Three minor device-specific fixes and revert of NCQ autosense added
  during this -rc1.

  It turned out that NCQ autosense as currently implemented interferes
  with the usual error handling behavior.  It will be revisited in the
  near future"

* 'for-4.2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/libata:
  ata: ahci_brcmstb: Fix misuse of IS_ENABLED
  sata_sx4: Check return code from pdc20621_i2c_read()
  Revert "libata: Implement NCQ autosense"
  Revert "libata: Implement support for sense data reporting"
  Revert "libata-eh: Set 'information' field for autosense"
  ata: ahci_brcmstb: Fix warnings with CONFIG_PM_SLEEP=n

8 years agoMerge branch 'for-4.2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj...
Linus Torvalds [Mon, 17 Aug 2015 23:15:26 +0000 (16:15 -0700)]
Merge branch 'for-4.2-fixes' of git://git./linux/kernel/git/tj/cgroup

Pull cgroup fix from Tejun Heo:
 "A fix for a subtle bug introduced back during 3.17 cycle which
  interferes with setting configurations under specific conditions"

* 'for-4.2-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
  cpuset: use trialcs->mems_allowed as a temp variable

8 years agodmaengine: fix balance of privatecnt inc/dec operations
Robert Baldyga [Fri, 7 Aug 2015 10:26:47 +0000 (12:26 +0200)]
dmaengine: fix balance of privatecnt inc/dec operations

This patch increments privatecnt value and set DMA_PRIVATE in device
caps in dma_request_slave_channel() function. This is needed to keep
privatecnt increment/decrement balance.

As function dma_release_channel() decrements privatecnt counter, we need
to increment it when channel is requested. Otherwise privatecnt drops
into negatives after few dma_release_channel() calls.

Reported-by: Krzysztof Kozlowski <k.kozlowski@samsung.com>
Signed-off-by: Robert Baldyga <r.baldyga@samsung.com>
Signed-off-by: Vinod Koul <vinod.koul@intel.com>
8 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Linus Torvalds [Mon, 17 Aug 2015 14:57:46 +0000 (07:57 -0700)]
Merge git://git./linux/kernel/git/herbert/crypto-2.6

Pull crypto fixes from Herbert Xu:
 "This fixes the following issues:

   - a regression caused by the conversion of IPsec ESP to the new AEAD
     interface: ESN with authencesn no longer works because it relied on
     the AD input SG list having a specific layout which is no longer
     the case.  In linux-next authencesn is fixed properly and no longer
     assumes anything about the SG list format.  While for this release
     a minimal fix is applied to authencesn so that it works with the
     new linear layout.

   - fix memory corruption caused by bogus index in the caam hash code.

   - fix powerpc nx SHA hashing which could cause module load failures
     if module signature verification is enabled"

* git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6:
  crypto: caam - fix memory corruption in ahash_final_ctx
  crypto: nx - respect sg limit bounds when building sg lists for SHA
  crypto: authencesn - Fix breakage with new ESP code

8 years agoperf tools: Take Intel PT into use
Adrian Hunter [Fri, 17 Jul 2015 16:33:42 +0000 (19:33 +0300)]
perf tools: Take Intel PT into use

To record an AUX area, the weak function auxtrace_record__init() must be
implemented.

Equally to decode an AUX area, the AUX area tracing type must be added
to the perf_event__process_auxtrace_info() function.

This patch makes those two changes plus hooks up default config for the
intel_pt PMU.  Also some brief documentation is provided for using the
tools with intel_pt.

Commiter note:

E.g:

  [root@perf4 ~]# dmesg
  451 [0.405807] Performance Events: PEBS fmt2+, 16-deep LBR, Broadwell events, full-width counters, Intel PMU driver.
  [root@perf4 ~]# perf --version
  perf version 4.1.g53874a
  [root@perf4 ~]#  perf record -e intel_pt//u -a sleep 10
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.383 MB perf.data ]
  [root@perf4 ~]# perf evlist
  intel_pt//u
  sched:sched_switch
  dummy:u
  [root@perf4 ~]# perf report --stdio
  # To display the perf.data header info, please use --header/--header-only options.
  #
  #
  # Total Lost Samples: 0
  #
  # Samples: 0  of event 'intel_pt//u'
  # Event count (approx.): 0
  #
  # Overhead  Command  Shared Object  Symbol
  # ........  .......  .............  ......
  #

  # Samples: 393  of event 'sched:sched_switch'
  # Event count (approx.): 393
  #
  # Overhead  Command         Shared Object     Symbol
  # ........  ..............  ................  ..............
    49.62%  swapper         [kernel.vmlinux]  [k] __schedule
    10.69%  rcu_sched       [kernel.vmlinux]  [k] __schedule
     6.62%  rcuos/0         [kernel.vmlinux]  [k] __schedule
     5.60%  kworker/0:1     [kernel.vmlinux]  [k] __schedule
     3.56%  rcuos/3         [kernel.vmlinux]  [k] __schedule
     3.05%  kworker/u384:2  [kernel.vmlinux]  [k] __schedule
     2.54%  kworker/2:0     [kernel.vmlinux]  [k] __schedule
     2.54%  tuned           [kernel.vmlinux]  [k] __schedule
  <SNIP>
  # Samples: 0  of event 'dummy:u'
  # Event count (approx.): 0
  #
  # Overhead  Command  Shared Object  Symbol
  # ........  .......  .............  ......

  # Samples: 28  of event 'instructions:u'
  # Event count (approx.): 5030172
  #
  # Overhead  Command     Shared Object        Symbol
  # ........  ..........  ...................  ................................
  #
    21.43%  tuned       libpython2.7.so.1.0  [.] PyEval_EvalFrameEx
                 |
                 ---PyEval_EvalFrameEx
                    |
                    |--83.33%-- PyEval_EvalCodeEx
                    |          PyEval_EvalFrameEx
                    |          |
                    |          |--60.00%-- PyEval_EvalCodeEx
                    |          |          PyEval_EvalFrameEx
                    |          |          PyEval_EvalFrameEx
                    |          |
                    |           --40.00%-- PyEval_EvalFrameEx
                    |
                     --16.67%-- PyEval_EvalFrameEx
                               PyEval_EvalCodeEx
                               PyEval_EvalFrameEx
                               PyEval_EvalCodeEx
                               PyEval_EvalFrameEx
                               PyEval_EvalFrameEx

    14.29%  tuned       libpython2.7.so.1.0  [.] _PyType_Lookup
                 |
                 ---_PyType_Lookup
                    _PyObject_GenericGetAttrWithDict
                    PyEval_EvalFrameEx
                    PyEval_EvalCodeEx
                    PyEval_EvalFrameEx
                    PyEval_EvalCodeEx
                    PyEval_EvalFrameEx
                    |
                    |--75.00%-- PyEval_EvalFrameEx
                    |
                     --25.00%-- PyEval_EvalCodeEx
                               PyEval_EvalFrameEx
                               PyEval_EvalFrameEx

     3.57%  irqbalance  irqbalance           [.] 0x0000000000004038
            |
            ---0x4038
               0x4761
               0x4761
               0x4761
               0x49f1
               0x2295

     3.57%  irqbalance  libc-2.17.so         [.] __GI_____strtoull_l_internal
            |
            ---__GI_____strtoull_l_internal
               0x6f49
               0x229a

     3.57%  irqbalance  libc-2.17.so         [.] __strchrnul
            |
            ---__strchrnul
               vfprintf
               __vsprintf_chk
               __sprintf_chk
               0x2724
               0x4038
               0x2331

     3.57%  irqbalance  libc-2.17.so         [.] __strstr_sse42
            |
            ---__strstr_sse42
               0x71e0
               0x229f

  # And now to some userspace ftrace on uninstrumented binaries 8-) :
  # Hand edited to make it a bit more compact, replacing /home/acme/bin/perf
  # with /bin/perf:

  [root@perf4 ~]# perf script
     perf 8921 [3] 7.310889: 1 branches:u:            0 [unknown] ([unknown]) => 7fcecadbf257 __GI___ioctl (/usr/lib64/libc-2.17.so)
     perf 8921 [3] 7.310889: 1 branches:u: 7fcecadbf25f __GI___ioctl (/usr/lib64/libc-2.17.so) => 481689 perf_evlist__enable (/bin/perf)
     perf 8921 [3] 7.310889: 1 branches:u:       481694 perf_evlist__enable (/bin/perf) => 481614 perf_evlist__enable (/bin/perf)
     perf 8921 [3] 7.310889: 1 branches:u:       481630 perf_evlist__enable (/bin/perf) => 4816d8 perf_evlist__enable (/bin/perf)
     perf 8921 [3] 7.310889: 1 branches:u:       4816de perf_evlist__enable (/bin/perf) => 48164f perf_evlist__enable (/bin/perf)
     perf 8921 [3] 7.310889: 1 branches:u:       481652 perf_evlist__enable (/bin/perf) => 48165f perf_evlist__enable (/bin/perf)
     perf 8921 [3] 7.310889: 1 branches:u:       481684 perf_evlist__enable (/bin/perf) => 41d250 ioctl@plt (/bin/perf)
     perf 8921 [3] 7.310889: 1 branches:u:       41d250 ioctl@plt (/bin/perf) => 7fcecadbf250 __GI___ioctl (/usr/lib64/libc-2.17.so)
     perf 8921 [3] 7.310889: 1 branches:u: 7fcecadbf255 __GI___ioctl (/usr/lib64/libc-2.17.so) => 0 [unknown] ([unknown])
     perf 8921 [3] 7.310890: 1 branches:u:            0 [unknown] ([unknown]) => 7fcecadbf257 __GI___ioctl (/usr/lib64/libc-2.17.so)
     perf 8921 [3] 7.310890: 1 branches:u: 7fcecadbf25f __GI___ioctl (/usr/lib64/libc-2.17.so) => 481689 perf_evlist__enable (/bin/perf)
     perf 8921 [3] 7.310890: 1 branches:u:       481694 perf_evlist__enable (/bin/perf) => 481614 perf_evlist__enable (/bin/perf)
     perf 8921 [3] 7.310890: 1 branches:u:       481652 perf_evlist__enable (/bin/perf) => 48165f perf_evlist__enable (/bin/perf)
     perf 8921 [3] 7.310890: 1 branches:u:       481684 perf_evlist__enable (/bin/perf) => 41d250 ioctl@plt (/bin/perf)
     perf 8921 [3] 7.310890: 1 branches:u:       41d250 ioctl@plt (/bin/perf) => 7fcecadbf250 __GI___ioctl (/usr/lib64/libc-2.17.so)
     perf 8921 [3] 7.310890: 1 branches:u: 7fcecadbf255 __GI___ioctl (/usr/lib64/libc-2.17.so) => 0 [unknown] ([unknown])
     perf 8921 [3] 7.310893: 1 branches:u:            0 [unknown] ([unknown]) => 7fcecadbf257 __GI___ioctl (/usr/lib64/libc-2.17.so)
     perf 8921 [3] 7.310893: 1 branches:u: 7fcecadbf25f __GI___ioctl (/usr/lib64/libc-2.17.so) => 481689 perf_evlist__enable (/bin/perf)
     perf 8921 [3] 7.310893: 1 branches:u:       4816a8 perf_evlist__enable (/bin/perf) => 4815f8 perf_evlist__enable (/bin/perf)
     perf 8921 [3] 7.310893: 1 branches:u:       4815fe perf_evlist__enable (/bin/perf) => 481614 perf_evlist__enable (/bin/perf)
     perf 8921 [3] 7.310893: 1 branches:u:       481652 perf_evlist__enable (/bin/perf) => 48165f perf_evlist__enable (/bin/perf)
     perf 8921 [3] 7.310893: 1 branches:u:       481684 perf_evlist__enable (/bin/perf) => 41d250 ioctl@plt (/bin/perf)
     perf 8921 [3] 7.310893: 1 branches:u:       41d250 ioctl@plt (/bin/perf) => 7fcecadbf250 __GI___ioctl (/usr/lib64/libc-2.17.so)
     perf 8921 [3] 7.310893: 1 branches:u: 7fcecadbf255 __GI___ioctl (/usr/lib64/libc-2.17.so) => 0 [unknown] ([unknown])
     perf 8921 [3] 7.310956: 1 branches:u:            0 [unknown] ([unknown]) => 7fcecadbf257 __GI___ioctl (/usr/lib64/libc-2.17.so)
     perf 8921 [3] 7.310956: 1 branches:u: 7fcecadbf25f __GI___ioctl (/usr/lib64/libc-2.17.so) => 481689 perf_evlist__enable (/bin/perf)
     perf 8921 [3] 7.310956: 1 branches:u:       481694 perf_evlist__enable (/bin/perf) => 481614 perf_evlist__enable (/bin/perf)
     perf 8921 [3] 7.310956: 1 branches:u:       481630 perf_evlist__enable (/bin/perf) => 4816d8 perf_evlist__enable (/bin/perf)
     perf 8921 [3] 7.310956: 1 branches:u:       4816de perf_evlist__enable (/bin/perf) => 48164f perf_evlist__enable (/bin/perf)
     perf 8921 [3] 7.310956: 1 branches:u:       481652 perf_evlist__enable (/bin/perf) => 48165f perf_evlist__enable (/bin/perf)
     perf 8921 [3] 7.310956: 1 branches:u:       481684 perf_evlist__enable (/bin/perf) => 41d250 ioctl@plt (/bin/perf)
     perf 8921 [3] 7.310956: 1 branches:u:       41d250 ioctl@plt (/bin/perf) => 7fcecadbf250 __GI___ioctl (/usr/lib64/libc-2.17.so)
     perf 8921 [3] 7.310956: 1 branches:u: 7fcecadbf255 __GI___ioctl (/usr/lib64/libc-2.17.so) => 0 [unknown] ([unknown])
     perf 8921 [3] 7.310961: 1 branches:u:            0 [unknown] ([unknown]) => 7fcecadbf257 __GI___ioctl (/usr/lib64/libc-2.17.so)
     perf 8921 [3] 7.310961: 1 branches:u: 7fcecadbf25f __GI___ioctl (/usr/lib64/libc-2.17.so) => 481689 perf_evlist__enable (/bin/perf)
     perf 8921 [3] 7.310961: 1 branches:u:       481694 perf_evlist__enable (/bin/perf) => 481614 perf_evlist__enable (/bin/perf)
     perf 8921 [3] 7.310961: 1 branches:u:       481652 perf_evlist__enable (/bin/perf) => 48165f perf_evlist__enable (/bin/perf)
     perf 8921 [3] 7.310961: 1 branches:u:       481684 perf_evlist__enable (/bin/perf) => 41d250 ioctl@plt (/bin/perf)
     perf 8921 [3] 7.310961: 1 branches:u:       41d250 ioctl@plt (/bin/perf) => 7fcecadbf250 __GI___ioctl (/usr/lib64/libc-2.17.so)
     perf 8921 [3] 7.310961: 1 branches:u: 7fcecadbf255 __GI___ioctl (/usr/lib64/libc-2.17.so) => 0 [unknown] ([unknown])
     perf 8921 [3] 7.310968: 1 branches:u:            0 [unknown] ([unknown]) => 7fcecadbf257 __GI___ioctl (/usr/lib64/libc-2.17.so)
     perf 8921 [3] 7.310968: 1 branches:u: 7fcecadbf25f __GI___ioctl (/usr/lib64/libc-2.17.so) => 481689 perf_evlist__enable (/bin/perf)
     perf 8921 [3] 7.310968: 1 branches:u:       4816a8 perf_evlist__enable (/bin/perf) => 4815f8 perf_evlist__enable (/bin/perf)
     perf 8921 [3] 7.310968: 1 branches:u:       4815fe perf_evlist__enable (/bin/perf) => 481614 perf_evlist__enable (/bin/perf)
     perf 8921 [3] 7.310968: 1 branches:u:       481652 perf_evlist__enable (/bin/perf) => 48165f perf_evlist__enable (/bin/perf)
     perf 8921 [3] 7.310968: 1 branches:u:       481684 perf_evlist__enable (/bin/perf) => 41d250 ioctl@plt (/bin/perf)
     perf 8921 [3] 7.310968: 1 branches:u:       41d250 ioctl@plt (/bin/perf) => 7fcecadbf250 __GI___ioctl (/usr/lib64/libc-2.17.so)
     perf 8921 [3] 7.310968: 1 branches:u: 7fcecadbf255 __GI___ioctl (/usr/lib64/libc-2.17.so) => 0 [unknown] ([unknown])
     perf 8921 [3] 7.311040: 1 branches:u:            0 [unknown] ([unknown]) => 7fcecadbf257 __GI___ioctl (/usr/lib64/libc-2.17.so)
     perf 8921 [3] 7.311040: 1 branches:u: 7fcecadbf25f __GI___ioctl (/usr/lib64/libc-2.17.so) => 481689 perf_evlist__enable (/bin/perf)
     perf 8921 [3] 7.311040: 1 branches:u:       481694 perf_evlist__enable (/bin/perf) => 481614 perf_evlist__enable (/bin/perf)
     perf 8921 [3] 7.311040: 1 branches:u:       481630 perf_evlist__enable (/bin/perf) => 4816d8 perf_evlist__enable (/bin/perf)
     perf 8921 [3] 7.311040: 1 branches:u:       4816de perf_evlist__enable (/bin/perf) => 48164f perf_evlist__enable (/bin/perf)
     perf 8921 [3] 7.311040: 1 branches:u:       481652 perf_evlist__enable (/bin/perf) => 48165f perf_evlist__enable (/bin/perf)
     perf 8921 [3] 7.311040: 1 branches:u:       481684 perf_evlist__enable (/bin/perf) => 41d250 ioctl@plt (/bin/perf)
     perf 8921 [3] 7.311040: 1 branches:u:       41d250 ioctl@plt (/bin/perf) => 7fcecadbf250 __GI___ioctl (/usr/lib64/libc-2.17.so)
     perf 8921 [3] 7.311040: 1 branches:u: 7fcecadbf255 __GI___ioctl (/usr/lib64/libc-2.17.so) => 0 [unknown] ([unknown])
     perf 8921 [3] 7.311046: 1 branches:u:            0 [unknown] ([unknown]) => 7fcecadbf257 __GI___ioctl (/usr/lib64/libc-2.17.so)
     perf 8921 [3] 7.311046: 1 branches:u: 7fcecadbf25f __GI___ioctl (/usr/lib64/libc-2.17.so) => 481689 perf_evlist__enable (/bin/perf)
     perf 8921 [3] 7.311046: 1 branches:u:       481694 perf_evlist__enable (/bin/perf) => 481614 perf_evlist__enable (/bin/perf)
     perf 8921 [3] 7.311046: 1 branches:u:       481652 perf_evlist__enable (/bin/perf) => 48165f perf_evlist__enable (/bin/perf)
     perf 8921 [3] 7.311046: 1 branches:u:       481684 perf_evlist__enable (/bin/perf) => 41d250 ioctl@plt (/bin/perf)
     perf 8921 [3] 7.311046: 1 branches:u:       41d250 ioctl@plt (/bin/perf) => 7fcecadbf250 __GI___ioctl (/usr/lib64/libc-2.17.so)
     perf 8921 [3] 7.311046: 1 branches:u: 7fcecadbf255 __GI___ioctl (/usr/lib64/libc-2.17.so) => 0 [unknown] ([unknown])
     perf 8921 [3] 7.311050: 1 branches:u:            0 [unknown] ([unknown]) => 7fcecadbf257 __GI___ioctl (/usr/lib64/libc-2.17.so)
     perf 8921 [3] 7.311050: 1 branches:u: 7fcecadbf25f __GI___ioctl (/usr/lib64/libc-2.17.so) => 481689 perf_evlist__enable (/bin/perf)
:

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1437150840-31811-8-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf tools: Add Intel PT support
Adrian Hunter [Fri, 17 Jul 2015 16:33:41 +0000 (19:33 +0300)]
perf tools: Add Intel PT support

Add support for Intel Processor Trace.

Intel PT support fits within the new auxtrace infrastructure.  Recording
is supporting by identifying the Intel PT PMU, parsing options and
setting up events.

Decoding is supported by queuing up trace data by cpu or thread and then
decoding synchronously delivering synthesized event samples into the
session processing for tools to consume.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1437150840-31811-7-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf tools: Add Intel PT decoder
Adrian Hunter [Fri, 17 Jul 2015 16:33:40 +0000 (19:33 +0300)]
perf tools: Add Intel PT decoder

Add support for decoding an Intel Processor Trace.

Intel PT trace data must be 'decoded' which involves walking the object
code and matching the trace data packets.

The decoder requests a buffer of binary data via a get_trace()
call-back, which it decodes using instruction information which it gets
via another call-back walk_insn().

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1437150840-31811-6-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf tools: Add Intel PT log
Adrian Hunter [Fri, 17 Jul 2015 16:33:39 +0000 (19:33 +0300)]
perf tools: Add Intel PT log

Add a facility to log Intel Processor Trace decoding.  The log is
intended for debugging purposes only.

The log file name is "intel_pt.log" and is opened in the current
directory.  The log contains a record of all packets and instructions
decoded and can get very large (10 MB would be a small one).

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1437150840-31811-5-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf tools: Add Intel PT instruction decoder
Adrian Hunter [Thu, 13 Aug 2015 07:14:55 +0000 (10:14 +0300)]
perf tools: Add Intel PT instruction decoder

Add support for decoding instructions for Intel Processor Trace.  The
kernel x86 instruction decoder is copied for this.

This essentially provides intel_pt_get_insn() which takes a binary
buffer, uses the kernel's x86 instruction decoder to get details of the
instruction and then categorizes it for consumption by an Intel PT
decoder.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1439450095-30122-1-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf tools: Add Intel PT packet decoder
Adrian Hunter [Fri, 17 Jul 2015 16:33:37 +0000 (19:33 +0300)]
perf tools: Add Intel PT packet decoder

Add support for decoding Intel Processor Trace packets.

This essentially provides intel_pt_get_packet() which takes a buffer of
binary data and returns the decoded packet.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1437150840-31811-3-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf auxtrace: Add Intel PT as an AUX area tracing type
Adrian Hunter [Fri, 17 Jul 2015 16:33:36 +0000 (19:33 +0300)]
perf auxtrace: Add Intel PT as an AUX area tracing type

Add the Intel Processor Trace type constant PERF_AUXTRACE_INTEL_PT.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1437150840-31811-2-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf tools: Add a helper function to probe whether cpu-wide tracing is possible
Adrian Hunter [Thu, 13 Aug 2015 09:40:56 +0000 (12:40 +0300)]
perf tools: Add a helper function to probe whether cpu-wide tracing is possible

Add a helper function to probe whether cpu-wide tracing is possible.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1439458857-30636-2-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf symbols: Fix annotation of vdso
Adrian Hunter [Fri, 14 Aug 2015 12:50:06 +0000 (15:50 +0300)]
perf symbols: Fix annotation of vdso

Older kernels attempt to prelink vdso to its virtual address.  To permit
annotation using objdump, the map__rip_2objdump() calculation must
result in that same address which we can infer from the start and offset
of the text section.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Will Deacon <will.deacon@arm.com>
Link: http://lkml.kernel.org/r/1439556606-11297-1-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf annotate: Fix 32-bit compilation error in util/annotate.c
Adrian Hunter [Fri, 14 Aug 2015 07:11:34 +0000 (10:11 +0300)]
perf annotate: Fix 32-bit compilation error in util/annotate.c

Fix the following 32-bit compilation errors:

  util/annotate.c: In function ‘addr_map_symbol__account_cycles’:
  util/annotate.c:643:3: error: format ‘%lx’ expects argument of type ‘long unsigned int’, but argument 4 has type ‘u64’ [-Werror=format=]
    pr_debug2("BB with bad start: addr %lx start %lx sym %lx saddr %lx\n",
      ^
  util/annotate.c:643:3: error: format ‘%lx’ expects argument of type ‘long unsigned int’, but argument 5 has type ‘u64’ [-Werror=format=]
  util/annotate.c:643:3: error: format ‘%lx’ expects argument of type ‘long unsigned int’, but argument 6 has type ‘u64’ [-Werror=format=]

These were introduced by the patch:

"perf report: Add infrastructure for a cycles histogram"

Also change the 'saddr' variable from 'unsigned long' to 'u64'
noting that theoretically we could be processing data captured
on a 64-bit machine but processing it on a 32-bit machine.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Fixes: d4957633bf9d ("perf report: Add infrastructure for a cycles histogram")
Link: http://lkml.kernel.org/r/1439536294-18241-1-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoperf script: Initialize callchain_param.record_mode
Jiri Olsa [Thu, 13 Aug 2015 07:17:24 +0000 (09:17 +0200)]
perf script: Initialize callchain_param.record_mode

Milian Wolff reported non functional DWARF unwind under perf script. The
reason is that perf script does not properly configure
callchain_param.record_mode, which is needed by unwind code.

Stealing the code from report and leaving the place for more
initialization code in a hope we could merge it with
report__setup_sample_type one day.

Reported-by: Milian Wolff <mail@milianw.de>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Milian Wolff <milian.wolff@kdab.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20150813071724.GA21322@krava.brq.redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoLinux 4.2-rc7
Linus Torvalds [Sun, 16 Aug 2015 23:34:13 +0000 (16:34 -0700)]
Linux 4.2-rc7

8 years agoMerge tag 'armsoc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm...
Linus Torvalds [Sun, 16 Aug 2015 22:44:33 +0000 (15:44 -0700)]
Merge tag 'armsoc-for-linus' of git://git./linux/kernel/git/arm/arm-soc

Pull ARM SoC fixes from Olof Johansson:
 "A smallish batch of fixes, a little more than expected this late, but
  all fixes are contained to their platforms and seem reasonably low
  risk:

   - a somewhat large SMP fix for ux500 that still seemed warranted to
     include here
   - OMAP DT fixes for pbias regulator specification that broke due to
     some DT reshuffling
   - PCIe IRQ routing bugfix for i.MX
   - networking fixes for keystone
   - runtime PM for OMAP GPMC
   - a couple of error path bug fixes for exynos"

* tag 'armsoc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc:
  ARM: dts: keystone: Fix the mdio bindings by moving it to soc specific file
  ARM: dts: keystone: fix the clock node for mdio
  memory: omap-gpmc: Don't try to save uninitialized GPMC context
  ARM: imx6: correct i.MX6 PCIe interrupt routing
  ARM: ux500: add an SMP enablement type and move cpu nodes
  ARM: dts: dra7: Fix broken pbias device creation
  ARM: dts: OMAP5: Fix broken pbias device creation
  ARM: dts: OMAP4: Fix broken pbias device creation
  ARM: dts: omap243x: Fix broken pbias device creation
  ARM: EXYNOS: fix double of_node_put() on error path
  ARM: EXYNOS: Fix potentian kfree() of ro memory

8 years agoMerge branch 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus
Linus Torvalds [Sun, 16 Aug 2015 22:39:31 +0000 (15:39 -0700)]
Merge branch 'upstream' of git://git.linux-mips.org/ralf/upstream-linus

Pull MIPS bugfix from Ralf Baechle:
 "Only a single MIPS fix - the math when invoking syscall_trace_enter
  was wrong"

* 'upstream' of git://git.linux-mips.org/pub/scm/ralf/upstream-linus:
  MIPS: Fix seccomp syscall argument for MIPS64

8 years agoMerge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 16 Aug 2015 22:11:25 +0000 (15:11 -0700)]
Merge branch 'x86-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Merge x86 fixes from Ingo Molnar:
 "Two followup fixes related to the previous LDT fix"

Also applied a further FPU emulation fix from Andy Lutomirski to the
branch before actually merging it.

* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
  x86/ldt: Further fix FPU emulation
  x86/ldt: Correct FPU emulation access to LDT
  x86/ldt: Correct LDT access in single stepping logic

8 years agox86/ldt: Further fix FPU emulation
Andy Lutomirski [Fri, 14 Aug 2015 22:02:55 +0000 (15:02 -0700)]
x86/ldt: Further fix FPU emulation

The previous fix confused a selector with a segment prefix.  Fix it.

Compile-tested only.

Cc: stable@vger.kernel.org
Cc: Juergen Gross <jgross@suse.com>
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Fixes: 4809146b86c3 ("x86/ldt: Correct FPU emulation access to LDT")
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agofs/fuse: fix ioctl type confusion
Jann Horn [Sun, 16 Aug 2015 18:27:01 +0000 (20:27 +0200)]
fs/fuse: fix ioctl type confusion

fuse_dev_ioctl() performed fuse_get_dev() on a user-supplied fd,
leading to a type confusion issue. Fix it by checking file->f_op.

Signed-off-by: Jann Horn <jann@thejh.net>
Acked-by: Miklos Szeredi <miklos@szeredi.hu>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agoMerge tag 'keystone-dts-late-fixes-v2' of git://git.kernel.org/pub/scm/linux/kernel...
Olof Johansson [Sun, 16 Aug 2015 19:29:57 +0000 (21:29 +0200)]
Merge tag 'keystone-dts-late-fixes-v2' of git://git./linux/kernel/git/ssantosh/linux-keystone into fixes

ARM: Couple of Keysyone MDIO DTS fixes for 4.2-rc6+

These are necessary to get the NIC card working on all Keystone
EVMs. Couple of boards are broken without these two fixes.

* tag 'keystone-dts-late-fixes-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/ssantosh/linux-keystone:
  ARM: dts: keystone: Fix the mdio bindings by moving it to soc specific file
  ARM: dts: keystone: fix the clock node for mdio

Signed-off-by: Olof Johansson <olof@lixom.net>
8 years agoMIPS: Fix seccomp syscall argument for MIPS64
Markos Chandras [Thu, 13 Aug 2015 07:47:59 +0000 (08:47 +0100)]
MIPS: Fix seccomp syscall argument for MIPS64

Commit 4c21b8fd8f14 ("MIPS: seccomp: Handle indirect system calls (o32)")
fixed indirect system calls on O32 but it also introduced a bug for MIPS64
where it erroneously modified the v0 (syscall) register with the assumption
that the sycall offset hasn't been taken into consideration. This breaks
seccomp on MIPS64 n64 and n32 ABIs. We fix this by replacing the addition
with a move instruction.

Fixes: 4c21b8fd8f14 ("MIPS: seccomp: Handle indirect system calls (o32)")
Cc: <stable@vger.kernel.org> # 3.15+
Reviewed-by: James Hogan <james.hogan@imgtec.com>
Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
Cc: linux-mips@linux-mips.org
Patchwork: https://patchwork.linux-mips.org/patch/10951/
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>
8 years agoMerge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Linus Torvalds [Sat, 15 Aug 2015 20:54:53 +0000 (13:54 -0700)]
Merge tag 'scsi-fixes' of git://git./linux/kernel/git/jejb/scsi

Pull SCSI fixes from James Bottomley:
 "This has two libfc fixes for bugs causing rare crashes, one iscsi fix
  for a potential hang on shutdown, and a fix for an I/O blocksize issue
  which caused a regression"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  sd: Fix maximum I/O size for BLOCK_PC requests
  libfc: Fix fc_fcp_cleanup_each_cmd()
  libfc: Fix fc_exch_recv_req() error path
  libiscsi: Fix host busy blocking during connection teardown

8 years agoMerge tag 'topic/drm-fixes-2015-08-14' of git://anongit.freedesktop.org/drm-intel...
Dave Airlie [Sat, 15 Aug 2015 04:52:12 +0000 (14:52 +1000)]
Merge tag 'topic/drm-fixes-2015-08-14' of git://anongit.freedesktop.org/drm-intel into drm-next

single MST fixes from Maarten.

* tag 'topic/drm-fixes-2015-08-14' of git://anongit.freedesktop.org/drm-intel:
  drm/dp/mst: Remove port after removing connector.

8 years agoMerge tag 'drm-intel-fixes-2015-08-14' of git://anongit.freedesktop.org/drm-intel...
Dave Airlie [Sat, 15 Aug 2015 04:51:31 +0000 (14:51 +1000)]
Merge tag 'drm-intel-fixes-2015-08-14' of git://anongit.freedesktop.org/drm-intel into drm-next

three display fixes for Intel.

* tag 'drm-intel-fixes-2015-08-14' of git://anongit.freedesktop.org/drm-intel:
  drm/i915: Commit planes on each crtc separately.
  drm/i915: calculate primary visibility changes instead of calling from set_config
  drm/i915: Only dither on 6bpc panels

8 years agoMerge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Linus Torvalds [Sat, 15 Aug 2015 00:27:52 +0000 (17:27 -0700)]
Merge tag 'for-linus' of git://git./virt/kvm/kvm

Pull KVM fixes from Paolo Bonzini:
 "Just two very small & simple patches"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: x86: Use adjustment in guest cycles when handling MSR_IA32_TSC_ADJUST
  KVM: x86: zero IDT limit on entry to SMM

8 years agoMerge branch 'akpm' (patches from Andrew)
Linus Torvalds [Sat, 15 Aug 2015 00:05:26 +0000 (17:05 -0700)]
Merge branch 'akpm' (patches from Andrew)

Merge fixes from Andrew Morton:
 "11 fixes"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  Update maintainers for DRM STI driver
  mm: cma: mark cma_bitmap_maxno() inline in header
  zram: fix pool name truncation
  memory-hotplug: fix wrong edge when hot add a new node
  .mailmap: Andrey Ryabinin has moved
  ipc/sem.c: update/correct memory barriers
  mm/hwpoison: fix panic due to split huge zero page
  ipc,sem: remove uneeded sem_undo_list lock usage in exit_sem()
  ipc,sem: fix use after free on IPC_RMID after a task using same semaphore set exits
  mm/hwpoison: fix fail isolate hugetlbfs page w/ refcount held
  mm/hwpoison: fix page refcount of unknown non LRU page

8 years agoMerge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Fri, 14 Aug 2015 23:10:04 +0000 (16:10 -0700)]
Merge tag 'clk-fixes-for-linus' of git://git./linux/kernel/git/clk/linux

Pull clock fix from Stephen Boyd:
 "A one-liner for a regression found in the PXA clock driver"

* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
  clk: pxa: pxa3xx: fix CKEN register access

8 years agoUpdate maintainers for DRM STI driver
Benjamin Gaignard [Fri, 14 Aug 2015 22:35:24 +0000 (15:35 -0700)]
Update maintainers for DRM STI driver

Add Vincent Abriou and myself as maintainers.

Signed-off-by: Benjamin Gaignard <benjamin.gaignard@linaro.org>
Cc: Vincent Abriou <vincent.abriou@st.com>
Cc: Dave Airlie <airlied@linux.ie>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agomm: cma: mark cma_bitmap_maxno() inline in header
Gregory Fong [Fri, 14 Aug 2015 22:35:21 +0000 (15:35 -0700)]
mm: cma: mark cma_bitmap_maxno() inline in header

cma_bitmap_maxno() was marked as static and not static inline, which can
cause warnings about this function not being used if this file is included
in a file that does not call that function, and violates the conventions
used elsewhere.  The two options are to move the function implementation
back to mm/cma.c or make it inline here, and it's simple enough for the
latter to make sense.

Signed-off-by: Gregory Fong <gregory.0xf0@gmail.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agozram: fix pool name truncation
Sergey Senozhatsky [Fri, 14 Aug 2015 22:35:19 +0000 (15:35 -0700)]
zram: fix pool name truncation

zram_meta_alloc() constructs a pool name for zs_create_pool() call as

    snprintf(pool_name, sizeof(pool_name), "zram%d", device_id);

However, it defines pool name buffer to be only 8 bytes long (minus
trailing zero), which means that we can have only 1000 pool names: zram0
-- zram999.

With CONFIG_ZSMALLOC_STAT enabled an attempt to create a device zram1000
can fail if device zram100 already exists, because snprintf() will
truncate new pool name to zram100 and pass it debugfs_create_dir(),
causing:

  debugfs dir <zram100> creation failed
  zram: Error creating memory pool

... and so on.

Fix it by passing zram->disk->disk_name to zram_meta_alloc() instead of
divice_id.  We construct zram%d name earlier and keep it as a ->disk_name,
no need to snprintf() it again.

Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Minchan Kim <minchan@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agomemory-hotplug: fix wrong edge when hot add a new node
Xishi Qiu [Fri, 14 Aug 2015 22:35:16 +0000 (15:35 -0700)]
memory-hotplug: fix wrong edge when hot add a new node

When we add a new node, the edge of memory may be wrong.

e.g. system has 4 nodes, and node3 is movable, node3 mem:[24G-32G],

1. hotremove the node3,
2. then hotadd node3 with a part of memory, mem:[26G-30G],
3. call hotadd_new_pgdat()
        free_area_init_node()
                get_pfn_range_for_nid()
4. it will return wrong start_pfn and end_pfn, because we have not
update the memblock.

This patch also fixes a BUG_ON during hot-addition, please see
http://marc.info/?l=linux-kernel&m=142961156129456&w=2

Signed-off-by: Xishi Qiu <qiuxishi@huawei.com>
Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Taku Izumi <izumi.taku@jp.fujitsu.com>
Cc: Tang Chen <tangchen@cn.fujitsu.com>
Cc: Gu Zheng <guz.fnst@cn.fujitsu.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years ago.mailmap: Andrey Ryabinin has moved
Andrey Ryabinin [Fri, 14 Aug 2015 22:35:13 +0000 (15:35 -0700)]
.mailmap: Andrey Ryabinin has moved

Update my email address.

Signed-off-by: Andrey Ryabinin <ryabinin.a.a@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agoipc/sem.c: update/correct memory barriers
Manfred Spraul [Fri, 14 Aug 2015 22:35:10 +0000 (15:35 -0700)]
ipc/sem.c: update/correct memory barriers

sem_lock() did not properly pair memory barriers:

!spin_is_locked() and spin_unlock_wait() are both only control barriers.
The code needs an acquire barrier, otherwise the cpu might perform read
operations before the lock test.

As no primitive exists inside <include/spinlock.h> and since it seems
noone wants another primitive, the code creates a local primitive within
ipc/sem.c.

With regards to -stable:

The change of sem_wait_array() is a bugfix, the change to sem_lock() is a
nop (just a preprocessor redefinition to improve the readability).  The
bugfix is necessary for all kernels that use sem_wait_array() (i.e.:
starting from 3.10).

Signed-off-by: Manfred Spraul <manfred@colorfullife.com>
Reported-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Cc: Kirill Tkhai <ktkhai@parallels.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: <stable@vger.kernel.org> [3.10+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agomm/hwpoison: fix panic due to split huge zero page
Wanpeng Li [Fri, 14 Aug 2015 22:35:08 +0000 (15:35 -0700)]
mm/hwpoison: fix panic due to split huge zero page

Bug:

  ------------[ cut here ]------------
  kernel BUG at mm/huge_memory.c:1957!
  invalid opcode: 0000 [#1] SMP
  Modules linked in: snd_hda_codec_hdmi i915 rpcsec_gss_krb5 snd_hda_codec_realtek snd_hda_codec_generic nfsv4 dns_re
  CPU: 2 PID: 2576 Comm: test_huge Not tainted 4.2.0-rc5-mm1+ #27
  Hardware name: Dell Inc. OptiPlex 7020/0F5C5X, BIOS A03 01/08/2015
  task: ffff880204e3d600 ti: ffff8800db16c000 task.ti: ffff8800db16c000
  RIP: split_huge_page_to_list+0xdb/0x120
  Call Trace:
    memory_failure+0x32e/0x7c0
    madvise_hwpoison+0x8b/0x160
    SyS_madvise+0x40/0x240
    ? do_page_fault+0x37/0x90
    entry_SYSCALL_64_fastpath+0x12/0x71
  Code: ff f0 41 ff 4c 24 30 74 0d 31 c0 48 83 c4 08 5b 41 5c 41 5d c9 c3 4c 89 e7 e8 e2 58 fd ff 48 83 c4 08 31 c0
  RIP  split_huge_page_to_list+0xdb/0x120
   RSP <ffff8800db16fde8>
  ---[ end trace aee7ce0df8e44076 ]---

Testcase:

    #define _GNU_SOURCE
    #include <stdlib.h>
    #include <stdio.h>
    #include <sys/mman.h>
    #include <unistd.h>
    #include <fcntl.h>
    #include <sys/types.h>
    #include <errno.h>
    #include <string.h>

    #define MB 1024*1024

    int main(void)
    {
            char *mem;

            posix_memalign((void **)&mem, 2 * MB, 200 * MB);

            madvise(mem, 200 * MB, MADV_HWPOISON);

            free(mem);

            return 0;
    }

Huge zero page is allocated if page fault w/o FAULT_FLAG_WRITE flag.
The get_user_pages_fast() which called in madvise_hwpoison() will get
huge zero page if the page is not allocated before.  Huge zero page is a
tranparent huge page, however, it is not an anonymous page.
memory_failure will split the huge zero page and trigger
BUG_ON(is_huge_zero_page(page));

After commit 98ed2b0052e6 ("mm/memory-failure: give up error handling
for non-tail-refcounted thp"), memory_failure will not catch non anon
thp from madvise_hwpoison path and this bug occur.

Fix it by catching non anon thp in memory_failure in order to not split
huge zero page in madvise_hwpoison path.

After this patch:

  Injecting memory failure for page 0x202800 at 0x7fd8ae800000
  MCE: 0x202800: non anonymous thp
  [...]

[akpm@linux-foundation.org: remove second split, per Wanpeng]
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
Acked-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agoipc,sem: remove uneeded sem_undo_list lock usage in exit_sem()
Herton R. Krzesinski [Fri, 14 Aug 2015 22:35:05 +0000 (15:35 -0700)]
ipc,sem: remove uneeded sem_undo_list lock usage in exit_sem()

After we acquire the sma->sem_perm lock in exit_sem(), we are protected
against a racing IPC_RMID operation.  Also at that point, we are the last
user of sem_undo_list.  Therefore it isn't required that we acquire or use
ulp->lock.

Signed-off-by: Herton R. Krzesinski <herton@redhat.com>
Acked-by: Manfred Spraul <manfred@colorfullife.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Rafael Aquini <aquini@redhat.com>
CC: Aristeu Rozanski <aris@redhat.com>
Cc: David Jeffery <djeffery@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agoipc,sem: fix use after free on IPC_RMID after a task using same semaphore set exits
Herton R. Krzesinski [Fri, 14 Aug 2015 22:35:02 +0000 (15:35 -0700)]
ipc,sem: fix use after free on IPC_RMID after a task using same semaphore set exits

The current semaphore code allows a potential use after free: in
exit_sem we may free the task's sem_undo_list while there is still
another task looping through the same semaphore set and cleaning the
sem_undo list at freeary function (the task called IPC_RMID for the same
semaphore set).

For example, with a test program [1] running which keeps forking a lot
of processes (which then do a semop call with SEM_UNDO flag), and with
the parent right after removing the semaphore set with IPC_RMID, and a
kernel built with CONFIG_SLAB, CONFIG_SLAB_DEBUG and
CONFIG_DEBUG_SPINLOCK, you can easily see something like the following
in the kernel log:

   Slab corruption (Not tainted): kmalloc-64 start=ffff88003b45c1c0, len=64
   000: 6b 6b 6b 6b 6b 6b 6b 6b 00 6b 6b 6b 6b 6b 6b 6b  kkkkkkkk.kkkkkkk
   010: ff ff ff ff 6b 6b 6b 6b ff ff ff ff ff ff ff ff  ....kkkk........
   Prev obj: start=ffff88003b45c180, len=64
   000: 00 00 00 00 ad 4e ad de ff ff ff ff 5a 5a 5a 5a  .....N......ZZZZ
   010: ff ff ff ff ff ff ff ff c0 fb 01 37 00 88 ff ff  ...........7....
   Next obj: start=ffff88003b45c200, len=64
   000: 00 00 00 00 ad 4e ad de ff ff ff ff 5a 5a 5a 5a  .....N......ZZZZ
   010: ff ff ff ff ff ff ff ff 68 29 a7 3c 00 88 ff ff  ........h).<....
   BUG: spinlock wrong CPU on CPU#2, test/18028
   general protection fault: 0000 [#1] SMP
   Modules linked in: 8021q mrp garp stp llc nf_conntrack_ipv4 nf_defrag_ipv4 ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables binfmt_misc ppdev input_leds joydev parport_pc parport floppy serio_raw virtio_balloon virtio_rng virtio_console virtio_net iosf_mbi crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcspkr qxl ttm drm_kms_helper drm snd_hda_codec_generic i2c_piix4 snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep snd_seq snd_seq_device snd_pcm snd_timer snd soundcore crc32c_intel virtio_pci virtio_ring virtio pata_acpi ata_generic [last unloaded: speedstep_lib]
   CPU: 2 PID: 18028 Comm: test Not tainted 4.2.0-rc5+ #1
   Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.8.1-20150318_183358- 04/01/2014
   RIP: spin_dump+0x53/0xc0
   Call Trace:
     spin_bug+0x30/0x40
     do_raw_spin_unlock+0x71/0xa0
     _raw_spin_unlock+0xe/0x10
     freeary+0x82/0x2a0
     ? _raw_spin_lock+0xe/0x10
     semctl_down.clone.0+0xce/0x160
     ? __do_page_fault+0x19a/0x430
     ? __audit_syscall_entry+0xa8/0x100
     SyS_semctl+0x236/0x2c0
     ? syscall_trace_leave+0xde/0x130
     entry_SYSCALL_64_fastpath+0x12/0x71
   Code: 8b 80 88 03 00 00 48 8d 88 60 05 00 00 48 c7 c7 a0 2c a4 81 31 c0 65 8b 15 eb 40 f3 7e e8 08 31 68 00 4d 85 e4 44 8b 4b 08 74 5e <45> 8b 84 24 88 03 00 00 49 8d 8c 24 60 05 00 00 8b 53 04 48 89
   RIP  [<ffffffff810d6053>] spin_dump+0x53/0xc0
    RSP <ffff88003750fd68>
   ---[ end trace 783ebb76612867a0 ]---
   NMI watchdog: BUG: soft lockup - CPU#3 stuck for 22s! [test:18053]
   Modules linked in: 8021q mrp garp stp llc nf_conntrack_ipv4 nf_defrag_ipv4 ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables binfmt_misc ppdev input_leds joydev parport_pc parport floppy serio_raw virtio_balloon virtio_rng virtio_console virtio_net iosf_mbi crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcspkr qxl ttm drm_kms_helper drm snd_hda_codec_generic i2c_piix4 snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep snd_seq snd_seq_device snd_pcm snd_timer snd soundcore crc32c_intel virtio_pci virtio_ring virtio pata_acpi ata_generic [last unloaded: speedstep_lib]
   CPU: 3 PID: 18053 Comm: test Tainted: G      D         4.2.0-rc5+ #1
   Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.8.1-20150318_183358- 04/01/2014
   RIP: native_read_tsc+0x0/0x20
   Call Trace:
     ? delay_tsc+0x40/0x70
     __delay+0xf/0x20
     do_raw_spin_lock+0x96/0x140
     _raw_spin_lock+0xe/0x10
     sem_lock_and_putref+0x11/0x70
     SYSC_semtimedop+0x7bf/0x960
     ? handle_mm_fault+0xbf6/0x1880
     ? dequeue_task_fair+0x79/0x4a0
     ? __do_page_fault+0x19a/0x430
     ? kfree_debugcheck+0x16/0x40
     ? __do_page_fault+0x19a/0x430
     ? __audit_syscall_entry+0xa8/0x100
     ? do_audit_syscall_entry+0x66/0x70
     ? syscall_trace_enter_phase1+0x139/0x160
     SyS_semtimedop+0xe/0x10
     SyS_semop+0x10/0x20
     entry_SYSCALL_64_fastpath+0x12/0x71
   Code: 47 10 83 e8 01 85 c0 89 47 10 75 08 65 48 89 3d 1f 74 ff 7e c9 c3 0f 1f 44 00 00 55 48 89 e5 e8 87 17 04 00 66 90 c9 c3 0f 1f 00 <55> 48 89 e5 0f 31 89 c1 48 89 d0 48 c1 e0 20 89 c9 48 09 c8 c9
   Kernel panic - not syncing: softlockup: hung tasks

I wasn't able to trigger any badness on a recent kernel without the
proper config debugs enabled, however I have softlockup reports on some
kernel versions, in the semaphore code, which are similar as above (the
scenario is seen on some servers running IBM DB2 which uses semaphore
syscalls).

The patch here fixes the race against freeary, by acquiring or waiting
on the sem_undo_list lock as necessary (exit_sem can race with freeary,
while freeary sets un->semid to -1 and removes the same sem_undo from
list_proc or when it removes the last sem_undo).

After the patch I'm unable to reproduce the problem using the test case
[1].

[1] Test case used below:

    #include <stdio.h>
    #include <sys/types.h>
    #include <sys/ipc.h>
    #include <sys/sem.h>
    #include <sys/wait.h>
    #include <stdlib.h>
    #include <time.h>
    #include <unistd.h>
    #include <errno.h>

    #define NSEM 1
    #define NSET 5

    int sid[NSET];

    void thread()
    {
            struct sembuf op;
            int s;
            uid_t pid = getuid();

            s = rand() % NSET;
            op.sem_num = pid % NSEM;
            op.sem_op = 1;
            op.sem_flg = SEM_UNDO;

            semop(sid[s], &op, 1);
            exit(EXIT_SUCCESS);
    }

    void create_set()
    {
            int i, j;
            pid_t p;
            union {
                    int val;
                    struct semid_ds *buf;
                    unsigned short int *array;
                    struct seminfo *__buf;
            } un;

            /* Create and initialize semaphore set */
            for (i = 0; i < NSET; i++) {
                    sid[i] = semget(IPC_PRIVATE , NSEM, 0644 | IPC_CREAT);
                    if (sid[i] < 0) {
                            perror("semget");
                            exit(EXIT_FAILURE);
                    }
            }
            un.val = 0;
            for (i = 0; i < NSET; i++) {
                    for (j = 0; j < NSEM; j++) {
                            if (semctl(sid[i], j, SETVAL, un) < 0)
                                    perror("semctl");
                    }
            }

            /* Launch threads that operate on semaphore set */
            for (i = 0; i < NSEM * NSET * NSET; i++) {
                    p = fork();
                    if (p < 0)
                            perror("fork");
                    if (p == 0)
                            thread();
            }

            /* Free semaphore set */
            for (i = 0; i < NSET; i++) {
                    if (semctl(sid[i], NSEM, IPC_RMID))
                            perror("IPC_RMID");
            }

            /* Wait for forked processes to exit */
            while (wait(NULL)) {
                    if (errno == ECHILD)
                            break;
            };
    }

    int main(int argc, char **argv)
    {
            pid_t p;

            srand(time(NULL));

            while (1) {
                    p = fork();
                    if (p < 0) {
                            perror("fork");
                            exit(EXIT_FAILURE);
                    }
                    if (p == 0) {
                            create_set();
                            goto end;
                    }

                    /* Wait for forked processes to exit */
                    while (wait(NULL)) {
                            if (errno == ECHILD)
                                    break;
                    };
            }
    end:
            return 0;
    }

[akpm@linux-foundation.org: use normal comment layout]
Signed-off-by: Herton R. Krzesinski <herton@redhat.com>
Acked-by: Manfred Spraul <manfred@colorfullife.com>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Rafael Aquini <aquini@redhat.com>
CC: Aristeu Rozanski <aris@redhat.com>
Cc: David Jeffery <djeffery@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agomm/hwpoison: fix fail isolate hugetlbfs page w/ refcount held
Wanpeng Li [Fri, 14 Aug 2015 22:34:59 +0000 (15:34 -0700)]
mm/hwpoison: fix fail isolate hugetlbfs page w/ refcount held

Hugetlbfs pages will get a refcount in get_any_page() or
madvise_hwpoison() if soft offlining through madvise.  The refcount which
is held by the soft offline path should be released if we fail to isolate
hugetlbfs pages.

Fix it by reducing the refcount for both isolation success and failure.

Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
Acked-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: <stable@vger.kernel.org> [3.9+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agomm/hwpoison: fix page refcount of unknown non LRU page
Wanpeng Li [Fri, 14 Aug 2015 22:34:56 +0000 (15:34 -0700)]
mm/hwpoison: fix page refcount of unknown non LRU page

After trying to drain pages from pagevec/pageset, we try to get reference
count of the page again, however, the reference count of the page is not
reduced if the page is still not on LRU list.

Fix it by adding the put_page() to drop the page reference which is from
__get_any_page().

Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
Acked-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: <stable@vger.kernel.org> [3.9+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agoMerge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Fri, 14 Aug 2015 18:06:43 +0000 (11:06 -0700)]
Merge branch 'timers-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Pull timer fix from Ingo Molnar:
 "A single clocksource driver suspend/resume fix"

* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  clockevents/drivers/sh_cmt: Only perform clocksource suspend/resume if enabled

8 years agoMerge branch 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Fri, 14 Aug 2015 17:57:16 +0000 (10:57 -0700)]
Merge branch 'perf-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Pull perf fixes from Ingo Molnar:
 "Misc fixes: PMU driver corner cases, tooling fixes, and an 'AUX'
  (Intel PT) race related core fix"

* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/x86/intel/cqm: Do not access cpu_data() from CPU_UP_PREPARE handler
  perf/x86/intel: Fix memory leak on hot-plug allocation fail
  perf: Fix PERF_EVENT_IOC_PERIOD migration race
  perf: Fix double-free of the AUX buffer
  perf: Fix fasync handling on inherited events
  perf tools: Fix test build error when bindir contains double slash
  perf stat: Fix transaction lenght metrics
  perf: Fix running time accounting

8 years agoMerge branch 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Fri, 14 Aug 2015 17:45:23 +0000 (10:45 -0700)]
Merge branch 'locking-urgent-for-linus' of git://git./linux/kernel/git/tip/tip

Pull locking fix from Ingo Molnar:
 "A single fix for a locking self-test crash"

* 'locking-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  locking/pvqspinlock: Fix kernel panic in locking-selftest

8 years agoMerge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux
Linus Torvalds [Fri, 14 Aug 2015 17:39:32 +0000 (10:39 -0700)]
Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux

Pull drm fixes from Dave Airlie:
 "Back from holidays, found these in the cracks: one nouveau revert, one
  vmwgfx locking fix and a bunch of exynos fixes"

* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
  Revert "drm/nouveau/fifo/gk104: kick channels when deactivating them"
  drm/vmwgfx: Fix execbuf locking issues
  drm/exynos/fimc: fix runtime pm support
  drm/exynos/mixer: always update INT_EN cache
  drm/exynos/mixer: correct vsync configuration sequence
  drm/exynos/mixer: fix interrupt clearing
  drm/exynos/hdmi: fix edid memory leak
  drm/exynos: gsc: fix wrong bitwise operation for swap detection

8 years agoperf trace: Move vfs_getname storage to per thread area
Arnaldo Carvalho de Melo [Fri, 14 Aug 2015 16:16:27 +0000 (13:16 -0300)]
perf trace: Move vfs_getname storage to per thread area

We were storing the vfs_getname payload (i.e. ptr->string) into
the trace wide storage area (struct trace), so that we could use the
last payload when setting up the fd->pathname per thread tables, oops,
not a good idea for multi cpu tracing sessions...

Fix it by moving it to the per thread area (struct thread_trace).

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: David Ahern <dsahern@gmail.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/n/tip-3j05ttqyaem7kh7oubvr1keo@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
8 years agoRevert "drm/nouveau/fifo/gk104: kick channels when deactivating them"
Alexandre Courbot [Wed, 12 Aug 2015 04:17:38 +0000 (13:17 +0900)]
Revert "drm/nouveau/fifo/gk104: kick channels when deactivating them"

This reverts commit 1addc1264852

This commit seems to cause crashes in gk104_fifo_intr_runlist() by
returning 0xbad0da00 when register 0x2a00 is read. Since this commit was
intended for GM20B which is not completely supported yet, let's revert
it for the time being.

Reported-by: Eric Biggers <ebiggers3@gmail.com>
Signed-off-by: Alexandre Courbot <acourbot@nvidia.com>
Tested-by: Afzal Mohammed <afzal.mohd.ma@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
8 years agodrm/vmwgfx: Fix execbuf locking issues
Thomas Hellstrom [Wed, 12 Aug 2015 05:31:17 +0000 (22:31 -0700)]
drm/vmwgfx: Fix execbuf locking issues

This addresses two issues that cause problems with viewperf maya-03 in
situation with memory pressure.

The first issue causes attempts to unreserve buffers if batched
reservation fails due to, for example, a signal pending. While previously
the ttm_eu api was resistant against this type of error, it is no longer
and the lockdep code will complain about attempting to unreserve buffers
that are not reserved. The issue is resolved by avoid calling
ttm_eu_backoff_reservation in the buffer reserve error path.

The second issue is that the binding_mutex may be held when user-space
fence objects are created and hence during memory reclaims. This may cause
recursive attempts to grab the binding mutex. The issue is resolved by not
holding the binding mutex across fence creation and submission.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
Reviewed-by: Sinclair Yeh <syeh@vmware.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
8 years agoMerge branch 'exynos-drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git...
Dave Airlie [Thu, 13 Aug 2015 23:47:07 +0000 (09:47 +1000)]
Merge branch 'exynos-drm-fixes' of git://git./linux/kernel/git/daeinki/drm-exynos into drm-fixes

   This pull request fixes memory leak and some issues related to
   mixer and gscaler driver issues.

* 'exynos-drm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/daeinki/drm-exynos:
  drm/exynos/fimc: fix runtime pm support
  drm/exynos/mixer: always update INT_EN cache
  drm/exynos/mixer: correct vsync configuration sequence
  drm/exynos/mixer: fix interrupt clearing
  drm/exynos/hdmi: fix edid memory leak
  drm/exynos: gsc: fix wrong bitwise operation for swap detection

8 years agoMerge branch 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm
Linus Torvalds [Thu, 13 Aug 2015 23:34:56 +0000 (16:34 -0700)]
Merge branch 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm

Pull ARM fixes from Russell King:
 "Another few small ARM fixes, mostly addressing some VDSO issues"

* 'fixes' of git://ftp.arm.linux.org.uk/~rmk/linux-arm:
  ARM: 8410/1: VDSO: fix coarse clock monotonicity regression
  ARM: 8409/1: Mark ret_fast_syscall as a function
  ARM: 8408/1: Fix the secondary_startup function in Big Endian case
  ARM: 8405/1: VDSO: fix regression with toolchains lacking ld.bfd executable

8 years agox86: fix error handling for 32-bit compat out-of-range system call numbers
Linus Torvalds [Thu, 13 Aug 2015 23:19:44 +0000 (16:19 -0700)]
x86: fix error handling for 32-bit compat out-of-range system call numbers

Commit 3f5159a9221f ("x86/asm/entry/32: Update -ENOSYS handling to match
the 64-bit logic") broke the ENOSYS handling for the 32-bit compat case.
The proper error return value was never loaded into %rax, except if
things just happened to go through the audit paths, which ended up
reloading the return value.

This moves the loading or %rax into the normal system call path, just to
make sure the error case triggers it.  It's kind of sad, since it adds a
useless instruction to reload the register to the fast path, but it's
not like that single load from the stack is going to be noticeable.

Reported-by: David Drysdale <drysdale@google.com>
Tested-by: Kees Cook <keescook@chromium.org>
Acked-by: Andy Lutomirski <luto@amacapital.net>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
8 years agoMerge tag 'dm-4.2-fixes-5' of git://git.kernel.org/pub/scm/linux/kernel/git/device...
Linus Torvalds [Thu, 13 Aug 2015 20:52:46 +0000 (13:52 -0700)]
Merge tag 'dm-4.2-fixes-5' of git://git./linux/kernel/git/device-mapper/linux-dm

Pull device mapper fixes from Mike Snitzer:

 - two stable fixes for corruption seen in a snapshot of thinp metadata;
   metadata snapshots aren't widely used but help provide a consistent
   view of the metadata associated with an active thin-pool.

 - a dm-cache fix for the 4.2 "default" policy switch from "mq" to "smq"

* tag 'dm-4.2-fixes-5' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
  dm cache policy smq: move 'dm-cache-default' module alias to SMQ
  dm btree: add ref counting ops for the leaves of top level btrees
  dm thin metadata: delete btrees when releasing metadata snapshot

8 years agoMerge branch 'for-linus' of git://git.kernel.dk/linux-block
Linus Torvalds [Thu, 13 Aug 2015 20:44:32 +0000 (13:44 -0700)]
Merge branch 'for-linus' of git://git.kernel.dk/linux-block

Pull xen block driver fixes from Jens Axboe:
 "A few small bug fixes for xen-blk{front,back} that have been sitting
  over my vacation"

* 'for-linus' of git://git.kernel.dk/linux-block:
  xen-blkback: replace work_pending with work_busy in purge_persistent_gnt()
  xen-blkfront: don't add indirect pages to list when !feature_persistent
  xen-blkfront: introduce blkfront_gather_backend_features()

8 years agoMerge tag 'for-linus-4.2-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Thu, 13 Aug 2015 20:36:22 +0000 (13:36 -0700)]
Merge tag 'for-linus-4.2-rc6-tag' of git://git./linux/kernel/git/xen/tip

Pull xen bug fixes from David Vrabel:

 - revert a fix from 4.2-rc5 that was causing lots of WARNING spam.

 - fix a memory leak affecting backends in HVM guests.

 - fix PV domU hang with certain configurations.

* tag 'for-linus-4.2-rc6-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
  xen/xenbus: Don't leak memory when unmapping the ring on HVM backend
  Revert "xen/events/fifo: Handle linked events when closing a port"
  x86/xen: build "Xen PV" APIC driver for domU as well