OSDN Git Service

tomoyo/tomoyo-test1.git
2 years agoperf arm64: Inject missing frames when using 'perf record --call-graph=fp'
Alexandre Truong [Fri, 17 Dec 2021 15:45:20 +0000 (15:45 +0000)]
perf arm64: Inject missing frames when using 'perf record --call-graph=fp'

When unwinding using frame pointers on ARM64, the return address of the
current function may not have been pushed into the stack when a function
was interrupted, which makes perf show an incorrect call graph to the
user.

Consider the following example program:

  void leaf() {
      /* long computation */
  }

  void parent() {
      // (1)
      leaf();
      // (2)
  }

  ... could be compiled into (using gcc -fno-inline -fno-omit-frame-pointer):

  leaf:
      /* long computation */
      nop
      ret
  parent:
      // (1)
      stp     x29, x30, [sp, -16]!
      mov     x29, sp
      bl      parent
      nop
      ldp     x29, x30, [sp], 16
      // (2)
      ret

If the program is interrupted at (1), (2), or any point in "leaf:", the
call graph will skip the callers of the current function. We can unwind
using the dwarf info and check if the return addr is the same as the LR
register, and inject the missing frame into the call graph.

Before this patch, the above example shows the following call-graph when
recording using "--call-graph fp" mode in ARM64:

  # Children      Self  Command   Shared Object     Symbol
  # ........  ........  ........  ................  ......................
  #
      99.86%    99.86%  program3  program3          [.] leaf
       |
       ---_start
          __libc_start_main
          main
          leaf

As can be seen, the "parent" function is missing. This is specially
problematic in "leaf" because for leaf functions the compiler may always
omit pushing the return addr into the stack. After this patch, it shows
the correct graph:

  # Children      Self  Command   Shared Object     Symbol
  # ........  ........  ........  ................  ......................
  #
      99.86%    99.86%  program3  program3          [.] leaf
       |
       ---_start
          __libc_start_main
          main
          parent
          leaf

Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: Alexandre Truong <alexandre.truong@arm.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20211217154521.80603-7-german.gomez@arm.com
Signed-off-by: German Gomez <german.gomez@arm.com>
[ Rename machine__normalize_is() to machine__normalized_is(), as suggested by James Clark ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2 years agoperf tools: Refactor SMPL_REG macro in perf_regs.h
German Gomez [Fri, 17 Dec 2021 15:45:19 +0000 (15:45 +0000)]
perf tools: Refactor SMPL_REG macro in perf_regs.h

Refactor the SAMPL_REG macro so that it can be used in a followup commit
to obtain the masks for ARM64 registers.

Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: German Gomez <german.gomez@arm.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20211217154521.80603-6-german.gomez@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2 years agoperf callchain: Enable dwarf_callchain_users on arm64
Alexandre Truong [Fri, 17 Dec 2021 15:45:18 +0000 (15:45 +0000)]
perf callchain: Enable dwarf_callchain_users on arm64

Enable dwarf_callchain_users on arm64 which will be needed to do a
DWARF unwind in order to get the caller of the leaf frame.

Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: Alexandre Truong <alexandre.truong@arm.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20211217154521.80603-5-german.gomez@arm.com
Signed-off-by: German Gomez <german.gomez@arm.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2 years agoperf script: Use callchain_param_setup() instead of open coded equivalent
Alexandre Truong [Fri, 17 Dec 2021 15:45:17 +0000 (15:45 +0000)]
perf script: Use callchain_param_setup() instead of open coded equivalent

Refactoring script__setup_sample_type() by using callchain_param_setup()
to replace the duplicate code for callchain parameter setting up.

Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: Alexandre Truong <alexandre.truong@arm.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20211217154521.80603-4-german.gomez@arm.com
Signed-off-by: German Gomez <german.gomez@arm.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2 years agoperf machine: Add a mechanism to inject stack frames
Alexandre Truong [Fri, 17 Dec 2021 15:45:16 +0000 (15:45 +0000)]
perf machine: Add a mechanism to inject stack frames

Add a mechanism for platforms to inject stack frames for the leaf
frame caller if there is enough information to determine a frame
is missing from dwarf or other post processing mechanisms.

Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: Alexandre Truong <alexandre.truong@arm.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20211217154521.80603-3-german.gomez@arm.com
Signed-off-by: German Gomez <german.gomez@arm.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2 years agoperf tools: Record ARM64 LR register automatically
Alexandre Truong [Fri, 17 Dec 2021 15:45:15 +0000 (15:45 +0000)]
perf tools: Record ARM64 LR register automatically

On ARM64, automatically record the link register if the frame pointer
mode is on. It will be used to do a dwarf unwind to find the caller of
the leaf frame if the frame pointer was omitted.

Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: Alexandre Truong <alexandre.truong@arm.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20211217154521.80603-2-german.gomez@arm.com
Signed-off-by: German Gomez <german.gomez@arm.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2 years agoperf test: Use 3 digits for test numbering now we can have more tests
Carsten Haitzler [Wed, 15 Dec 2021 16:03:54 +0000 (16:03 +0000)]
perf test: Use 3 digits for test numbering now we can have more tests

This is in preparation for adding more tests that will need the test
number to be 3 digts so they align nicely in the output.

Reviewed-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: Carsten Haitzler <carsten.haitzler@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: coresight@lists.linaro.org
Link: http://lore.kernel.org/lkml/20211215160403.69264-3-carsten.haitzler@foss.arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2 years agoperf arm-spe: Synthesize SPE instruction events
German Gomez [Thu, 16 Dec 2021 15:24:04 +0000 (15:24 +0000)]
perf arm-spe: Synthesize SPE instruction events

Synthesize instruction events for every ARM SPE record.

Arm SPE implements a hardware-based sample period, and perf implements a
software-based one. Add a warning message to inform the user of this.

Signed-off-by: German Gomez <german.gomez@arm.com>
Tested-by: Leo Yan <leo.yan@linaro.org>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20211216152404.52474-1-german.gomez@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2 years agoperf test: Test 73 Sig_trap fails on s390
Thomas Richter [Thu, 16 Dec 2021 15:14:54 +0000 (16:14 +0100)]
perf test: Test 73 Sig_trap fails on s390

In Linux next commit 5504f67944484495 ("perf test sigtrap: Add basic
stress test for sigtrap handling") introduced the new test which uses
breakpoint events.  These events are not supported on s390 and PowerPC
and always fail:

  # perf test -F 73
  73: Sigtrap                                                         : FAILED!
  #

Fix it the same way as in the breakpoint tests in file
tests/bp_account.c where these type of tests are skipped on s390 and
PowerPC platforms.

With this patch skip this test on both platforms.

Output after:

  # perf test -F 73
  73: Sigtrap
  #

Fixes: 5504f67944484495 ("perf test sigtrap: Add basic stress test for sigtrap handling")
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Acked-by: Marco Elver <elver@google.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Sumanth Korikkar <sumanthk@linux.ibm.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Link: https://lore.kernel.org/r/20211216151454.752066-1-tmricht@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2 years agoperf ftrace: Implement cpu and task filters in BPF
Namhyung Kim [Wed, 15 Dec 2021 18:51:54 +0000 (10:51 -0800)]
perf ftrace: Implement cpu and task filters in BPF

Honor cpu and task options to set up filters (by pid or tid) in the
BPF program.  For example, the following command will show latency of
the mutex_lock for process 2570.

  # perf ftrace latency -b -T mutex_lock -p 2570 sleep 3
  #   DURATION     |      COUNT | GRAPH                          |
       0 - 1    us |        675 | ############################## |
       1 - 2    us |          9 |                                |
       2 - 4    us |          0 |                                |
       4 - 8    us |          0 |                                |
       8 - 16   us |          0 |                                |
      16 - 32   us |          0 |                                |
      32 - 64   us |          0 |                                |
      64 - 128  us |          0 |                                |
     128 - 256  us |          0 |                                |
     256 - 512  us |          0 |                                |
     512 - 1024 us |          0 |                                |
       1 - 2    ms |          0 |                                |
       2 - 4    ms |          0 |                                |
       4 - 8    ms |          0 |                                |
       8 - 16   ms |          0 |                                |
      16 - 32   ms |          0 |                                |
      32 - 64   ms |          0 |                                |
      64 - 128  ms |          0 |                                |
     128 - 256  ms |          0 |                                |
     256 - 512  ms |          0 |                                |
     512 - 1024 ms |          0 |                                |
       1 - ...   s |          0 |                                |

Committer testing:

Looking at faults on a firefox process:

  # strace -e bpf perf ftrace latency -b -p 1674378 -T __handle_mm_fault
  bpf(BPF_PROG_LOAD, {prog_type=BPF_PROG_TYPE_SOCKET_FILTER, insn_cnt=2, insns=0x7ffee1fee740, license="GPL", log_level=0, log_size=0, log_buf=NULL, kern_version=KERNEL_VERSION(0, 0, 0), prog_flags=0, prog_name="", prog_ifindex=0, expected_attach_type=BPF_CGROUP_INET_INGRESS, prog_btf_fd=0, func_info_rec_size=0, func_info=NULL, func_info_cnt=0, line_info_rec_size=0, line_info=NULL, line_info_cnt=0, attach_btf_id=0, attach_prog_fd=0}, 128) = 3
  bpf(BPF_BTF_LOAD, {btf="\237\353\1\0\30\0\0\0\0\0\0\0\20\0\0\0\20\0\0\0\5\0\0\0\1\0\0\0\0\0\0\1"..., btf_log_buf=NULL, btf_size=45, btf_log_size=0, btf_log_level=0}, 128) = 3
  bpf(BPF_BTF_LOAD, {btf="\237\353\1\0\30\0\0\0\0\0\0\0000\0\0\0000\0\0\0\t\0\0\0\1\0\0\0\0\0\0\1"..., btf_log_buf=NULL, btf_size=81, btf_log_size=0, btf_log_level=0}, 128) = 3
  bpf(BPF_BTF_LOAD, {btf="\237\353\1\0\30\0\0\0\0\0\0\08\0\0\08\0\0\0\t\0\0\0\0\0\0\0\0\0\0\1"..., btf_log_buf=NULL, btf_size=89, btf_log_size=0, btf_log_level=0}, 128) = 3
  bpf(BPF_BTF_LOAD, {btf="\237\353\1\0\30\0\0\0\0\0\0\0\f\0\0\0\f\0\0\0\7\0\0\0\1\0\0\0\0\0\0\20"..., btf_log_buf=NULL, btf_size=43, btf_log_size=0, btf_log_level=0}, 128) = 3
  bpf(BPF_BTF_LOAD, {btf="\237\353\1\0\30\0\0\0\0\0\0\0000\0\0\0000\0\0\0\t\0\0\0\1\0\0\0\0\0\0\1"..., btf_log_buf=NULL, btf_size=81, btf_log_size=0, btf_log_level=0}, 128) = 3
  bpf(BPF_BTF_LOAD, {btf="\237\353\1\0\30\0\0\0\0\0\0\0000\0\0\0000\0\0\0\5\0\0\0\0\0\0\0\0\0\0\1"..., btf_log_buf=NULL, btf_size=77, btf_log_size=0, btf_log_level=0}, 128) = -1 EINVAL (Invalid argument)
  bpf(BPF_BTF_LOAD, {btf="\237\353\1\0\30\0\0\0\0\0\0\0 \3\0\0 \3\0\0\306\3\0\0\0\0\0\0\0\0\0\2"..., btf_log_buf=NULL, btf_size=1790, btf_log_size=0, btf_log_level=0}, 128) = 3
  bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_ARRAY, key_size=4, value_size=32, max_entries=1, map_flags=0, inner_map_fd=0, map_name="", map_ifindex=0, btf_fd=0, btf_key_type_id=0, btf_value_type_id=0, btf_vmlinux_value_type_id=0}, 128) = 4
  bpf(BPF_PROG_LOAD, {prog_type=BPF_PROG_TYPE_SOCKET_FILTER, insn_cnt=5, insns=0x7ffee1fee570, license="GPL", log_level=0, log_size=0, log_buf=NULL, kern_version=KERNEL_VERSION(0, 0, 0), prog_flags=0, prog_name="", prog_ifindex=0, expected_attach_type=BPF_CGROUP_INET_INGRESS, prog_btf_fd=0, func_info_rec_size=0, func_info=NULL, func_info_cnt=0, line_info_rec_size=0, line_info=NULL, line_info_cnt=0, attach_btf_id=0, attach_prog_fd=0}, 128) = 5
  bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_ARRAY, key_size=4, value_size=4, max_entries=1, map_flags=BPF_F_MMAPABLE, inner_map_fd=0, map_name="", map_ifindex=0, btf_fd=0, btf_key_type_id=0, btf_value_type_id=0, btf_vmlinux_value_type_id=0}, 128) = 4
  bpf(BPF_PROG_LOAD, {prog_type=BPF_PROG_TYPE_SOCKET_FILTER, insn_cnt=2, insns=0x7ffee1fee3c0, license="GPL", log_level=0, log_size=0, log_buf=NULL, kern_version=KERNEL_VERSION(0, 0, 0), prog_flags=0, prog_name="test", prog_ifindex=0, expected_attach_type=BPF_CGROUP_INET_INGRESS, prog_btf_fd=0, func_info_rec_size=0, func_info=NULL, func_info_cnt=0, line_info_rec_size=0, line_info=NULL, line_info_cnt=0, attach_btf_id=0, attach_prog_fd=0}, 128) = 4
  bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_HASH, key_size=8, value_size=8, max_entries=10000, map_flags=0, inner_map_fd=0, map_name="functime", map_ifindex=0, btf_fd=3, btf_key_type_id=0, btf_value_type_id=0, btf_vmlinux_value_type_id=0}, 128) = 4
  bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_HASH, key_size=4, value_size=1, max_entries=1, map_flags=0, inner_map_fd=0, map_name="cpu_filter", map_ifindex=0, btf_fd=3, btf_key_type_id=0, btf_value_type_id=0, btf_vmlinux_value_type_id=0}, 128) = 5
  bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_HASH, key_size=4, value_size=1, max_entries=36, map_flags=0, inner_map_fd=0, map_name="task_filter", map_ifindex=0, btf_fd=3, btf_key_type_id=0, btf_value_type_id=0, btf_vmlinux_value_type_id=0}, 128) = 6
  bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_PERCPU_ARRAY, key_size=4, value_size=8, max_entries=22, map_flags=0, inner_map_fd=0, map_name="latency", map_ifindex=0, btf_fd=3, btf_key_type_id=0, btf_value_type_id=0, btf_vmlinux_value_type_id=0}, 128) = 7
  bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_ARRAY, key_size=4, value_size=12, max_entries=1, map_flags=BPF_F_MMAPABLE, inner_map_fd=0, map_name="func_lat.bss", map_ifindex=0, btf_fd=3, btf_key_type_id=0, btf_value_type_id=32, btf_vmlinux_value_type_id=0}, 128) = 8
  bpf(BPF_MAP_UPDATE_ELEM, {map_fd=8, key=0x7ffee1fee580, value=0x7f01d940a000, flags=BPF_ANY}, 128) = 0
  bpf(BPF_PROG_LOAD, {prog_type=BPF_PROG_TYPE_KPROBE, insn_cnt=42, insns=0x1871f30, license="", log_level=0, log_size=0, log_buf=NULL, kern_version=KERNEL_VERSION(5, 14, 16), prog_flags=0, prog_name="func_begin", prog_ifindex=0, expected_attach_type=BPF_CGROUP_INET_INGRESS, prog_btf_fd=3, func_info_rec_size=8, func_info=0x18746a0, func_info_cnt=1, line_info_rec_size=16, line_info=0x1874550, line_info_cnt=20, attach_btf_id=0, attach_prog_fd=0}, 128) = 9
  bpf(BPF_PROG_LOAD, {prog_type=BPF_PROG_TYPE_KPROBE, insn_cnt=99, insns=0x18769b0, license="", log_level=0, log_size=0, log_buf=NULL, kern_version=KERNEL_VERSION(5, 14, 16), prog_flags=0, prog_name="func_end", prog_ifindex=0, expected_attach_type=BPF_CGROUP_INET_INGRESS, prog_btf_fd=3, func_info_rec_size=8, func_info=0x188a640, func_info_cnt=1, line_info_rec_size=16, line_info=0x188a660, line_info_cnt=20, attach_btf_id=0, attach_prog_fd=0}, 128) = 10
  bpf(BPF_MAP_UPDATE_ELEM, {map_fd=6, key=0x7ffee1fee8e0, value=0x7ffee1fee8df, flags=BPF_ANY}, 128) = 0
  bpf(BPF_MAP_UPDATE_ELEM, {map_fd=6, key=0x7ffee1fee8e0, value=0x7ffee1fee8df, flags=BPF_ANY}, 128) = 0
  bpf(BPF_MAP_UPDATE_ELEM, {map_fd=6, key=0x7ffee1fee8e0, value=0x7ffee1fee8df, flags=BPF_ANY}, 128) = 0
  bpf(BPF_MAP_UPDATE_ELEM, {map_fd=6, key=0x7ffee1fee8e0, value=0x7ffee1fee8df, flags=BPF_ANY}, 128) = 0
  bpf(BPF_MAP_UPDATE_ELEM, {map_fd=6, key=0x7ffee1fee8e0, value=0x7ffee1fee8df, flags=BPF_ANY}, 128) = 0
  bpf(BPF_MAP_UPDATE_ELEM, {map_fd=6, key=0x7ffee1fee8e0, value=0x7ffee1fee8df, flags=BPF_ANY}, 128) = 0
  bpf(BPF_MAP_UPDATE_ELEM, {map_fd=6, key=0x7ffee1fee8e0, value=0x7ffee1fee8df, flags=BPF_ANY}, 128) = 0
  bpf(BPF_MAP_UPDATE_ELEM, {map_fd=6, key=0x7ffee1fee8e0, value=0x7ffee1fee8df, flags=BPF_ANY}, 128) = 0
  bpf(BPF_MAP_UPDATE_ELEM, {map_fd=6, key=0x7ffee1fee8e0, value=0x7ffee1fee8df, flags=BPF_ANY}, 128) = 0
  bpf(BPF_MAP_UPDATE_ELEM, {map_fd=6, key=0x7ffee1fee8e0, value=0x7ffee1fee8df, flags=BPF_ANY}, 128) = 0
  bpf(BPF_MAP_UPDATE_ELEM, {map_fd=6, key=0x7ffee1fee8e0, value=0x7ffee1fee8df, flags=BPF_ANY}, 128) = 0
  bpf(BPF_MAP_UPDATE_ELEM, {map_fd=6, key=0x7ffee1fee8e0, value=0x7ffee1fee8df, flags=BPF_ANY}, 128) = 0
  bpf(BPF_MAP_UPDATE_ELEM, {map_fd=6, key=0x7ffee1fee8e0, value=0x7ffee1fee8df, flags=BPF_ANY}, 128) = 0
  bpf(BPF_MAP_UPDATE_ELEM, {map_fd=6, key=0x7ffee1fee8e0, value=0x7ffee1fee8df, flags=BPF_ANY}, 128) = 0
  bpf(BPF_MAP_UPDATE_ELEM, {map_fd=6, key=0x7ffee1fee8e0, value=0x7ffee1fee8df, flags=BPF_ANY}, 128) = 0
  bpf(BPF_MAP_UPDATE_ELEM, {map_fd=6, key=0x7ffee1fee8e0, value=0x7ffee1fee8df, flags=BPF_ANY}, 128) = 0
  bpf(BPF_MAP_UPDATE_ELEM, {map_fd=6, key=0x7ffee1fee8e0, value=0x7ffee1fee8df, flags=BPF_ANY}, 128) = 0
  bpf(BPF_MAP_UPDATE_ELEM, {map_fd=6, key=0x7ffee1fee8e0, value=0x7ffee1fee8df, flags=BPF_ANY}, 128) = 0
  bpf(BPF_MAP_UPDATE_ELEM, {map_fd=6, key=0x7ffee1fee8e0, value=0x7ffee1fee8df, flags=BPF_ANY}, 128) = 0
  bpf(BPF_MAP_UPDATE_ELEM, {map_fd=6, key=0x7ffee1fee8e0, value=0x7ffee1fee8df, flags=BPF_ANY}, 128) = 0
  bpf(BPF_MAP_UPDATE_ELEM, {map_fd=6, key=0x7ffee1fee8e0, value=0x7ffee1fee8df, flags=BPF_ANY}, 128) = 0
  bpf(BPF_MAP_UPDATE_ELEM, {map_fd=6, key=0x7ffee1fee8e0, value=0x7ffee1fee8df, flags=BPF_ANY}, 128) = 0
  bpf(BPF_MAP_UPDATE_ELEM, {map_fd=6, key=0x7ffee1fee8e0, value=0x7ffee1fee8df, flags=BPF_ANY}, 128) = 0
  bpf(BPF_MAP_UPDATE_ELEM, {map_fd=6, key=0x7ffee1fee8e0, value=0x7ffee1fee8df, flags=BPF_ANY}, 128) = 0
  bpf(BPF_MAP_UPDATE_ELEM, {map_fd=6, key=0x7ffee1fee8e0, value=0x7ffee1fee8df, flags=BPF_ANY}, 128) = 0
  bpf(BPF_MAP_UPDATE_ELEM, {map_fd=6, key=0x7ffee1fee8e0, value=0x7ffee1fee8df, flags=BPF_ANY}, 128) = 0
  bpf(BPF_MAP_UPDATE_ELEM, {map_fd=6, key=0x7ffee1fee8e0, value=0x7ffee1fee8df, flags=BPF_ANY}, 128) = 0
  bpf(BPF_MAP_UPDATE_ELEM, {map_fd=6, key=0x7ffee1fee8e0, value=0x7ffee1fee8df, flags=BPF_ANY}, 128) = 0
  bpf(BPF_MAP_UPDATE_ELEM, {map_fd=6, key=0x7ffee1fee8e0, value=0x7ffee1fee8df, flags=BPF_ANY}, 128) = 0
  bpf(BPF_MAP_UPDATE_ELEM, {map_fd=6, key=0x7ffee1fee8e0, value=0x7ffee1fee8df, flags=BPF_ANY}, 128) = 0
  bpf(BPF_MAP_UPDATE_ELEM, {map_fd=6, key=0x7ffee1fee8e0, value=0x7ffee1fee8df, flags=BPF_ANY}, 128) = 0
  bpf(BPF_MAP_UPDATE_ELEM, {map_fd=6, key=0x7ffee1fee8e0, value=0x7ffee1fee8df, flags=BPF_ANY}, 128) = 0
  bpf(BPF_MAP_UPDATE_ELEM, {map_fd=6, key=0x7ffee1fee8e0, value=0x7ffee1fee8df, flags=BPF_ANY}, 128) = 0
  bpf(BPF_MAP_UPDATE_ELEM, {map_fd=6, key=0x7ffee1fee8e0, value=0x7ffee1fee8df, flags=BPF_ANY}, 128) = 0
  bpf(BPF_MAP_UPDATE_ELEM, {map_fd=6, key=0x7ffee1fee8e0, value=0x7ffee1fee8df, flags=BPF_ANY}, 128) = 0
  bpf(BPF_MAP_UPDATE_ELEM, {map_fd=6, key=0x7ffee1fee8e0, value=0x7ffee1fee8df, flags=BPF_ANY}, 128) = 0
  bpf(BPF_PROG_LOAD, {prog_type=BPF_PROG_TYPE_TRACEPOINT, insn_cnt=2, insns=0x7ffee1fee3c0, license="GPL", log_level=0, log_size=0, log_buf=NULL, kern_version=KERNEL_VERSION(0, 0, 0), prog_flags=0, prog_name="", prog_ifindex=0, expected_attach_type=BPF_CGROUP_INET_INGRESS, prog_btf_fd=0, func_info_rec_size=0, func_info=NULL, func_info_cnt=0, line_info_rec_size=0, line_info=NULL, line_info_cnt=0, attach_btf_id=0, attach_prog_fd=0}, 128) = 12
  bpf(BPF_LINK_CREATE, {link_create={prog_fd=12, target_fd=-1, attach_type=0x29 /* BPF_??? */, flags=0}}, 128) = -1 EINVAL (Invalid argument)
  ^Cstrace: Process 1702285 detached
  #   DURATION     |      COUNT | GRAPH                                          |
       0 - 1    us |        109 | #################                              |
       1 - 2    us |        127 | ###################                            |
       2 - 4    us |         36 | #####                                          |
       4 - 8    us |         20 | ###                                            |
       8 - 16   us |          2 |                                                |
      16 - 32   us |          0 |                                                |
      32 - 64   us |          0 |                                                |
      64 - 128  us |          0 |                                                |
     128 - 256  us |          0 |                                                |
     256 - 512  us |          0 |                                                |
     512 - 1024 us |          0 |                                                |
       1 - 2    ms |          0 |                                                |
       2 - 4    ms |          0 |                                                |
       4 - 8    ms |          0 |                                                |
       8 - 16   ms |          0 |                                                |
      16 - 32   ms |          0 |                                                |
      32 - 64   ms |          0 |                                                |
      64 - 128  ms |          0 |                                                |
     128 - 256  ms |          0 |                                                |
     256 - 512  ms |          0 |                                                |
     512 - 1024 ms |          0 |                                                |
       1 - ...   s |          0 |                                                |

  #

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Cc: Changbin Du <changbin.du@gmail.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20211215185154.360314-6-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2 years agoperf ftrace: Add -b/--use-bpf option for latency subcommand
Namhyung Kim [Wed, 15 Dec 2021 18:51:53 +0000 (10:51 -0800)]
perf ftrace: Add -b/--use-bpf option for latency subcommand

The -b/--use-bpf option is to use BPF to get latency info of kernel
functions.  It'd have better performance impact and I observed that
latency of same function is smaller than before when using BPF.

Committer testing:

  # strace -e bpf perf ftrace latency -b -T __handle_mm_fault -a sleep 1
  bpf(BPF_PROG_LOAD, {prog_type=BPF_PROG_TYPE_SOCKET_FILTER, insn_cnt=2, insns=0x7fff51914e00, license="GPL", log_level=0, log_size=0, log_buf=NULL, kern_version=KERNEL_VERSION(0, 0, 0), prog_flags=0, prog_name="", prog_ifindex=0, expected_attach_type=BPF_CGROUP_INET_INGRESS, prog_btf_fd=0, func_info_rec_size=0, func_info=NULL, func_info_cnt=0, line_info_rec_size=0, line_info=NULL, line_info_cnt=0, attach_btf_id=0, attach_prog_fd=0}, 128) = 3
  bpf(BPF_BTF_LOAD, {btf="\237\353\1\0\30\0\0\0\0\0\0\0\20\0\0\0\20\0\0\0\5\0\0\0\1\0\0\0\0\0\0\1"..., btf_log_buf=NULL, btf_size=45, btf_log_size=0, btf_log_level=0}, 128) = 3
  bpf(BPF_BTF_LOAD, {btf="\237\353\1\0\30\0\0\0\0\0\0\0000\0\0\0000\0\0\0\t\0\0\0\1\0\0\0\0\0\0\1"..., btf_log_buf=NULL, btf_size=81, btf_log_size=0, btf_log_level=0}, 128) = 3
  bpf(BPF_BTF_LOAD, {btf="\237\353\1\0\30\0\0\0\0\0\0\08\0\0\08\0\0\0\t\0\0\0\0\0\0\0\0\0\0\1"..., btf_log_buf=NULL, btf_size=89, btf_log_size=0, btf_log_level=0}, 128) = 3
  bpf(BPF_BTF_LOAD, {btf="\237\353\1\0\30\0\0\0\0\0\0\0\f\0\0\0\f\0\0\0\7\0\0\0\1\0\0\0\0\0\0\20"..., btf_log_buf=NULL, btf_size=43, btf_log_size=0, btf_log_level=0}, 128) = 3
  bpf(BPF_BTF_LOAD, {btf="\237\353\1\0\30\0\0\0\0\0\0\0000\0\0\0000\0\0\0\t\0\0\0\1\0\0\0\0\0\0\1"..., btf_log_buf=NULL, btf_size=81, btf_log_size=0, btf_log_level=0}, 128) = 3
  bpf(BPF_BTF_LOAD, {btf="\237\353\1\0\30\0\0\0\0\0\0\0000\0\0\0000\0\0\0\5\0\0\0\0\0\0\0\0\0\0\1"..., btf_log_buf=NULL, btf_size=77, btf_log_size=0, btf_log_level=0}, 128) = -1 EINVAL (Invalid argument)
  bpf(BPF_BTF_LOAD, {btf="\237\353\1\0\30\0\0\0\0\0\0\0\350\2\0\0\350\2\0\0\353\2\0\0\0\0\0\0\0\0\0\2"..., btf_log_buf=NULL, btf_size=1515, btf_log_size=0, btf_log_level=0}, 128) = 3
  bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_ARRAY, key_size=4, value_size=32, max_entries=1, map_flags=0, inner_map_fd=0, map_name="", map_ifindex=0, btf_fd=0, btf_key_type_id=0, btf_value_type_id=0, btf_vmlinux_value_type_id=0}, 128) = 4
  bpf(BPF_PROG_LOAD, {prog_type=BPF_PROG_TYPE_SOCKET_FILTER, insn_cnt=5, insns=0x7fff51914c30, license="GPL", log_level=0, log_size=0, log_buf=NULL, kern_version=KERNEL_VERSION(0, 0, 0), prog_flags=0, prog_name="", prog_ifindex=0, expected_attach_type=BPF_CGROUP_INET_INGRESS, prog_btf_fd=0, func_info_rec_size=0, func_info=NULL, func_info_cnt=0, line_info_rec_size=0, line_info=NULL, line_info_cnt=0, attach_btf_id=0, attach_prog_fd=0}, 128) = 5
  bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_ARRAY, key_size=4, value_size=4, max_entries=1, map_flags=BPF_F_MMAPABLE, inner_map_fd=0, map_name="", map_ifindex=0, btf_fd=0, btf_key_type_id=0, btf_value_type_id=0, btf_vmlinux_value_type_id=0}, 128) = 4
  bpf(BPF_PROG_LOAD, {prog_type=BPF_PROG_TYPE_SOCKET_FILTER, insn_cnt=2, insns=0x7fff51914a80, license="GPL", log_level=0, log_size=0, log_buf=NULL, kern_version=KERNEL_VERSION(0, 0, 0), prog_flags=0, prog_name="test", prog_ifindex=0, expected_attach_type=BPF_CGROUP_INET_INGRESS, prog_btf_fd=0, func_info_rec_size=0, func_info=NULL, func_info_cnt=0, line_info_rec_size=0, line_info=NULL, line_info_cnt=0, attach_btf_id=0, attach_prog_fd=0}, 128) = 4
  bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_HASH, key_size=8, value_size=8, max_entries=10000, map_flags=0, inner_map_fd=0, map_name="functime", map_ifindex=0, btf_fd=3, btf_key_type_id=0, btf_value_type_id=0, btf_vmlinux_value_type_id=0}, 128) = 4
  bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_HASH, key_size=4, value_size=1, max_entries=1, map_flags=0, inner_map_fd=0, map_name="cpu_filter", map_ifindex=0, btf_fd=3, btf_key_type_id=0, btf_value_type_id=0, btf_vmlinux_value_type_id=0}, 128) = 5
  bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_HASH, key_size=4, value_size=1, max_entries=1, map_flags=0, inner_map_fd=0, map_name="task_filter", map_ifindex=0, btf_fd=3, btf_key_type_id=0, btf_value_type_id=0, btf_vmlinux_value_type_id=0}, 128) = 7
  bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_PERCPU_ARRAY, key_size=4, value_size=8, max_entries=22, map_flags=0, inner_map_fd=0, map_name="latency", map_ifindex=0, btf_fd=3, btf_key_type_id=0, btf_value_type_id=0, btf_vmlinux_value_type_id=0}, 128) = 8
  bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_ARRAY, key_size=4, value_size=4, max_entries=1, map_flags=BPF_F_MMAPABLE, inner_map_fd=0, map_name="func_lat.bss", map_ifindex=0, btf_fd=3, btf_key_type_id=0, btf_value_type_id=30, btf_vmlinux_value_type_id=0}, 128) = 9
  bpf(BPF_MAP_UPDATE_ELEM, {map_fd=9, key=0x7fff51914c40, value=0x7f6e99be2000, flags=BPF_ANY}, 128) = 0
  bpf(BPF_PROG_LOAD, {prog_type=BPF_PROG_TYPE_KPROBE, insn_cnt=18, insns=0x11e4160, license="", log_level=0, log_size=0, log_buf=NULL, kern_version=KERNEL_VERSION(5, 14, 16), prog_flags=0, prog_name="func_begin", prog_ifindex=0, expected_attach_type=BPF_CGROUP_INET_INGRESS, prog_btf_fd=3, func_info_rec_size=8, func_info=0x11dfc50, func_info_cnt=1, line_info_rec_size=16, line_info=0x11e04c0, line_info_cnt=9, attach_btf_id=0, attach_prog_fd=0}, 128) = 10
  bpf(BPF_PROG_LOAD, {prog_type=BPF_PROG_TYPE_KPROBE, insn_cnt=99, insns=0x11ded70, license="", log_level=0, log_size=0, log_buf=NULL, kern_version=KERNEL_VERSION(5, 14, 16), prog_flags=0, prog_name="func_end", prog_ifindex=0, expected_attach_type=BPF_CGROUP_INET_INGRESS, prog_btf_fd=3, func_info_rec_size=8, func_info=0x11dfc70, func_info_cnt=1, line_info_rec_size=16, line_info=0x11f6e10, line_info_cnt=20, attach_btf_id=0, attach_prog_fd=0}, 128) = 11
  bpf(BPF_PROG_LOAD, {prog_type=BPF_PROG_TYPE_TRACEPOINT, insn_cnt=2, insns=0x7fff51914a80, license="GPL", log_level=0, log_size=0, log_buf=NULL, kern_version=KERNEL_VERSION(0, 0, 0), prog_flags=0, prog_name="", prog_ifindex=0, expected_attach_type=BPF_CGROUP_INET_INGRESS, prog_btf_fd=0, func_info_rec_size=0, func_info=NULL, func_info_cnt=0, line_info_rec_size=0, line_info=NULL, line_info_cnt=0, attach_btf_id=0, attach_prog_fd=0}, 128) = 13
  bpf(BPF_LINK_CREATE, {link_create={prog_fd=13, target_fd=-1, attach_type=0x29 /* BPF_??? */, flags=0}}, 128) = -1 EINVAL (Invalid argument)
  --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=1699992, si_uid=0, si_status=0, si_utime=0, si_stime=0} ---
  bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7fff51914f84, value=0x11f6fa0, flags=BPF_ANY}, 128) = 0
  bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7fff51914f84, value=0x11f6fa0, flags=BPF_ANY}, 128) = 0
  bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7fff51914f84, value=0x11f6fa0, flags=BPF_ANY}, 128) = 0
  bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7fff51914f84, value=0x11f6fa0, flags=BPF_ANY}, 128) = 0
  bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7fff51914f84, value=0x11f6fa0, flags=BPF_ANY}, 128) = 0
  bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7fff51914f84, value=0x11f6fa0, flags=BPF_ANY}, 128) = 0
  bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7fff51914f84, value=0x11f6fa0, flags=BPF_ANY}, 128) = 0
  bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7fff51914f84, value=0x11f6fa0, flags=BPF_ANY}, 128) = 0
  bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7fff51914f84, value=0x11f6fa0, flags=BPF_ANY}, 128) = 0
  bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7fff51914f84, value=0x11f6fa0, flags=BPF_ANY}, 128) = 0
  bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7fff51914f84, value=0x11f6fa0, flags=BPF_ANY}, 128) = 0
  bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7fff51914f84, value=0x11f6fa0, flags=BPF_ANY}, 128) = 0
  bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7fff51914f84, value=0x11f6fa0, flags=BPF_ANY}, 128) = 0
  bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7fff51914f84, value=0x11f6fa0, flags=BPF_ANY}, 128) = 0
  bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7fff51914f84, value=0x11f6fa0, flags=BPF_ANY}, 128) = 0
  bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7fff51914f84, value=0x11f6fa0, flags=BPF_ANY}, 128) = 0
  bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7fff51914f84, value=0x11f6fa0, flags=BPF_ANY}, 128) = 0
  bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7fff51914f84, value=0x11f6fa0, flags=BPF_ANY}, 128) = 0
  bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7fff51914f84, value=0x11f6fa0, flags=BPF_ANY}, 128) = 0
  bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7fff51914f84, value=0x11f6fa0, flags=BPF_ANY}, 128) = 0
  bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7fff51914f84, value=0x11f6fa0, flags=BPF_ANY}, 128) = 0
  bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7fff51914f84, value=0x11f6fa0, flags=BPF_ANY}, 128) = 0
  #   DURATION     |      COUNT | GRAPH                                          |
       0 - 1    us |         52 | ###################                            |
       1 - 2    us |         36 | #############                                  |
       2 - 4    us |         24 | #########                                      |
       4 - 8    us |          7 | ##                                             |
       8 - 16   us |          1 |                                                |
      16 - 32   us |          0 |                                                |
      32 - 64   us |          0 |                                                |
      64 - 128  us |          0 |                                                |
     128 - 256  us |          0 |                                                |
     256 - 512  us |          0 |                                                |
     512 - 1024 us |          0 |                                                |
       1 - 2    ms |          0 |                                                |
       2 - 4    ms |          0 |                                                |
       4 - 8    ms |          0 |                                                |
       8 - 16   ms |          0 |                                                |
      16 - 32   ms |          0 |                                                |
      32 - 64   ms |          0 |                                                |
      64 - 128  ms |          0 |                                                |
     128 - 256  ms |          0 |                                                |
     256 - 512  ms |          0 |                                                |
     512 - 1024 ms |          0 |                                                |
       1 - ...   s |          0 |                                                |
  +++ exited with 0 +++
  #

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Cc: Changbin Du <changbin.du@gmail.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20211215185154.360314-5-namhyung@kernel.org
[ Add missing util/cpumap.h include and removed unused 'fd' variable ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2 years agoperf ftrace: Add 'latency' subcommand
Namhyung Kim [Wed, 15 Dec 2021 18:51:52 +0000 (10:51 -0800)]
perf ftrace: Add 'latency' subcommand

The perf ftrace latency is to get a histogram of function execution
time.  Users should give a function name using -T option.

This is implemented using function_graph tracer with the given
function only.  And it parses the output to extract the time.

  $ sudo perf ftrace latency -a -T mutex_lock sleep 1
  #   DURATION     |      COUNT | GRAPH                          |
       0 - 1    us |       4596 | ########################       |
       1 - 2    us |       1680 | #########                      |
       2 - 4    us |       1106 | #####                          |
       4 - 8    us |        546 | ##                             |
       8 - 16   us |        562 | ###                            |
      16 - 32   us |          1 |                                |
      32 - 64   us |          0 |                                |
      64 - 128  us |          0 |                                |
     128 - 256  us |          0 |                                |
     256 - 512  us |          0 |                                |
     512 - 1024 us |          0 |                                |
       1 - 2    ms |          0 |                                |
       2 - 4    ms |          0 |                                |
       4 - 8    ms |          0 |                                |
       8 - 16   ms |          0 |                                |
      16 - 32   ms |          0 |                                |
      32 - 64   ms |          0 |                                |
      64 - 128  ms |          0 |                                |
     128 - 256  ms |          0 |                                |
     256 - 512  ms |          0 |                                |
     512 - 1024 ms |          0 |                                |
       1 - ...   s |          0 |                                |

Committer testing:

Latency for the __handle_mm_fault kernel function, system wide for 1
second, see how one can go from the usual 'perf ftrace' output, now the
same as for the 'perf ftrace trace' subcommand, to the new 'perf ftrace
latency' subcommand:

  # perf ftrace -T __handle_mm_fault -a sleep 1 | wc -l
  709
  # perf ftrace -T __handle_mm_fault -a sleep 1 | wc -l
  510
  # perf ftrace -T __handle_mm_fault -a sleep 1 | head -20
  # tracer: function
  #
  # entries-in-buffer/entries-written: 0/0   #P:32
  #
  #           TASK-PID     CPU#     TIMESTAMP  FUNCTION
  #              | |         |         |         |
         perf-exec-1685104 [007]  90638.894613: __handle_mm_fault <-handle_mm_fault
         perf-exec-1685104 [007]  90638.894620: __handle_mm_fault <-handle_mm_fault
         perf-exec-1685104 [007]  90638.894622: __handle_mm_fault <-handle_mm_fault
         perf-exec-1685104 [007]  90638.894635: __handle_mm_fault <-handle_mm_fault
         perf-exec-1685104 [007]  90638.894688: __handle_mm_fault <-handle_mm_fault
         perf-exec-1685104 [007]  90638.894702: __handle_mm_fault <-handle_mm_fault
         perf-exec-1685104 [007]  90638.894714: __handle_mm_fault <-handle_mm_fault
         perf-exec-1685104 [007]  90638.894728: __handle_mm_fault <-handle_mm_fault
         perf-exec-1685104 [007]  90638.894740: __handle_mm_fault <-handle_mm_fault
         perf-exec-1685104 [007]  90638.894751: __handle_mm_fault <-handle_mm_fault
             sleep-1685104 [007]  90638.894962: __handle_mm_fault <-handle_mm_fault
             sleep-1685104 [007]  90638.894977: __handle_mm_fault <-handle_mm_fault
             sleep-1685104 [007]  90638.894983: __handle_mm_fault <-handle_mm_fault
             sleep-1685104 [007]  90638.894995: __handle_mm_fault <-handle_mm_fault
  # perf ftrace latency -T __handle_mm_fault -a sleep 1
  #   DURATION     |      COUNT | GRAPH                                          |
       0 - 1    us |        125 | ######                                         |
       1 - 2    us |        249 | #############                                  |
       2 - 4    us |        455 | ########################                       |
       4 - 8    us |         37 | #                                              |
       8 - 16   us |          0 |                                                |
      16 - 32   us |          0 |                                                |
      32 - 64   us |          0 |                                                |
      64 - 128  us |          0 |                                                |
     128 - 256  us |          0 |                                                |
     256 - 512  us |          0 |                                                |
     512 - 1024 us |          0 |                                                |
       1 - 2    ms |          0 |                                                |
       2 - 4    ms |          0 |                                                |
       4 - 8    ms |          0 |                                                |
       8 - 16   ms |          0 |                                                |
      16 - 32   ms |          0 |                                                |
      32 - 64   ms |          0 |                                                |
      64 - 128  ms |          0 |                                                |
     128 - 256  ms |          0 |                                                |
     256 - 512  ms |          0 |                                                |
     512 - 1024 ms |          0 |                                                |
       1 - ...   s |          0 |                                                |
  #

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Cc: Changbin Du <changbin.du@gmail.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20211215185154.360314-4-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2 years agoperf ftrace: Move out common code from __cmd_ftrace
Namhyung Kim [Wed, 15 Dec 2021 18:51:51 +0000 (10:51 -0800)]
perf ftrace: Move out common code from __cmd_ftrace

The signal setup code and evlist__prepare_workload() can be used for
other subcommands.  Let's move them out of the __cmd_ftrace().  Then
it doesn't need to pass argc and argv.

On the other hand, select_tracer() is specific to the 'trace'
subcommand so it'd better moving it into the __cmd_ftrace().

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Cc: Changbin Du <changbin.du@gmail.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20211215185154.360314-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2 years agoperf ftrace: Add 'trace' subcommand
Namhyung Kim [Wed, 15 Dec 2021 18:51:50 +0000 (10:51 -0800)]
perf ftrace: Add 'trace' subcommand

This is a preparation to add more sub-commands for ftrace.  The
'trace' subcommand does the same thing when no subcommand is given.

Committer testing:

The previous mode, i.e. no subcommand and the new 'perf ftrace trace'
are equivalent:

  # perf ftrace -G check_preempt_curr sleep 0.00001
  # tracer: function_graph
  #
  # CPU  DURATION                  FUNCTION CALLS
  # |     |   |                     |   |   |   |
   25)               |  check_preempt_curr() {
   25)               |    resched_curr() {
   25)               |      native_smp_send_reschedule() {
   25)               |        default_send_IPI_single_phys() {
   25)   0.110 us    |          __default_send_IPI_dest_field();
   25)   0.490 us    |        }
   25)   0.640 us    |      }
   25)   0.850 us    |    }
   25)   2.060 us    |  }
  # perf ftrace trace -G check_preempt_curr sleep 0.00001
  # tracer: function_graph
  #
  # CPU  DURATION                  FUNCTION CALLS
  # |     |   |                     |   |   |   |
   10)               |  check_preempt_curr() {
   10)               |    resched_curr() {
   10)               |      native_smp_send_reschedule() {
   10)               |        default_send_IPI_single_phys() {
   10)   0.080 us    |          __default_send_IPI_dest_field();
   10)   0.460 us    |        }
   10)   0.610 us    |      }
   10)   0.830 us    |    }
   10)   2.020 us    |  }
  #

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Cc: Changbin Du <changbin.du@gmail.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20211215185154.360314-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2 years agoperf arch: Support register names from all archs
German Gomez [Tue, 7 Dec 2021 18:06:52 +0000 (18:06 +0000)]
perf arch: Support register names from all archs

When reading a perf.data file with register values, there is a mismatch
between the names and the values of the registers because the tool is
built using only the register names from the local architecture.

Reading a perf.data file that was recorded on ARM64, gives the following
erroneous output on an X86 machine:

  # perf report -i perf_arm64.data -D
  [...]
  24661932634451 0x698 [0x21d0]: PERF_RECORD_SAMPLE(IP, 0x1): 43239/43239: 0xffffc5be8f100f98 period: 1 addr: 0
  ... user regs: mask 0x1ffffffff ABI 64-bit
  .... AX    0x0000ffffd1515817
  .... BX    0x0000ffffd1515480
  .... CX    0x0000aaaadabf6c80
  .... DX    0x000000000000002e
  .... SI    0x0000000040100401
  .... DI    0x0040600200000080
  .... BP    0x0000ffffd1510e10
  .... SP    0x0000000000000000
  .... IP    0x00000000000000dd
  .... FLAGS 0x0000ffffd1510cd0
  .... CS    0x0000000000000000
  .... SS    0x0000000000000030
  .... DS    0x0000ffffa569a208
  .... ES    0x0000000000000000
  .... FS    0x0000000000000000
  .... GS    0x0000000000000000
  .... R8    0x0000aaaad3de9650
  .... R9    0x0000ffffa57397f0
  .... R10   0x0000000000000001
  .... R11   0x0000ffffa57fd000
  .... R12   0x0000ffffd1515817
  .... R13   0x0000ffffd1515480
  .... R14   0x0000aaaadabf6c80
  .... R15   0x0000000000000000
  .... unknown 0x0000000000000001
  .... unknown 0x0000000000000000
  .... unknown 0x0000000000000000
  .... unknown 0x0000000000000000
  .... unknown 0x0000000000000000
  .... unknown 0x0000ffffd1510d90
  .... unknown 0x0000ffffa5739b90
  .... unknown 0x0000ffffd1510d80
  .... XMM0  0x0000ffffa57392c8
   ... thread: perf-exec:43239
   ...... dso: [kernel.kallsyms]

As can be seen, the register names correspond to X86 registers, even
though the perf.data file was recorded on an ARM64 system. After this
patch, the output of the command displays the correct register names:

  # perf report -i perf_arm64.data -D
  [...]
  24661932634451 0x698 [0x21d0]: PERF_RECORD_SAMPLE(IP, 0x1): 43239/43239: 0xffffc5be8f100f98 period: 1 addr: 0
  ... user regs: mask 0x1ffffffff ABI 64-bit
  .... x0    0x0000ffffd1515817
  .... x1    0x0000ffffd1515480
  .... x2    0x0000aaaadabf6c80
  .... x3    0x000000000000002e
  .... x4    0x0000000040100401
  .... x5    0x0040600200000080
  .... x6    0x0000ffffd1510e10
  .... x7    0x0000000000000000
  .... x8    0x00000000000000dd
  .... x9    0x0000ffffd1510cd0
  .... x10   0x0000000000000000
  .... x11   0x0000000000000030
  .... x12   0x0000ffffa569a208
  .... x13   0x0000000000000000
  .... x14   0x0000000000000000
  .... x15   0x0000000000000000
  .... x16   0x0000aaaad3de9650
  .... x17   0x0000ffffa57397f0
  .... x18   0x0000000000000001
  .... x19   0x0000ffffa57fd000
  .... x20   0x0000ffffd1515817
  .... x21   0x0000ffffd1515480
  .... x22   0x0000aaaadabf6c80
  .... x23   0x0000000000000000
  .... x24   0x0000000000000001
  .... x25   0x0000000000000000
  .... x26   0x0000000000000000
  .... x27   0x0000000000000000
  .... x28   0x0000000000000000
  .... x29   0x0000ffffd1510d90
  .... lr    0x0000ffffa5739b90
  .... sp    0x0000ffffd1510d80
  .... pc    0x0000ffffa57392c8
   ... thread: perf-exec:43239
   ...... dso: [kernel.kallsyms]

Tester comments:

Athira reports:

"Looks good to me. Tested this patchset in powerpc by capturing regs in
powerpc and doing perf report to read the data from x86."

Reported-by: Alexandre Truong <alexandre.truong@arm.com>
Reviewed-by: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Signed-off-by: German Gomez <german.gomez@arm.com>
Tested-by: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-csky@vger.kernel.org
Cc: linux-riscv@lists.infradead.org
Link: https://lore.kernel.org/r/20211207180653.1147374-4-german.gomez@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2 years agoperf arm64: Rename perf_event_arm_regs for ARM64 registers
German Gomez [Tue, 7 Dec 2021 18:06:51 +0000 (18:06 +0000)]
perf arm64: Rename perf_event_arm_regs for ARM64 registers

The registers for ARM and ARM64 are enumerated using two enums that have
the same name. In order to be able to import both headers, the name of
one can be replaced using the C preprocessor like so:

  #define perf_event_arm_regs perf_event_arm64_regs
  #include <asm/perf_regs.h>
  #undef perf_event_arm_regs

This patch updates all imports of ARM64's perf_regs.h in order to
prevent the naming collision.

Reviewed-by: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Signed-off-by: German Gomez <german.gomez@arm.com>
Tested-by: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-csky@vger.kernel.org
Cc: linux-riscv@lists.infradead.org
Link: https://lore.kernel.org/r/20211207180653.1147374-3-german.gomez@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2 years agoperf namespaces: Add helper nsinfo__is_in_root_namespace()
Leo Yan [Sun, 12 Dec 2021 13:47:20 +0000 (21:47 +0800)]
perf namespaces: Add helper nsinfo__is_in_root_namespace()

Refactors code for gathering PID infos, it creates the function
nsinfo__get_nspid() to parse process 'status' node in folder '/proc'.

Base on the refactoring, this patch introduces a new helper
nsinfo__is_in_root_namespace(), it returns true when the caller runs in
the root PID namespace.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: John Garry <john.garry@huawei.com>
Cc: KP Singh <kpsingh@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Yonatan Goldschmidt <yonatan.goldschmidt@granulate.io>
Cc: Yonghong Song <yhs@fb.com>
Cc: bpf@vger.kernel.org
Cc: netdev@vger.kernel.org
Link: http://lore.kernel.org/lkml/20211212134721.1721245-2-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2 years agolibperf tests: Fix a spelling mistake "Runnnig" -> "Running"
Colin Ian King [Sun, 12 Dec 2021 22:21:22 +0000 (22:21 +0000)]
libperf tests: Fix a spelling mistake "Runnnig" -> "Running"

There is a spelling mistake in a __T_VERBOSE message. Fix it.

Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: kernel-janitors@vger.kernel.org
Link: http://lore.kernel.org/lkml/20211212222122.478537-1-colin.i.king@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2 years agoperf bpf-loader: Use IS_ERR_OR_NULL() to clean code and fix check
Miaoqian Lin [Sun, 12 Dec 2021 13:56:09 +0000 (13:56 +0000)]
perf bpf-loader: Use IS_ERR_OR_NULL() to clean code and fix check

Use IS_ERR_OR_NULL() to make the code cleaner.
Also if the priv is NULL, it's improper to call PTR_ERR(priv).

Signed-off-by: Miaoqian Lin <linmq006@gmail.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: KP Singh <kpsingh@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Yonghong Song <yhs@fb.com>
Cc: bpf@vger.kernel.org
Cc: netdev@vger.kernel.org
Cc: unlisted-recipients
Link: http://lore.kernel.org/lkml/20211212135613.20000-1-linmq006@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2 years agoperf cs-etm: Remove duplicate and incorrect aux size checks
James Clark [Wed, 8 Dec 2021 11:54:35 +0000 (11:54 +0000)]
perf cs-etm: Remove duplicate and incorrect aux size checks

There are two checks, one is for size when running without admin, but
this one is covered by the driver and reported on in more detail here
(builtin-record.c):

  pr_err("Permission error mapping pages.\n"
         "Consider increasing "
         "/proc/sys/kernel/perf_event_mlock_kb,\n"
         "or try again with a smaller value of -m/--mmap_pages.\n"
         "(current value: %u,%u)\n",

This had the effect of artificially limiting the aux buffer size to a
value smaller than what was allowed because perf_event_mlock_kb wasn't
taken into account.

The second is to check for a power of two, but this is covered here
(evlist.c):

  pr_info("rounding mmap pages size to %s (%lu pages)\n",
          buf, pages);

Reviewed-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: James Clark <james.clark@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20211208115435.610101-1-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2 years agoperf vendor events: Rename arm64 arch std event files
Andrew Kilroy [Fri, 10 Dec 2021 12:37:05 +0000 (12:37 +0000)]
perf vendor events: Rename arm64 arch std event files

A previous commit adds pmu events into the files

  armv8-common-and-microarch.json
  armv8-recommended.json

that are actually specified in an armv9 reference supplement, not armv8.
As such, naming the files with the armv8 prefix seems artificial.

This patch renames the files to reflect that these two files are for
arch std events regardless of whether they are defined in armv8 or
armv9.

Reviewed-by: John Garry <john.garry@huawei.com>
Signed-off-by: Andrew Kilroy <andrew.kilroy@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20211210123706.7490-3-andrew.kilroy@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2 years agoperf vendor events: For the Arm Neoverse N2
Andrew Kilroy [Fri, 10 Dec 2021 12:37:04 +0000 (12:37 +0000)]
perf vendor events: For the Arm Neoverse N2

Updates the common and microarch json file to add counters available in
the Arm Neoverse N2 chip, but should also apply to other ArmV8 and ArmV9
cpus.  Specified in ArmV8 architecture reference manual

  https://developer.arm.com/documentation/ddi0487/gb/?lang=en

Some of the counters added to armv8-common-and-microarch.json are
specified in the ArmV9 architecture reference manual supplement
(issue A.a):

  https://developer.arm.com/documentation/ddi0608/aa

The additional ArmV9 counters are

  TRB_WRAP
  TRCEXTOUT0
  TRCEXTOUT1
  TRCEXTOUT2
  TRCEXTOUT3
  CTI_TRIGOUT4
  CTI_TRIGOUT5
  CTI_TRIGOUT6
  CTI_TRIGOUT7

This patch also adds files in pmu-events/arch/arm64/arm/neoverse-n2 for
perf list to output the counter names in categories.

Counters on the Neoverse N2 are stated in its reference manual:

  https://developer.arm.com/documentation/102099/0000

Reviewed-by: John Garry <john.garry@huawei.com>
Signed-off-by: Andrew Kilroy <andrew.kilroy@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20211210123706.7490-2-andrew.kilroy@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2 years agoperf dlfilter: Drop unused variable
Salvatore Bonaccorso [Tue, 23 Nov 2021 21:18:21 +0000 (22:18 +0100)]
perf dlfilter: Drop unused variable

Compiling tools/perf/dlfilters/dlfilter-test-api-v0.c result in:

checking for stdlib.h... dlfilters/dlfilter-test-api-v0.c: In function â€˜filter_event’:
dlfilters/dlfilter-test-api-v0.c:311:29: warning: unused variable â€˜d’ [-Wunused-variable]
  311 |         struct filter_data *d = data;
      |

So remove the  variable now.

Reviewed-by: German Gomez <german.gomez@arm.com>
Signed-off-by: Salvatore Bonaccorso <carnil@debian.org>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20211123211821.132924-1-carnil@debian.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2 years agoperf arm-spe: Add SPE total latency as PERF_SAMPLE_WEIGHT
Namhyung Kim [Wed, 1 Dec 2021 22:08:55 +0000 (14:08 -0800)]
perf arm-spe: Add SPE total latency as PERF_SAMPLE_WEIGHT

Use total latency info in the SPE counter packet as sample weight so
that we can see it in local_weight and (global) weight sort keys.

Maybe we can use PERF_SAMPLE_WEIGHT_STRUCT to support ins_lat as well
but I'm not sure which latency it matches.  So just adding total latency
first.

Reviewed-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20211201220855.1260688-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2 years agoperf bench: Use unbuffered output when pipe/tee'ing to a file
Sohaib Mohamed [Fri, 19 Nov 2021 06:14:08 +0000 (08:14 +0200)]
perf bench: Use unbuffered output when pipe/tee'ing to a file

The output of 'perf bench' gets buffered when I pipe it to a file or to
tee, in such a way that I can see it only at the end.

E.g.

  $ perf bench internals synthesize -t
  < output comes out fine after each test run >

  $ perf bench internals synthesize -t | tee file.txt
  < output comes out only at the end of all tests >

This patch resolves this issue for 'bench' and 'test' subcommands.

See, also:

  $ perf bench mem all | tee file.txt
  $ perf bench sched all | tee file.txt
  $ perf bench internals all -t | tee file.txt
  $ perf bench internals all | tee file.txt

Committer testing:

It really gets staggered, i.e. outputs in bursts, when the buffer fills
up and has to be drained to make up space for more output.

Suggested-by: Riccardo Mancini <rickyman7@gmail.com>
Signed-off-by: Sohaib Mohamed <sohaib.amhmd@gmail.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Fabian Hemmer <copy@copy.sh>
Cc: Ian Rogers <irogers@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20211119061409.78004-1-sohaib.amhmd@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2 years agoMerge remote-tracking branch 'torvalds/master' into perf/core
Arnaldo Carvalho de Melo [Thu, 16 Dec 2021 15:12:36 +0000 (12:12 -0300)]
Merge remote-tracking branch 'torvalds/master' into perf/core

To pick up fixes.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2 years agoMerge tag 'ceph-for-5.16-rc6' of git://github.com/ceph/ceph-client
Linus Torvalds [Wed, 15 Dec 2021 19:06:41 +0000 (11:06 -0800)]
Merge tag 'ceph-for-5.16-rc6' of git://github.com/ceph/ceph-client

Pull ceph fixes from Ilya Dryomov:
 "An SGID directory handling fix (marked for stable), a metrics
  accounting fix and two fixups to appease static checkers"

* tag 'ceph-for-5.16-rc6' of git://github.com/ceph/ceph-client:
  ceph: fix up non-directory creation in SGID directories
  ceph: initialize pathlen variable in reconnect_caps_cb
  ceph: initialize i_size variable in ceph_sync_read
  ceph: fix duplicate increment of opened_inodes metric

2 years agoMerge tag 's390-5.16-5' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Linus Torvalds [Wed, 15 Dec 2021 18:52:01 +0000 (10:52 -0800)]
Merge tag 's390-5.16-5' of git://git./linux/kernel/git/s390/linux

Pull s390 fixes from Heiko Carstens:

 - Add missing handling of R_390_PLT32DBL relocation type in
   arch_kexec_apply_relocations_add(). Clang and the upcoming gcc 11.3
   generate such relocation entries, which our relocation code silently
   ignores, and which finally will result in an endless loop within the
   purgatory code in case of kexec.

 - Add proper handling of errors and print error messages when applying
   relocations

 - Fix duplicate tracking of irq nesting level in entry code

 - Let recordmcount.pl also look for jgnop mnemonic. Starting with
   binutils 2.37 objdump emits a jgnop mnemonic instead of brcl, which
   breaks mcount location detection. This is only a problem if used with
   compilers older than gcc 9, since with gcc 9 and newer compilers
   recordmcount.pl is not used anymore.

 - Remove preempt_disable()/preempt_enable() pair in
   kprobe_ftrace_handler() which was done for all architectures except
   for s390.

 - Update defconfig

* tag 's390-5.16-5' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  recordmcount.pl: look for jgnop instruction as well as bcrl on s390
  s390/entry: fix duplicate tracking of irq nesting level
  s390: enable switchdev support in defconfig
  s390/kexec: handle R_390_PLT32DBL rela in arch_kexec_apply_relocations_add()
  s390/ftrace: remove preempt_disable()/preempt_enable() pair
  s390/kexec_file: fix error handling when applying relocations
  s390/kexec_file: print some more error messages

2 years agoMerge tag 'hyperv-fixes-signed-20211214' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Wed, 15 Dec 2021 18:46:03 +0000 (10:46 -0800)]
Merge tag 'hyperv-fixes-signed-20211214' of git://git./linux/kernel/git/hyperv/linux

Pull hyperv fix from Wei Liu:
 "Build fix from Randy Dunlap"

* tag 'hyperv-fixes-signed-20211214' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux:
  hv: utils: add PTP_1588_CLOCK to Kconfig to fix build

2 years agoMerge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost
Linus Torvalds [Mon, 13 Dec 2021 22:34:22 +0000 (14:34 -0800)]
Merge tag 'for_linus' of git://git./linux/kernel/git/mst/vhost

Pull virtio fixes from Michael Tsirkin:
 "Misc virtio and vdpa bugfixes"

* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
  vdpa: Consider device id larger than 31
  virtio/vsock: fix the transport to work with VMADDR_CID_ANY
  virtio_ring: Fix querying of maximum DMA mapping size for virtio device
  virtio: always enter drivers/virtio/
  vduse: check that offset is within bounds in get_config()
  vdpa: check that offsets are within bounds
  vduse: fix memory corruption in vduse_dev_ioctl()

2 years agoPCI: mt7621: Convert driver into 'bool'
Sergio Paracuellos [Wed, 1 Dec 2021 21:34:02 +0000 (22:34 +0100)]
PCI: mt7621: Convert driver into 'bool'

The driver is not ready yet to be compiled as a module since it depends
on some symbols not exported on MIPS.  We have the following current
problems:

  Building mips:allmodconfig ... failed
  --------------
  Error log:
  ERROR: modpost: missing MODULE_LICENSE() in drivers/pci/controller/pcie-mt7621.o
  ERROR: modpost: "mips_cm_unlock_other" [drivers/pci/controller/pcie-mt7621.ko] undefined!
  ERROR: modpost: "mips_cpc_base" [drivers/pci/controller/pcie-mt7621.ko] undefined!
  ERROR: modpost: "mips_cm_lock_other" [drivers/pci/controller/pcie-mt7621.ko] undefined!
  ERROR: modpost: "mips_cm_is64" [drivers/pci/controller/pcie-mt7621.ko] undefined!
  ERROR: modpost: "mips_gcr_base" [drivers/pci/controller/pcie-mt7621.ko] undefined!

Temporarily move from 'tristate' to 'bool' until a better solution is
ready.

Also RALINK is redundant because SOC_MT7621 already depends on it.
Hence, simplify condition.

Fixes: 2bdd5238e756 ("PCI: mt7621: Add MediaTek MT7621 PCIe host controller driver").
Signed-off-by: Sergio Paracuellos <sergio.paracuellos@gmail.com>
Reviewed-and-Tested-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2 years agofget: clarify and improve __fget_files() implementation
Linus Torvalds [Fri, 10 Dec 2021 22:00:15 +0000 (14:00 -0800)]
fget: clarify and improve __fget_files() implementation

Commit 054aa8d439b9 ("fget: check that the fd still exists after getting
a ref to it") fixed a race with getting a reference to a file just as it
was being closed.  It was a fairly minimal patch, and I didn't think
re-checking the file pointer lookup would be a measurable overhead,
since it was all right there and cached.

But I was wrong, as pointed out by the kernel test robot.

The 'poll2' case of the will-it-scale.per_thread_ops benchmark regressed
quite noticeably.  Admittedly it seems to be a very artificial test:
doing "poll()" system calls on regular files in a very tight loop in
multiple threads.

That means that basically all the time is spent just looking up file
descriptors without ever doing anything useful with them (not that doing
'poll()' on a regular file is useful to begin with).  And as a result it
shows the extra "re-check fd" cost as a sore thumb.

Happily, the regression is fixable by just writing the code to loook up
the fd to be better and clearer.  There's still a cost to verify the
file pointer, but now it's basically in the noise even for that
benchmark that does nothing else - and the code is more understandable
and has better comments too.

[ Side note: this patch is also a classic case of one that looks very
  messy with the default greedy Myers diff - it's much more legible with
  either the patience of histogram diff algorithm ]

Link: https://lore.kernel.org/lkml/20211210053743.GA36420@xsang-OptiPlex-9020/
Link: https://lore.kernel.org/lkml/20211213083154.GA20853@linux.intel.com/
Reported-by: kernel test robot <oliver.sang@intel.com>
Tested-by: Carel Si <beibei.si@intel.com>
Cc: Jann Horn <jannh@google.com>
Cc: Miklos Szeredi <mszeredi@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2 years agoLinux 5.16-rc5 v5.16-rc5
Linus Torvalds [Sun, 12 Dec 2021 22:53:01 +0000 (14:53 -0800)]
Linux 5.16-rc5

2 years agoMerge tag 'usb-5.16-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Linus Torvalds [Sun, 12 Dec 2021 18:20:57 +0000 (10:20 -0800)]
Merge tag 'usb-5.16-rc5' of git://git./linux/kernel/git/gregkh/usb

Pull USB fixes from Greg KH:
 "Here are some small USB fixes for 5.16-rc5.  They include:

   - gadget driver fixes for reported issues

   - xhci fixes for reported problems.

   - config endpoint parsing fixes for where we got bitfields wrong

  Most of these have been in linux-next, the remaining few were not, but
  got lots of local testing in my systems and in some cloud testing
  infrastructures"

* tag 'usb-5.16-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
  usb: core: config: using bit mask instead of individual bits
  usb: core: config: fix validation of wMaxPacketValue entries
  USB: gadget: zero allocate endpoint 0 buffers
  USB: gadget: detect too-big endpoint 0 requests
  xhci: avoid race between disable slot command and host runtime suspend
  xhci: Remove CONFIG_USB_DEFAULT_PERSIST to prevent xHCI from runtime suspending
  Revert "usb: dwc3: dwc3-qcom: Enable tx-fifo-resize property by default"

2 years agoMerge tag 'char-misc-5.16-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregk...
Linus Torvalds [Sun, 12 Dec 2021 18:16:34 +0000 (10:16 -0800)]
Merge tag 'char-misc-5.16-rc5' of git://git./linux/kernel/git/gregkh/char-misc

Pull char/misc driver fixes from Greg KH:
 "Here are a bunch of small char/misc and other driver subsystem fixes.

  Included in here are:

   - iio driver fixes for reported problems

   - phy driver fixes for a number of reported problems

   - mhi resume bugfix for broken hardware

   - nvmem driver fix

   - rtsx driver fix for irq issues

   - fastrpc packet parsing fix

  All of these have been in linux-next for a while with no reported
  issues"

* tag 'char-misc-5.16-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (33 commits)
  bus: mhi: core: Add support for forced PM resume
  iio: trigger: stm32-timer: fix MODULE_ALIAS
  misc: rtsx: Avoid mangling IRQ during runtime PM
  nvmem: eeprom: at25: fix FRAM byte_len
  misc: fastrpc: fix improper packet size calculation
  MAINTAINERS: add maintainer for Qualcomm FastRPC driver
  bus: mhi: pci_generic: Fix device recovery failed issue
  iio: adc: stm32: fix null pointer on defer_probe error
  phy: HiSilicon: Fix copy and paste bug in error handling
  dt-bindings: phy: zynqmp-psgtr: fix USB phy name
  phy: ti: omap-usb2: Fix the kernel-doc style
  phy: qualcomm: ipq806x-usb: Fix kernel-doc style
  iio: at91-sama5d2: Fix incorrect sign extension
  iio: adc: axp20x_adc: fix charging current reporting on AXP22x
  iio: gyro: adxrs290: fix data signedness
  phy: ti: tusb1210: Fix the kernel-doc warn
  phy: qualcomm: usb-hsic: Fix the kernel-doc warn
  phy: qualcomm: qmp: Add missing struct documentation
  phy: mvebu-cp110-utmi: Fix kernel-doc warns
  iio: ad7768-1: Call iio_trigger_notify_done() on error
  ...

2 years agoMerge tag 'timers-urgent-2021-12-12' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 12 Dec 2021 18:07:50 +0000 (10:07 -0800)]
Merge tag 'timers-urgent-2021-12-12' of git://git./linux/kernel/git/tip/tip

Pull timer fixes from Thomas Gleixner:
 "Two fixes for clock chip drivers:

   - A regression fix for the Designware APB timer. A recent change to
     the error checking code transformed the error condition wrongly so
     it turned into a fail if good condition.

   - Fix a clang build fail of the ARM architected timer driver"

* tag 'timers-urgent-2021-12-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  clocksource/drivers/arm_arch_timer: Force inlining of erratum_set_next_event_generic()
  clocksource/drivers/dw_apb_timer_of: Fix probe failure

2 years agoMerge tag 'irq-urgent-2021-12-12' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sun, 12 Dec 2021 17:53:12 +0000 (09:53 -0800)]
Merge tag 'irq-urgent-2021-12-12' of git://git./linux/kernel/git/tip/tip

Pull irq fixes from Thomas Gleixner:
 "A set of interrupt chip driver fixes:

   - Fix the multi vector MSI allocation on Armada 370XP

   - Do interrupt acknowledgement correctly in the aspeed-scu driver

   - Make the IPR register offset correct in the NVIC driver

   - Make redistribution table flushing correct by issueing a SYNC
     command to ensure that the invalidation command has been executed

   - Plug a device tree node reference leak in the bcm7210-l2 driver

   - Trivial fixes in the MIPS GIC and the Apple AIC drivers"

* tag 'irq-urgent-2021-12-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  irqchip/irq-bcm7120-l2: Add put_device() after of_find_device_by_node()
  irqchip/irq-gic-v3-its.c: Force synchronisation when issuing INVALL
  irqchip/apple-aic: Mark aic_init_smp() as __init
  irqchip: nvic: Fix offset for Interrupt Priority Offsets
  irqchip/mips-gic: Use bitfield helpers
  irqchip/aspeed-scu: Replace update_bits with write_bits.
  irqchip/armada-370-xp: Fix support for Multi-MSI interrupts
  irqchip/armada-370-xp: Fix return value of armada_370_xp_msi_alloc()

2 years agorecordmcount.pl: look for jgnop instruction as well as bcrl on s390
Jerome Marchand [Fri, 10 Dec 2021 09:38:27 +0000 (10:38 +0100)]
recordmcount.pl: look for jgnop instruction as well as bcrl on s390

On s390, recordmcount.pl is looking for "bcrl 0,<xxx>" instructions in
the objdump -d outpout. However since binutils 2.37, objdump -d
display "jgnop <xxx>" for the same instruction. Update the
mcount_regex so that it accepts both.

Signed-off-by: Jerome Marchand <jmarchan@redhat.com>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/20211210093827.1623286-1-jmarchan@redhat.com
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2 years agos390/entry: fix duplicate tracking of irq nesting level
Sven Schnelle [Mon, 6 Dec 2021 10:50:16 +0000 (11:50 +0100)]
s390/entry: fix duplicate tracking of irq nesting level

In the current code, when exiting from idle, rcu_irq_enter() is
called twice during irq entry:

irq_entry_enter()-> rcu_irq_enter()
irq_enter() -> rcu_irq_enter()

This may lead to wrong results from rcu_is_cpu_rrupt_from_idle()
because of a wrong dynticks nmi nesting count. Fix this by only
calling irq_enter_rcu().

Cc: <stable@vger.kernel.org> # 5.12+
Reported-by: Mark Rutland <mark.rutland@arm.com>
Fixes: 56e62a737028 ("s390: convert to generic entry")
Signed-off-by: Sven Schnelle <svens@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2 years agoMerge tag 'sched-urgent-2021-12-12' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sun, 12 Dec 2021 17:38:04 +0000 (09:38 -0800)]
Merge tag 'sched-urgent-2021-12-12' of git://git./linux/kernel/git/tip/tip

Pull scheduler fix from Thomas Gleixner:
 "A single fix for the x86 scheduler topology:

  Using cluster topology on hybrid CPUs, e.g. Alder Lake, biases the
  scheduler towards the ATOM cluster as that has more total capacity.
  Use selection based on CPU priority instead"

* tag 'sched-urgent-2021-12-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched,x86: Don't use cluster topology for x86 hybrid CPUs

2 years agoMerge tag 'csky-for-linus-5.16-rc5' of git://github.com/c-sky/csky-linux
Linus Torvalds [Sun, 12 Dec 2021 17:32:49 +0000 (09:32 -0800)]
Merge tag 'csky-for-linus-5.16-rc5' of git://github.com/c-sky/csky-linux

Pull csky from Guo Ren:
 "Only one fix for csky: fix fpu config macro"

* tag 'csky-for-linus-5.16-rc5' of git://github.com/c-sky/csky-linux:
  csky: fix typo of fpu config macro

2 years agousb: core: config: using bit mask instead of individual bits
Pavel Hofman [Fri, 10 Dec 2021 08:52:19 +0000 (09:52 +0100)]
usb: core: config: using bit mask instead of individual bits

Using standard USB_EP_MAXP_MULT_MASK instead of individual bits for
extracting multiple-transactions bits from wMaxPacketSize value.

Acked-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Pavel Hofman <pavel.hofman@ivitera.com>
Link: https://lore.kernel.org/r/20211210085219.16796-2-pavel.hofman@ivitera.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2 years agousb: core: config: fix validation of wMaxPacketValue entries
Pavel Hofman [Fri, 10 Dec 2021 08:52:18 +0000 (09:52 +0100)]
usb: core: config: fix validation of wMaxPacketValue entries

The checks performed by commit aed9d65ac327 ("USB: validate
wMaxPacketValue entries in endpoint descriptors") require that initial
value of the maxp variable contains both maximum packet size bits
(10..0) and multiple-transactions bits (12..11). However, the existing
code assings only the maximum packet size bits. This patch assigns all
bits of wMaxPacketSize to the variable.

Fixes: aed9d65ac327 ("USB: validate wMaxPacketValue entries in endpoint descriptors")
Cc: stable <stable@vger.kernel.org>
Acked-by: Alan Stern <stern@rowland.harvard.edu>
Signed-off-by: Pavel Hofman <pavel.hofman@ivitera.com>
Link: https://lore.kernel.org/r/20211210085219.16796-1-pavel.hofman@ivitera.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2 years agoUSB: gadget: zero allocate endpoint 0 buffers
Greg Kroah-Hartman [Thu, 9 Dec 2021 18:02:15 +0000 (19:02 +0100)]
USB: gadget: zero allocate endpoint 0 buffers

Under some conditions, USB gadget devices can show allocated buffer
contents to a host.  Fix this up by zero-allocating them so that any
extra data will all just be zeros.

Reported-by: Szymon Heidrich <szymon.heidrich@gmail.com>
Tested-by: Szymon Heidrich <szymon.heidrich@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2 years agoUSB: gadget: detect too-big endpoint 0 requests
Greg Kroah-Hartman [Thu, 9 Dec 2021 17:59:27 +0000 (18:59 +0100)]
USB: gadget: detect too-big endpoint 0 requests

Sometimes USB hosts can ask for buffers that are too large from endpoint
0, which should not be allowed.  If this happens for OUT requests, stall
the endpoint, but for IN requests, trim the request size to the endpoint
buffer size.

Co-developed-by: Szymon Heidrich <szymon.heidrich@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2 years agoMerge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Linus Torvalds [Sun, 12 Dec 2021 00:28:27 +0000 (16:28 -0800)]
Merge tag 'scsi-fixes' of git://git./linux/kernel/git/jejb/scsi

Pull SCSI fixes from James Bottomley:
 "Four fixes, all in drivers.

  Three are small and obvious, the qedi one is a bit larger but also
  pretty obvious"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: qla2xxx: Format log strings only if needed
  scsi: scsi_debug: Fix buffer size of REPORT ZONES command
  scsi: qedi: Fix cmd_cleanup_cmpl counter mismatch issue
  scsi: pm80xx: Do not call scsi_remove_host() in pm8001_alloc()

2 years agoMerge tag 'xfs-5.16-fixes-3' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux
Linus Torvalds [Sun, 12 Dec 2021 00:21:06 +0000 (16:21 -0800)]
Merge tag 'xfs-5.16-fixes-3' of git://git./fs/xfs/xfs-linux

Pull xfs fix from Darrick Wong:
 "This fixes a race between a readonly remount process and other
  processes that hold a file IOLOCK on files that previously experienced
  copy on write, that could result in severe filesystem corruption if
  the filesystem is then remounted rw.

  I think this is fairly rare (since the only reliable reproducer I have
  that fits the second criteria is the experimental xfs_scrub program),
  but the race is clear, so we still need to fix this.

  Summary:

   - Fix a data corruption vector that can result from the ro remount
     process failing to clear all speculative preallocations from files
     and the rw remount process not noticing the incomplete cleanup"

* tag 'xfs-5.16-fixes-3' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
  xfs: remove all COW fork extents when remounting readonly

2 years agoMerge branch 'for-5.16-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/dennis...
Linus Torvalds [Sun, 12 Dec 2021 00:14:17 +0000 (16:14 -0800)]
Merge branch 'for-5.16-fixes' of git://git./linux/kernel/git/dennis/percpu

Pull percpu fixes from Dennis Zhou:
 "This contains a fix for SMP && !MMU archs for percpu which has been
  tested by arm and sh. It seems in the past they have gotten away with
  it due to mapping of vm functions to km functions, but this fell apart
  a few releases ago and was just reported recently.

  The other is just a minor dependency clean up.

  I think queued up right now by Andrew is a fix in percpu that papers
  of what seems to be a bug in hotplug for a special situation with
  memoryless nodes. Michal Hocko is digging into it further"

* 'for-5.16-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/dennis/percpu:
  percpu_ref: Replace kernel.h with the necessary inclusions
  percpu: km: ensure it is used with NOMMU (either UP or SMP)

2 years agoMerge tag 'perf-tools-fixes-for-v5.16-2021-12-11' of git://git.kernel.org/pub/scm...
Linus Torvalds [Sat, 11 Dec 2021 21:28:02 +0000 (13:28 -0800)]
Merge tag 'perf-tools-fixes-for-v5.16-2021-12-11' of git://git./linux/kernel/git/acme/linux

Pull perf tools fixes from Arnaldo Carvalho de Melo:

 - Prevent out-of-bounds access to per sample registers.

 - Fix NULL vs IS_ERR_OR_NULL() checking on the python binding.

 - Intel PT fixes, half of those are one-liners:
      - Fix some PGE (packet generation enable/control flow packets) usage.
      - Fix sync state when a PSB (synchronization) packet is found.
      - Fix intel_pt_fup_event() assumptions about setting state type.
      - Fix state setting when receiving overflow (OVF) packet.
      - Fix next 'err' value, walking trace.
      - Fix missing 'instruction' events with 'q' option.
      - Fix error timestamp setting on the decoder error path.

* tag 'perf-tools-fixes-for-v5.16-2021-12-11' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux:
  perf python: Fix NULL vs IS_ERR_OR_NULL() checking
  perf intel-pt: Fix error timestamp setting on the decoder error path
  perf intel-pt: Fix missing 'instruction' events with 'q' option
  perf intel-pt: Fix next 'err' value, walking trace
  perf intel-pt: Fix state setting when receiving overflow (OVF) packet
  perf intel-pt: Fix intel_pt_fup_event() assumptions about setting state type
  perf intel-pt: Fix sync state when a PSB (synchronization) packet is found
  perf intel-pt: Fix some PGE (packet generation enable/control flow packets) usage
  perf tools: Prevent out-of-bounds access to registers

2 years agoMerge tag 'block-5.16-2021-12-10' of git://git.kernel.dk/linux-block
Linus Torvalds [Sat, 11 Dec 2021 17:25:07 +0000 (09:25 -0800)]
Merge tag 'block-5.16-2021-12-10' of git://git.kernel.dk/linux-block

Pull block fixes from Jens Axboe:
 "A few block fixes that should go into this release:

   - NVMe pull request:
        - set ana_log_size to 0 after freeing ana_log_buf (Hou Tao)
        - show subsys nqn for duplicate cntlids (Keith Busch)
        - disable namespace access for unsupported metadata (Keith
          Busch)
        - report write pointer for a full zone as zone start + zone len
          (Niklas Cassel)
        - fix use after free when disconnecting a reconnecting ctrl
          (Ruozhu Li)
        - fix a list corruption in nvmet-tcp (Sagi Grimberg)

   - Fix for a regression on DIO single bio async IO (Pavel)

   - ioprio seteuid fix (Davidlohr)

   - mtd fix that subsequently got reverted as it was broken, will get
     re-done and submitted for the next round

   - Two MD fixes via Song (Markus, zhangyue)"

* tag 'block-5.16-2021-12-10' of git://git.kernel.dk/linux-block:
  Revert "mtd_blkdevs: don't scan partitions for plain mtdblock"
  block: fix ioprio_get(IOPRIO_WHO_PGRP) vs setuid(2)
  md: fix double free of mddev->private in autorun_array()
  md: fix update super 1.0 on rdev size change
  nvmet-tcp: fix possible list corruption for unexpected command failure
  block: fix single bio async DIO error handling
  nvme: fix use after free when disconnecting a reconnecting ctrl
  nvme-multipath: set ana_log_size to 0 after free ana_log_buf
  mtd_blkdevs: don't scan partitions for plain mtdblock
  nvme: report write pointer for a full zone as zone start + zone len
  nvme: disable namespace access for unsupported metadata
  nvme: show subsys nqn for duplicate cntlids

2 years agoMerge tag 'io_uring-5.16-2021-12-10' of git://git.kernel.dk/linux-block
Linus Torvalds [Sat, 11 Dec 2021 17:19:44 +0000 (09:19 -0800)]
Merge tag 'io_uring-5.16-2021-12-10' of git://git.kernel.dk/linux-block

Pull io_uring fixes from Jens Axboe:
 "A few fixes that are all bound for stable:

   - Two syzbot reports for io-wq that turned out to be separate fixes,
     but ultimately very closely related

   - io_uring task_work running on cancelations"

* tag 'io_uring-5.16-2021-12-10' of git://git.kernel.dk/linux-block:
  io-wq: check for wq exit after adding new worker task_work
  io_uring: ensure task_work gets run as part of cancelations
  io-wq: remove spurious bit clear on task_work addition

2 years agoMerge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa...
Linus Torvalds [Sat, 11 Dec 2021 17:09:59 +0000 (09:09 -0800)]
Merge branch 'i2c/for-current' of git://git./linux/kernel/git/wsa/linux

Pull i2c fixes from Wolfram Sang:
 "Two more I2C driver bugfixes"

* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
  i2c: mpc: Use atomic read and fix break condition
  i2c: virtio: fix completion handling

2 years agoMerge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Sat, 11 Dec 2021 17:06:08 +0000 (09:06 -0800)]
Merge tag 'clk-fixes-for-linus' of git://git./linux/kernel/git/clk/linux

Pull clk driver fixes from Stephen Boyd:

 - Fix qcom mux logic to look at the proper parent table member. Luckily
   this clk type isn't very common.

 - Don't kill clks on qcom systems that use Trion PLLs that are enabled
   out of the bootloader. We will simply skip programming the PLL rate
   if it's already done.

 - Use the proper clk_ops for the qcom sm6125 ICE clks.

 - Use module_platform_driver() in i.MX as it can be a module.

 - Fix a UAF in the versatile clk driver on an error path.

* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
  clk: versatile: clk-icst: use after free on error path
  clk: qcom: sm6125-gcc: Swap ops of ice and apps on sdcc1
  clk: imx: use module_platform_driver
  clk: qcom: clk-alpha-pll: Don't reconfigure running Trion
  clk: qcom: regmap-mux: fix parent clock lookup

2 years agoMerge tag 'devicetree-fixes-for-5.16-2' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sat, 11 Dec 2021 16:58:04 +0000 (08:58 -0800)]
Merge tag 'devicetree-fixes-for-5.16-2' of git://git./linux/kernel/git/robh/linux

Pull devicetree fixes from Rob Herring:

 - Revert schema checks on %.dtb targets. This was problematic for some
   external build tools.

 - A few DT binding example fixes

 - Add back dropped 'enet-phy-lane-no-swap' Ethernet PHY property

 - Drop erroneous if/then schema in nxp,imx7-mipi-csi2

 - Add a quirk to fix some interrupt controllers use of 'interrupt-map'

* tag 'devicetree-fixes-for-5.16-2' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux:
  Revert "kbuild: Enable DT schema checks for %.dtb targets"
  dt-bindings: bq25980: Fixup the example
  dt-bindings: input: gpio-keys: Fix interrupts in example
  dt-bindings: net: Reintroduce PHY no lane swap binding
  dt-bindings: media: nxp,imx7-mipi-csi2: Drop bad if/then schema
  of/irq: Add a quirk for controllers with their own definition of interrupt-map
  dt-bindings: iio: adc: exynos-adc: Fix node name in example

2 years agoMerge branch 'akpm' (patches from Andrew)
Linus Torvalds [Sat, 11 Dec 2021 16:46:52 +0000 (08:46 -0800)]
Merge branch 'akpm' (patches from Andrew)

Merge misc fixes from Andrew Morton:
 "21 patches.

  Subsystems affected by this patch series: MAINTAINERS, mailmap, and mm
  (mlock, pagecache, damon, slub, memcg, hugetlb, and pagecache)"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (21 commits)
  mm: bdi: initialize bdi_min_ratio when bdi is unregistered
  hugetlbfs: fix issue of preallocation of gigantic pages can't work
  mm/memcg: relocate mod_objcg_mlstate(), get_obj_stock() and put_obj_stock()
  mm/slub: fix endianness bug for alloc/free_traces attributes
  selftests/damon: split test cases
  selftests/damon: test debugfs file reads/writes with huge count
  selftests/damon: test wrong DAMOS condition ranges input
  selftests/damon: test DAMON enabling with empty target_ids case
  selftests/damon: skip test if DAMON is running
  mm/damon/vaddr-test: remove unnecessary variables
  mm/damon/vaddr-test: split a test function having >1024 bytes frame size
  mm/damon/vaddr: remove an unnecessary warning message
  mm/damon/core: remove unnecessary error messages
  mm/damon/dbgfs: remove an unnecessary error message
  mm/damon/core: use better timer mechanisms selection threshold
  mm/damon/core: fix fake load reports due to uninterruptible sleeps
  timers: implement usleep_idle_range()
  filemap: remove PageHWPoison check from next_uptodate_page()
  mailmap: update email address for Guo Ren
  MAINTAINERS: update kdump maintainers
  ...

2 years agoMerge tag 'timers-v5.16-rc4' of https://git.linaro.org/people/daniel.lezcano/linux...
Thomas Gleixner [Sat, 11 Dec 2021 12:56:30 +0000 (13:56 +0100)]
Merge tag 'timers-v5.16-rc4' of https://git.linaro.org/people/daniel.lezcano/linux into timers/urgent

Pull timer fixes from Daniel Lezcano:

  - Fix build error with clang and some kernel configuration on the
    arm64 architected timer by inlining the
    erratum_set_next_event_generic() function (Marc Zyngier)

  - Fix probe error on the dw_apb_timer_of driver by fixing the
    incorrect condition previously introduced (Alexey Sheplyakov)

Link: https://lore.kernel.org/r/429b796d-9395-4ca8-81f3-30911f80a9a9@linaro.org
2 years agoperf python: Fix NULL vs IS_ERR_OR_NULL() checking
Miaoqian Lin [Sat, 11 Dec 2021 05:38:53 +0000 (05:38 +0000)]
perf python: Fix NULL vs IS_ERR_OR_NULL() checking

The function trace_event__tp_format_id may return ERR_PTR(-ENOMEM).  Use
IS_ERR_OR_NULL to check tp_format.

Signed-off-by: Miaoqian Lin <linmq006@gmail.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Link: http://lore.kernel.org/lkml/20211211053856.19827-1-linmq006@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2 years agoperf intel-pt: Fix error timestamp setting on the decoder error path
Adrian Hunter [Fri, 10 Dec 2021 16:23:03 +0000 (18:23 +0200)]
perf intel-pt: Fix error timestamp setting on the decoder error path

An error timestamp shows the last known timestamp for the queue, but this
is not updated on the error path. Fix by setting it.

Fixes: f4aa081949e7b6 ("perf tools: Add Intel PT decoder")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: stable@vger.kernel.org # v5.15+
Link: https://lore.kernel.org/r/20211210162303.2288710-8-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2 years agoperf intel-pt: Fix missing 'instruction' events with 'q' option
Adrian Hunter [Fri, 10 Dec 2021 16:23:02 +0000 (18:23 +0200)]
perf intel-pt: Fix missing 'instruction' events with 'q' option

FUP packets contain IP information, which makes them also an 'instruction'
event in 'hop' mode i.e. the itrace 'q' option.  That wasn't happening, so
restructure the logic so that FUP events are added along with appropriate
'instruction' and 'branch' events.

Fixes: 7c1b16ba0e26e6 ("perf intel-pt: Add support for decoding FUP/TIP only")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: stable@vger.kernel.org # v5.15+
Link: https://lore.kernel.org/r/20211210162303.2288710-7-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2 years agoperf intel-pt: Fix next 'err' value, walking trace
Adrian Hunter [Fri, 10 Dec 2021 16:23:01 +0000 (18:23 +0200)]
perf intel-pt: Fix next 'err' value, walking trace

Code after label 'next:' in intel_pt_walk_trace() assumes 'err' is zero,
but it may not be, if arrived at via a 'goto'. Ensure it is zero.

Fixes: 7c1b16ba0e26e6 ("perf intel-pt: Add support for decoding FUP/TIP only")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: stable@vger.kernel.org # v5.15+
Link: https://lore.kernel.org/r/20211210162303.2288710-6-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2 years agoperf intel-pt: Fix state setting when receiving overflow (OVF) packet
Adrian Hunter [Fri, 10 Dec 2021 16:23:00 +0000 (18:23 +0200)]
perf intel-pt: Fix state setting when receiving overflow (OVF) packet

An overflow (OVF packet) is treated as an error because it represents a
loss of trace data, but there is no loss of synchronization, so the packet
state should be INTEL_PT_STATE_IN_SYNC not INTEL_PT_STATE_ERR_RESYNC.

To support that, some additional variables must be reset, and the FUP
packet that may follow OVF is treated as an FUP event.

Fixes: f4aa081949e7b6 ("perf tools: Add Intel PT decoder")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: stable@vger.kernel.org # v5.15+
Link: https://lore.kernel.org/r/20211210162303.2288710-5-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2 years agoperf intel-pt: Fix intel_pt_fup_event() assumptions about setting state type
Adrian Hunter [Fri, 10 Dec 2021 16:22:59 +0000 (18:22 +0200)]
perf intel-pt: Fix intel_pt_fup_event() assumptions about setting state type

intel_pt_fup_event() assumes it can overwrite the state type if there has
been an FUP event, but this is an unnecessary and unexpected constraint on
callers.

Fix by touching only the state type flags that are affected by an FUP
event.

Fixes: a472e65fc490a ("perf intel-pt: Add decoder support for ptwrite and power event packets")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: stable@vger.kernel.org # v5.15+
Link: https://lore.kernel.org/r/20211210162303.2288710-4-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2 years agoperf intel-pt: Fix sync state when a PSB (synchronization) packet is found
Adrian Hunter [Fri, 10 Dec 2021 16:22:58 +0000 (18:22 +0200)]
perf intel-pt: Fix sync state when a PSB (synchronization) packet is found

When syncing, it may be that branch packet generation is not enabled at
that point, in which case there will not immediately be a control-flow
packet, so some packets before a control flow packet turns up, get
ignored.  However, the decoder is in sync as soon as a PSB is found, so
the state should be set accordingly.

Fixes: f4aa081949e7b6 ("perf tools: Add Intel PT decoder")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: stable@vger.kernel.org # v5.15+
Link: https://lore.kernel.org/r/20211210162303.2288710-3-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2 years agoperf intel-pt: Fix some PGE (packet generation enable/control flow packets) usage
Adrian Hunter [Fri, 10 Dec 2021 16:22:57 +0000 (18:22 +0200)]
perf intel-pt: Fix some PGE (packet generation enable/control flow packets) usage

Packet generation enable (PGE) refers to whether control flow (COFI)
packets are being produced.

PGE may be false even when branch-tracing is enabled, due to being
out-of-context, or outside a filter address range.  Fix some missing PGE
usage.

Fixes: 7c1b16ba0e26e6 ("perf intel-pt: Add support for decoding FUP/TIP only")
Fixes: 839598176b0554 ("perf intel-pt: Allow decoding with branch tracing disabled")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: stable@vger.kernel.org # v5.15+
Link: https://lore.kernel.org/r/20211210162303.2288710-2-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2 years agoperf tools: Prevent out-of-bounds access to registers
German Gomez [Wed, 1 Dec 2021 12:33:29 +0000 (12:33 +0000)]
perf tools: Prevent out-of-bounds access to registers

The size of the cache of register values is arch-dependant
(PERF_REGS_MAX). This has the potential of causing an out-of-bounds
access in the function "perf_reg_value" if the local architecture
contains less registers than the one the perf.data file was recorded on.

Since the maximum number of registers is bound by the bitmask "u64
cache_mask", and the size of the cache when running under x86 systems is
64 already, fix the size to 64 and add a range-check to the function
"perf_reg_value" to prevent out-of-bounds access.

Reported-by: Alexandre Truong <alexandre.truong@arm.com>
Reviewed-by: Kajol Jain <kjain@linux.ibm.com>
Signed-off-by: German Gomez <german.gomez@arm.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-csky@vger.kernel.org
Cc: linux-riscv@lists.infradead.org
Link: https://lore.kernel.org/r/20211201123334.679131-2-german.gomez@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2 years agoMerge tag 'irqchip-fixes-5.16-2' of git://git.kernel.org/pub/scm/linux/kernel/git...
Thomas Gleixner [Sat, 11 Dec 2021 09:51:23 +0000 (10:51 +0100)]
Merge tag 'irqchip-fixes-5.16-2' of git://git./linux/kernel/git/maz/arm-platforms into irq/urgent

Pull irqchip fixes from Marc Zyngier:

 - Fix Armada-370-XP Multi-MSi allocation to be aligned on the allocation
   size, as required by the PCI spec

 - Fix aspeed-scu interrupt acknowledgement by directly writing to the
   register instead of a read-modify-write sequence

 - Use standard bitfirl helpers in the MIPS GIC driver instead of custom
   constructs

 - Fix the NVIC driver IPR register offset

 - Correctly drop the reference of the device node in the irq-bcm7120-l2
   driver

 - Fix the GICv3 ITS INVALL command by issueing a following SYNC command

 - Add a missing __init attribute to the init function of the Apple AIC
   driver

Link: https://lore.kernel.org/r/20211210133516.664497-1-maz@kernel.org
2 years agoMerge tag 'for-5.16-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave...
Linus Torvalds [Sat, 11 Dec 2021 01:28:02 +0000 (17:28 -0800)]
Merge tag 'for-5.16-rc4-tag' of git://git./linux/kernel/git/kdave/linux

Pull btrfs fixes from David Sterba:
 "A few more regression fixes and stable patches, mostly one-liners.

  Regression fixes:

   - fix pointer/ERR_PTR mismatch returned from memdup_user

   - reset dedicated zoned mode relocation block group to avoid using it
     and filling it without any recourse

  Fixes:

   - handle a case to FITRIM range (also to make fstests/generic/260
     work)

   - fix warning when extent buffer state and pages get out of sync
     after an IO error

   - fix transaction abort when syncing due to missing mapping error set
     on metadata inode after inlining a compressed file

   - fix transaction abort due to tree-log and zoned mode interacting in
     an unexpected way

   - fix memory leak of additional extent data when qgroup reservation
     fails

   - do proper handling of slot search call when deleting root refs"

* tag 'for-5.16-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
  btrfs: replace the BUG_ON in btrfs_del_root_ref with proper error handling
  btrfs: zoned: clear data relocation bg on zone finish
  btrfs: free exchange changeset on failures
  btrfs: fix re-dirty process of tree-log nodes
  btrfs: call mapping_set_error() on btree inode with a write error
  btrfs: clear extent buffer uptodate when we fail to write it
  btrfs: fail if fstrim_range->start == U64_MAX
  btrfs: fix error pointer dereference in btrfs_ioctl_rm_dev_v2()

2 years agoMerge tag '5.16-rc4-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6
Linus Torvalds [Sat, 11 Dec 2021 01:24:57 +0000 (17:24 -0800)]
Merge tag '5.16-rc4-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6

Pull cifs fixes from Steve French:
 "Two cifs/smb3 fixes - one for stable, the other fixes a recently
  reported NTLMSSP auth problem"

* tag '5.16-rc4-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6:
  cifs: fix ntlmssp auth when there is no key exchange
  cifs: Fix crash on unload of cifs_arc4.ko

2 years agoMerge tag 'nfsd-5.16-2' of git://linux-nfs.org/~bfields/linux
Linus Torvalds [Sat, 11 Dec 2021 01:17:53 +0000 (17:17 -0800)]
Merge tag 'nfsd-5.16-2' of git://linux-nfs.org/~bfields/linux

Pull nfsd fixes from Bruce Fields:
 "Fix a race on startup and another in the delegation code.

  The latter has been around for years, but I suspect recent changes may
  have widened the race window a little, so I'd like to go ahead and get
  it in"

* tag 'nfsd-5.16-2' of git://linux-nfs.org/~bfields/linux:
  nfsd: fix use-after-free due to delegation race
  nfsd: Fix nsfd startup race (again)

2 years agomm: bdi: initialize bdi_min_ratio when bdi is unregistered
Manjong Lee [Fri, 10 Dec 2021 22:47:11 +0000 (14:47 -0800)]
mm: bdi: initialize bdi_min_ratio when bdi is unregistered

Initialize min_ratio if it is set during bdi unregistration.  This can
prevent problems that may occur a when bdi is removed without resetting
min_ratio.

For example.
1) insert external sdcard
2) set external sdcard's min_ratio 70
3) remove external sdcard without setting min_ratio 0
4) insert external sdcard
5) set external sdcard's min_ratio 70 << error occur(can't set)

Because when an sdcard is removed, the present bdi_min_ratio value will
remain.  Currently, the only way to reset bdi_min_ratio is to reboot.

[akpm@linux-foundation.org: tweak comment and coding style]

Link: https://lkml.kernel.org/r/20211021161942.5983-1-mj0123.lee@samsung.com
Signed-off-by: Manjong Lee <mj0123.lee@samsung.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Changheun Lee <nanich.lee@samsung.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: <seunghwan.hyun@samsung.com>
Cc: <sookwan7.kim@samsung.com>
Cc: <yt0928.kim@samsung.com>
Cc: <junho89.kim@samsung.com>
Cc: <jisoo2146.oh@samsung.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2 years agohugetlbfs: fix issue of preallocation of gigantic pages can't work
Zhenguo Yao [Fri, 10 Dec 2021 22:47:08 +0000 (14:47 -0800)]
hugetlbfs: fix issue of preallocation of gigantic pages can't work

Preallocation of gigantic pages can't work bacause of commit
b5389086ad7b ("hugetlbfs: extend the definition of hugepages parameter
to support node allocation").  When nid is NUMA_NO_NODE(-1),
alloc_bootmem_huge_page will always return without doing allocation.
Fix this by adding more check.

Link: https://lkml.kernel.org/r/20211129133803.15653-1-yaozhenguo1@gmail.com
Fixes: b5389086ad7b ("hugetlbfs: extend the definition of hugepages parameter to support node allocation")
Signed-off-by: Zhenguo Yao <yaozhenguo1@gmail.com>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Tested-by: Maxim Levitsky <mlevitsk@redhat.com>
Reviewed-by: Muchun Song <songmuchun@bytedance.com>
Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2 years agomm/memcg: relocate mod_objcg_mlstate(), get_obj_stock() and put_obj_stock()
Waiman Long [Fri, 10 Dec 2021 22:47:05 +0000 (14:47 -0800)]
mm/memcg: relocate mod_objcg_mlstate(), get_obj_stock() and put_obj_stock()

All the calls to mod_objcg_mlstate(), get_obj_stock() and
put_obj_stock() are done by functions defined within the same "#ifdef
CONFIG_MEMCG_KMEM" compilation block.  When CONFIG_MEMCG_KMEM isn't
defined, the following compilation warnings will be issued [1] and [2].

  mm/memcontrol.c:785:20: warning: unused function 'mod_objcg_mlstate'
  mm/memcontrol.c:2113:33: warning: unused function 'get_obj_stock'

Fix these warning by moving those functions to under the same
CONFIG_MEMCG_KMEM compilation block.  There is no functional change.

[1] https://lore.kernel.org/lkml/202111272014.WOYNLUV6-lkp@intel.com/
[2] https://lore.kernel.org/lkml/202111280551.LXsWYt1T-lkp@intel.com/

Link: https://lkml.kernel.org/r/20211129161140.306488-1-longman@redhat.com
Fixes: 559271146efc ("mm/memcg: optimize user context object stock access")
Fixes: 68ac5b3c8db2 ("mm/memcg: cache vmstat data in percpu memcg_stock_pcp")
Signed-off-by: Waiman Long <longman@redhat.com>
Reported-by: kernel test robot <lkp@intel.com>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Acked-by: Roman Gushchin <guro@fb.com>
Reviewed-by: Muchun Song <songmuchun@bytedance.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2 years agomm/slub: fix endianness bug for alloc/free_traces attributes
Gerald Schaefer [Fri, 10 Dec 2021 22:47:02 +0000 (14:47 -0800)]
mm/slub: fix endianness bug for alloc/free_traces attributes

On big-endian s390, the alloc/free_traces attributes produce endless
output, because of always 0 idx in slab_debugfs_show().

idx is de-referenced from *v, which points to a loff_t value, with

    unsigned int idx = *(unsigned int *)v;

This will only give the upper 32 bits on big-endian, which remain 0.

Instead of only fixing this de-reference, during discussion it seemed
more appropriate to change the seq_ops so that they use an explicit
iterator in private loc_track struct.

This patch adds idx to loc_track, which will also fix the endianness
bug.

Link: https://lore.kernel.org/r/20211117193932.4049412-1-gerald.schaefer@linux.ibm.com
Link: https://lkml.kernel.org/r/20211126171848.17534-1-gerald.schaefer@linux.ibm.com
Fixes: 64dd68497be7 ("mm: slub: move sysfs slab alloc/free interfaces to debugfs")
Signed-off-by: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
Reported-by: Steffen Maier <maier@linux.ibm.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Faiyaz Mohammed <faiyazm@codeaurora.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2 years agoselftests/damon: split test cases
SeongJae Park [Fri, 10 Dec 2021 22:46:59 +0000 (14:46 -0800)]
selftests/damon: split test cases

Currently, the single test program, debugfs.sh, contains all test cases
for DAMON.  When one of the cases fails, finding which case is failed
from the test log is not so easy, and all remaining tests will be
skipped.  To improve the situation, this commit splits the single
program into small test programs having their own names.

Link: https://lkml.kernel.org/r/20211201150440.1088-12-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Brendan Higgins <brendanhiggins@google.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2 years agoselftests/damon: test debugfs file reads/writes with huge count
SeongJae Park [Fri, 10 Dec 2021 22:46:55 +0000 (14:46 -0800)]
selftests/damon: test debugfs file reads/writes with huge count

DAMON debugfs interface users were able to trigger warning by writing
some files with arbitrarily large 'count' parameter.  The issue is fixed
with commit db7a347b26fe ("mm/damon/dbgfs: use '__GFP_NOWARN' for
user-specified size buffer allocation").  This commit adds a test case
for the issue in DAMON selftests to avoid future regressions.

Link: https://lkml.kernel.org/r/20211201150440.1088-11-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Brendan Higgins <brendanhiggins@google.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2 years agoselftests/damon: test wrong DAMOS condition ranges input
SeongJae Park [Fri, 10 Dec 2021 22:46:52 +0000 (14:46 -0800)]
selftests/damon: test wrong DAMOS condition ranges input

A patch titled "mm/damon/schemes: add the validity judgment of
thresholds"[1] makes DAMON debugfs interface to validate DAMON scheme
inputs.  This commit adds a test case for the validation logic in DAMON
selftests.

[1] https://lore.kernel.org/linux-mm/d78360e52158d786fcbf20bc62c96785742e76d3.1637239568.git.xhao@linux.alibaba.com/

Link: https://lkml.kernel.org/r/20211201150440.1088-10-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Brendan Higgins <brendanhiggins@google.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2 years agoselftests/damon: test DAMON enabling with empty target_ids case
SeongJae Park [Fri, 10 Dec 2021 22:46:49 +0000 (14:46 -0800)]
selftests/damon: test DAMON enabling with empty target_ids case

DAMON debugfs didn't check empty targets when starting monitoring, and
the issue is fixed with commit b5ca3e83ddb0 ("mm/damon/dbgfs: add
adaptive_targets list check before enable monitor_on").  To avoid future
regression, this commit adds a test case for that in DAMON selftests.

Link: https://lkml.kernel.org/r/20211201150440.1088-9-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Brendan Higgins <brendanhiggins@google.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2 years agoselftests/damon: skip test if DAMON is running
SeongJae Park [Fri, 10 Dec 2021 22:46:46 +0000 (14:46 -0800)]
selftests/damon: skip test if DAMON is running

Testing the DAMON debugfs files while DAMON is running makes no sense,
as any write to the debugfs files will fail.  This commit makes the test
be skipped in this case.

Link: https://lkml.kernel.org/r/20211201150440.1088-8-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Brendan Higgins <brendanhiggins@google.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2 years agomm/damon/vaddr-test: remove unnecessary variables
SeongJae Park [Fri, 10 Dec 2021 22:46:43 +0000 (14:46 -0800)]
mm/damon/vaddr-test: remove unnecessary variables

A couple of test functions in DAMON virtual address space monitoring
primitives implementation has unnecessary damon_ctx variables.  This
commit removes those.

Link: https://lkml.kernel.org/r/20211201150440.1088-7-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Brendan Higgins <brendanhiggins@google.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2 years agomm/damon/vaddr-test: split a test function having >1024 bytes frame size
SeongJae Park [Fri, 10 Dec 2021 22:46:40 +0000 (14:46 -0800)]
mm/damon/vaddr-test: split a test function having >1024 bytes frame size

On some configuration[1], 'damon_test_split_evenly()' kunit test
function has >1024 bytes frame size, so below build warning is
triggered:

      CC      mm/damon/vaddr.o
    In file included from mm/damon/vaddr.c:672:
    mm/damon/vaddr-test.h: In function 'damon_test_split_evenly':
    mm/damon/vaddr-test.h:309:1: warning: the frame size of 1064 bytes is larger than 1024 bytes [-Wframe-larger-than=]
      309 | }
          | ^

This commit fixes the warning by separating the common logic in the
function.

[1] https://lore.kernel.org/linux-mm/202111182146.OV3C4uGr-lkp@intel.com/

Link: https://lkml.kernel.org/r/20211201150440.1088-6-sj@kernel.org
Fixes: 17ccae8bb5c9 ("mm/damon: add kunit tests")
Signed-off-by: SeongJae Park <sj@kernel.org>
Reported-by: kernel test robot <lkp@intel.com>
Cc: Brendan Higgins <brendanhiggins@google.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2 years agomm/damon/vaddr: remove an unnecessary warning message
SeongJae Park [Fri, 10 Dec 2021 22:46:37 +0000 (14:46 -0800)]
mm/damon/vaddr: remove an unnecessary warning message

The DAMON virtual address space monitoring primitive prints a warning
message for wrong DAMOS action.  However, it is not essential as the
code returns appropriate failure in the case.  This commit removes the
message to make the log clean.

Link: https://lkml.kernel.org/r/20211201150440.1088-5-sj@kernel.org
Fixes: 6dea8add4d28 ("mm/damon/vaddr: support DAMON-based Operation Schemes")
Signed-off-by: SeongJae Park <sj@kernel.org>
Reviewed-by: Muchun Song <songmuchun@bytedance.com>
Cc: Brendan Higgins <brendanhiggins@google.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2 years agomm/damon/core: remove unnecessary error messages
SeongJae Park [Fri, 10 Dec 2021 22:46:34 +0000 (14:46 -0800)]
mm/damon/core: remove unnecessary error messages

DAMON core prints error messages when damon_target object creation is
failed or wrong monitoring attributes are given.  Because appropriate
error code is returned for each case, the messages are not essential.
Also, because the code path can be triggered with user-specified input,
this could result in kernel log mistakenly being messy.  To avoid the
case, this commit removes the messages.

Link: https://lkml.kernel.org/r/20211201150440.1088-4-sj@kernel.org
Fixes: 4bc05954d007 ("mm/damon: implement a debugfs-based user space interface")
Fixes: b9a6ac4e4ede ("mm/damon: adaptively adjust regions")
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Brendan Higgins <brendanhiggins@google.com>
Cc: kernel test robot <lkp@intel.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2 years agomm/damon/dbgfs: remove an unnecessary error message
SeongJae Park [Fri, 10 Dec 2021 22:46:31 +0000 (14:46 -0800)]
mm/damon/dbgfs: remove an unnecessary error message

When wrong scheme action is requested via the debugfs interface, DAMON
prints an error message.  Because the function returns error code, this
is not really needed.  Because the code path is triggered by the user
specified input, this can result in kernel log mistakenly being messy.
To avoid the case, this commit removes the message.

Link: https://lkml.kernel.org/r/20211201150440.1088-3-sj@kernel.org
Fixes: af122dd8f3c0 ("mm/damon/dbgfs: support DAMON-based Operation Schemes")
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Brendan Higgins <brendanhiggins@google.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2 years agomm/damon/core: use better timer mechanisms selection threshold
SeongJae Park [Fri, 10 Dec 2021 22:46:28 +0000 (14:46 -0800)]
mm/damon/core: use better timer mechanisms selection threshold

Patch series "mm/damon: Trivial fixups and improvements".

This patchset contains trivial fixups and improvements for DAMON and its
kunit/kselftest tests.

This patch (of 11):

DAMON is using hrtimer if requested sleep time is <=100ms, while the
suggested threshold[1] is <=20ms.  This commit applies the threshold.

[1] Documentation/timers/timers-howto.rst

Link: https://lkml.kernel.org/r/20211201150440.1088-2-sj@kernel.org
Fixes: ee801b7dd7822 ("mm/damon/schemes: activate schemes based on a watermarks mechanism")
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Brendan Higgins <brendanhiggins@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2 years agomm/damon/core: fix fake load reports due to uninterruptible sleeps
SeongJae Park [Fri, 10 Dec 2021 22:46:25 +0000 (14:46 -0800)]
mm/damon/core: fix fake load reports due to uninterruptible sleeps

Because DAMON sleeps in uninterruptible mode, /proc/loadavg reports fake
load while DAMON is turned on, though it is doing nothing.  This can
confuse users[1].  To avoid the case, this commit makes DAMON sleeps in
idle mode.

[1] https://lore.kernel.org/all/11868371.O9o76ZdvQC@natalenko.name/

Link: https://lkml.kernel.org/r/20211126145015.15862-3-sj@kernel.org
Fixes: 2224d8485492 ("mm: introduce Data Access MONitor (DAMON)")
Reported-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Signed-off-by: SeongJae Park <sj@kernel.org>
Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2 years agotimers: implement usleep_idle_range()
SeongJae Park [Fri, 10 Dec 2021 22:46:22 +0000 (14:46 -0800)]
timers: implement usleep_idle_range()

Patch series "mm/damon: Fix fake /proc/loadavg reports", v3.

This patchset fixes DAMON's fake load report issue.  The first patch
makes yet another variant of usleep_range() for this fix, and the second
patch fixes the issue of DAMON by making it using the newly introduced
function.

This patch (of 2):

Some kernel threads such as DAMON could need to repeatedly sleep in
micro seconds level.  Because usleep_range() sleeps in uninterruptible
state, however, such threads would make /proc/loadavg reports fake load.

To help such cases, this commit implements a variant of usleep_range()
called usleep_idle_range().  It is same to usleep_range() but sets the
state of the current task as TASK_IDLE while sleeping.

Link: https://lkml.kernel.org/r/20211126145015.15862-1-sj@kernel.org
Link: https://lkml.kernel.org/r/20211126145015.15862-2-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Suggested-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Cc: John Stultz <john.stultz@linaro.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2 years agofilemap: remove PageHWPoison check from next_uptodate_page()
Matthew Wilcox (Oracle) [Fri, 10 Dec 2021 22:46:18 +0000 (14:46 -0800)]
filemap: remove PageHWPoison check from next_uptodate_page()

Pages are individually marked as suffering from hardware poisoning.
Checking that the head page is not hardware poisoned doesn't make
sense; we might be after a subpage.  We check each page individually
before we use it, so this was an optimisation gone wrong.  It will
cause us to fall back to the slow path when there was no need to do
that

Link: https://lkml.kernel.org/r/20211120174429.2596303-1-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
Cc: Yang Shi <shy828301@gmail.com>
Cc: "Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2 years agomailmap: update email address for Guo Ren
Guo Ren [Fri, 10 Dec 2021 22:46:15 +0000 (14:46 -0800)]
mailmap: update email address for Guo Ren

The ren_guo@c-sky.com would be deprecated and use guoren@kernel.org as the
main email address.

Link: https://lkml.kernel.org/r/20211123022741.545541-1-guoren@kernel.org
Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2 years agoMAINTAINERS: update kdump maintainers
Dave Young [Fri, 10 Dec 2021 22:46:12 +0000 (14:46 -0800)]
MAINTAINERS: update kdump maintainers

Remove myself from kdump maintainers as I have no enough time to maintain
it now.  But I can review patches on demand though.

Link: https://lkml.kernel.org/r/YZyKilzKFsWJYdgn@dhcp-128-65.nay.redhat.com
Signed-off-by: Dave Young <dyoung@redhat.com>
Acked-by: Baoquan He <bhe@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2 years agoIncrease default MLOCK_LIMIT to 8 MiB
Drew DeVault [Fri, 10 Dec 2021 22:46:09 +0000 (14:46 -0800)]
Increase default MLOCK_LIMIT to 8 MiB

This limit has not been updated since 2008, when it was increased to 64
KiB at the request of GnuPG.  Until recently, the main use-cases for this
feature were (1) preventing sensitive memory from being swapped, as in
GnuPG's use-case; and (2) real-time use-cases.  In the first case, little
memory is called for, and in the second case, the user is generally in a
position to increase it if they need more.

The introduction of IOURING_REGISTER_BUFFERS adds a third use-case:
preparing fixed buffers for high-performance I/O.  This use-case will take
as much of this memory as it can get, but is still limited to 64 KiB by
default, which is very little.  This increases the limit to 8 MB, which
was chosen fairly arbitrarily as a more generous, but still conservative,
default value.

It is also possible to raise this limit in userspace.  This is easily
done, for example, in the use-case of a network daemon: systemd, for
instance, provides for this via LimitMEMLOCK in the service file; OpenRC
via the rc_ulimit variables.  However, there is no established userspace
facility for configuring this outside of daemons: end-user applications do
not presently have access to a convenient means of raising their limits.

The buck, as it were, stops with the kernel.  It's much easier to address
it here than it is to bring it to hundreds of distributions, and it can
only realistically be relied upon to be high-enough by end-user software
if it is more-or-less ubiquitous.  Most distros don't change this
particular rlimit from the kernel-supplied default value, so a change here
will easily provide that ubiquity.

Link: https://lkml.kernel.org/r/20211028080813.15966-1-sir@cmpwn.com
Signed-off-by: Drew DeVault <sir@cmpwn.com>
Acked-by: Jens Axboe <axboe@kernel.dk>
Acked-by: Cyril Hrubis <chrubis@suse.cz>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Pavel Begunkov <asml.silence@gmail.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Andrew Dona-Couch <andrew@donacou.ch>
Cc: Ammar Faizi <ammarfaizi2@gnuweeb.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2 years agoMerge tag 'thermal-5.16-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael...
Linus Torvalds [Sat, 11 Dec 2021 01:02:46 +0000 (17:02 -0800)]
Merge tag 'thermal-5.16-rc5' of git://git./linux/kernel/git/rafael/linux-pm

Pull thermal control fix from Rafael Wysocki:
 "Fix the definition of one of the Tiger Lake MMIO registers in the
  int340x thermal driver (Sumeet Pawnikar)"

* tag 'thermal-5.16-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  thermal: int340x: Fix VCoRefLow MMIO bit offset for TGL

2 years agoMerge tag 'acpi-5.16-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael...
Linus Torvalds [Fri, 10 Dec 2021 22:43:16 +0000 (14:43 -0800)]
Merge tag 'acpi-5.16-rc5' of git://git./linux/kernel/git/rafael/linux-pm

Pull ACPI fix from Rafael Wysocki:
 "Create the output directory for the ACPI tools during build if it has
  not been present before and prevent the compilation from failing in
  that case (Chen Yu)"

* tag 'acpi-5.16-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  ACPI: tools: Fix compilation when output directory is not present

2 years agoMerge tag 'pm-5.16-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Linus Torvalds [Fri, 10 Dec 2021 22:36:06 +0000 (14:36 -0800)]
Merge tag 'pm-5.16-rc5' of git://git./linux/kernel/git/rafael/linux-pm

Pull power management fix from Rafael Wysocki:
 "Fix a kernedoc comment that doesn't match the behavior of the function
  documented by it"

* tag 'pm-5.16-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  PM: runtime: Fix pm_runtime_active() kerneldoc comment

2 years agoMerge tag 'hwmon-for-v5.16-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Fri, 10 Dec 2021 22:31:45 +0000 (14:31 -0800)]
Merge tag 'hwmon-for-v5.16-rc5' of git://git./linux/kernel/git/groeck/linux-staging

Pull hwmon fixes from Guenter Roeck:

 - In the pwm-fan driver, ensure that the internal pwm state matches the
   state assumed by the pwm code.

 - Avoid EREMOTEIO errors in sht4 driver

 - In the nct6775 driver, make it explicit that the register value
   passed to nct6775_asuswmi_read() is an 8-bit value

 - Avoid WARNing in dell-smm driver removal after failing to create
   /proc/i8k

 - Stop using a plain integer as NULL pointer in corsair-psu driver

* tag 'hwmon-for-v5.16-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
  hwmon: (pwm-fan) Ensure the fan going on in .probe()
  hwmon: (sht4x) Fix EREMOTEIO errors
  hwmon: (nct6775) mask out bank number in nct6775_wmi_read_value()
  hwmon: (dell-smm) Fix warning on /proc/i8k creation error
  hwmon: (corsair-psu) fix plain integer used as NULL pointer

2 years agoMerge tag 'trace-v5.16-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt...
Linus Torvalds [Fri, 10 Dec 2021 22:24:05 +0000 (14:24 -0800)]
Merge tag 'trace-v5.16-rc4' of git://git./linux/kernel/git/rostedt/linux-trace

Pull tracing fixes from Steven Rostedt:
 "Tracing, ftrace and tracefs fixes:

   - Have tracefs honor the gid mount option

   - Have new files in tracefs inherit the parent ownership

   - Have direct_ops unregister when it has no more functions

   - Properly clean up the ops when unregistering multi direct ops

   - Add a sample module to test the multiple direct ops

   - Fix memory leak in error path of __create_synth_event()"

* tag 'trace-v5.16-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  tracing: Fix possible memory leak in __create_synth_event() error path
  ftrace/samples: Add module to test multi direct modify interface
  ftrace: Add cleanup to unregister_ftrace_direct_multi
  ftrace: Use direct_ops hash in unregister_ftrace_direct
  tracefs: Set all files to the same group ownership as the mount option
  tracefs: Have new files inherit the ownership of their parent

2 years agoMerge tag 'aio-poll-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebigg...
Linus Torvalds [Fri, 10 Dec 2021 22:15:39 +0000 (14:15 -0800)]
Merge tag 'aio-poll-for-linus' of git://git./linux/kernel/git/ebiggers/linux

Pull aio poll fixes from Eric Biggers:
 "Fix three bugs in aio poll, and one issue with POLLFREE more broadly:

   - aio poll didn't handle POLLFREE, causing a use-after-free.

   - aio poll could block while the file is ready.

   - aio poll called eventfd_signal() when it isn't allowed.

   - POLLFREE didn't handle multiple exclusive waiters correctly.

  This has been tested with the libaio test suite, as well as with test
  programs I wrote that reproduce the first two bugs. I am sending this
  pull request myself as no one seems to be maintaining this code"

* tag 'aio-poll-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux:
  aio: Fix incorrect usage of eventfd_signal_allowed()
  aio: fix use-after-free due to missing POLLFREE handling
  aio: keep poll requests on waitqueue until completed
  signalfd: use wake_up_pollfree()
  binder: use wake_up_pollfree()
  wait: add wake_up_pollfree()

2 years agoMerge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Linus Torvalds [Fri, 10 Dec 2021 22:09:12 +0000 (14:09 -0800)]
Merge tag 'for-linus' of git://git./virt/kvm/kvm

Pull kvm fixes from Paolo Bonzini:
 "More x86 fixes:

   - Logic bugs in CR0 writes and Hyper-V hypercalls

   - Don't use Enlightened MSR Bitmap for L3

   - Remove user-triggerable WARN

  Plus a few selftest fixes and a regression test for the
  user-triggerable WARN"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  selftests: KVM: Add test to verify KVM doesn't explode on "bad" I/O
  KVM: x86: Don't WARN if userspace mucks with RCX during string I/O exit
  KVM: X86: Raise #GP when clearing CR0_PG in 64 bit mode
  selftests: KVM: avoid failures due to reserved HyperTransport region
  KVM: x86: Ignore sparse banks size for an "all CPUs", non-sparse IPI req
  KVM: x86: Wait for IPIs to be delivered when handling Hyper-V TLB flush hypercall
  KVM: x86: selftests: svm_int_ctl_test: fix intercept calculation
  KVM: nVMX: Don't use Enlightened MSR Bitmap for L3

2 years agoi2c: mpc: Use atomic read and fix break condition
Chris Packham [Tue, 7 Dec 2021 04:21:44 +0000 (17:21 +1300)]
i2c: mpc: Use atomic read and fix break condition

Maxime points out that the polling code in mpc_i2c_isr should use the
_atomic API because it is called in an irq context and that the
behaviour of the MCF bit is that it is 1 when the byte transfer is
complete. All of this means the original code was effectively a
udelay(100).

Fix this by using readb_poll_timeout_atomic() and removing the negation
of the break condition.

Fixes: 4a8ac5e45cda ("i2c: mpc: Poll for MCF")
Reported-by: Maxime Bizon <mbizon@freebox.fr>
Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz>
Tested-by: Maxime Bizon <mbizon@freebox.fr>
Signed-off-by: Wolfram Sang <wsa@kernel.org>
2 years agoio-wq: check for wq exit after adding new worker task_work
Jens Axboe [Fri, 10 Dec 2021 15:29:30 +0000 (08:29 -0700)]
io-wq: check for wq exit after adding new worker task_work

We check IO_WQ_BIT_EXIT before attempting to create a new worker, and
wq exit cancels pending work if we have any. But it's possible to have
a race between the two, where creation checks exit finding it not set,
but we're in the process of exiting. The exit side will cancel pending
creation task_work, but there's a gap where we add task_work after we've
canceled existing creations at exit time.

Fix this by checking the EXIT bit post adding the creation task_work.
If it's set, run the same cancelation that exit does.

Reported-and-tested-by: syzbot+b60c982cb0efc5e05a47@syzkaller.appspotmail.com
Reviewed-by: Hao Xu <haoxu@linux.alibaba.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2 years agoio_uring: ensure task_work gets run as part of cancelations
Jens Axboe [Thu, 9 Dec 2021 15:54:29 +0000 (08:54 -0700)]
io_uring: ensure task_work gets run as part of cancelations

If we successfully cancel a work item but that work item needs to be
processed through task_work, then we can be sleeping uninterruptibly
in io_uring_cancel_generic() and never process it. Hence we don't
make forward progress and we end up with an uninterruptible sleep
warning.

While in there, correct a comment that should be IFF, not IIF.

Reported-and-tested-by: syzbot+21e6887c0be14181206d@syzkaller.appspotmail.com
Cc: stable@vger.kernel.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>