OSDN Git Service

sagit-ice-cold/kernel_xiaomi_msm8998.git
4 years agotreewide: Fix code issues detected using GCC 8
Sultan Alsawaf [Sat, 20 Jul 2019 17:05:17 +0000 (10:05 -0700)]
treewide: Fix code issues detected using GCC 8

Use the latest version of GCC to take advantage of improved static
analysis. These issues appeared as warnings from the compiler.

Many of these fixes are for clearly incorrect code; compiler warnings
should not be taken lightly.

Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>
[wight554: fix caf tree conflicts]
Signed-off-by: Volodymyr Zhdanov <wight554@gmail.com>
4 years agocfg80211: Support backport of removing ieee80211
Srinivas Girigowda [Thu, 3 Aug 2017 19:35:52 +0000 (12:35 -0700)]
cfg80211: Support backport of removing ieee80211

Bug: 62058353
Change-Id: Id8725947048bb4ba461dcb77b7b9023991a304be
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
4 years agosoc: qpnp-haptic: Fix indentation
wloot [Wed, 8 May 2019 03:11:00 +0000 (11:11 +0800)]
soc: qpnp-haptic: Fix indentation

drivers/soc/qcom/qpnp-haptic.c: In function ‘qpnp_haptic_probe’:
drivers/soc/qcom/qpnp-haptic.c:2785:2: error: this ‘if’ clause does not guard... [-Werror=misleading-indentation]
  if (!hap)
  ^~
drivers/soc/qcom/qpnp-haptic.c:2787:3: note: ...this statement, but the latter is misleadingly indented as if it were guarded by the ‘if’   hap->regmap = dev_get_regmap(pdev->dev.parent, NULL);
   ^~~

4 years agosoc: qpnp-haptic: Import xiaomi changes
wloot [Sat, 20 Apr 2019 12:56:26 +0000 (20:56 +0800)]
soc: qpnp-haptic: Import xiaomi changes

4 years agoRevert "qpnp-haptic: expose vibrate function"
0ranko0P [Wed, 4 Dec 2019 15:10:48 +0000 (23:10 +0800)]
Revert "qpnp-haptic: expose vibrate function"

This reverts commit b95fe5ebab364968845590335e5226d1c2363d30.

4 years agosynaptics_dsx_force: do not reinit the device upon suspend/resume
Xiaonian Wang [Fri, 15 Apr 2016 11:34:12 +0000 (19:34 +0800)]
synaptics_dsx_force: do not reinit the device upon suspend/resume

There is no need to reinit the device when suspend, resume or even an
spontaneous reset is detected, touch will recover by itself.

CRs-Fixed: 1003951
Change-Id: Ifb5b134d0fbeb2f55f16af8806abb9c8e51c35e0

4 years agoARM: dts: xiaomi: update from pie oss
wloot [Thu, 11 Jul 2019 09:28:04 +0000 (17:28 +0800)]
ARM: dts: xiaomi: update from pie oss

4 years agoARM: dts: xiaomi: Update sagit vibrator tweaks from stock pie dtb
wloot [Sat, 1 Jun 2019 11:32:59 +0000 (19:32 +0800)]
ARM: dts: xiaomi: Update sagit vibrator tweaks from stock pie dtb

4 years agocamera_v2: flash: import xiaomi changes
Vol Zhdanov [Tue, 29 Jan 2019 16:17:55 +0000 (16:17 +0000)]
camera_v2: flash: import xiaomi changes

* needed for torch to function

Change-Id: I0a8186300630812698d055e8aa326edc370aed4e

4 years agoARM: dts: xiaomi: Disable UART connector
Alex Naidis [Thu, 13 Jul 2017 10:47:51 +0000 (12:47 +0200)]
ARM: dts: xiaomi: Disable UART connector

This disables the uartblsp2dm1 device as it
is obsolete.

Change-Id: Icd1e0670b47d59e6c5f65c5d23bd8b1aebda90b6
Signed-off-by: Alex Naidis <alex.naidis@linux.com>
Signed-off-by: Park Ju Hyung <qkrwngud825@gmail.com>
[wight554: adapted for xiaomi msm8998 dts]
Signed-off-by: Volodymyr Zhdanov <wight554@gmail.com>
4 years agopower: Silence few logspam
Julian Liu [Mon, 16 Sep 2019 09:36:43 +0000 (17:36 +0800)]
power: Silence few logspam

4 years agopower: nuke xiaomi screen on limiting
Julian Liu [Fri, 6 Sep 2019 10:29:14 +0000 (18:29 +0800)]
power: nuke xiaomi screen on limiting

4 years agoqpnp-fg-gen3: disable debug logs
Julian Liu [Fri, 6 Sep 2019 08:37:35 +0000 (16:37 +0800)]
qpnp-fg-gen3: disable debug logs

4 years agopower: re-apply some patchs
Julian Liu [Fri, 6 Sep 2019 06:46:35 +0000 (14:46 +0800)]
power: re-apply some patchs

4 years agopower: import xiaomi changes
Julian Liu [Fri, 6 Sep 2019 06:34:06 +0000 (14:34 +0800)]
power: import xiaomi changes

4 years agosound: usb-headset: Remove debug outputs
wloot [Wed, 17 Apr 2019 05:45:52 +0000 (13:45 +0800)]
sound: usb-headset: Remove debug outputs

4 years agodts: xiaomi: msm8998: Enable partial update for cmd panels
wloot [Thu, 2 May 2019 14:29:15 +0000 (22:29 +0800)]
dts: xiaomi: msm8998: Enable partial update for cmd panels

4 years agodts: xiaomi: msm8998: enable qcom,panel-allow-phy-poweroff
Sultan Alsawaf [Sun, 13 May 2018 21:20:02 +0000 (14:20 -0700)]
dts: xiaomi: msm8998: enable qcom,panel-allow-phy-poweroff

Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>
Change-Id: Ib3a3b694f2cb76f4cec7f40bcb785e7dd25df568

4 years agoHACK: media: v4l2-ioctl.c: Remove a kernel WARN
wloot [Mon, 6 May 2019 03:16:03 +0000 (11:16 +0800)]
HACK: media: v4l2-ioctl.c: Remove a kernel WARN

* REVERTME: Need sagit pie oss

4 years agoASoC: msm: qdsp6v2: Fix sound break after 8.12.28 dev
Qiwu Huang [Mon, 21 Jan 2019 05:28:31 +0000 (13:28 +0800)]
ASoC: msm: qdsp6v2: Fix sound break after 8.12.28 dev

Change-Id: Icfc7b5c62a28930fe5ffe56acd03c3c5d7743e79

4 years agofs: proc: Import cpumaxfreq
wloot [Fri, 3 May 2019 14:56:11 +0000 (22:56 +0800)]
fs: proc: Import cpumaxfreq

4 years agodts: qcom: xiaomi: Add support for SNfuse
wloot [Fri, 29 Mar 2019 11:19:43 +0000 (19:19 +0800)]
dts: qcom: xiaomi: Add support for SNfuse

* Import from Xiaomi oss
* Needed by serial_num driver

4 years agodrivers: soc: qcom: Import serial_num
TheStrix [Thu, 1 Mar 2018 10:19:16 +0000 (15:49 +0530)]
drivers: soc: qcom: Import serial_num

Change-Id: Iaa40b75ac7494b32185868ca2994600501e38471

4 years agodts: xiaomi: msm8998: Add missed vibrator properties back
wloot [Fri, 8 Mar 2019 08:50:30 +0000 (16:50 +0800)]
dts: xiaomi: msm8998: Add missed vibrator properties back

4 years agokbuild: Don't try to add '-fcatch-undefined-behavior' flag
Nathan Chancellor [Thu, 9 May 2019 11:48:25 +0000 (04:48 -0700)]
kbuild: Don't try to add '-fcatch-undefined-behavior' flag

This is no longer a valid option in clang, it was removed in 3.5, which
we don't support.

https://github.com/llvm/llvm-project/commit/cb3f812b6b9fab8f3b41414f24e90222170417b4

Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Danny Lin <danny@kdrag0n.dev>
4 years agovfs: Bump max inline dirent name size
Danny Lin [Sun, 1 Sep 2019 08:28:10 +0000 (01:28 -0700)]
vfs: Bump max inline dirent name size

Many names exceed 32 bytes, but fit in 96 bytes. This keeps the 64-byte
alignment and bumps the total dentry struct size to 256 bytes.

Inspired by: https://github.com/arter97/android_kernel_oneplus_sm8150/commit/1dd11b364e600

Signed-off-by: Danny Lin <danny@kdrag0n.dev>
4 years agomm: backport speculative page fault
Park Ju Hyung [Thu, 20 Jun 2019 18:53:36 +0000 (03:53 +0900)]
mm: backport speculative page fault

Signed-off-by: Park Ju Hyung <qkrwngud825@gmail.com>
4 years agoclk: mdss: Separate 8996 platform from 8998
celtare21 [Tue, 27 Aug 2019 15:42:05 +0000 (15:42 +0000)]
clk: mdss: Separate 8996 platform from 8998

 * Also fix missing break.

Signed-off-by: celtare21 <celtare21@gmail.com>
4 years agodevfreq: Avoid competing with low-priority tasks
Alex Naidis [Sun, 12 Feb 2017 16:04:45 +0000 (17:04 +0100)]
devfreq: Avoid competing with low-priority tasks

The work for load monitoring and reevaluation
can compete with low-priority tasks under high
sched pressure.

This patch makes the WQ a high priority WQ
to ensure that the work isn't delayed
under high sched pressure.

Signed-off-by: Alex Naidis <alex.naidis@linux.com>
Signed-off-by: Raphiel Rollerscaperers <raphielscape@outlook.com>
4 years agocpuidle: enter_state: Don't needlessly calculate diff time
Fieah Lim [Mon, 10 Sep 2018 21:47:25 +0000 (05:47 +0800)]
cpuidle: enter_state: Don't needlessly calculate diff time

Currently, ktime_us_delta() is invoked unconditionally to compute the
idle residency of the CPU, but it only makes sense to do that if a
valid idle state has been entered, so move the ktime_us_delta()
invocation after the entered_state >= 0 check.

While at it, merge two comment blocks in there into one and drop
a space between type casting of diff.

This patch has no functional changes.

Signed-off-by: Fieah Lim <kw@fieahl.im>
[ rjw: Changelog cleanup, comment format fix ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Signed-off-by: Danny Lin <danny@kdrag0n.dev>
Signed-off-by: Volodymyr Zhdanov <wight554@gmail.com>
4 years agoPM / sleep: Skip OOM killer toggles
Julian Liu [Wed, 28 Aug 2019 09:34:41 +0000 (17:34 +0800)]
PM / sleep: Skip OOM killer toggles

https://github.com/kdrag0n/proton_zf6/commit/f953b977fba811023907b4c34829df88f9ee6bb9

4 years agomm: skip swap readahead when process is exiting
Tim Murray [Wed, 3 Jan 2018 23:50:10 +0000 (15:50 -0800)]
mm: skip swap readahead when process is exiting

It's possible for a user fault to be triggered during task exit that
results in swap readahead, which is not useful. Skip swap readahead
if the current process is exiting.

Change-Id: I5fad20ebdcc616af732254705726d395eb118cbe
Signed-off-by: Tim Murray <timmurray@google.com>
I know swap is not used with my kernel.
Just picking for future references.

Signed-off-by: Park Ju Hyung <qkrwngud825@gmail.com>
4 years agokallsyms: reduce size a little on 64-bit
Jan Beulich [Mon, 3 Sep 2018 12:09:34 +0000 (06:09 -0600)]
kallsyms: reduce size a little on 64-bit

Both kallsyms_num_syms and kallsyms_markers[] don't really need to use
unsigned long as their (base) types; unsigned int fully suffices.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Danny Lin <danny@kdrag0n.dev>
4 years agoscripts: Fixed printf format mismatch
nixiaoming [Thu, 24 May 2018 03:16:12 +0000 (11:16 +0800)]
scripts: Fixed printf format mismatch

scripts/kallsyms.c: function write_src:
"printf", the #1 format specifier "d" need arg type "int",
but the according arg "table_cnt" has type "unsigned int"

scripts/recordmcount.c: function do_file:
"fprintf", the #1 format specifier "d" need arg type "int",
but the according arg "(*w2)(ehdr->e_machine)" has type "unsigned int"

scripts/recordmcount.h: function find_secsym_ndx:
"fprintf", the #1 format specifier "d" need arg type "int",
but the according arg "txtndx" has type "unsigned int"

Signed-off-by: nixiaoming <nixiaoming@huawei.com>
Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Danny Lin <danny@kdrag0n.dev>
4 years agokallsyms: lower alignment on ARM
Mathias Krause [Sun, 30 Dec 2018 12:36:00 +0000 (13:36 +0100)]
kallsyms: lower alignment on ARM

As mentioned in the info pages of gas, the '.align' pseudo op's
interpretation of the alignment value is architecture specific.
It might either be a byte value or taken to the power of two.

On ARM it's actually the latter which leads to unnecessary large
alignments of 16 bytes for 32 bit builds or 256 bytes for 64 bit
builds.

Fix this by switching to '.balign' instead which is consistent
across all architectures.

Signed-off-by: Mathias Krause <minipli@googlemail.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Danny Lin <danny@kdrag0n.dev>
4 years agotcp: fastopen: Enable support for TCP fast open when serving
Danny Lin [Fri, 2 Aug 2019 07:33:26 +0000 (00:33 -0700)]
tcp: fastopen: Enable support for TCP fast open when serving

Signed-off-by: Danny Lin <danny@kdrag0n.dev>
4 years agouid_sys_stats: Remove dependency on the profiling subsystem
Danny Lin [Mon, 8 Apr 2019 06:39:12 +0000 (23:39 -0700)]
uid_sys_stats: Remove dependency on the profiling subsystem

Now that we have a simple task exit notifier system to notify
uid_sys_stats when tasks exit independently of the profiling subsystem,
remove this unnecessary dependency.

Test: /proc/uid_cputime shows valid stats with profiling disabled
Signed-off-by: Danny Lin <danny@kdrag0n.dev>
4 years agoprofiling: Implement a simple task exit notifier when disabled
Danny Lin [Mon, 8 Apr 2019 06:34:36 +0000 (23:34 -0700)]
profiling: Implement a simple task exit notifier when disabled

Some kernel code currently depends on the profiling subsystem solely for
its task exit notifier. The rest of the profiling subsystem is left
unused.

Add a simple task exit notifier implemented in kernel/exit.c to make
such code work properly even when profiling is disabled. This will allow
unnecessary profiling code to be disabled without breakage.

Signed-off-by: Danny Lin <danny@kdrag0n.dev>
4 years agoarm64: remove unneeded copy to init_utsname()->machine
Masahiro Yamada [Thu, 14 Sep 2017 11:40:04 +0000 (20:40 +0900)]
arm64: remove unneeded copy to init_utsname()->machine

As you see in init/version.c, init_uts_ns.name.machine is initially
set to UTS_MACHINE.  There is no point to copy the same string.

I dug the git history to figure out why this line is here.  My best
guess is like this:

 - This line has been around here since the initial support of arm64
   by commit 9703d9d7f77c ("arm64: Kernel booting and initialisation").
   If ARCH (=arm64) and UTS_MACHINE (=aarch64) do not match,
   arch/$(ARCH)/Makefile is supposed to override UTS_MACHINE, but the
   initial version of arch/arm64/Makefile missed to do that.  Instead,
   the boot code copied "aarch64" to init_utsname()->machine.

 - Commit 94ed1f2cb5d4 ("arm64: setup: report ELF_PLATFORM as the
   machine for utsname") replaced "aarch64" with ELF_PLATFORM to
   make "uname" to reflect the endianness.

 - ELF_PLATFORM does not help to provide the UTS machine name to rpm
   target, so commit cfa88c79462d ("arm64: Set UTS_MACHINE in the
   Makefile") fixed it.  The commit simply replaced ELF_PLATFORM with
   UTS_MACHINE, but missed the fact the string copy itself is no longer
   needed.

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Danny Lin <danny@kdrag0n.dev>
4 years agoarm64: support __int128 with clang
Jason A. Donenfeld [Sat, 23 Dec 2017 00:43:23 +0000 (01:43 +0100)]
arm64: support __int128 with clang

Commit fb8722735f50 ("arm64: support __int128 on gcc 5+") added support
for arm64 __int128 with gcc with a version-conditional, but neglected to
enable this for clang, which in fact appears to support aarch64 __int128.
This commit therefore enables it if the compiler is clang, using the
same type of makefile conditional used elsewhere in the tree.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Danny Lin <danny@kdrag0n.dev>
4 years agorcu: Speed up calling of RCU tasks callbacks
Steven Rostedt (VMware) [Thu, 24 May 2018 22:49:46 +0000 (18:49 -0400)]
rcu: Speed up calling of RCU tasks callbacks

Joel Fernandes found that the synchronize_rcu_tasks() was taking a
significant amount of time. He demonstrated it with the following test:

 # cd /sys/kernel/tracing
 # while [ 1 ]; do x=1; done &
 # echo '__schedule_bug:traceon' > set_ftrace_filter
 # time echo '!__schedule_bug:traceon' > set_ftrace_filter;

real 0m1.064s
user 0m0.000s
sys 0m0.004s

Where it takes a little over a second to perform the synchronize,
because there's a loop that waits 1 second at a time for tasks to get
through their quiescent points when there's a task that must be waited
for.

After discussion we came up with a simple way to wait for holdouts but
increase the time for each iteration of the loop but no more than a
full second.

With the new patch we have:

 # time echo '!__schedule_bug:traceon' > set_ftrace_filter;

real 0m0.131s
user 0m0.000s
sys 0m0.004s

Which drops it down to 13% of what the original wait time was.

Link: http://lkml.kernel.org/r/20180523063815.198302-2-joel@joelfernandes.org
Reported-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Suggested-by: Joel Fernandes (Google) <joel@joelfernandes.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
4 years agoBACKPORT: block: use ktime_get_ns() instead of sched_clock() for cfq
Omar Sandoval [Wed, 9 May 2018 09:08:51 +0000 (02:08 -0700)]
BACKPORT: block: use ktime_get_ns() instead of sched_clock() for cfq

cfq and bfq have some internal fields that use sched_clock() which can
trivially use ktime_get_ns() instead. Their timestamp fields in struct
request can also use ktime_get_ns(), which resolves the 8 year old
comment added by commit 28f4197e5d47 ("block: disable preemption before
using sched_clock()").

Signed-off-by: Omar Sandoval <osandov@fb.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
(cherry-picked from 84c7afcebed913c93d50f116b046b7f0d8ec0cdc)
[nc: only backport cfq changes]
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: Harsh Shandilya <msfjarvis@gmail.com>
4 years agoarm64: only pass -mpc-relative-literal-loads if 843419 fix is enabled
kdrag0n [Sun, 2 Dec 2018 05:07:00 +0000 (21:07 -0800)]
arm64: only pass -mpc-relative-literal-loads if 843419 fix is enabled

Signed-off-by: kdrag0n <dragon@khronodragon.com>
4 years agofs: sync: Avoid calling fdget without fdput
Wang Han [Sat, 9 Jun 2018 05:36:57 +0000 (22:36 -0700)]
fs: sync: Avoid calling fdget without fdput

When adding fsync support, we check for fsync_enabled() in several
cases but it appears that we should fdput() after fdget() but the
current code just check for fsync_enabled and directly return in
some cases after calling fdget().

Fix it by checking fsync_enabled first and move the f initialization
after the check. Also remove some unnecessary fsync_enabled checks as
this will be checked later in do_fsync().

Signed-off-by: Wang Han <416810799@qq.com>
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: engstk <eng.stk@sapo.pt>
4 years agoAdded fsync on/off support.
franciscofranco [Thu, 22 Nov 2012 07:45:56 +0000 (23:45 -0800)]
Added fsync on/off support.

Signed-off-by: Francisco Franco <franciscofranco.1990@gmail.com>
Signed-off-by: engstk <eng.stk@sapo.pt>
4 years agoarm64: add endianness option to LDFLAGS instead of LD
Masahiro Yamada [Tue, 3 Jul 2018 01:22:00 +0000 (10:22 +0900)]
arm64: add endianness option to LDFLAGS instead of LD

With the recent syntax extension, Kconfig is now able to evaluate the
compiler / toolchain capability.

However, accumulating flags to 'LD' is not compatible with the way
it works; 'LD' must be passed to Kconfig to call $(ld-option,...)
from Kconfig files.  If you tweak 'LD' in arch Makefile depending on
CONFIG_CPU_BIG_ENDIAN, this would end up with circular dependency
between Makefile and Kconfig.

Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
4 years agoarm64: Use aarch64elf and aarch64elfb emulation mode variants
Paul Kocialkowski [Mon, 2 Jul 2018 09:16:59 +0000 (11:16 +0200)]
arm64: Use aarch64elf and aarch64elfb emulation mode variants

The aarch64linux and aarch64linuxb emulation modes are not supported by
bare-metal toolchains and Linux using them forbids building the kernel
with these toolchains.

Since there is apparently no reason to target these emulation modes, the
more generic elf modes are used instead, allowing to build on bare-metal
toolchains as well as the already-supported ones.

Fixes: 3d6a7b99e3fa ("arm64: ensure the kernel is compiled for LP64")

Cc: stable@vger.kernel.org
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Paul Kocialkowski <contact@paulk.fr>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
4 years agoFROMLIST: BACKPORT: Makefile: lld: set -O2 linker flag when linking with LLD
ndesaulniers@google.com [Mon, 11 Feb 2019 19:30:06 +0000 (19:30 +0000)]
FROMLIST: BACKPORT: Makefile: lld: set -O2 linker flag when linking with LLD

For arm64:
0.34% size improvement with lld -O2 over lld for vmlinux.
3.3% size improvement with lld -O2 over lld for Image.lz4-dtb.

Link: https://github.com/ClangBuiltLinux/linux/issues/343
Suggested-by: Rui Ueyama <ruiu@google.com>
Suggested-by: Nathan Chancellor <natechancellor@gmail.com>
Reviewed-by: Nathan Chancellor <natechancellor@gmail.com>
Tested-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
4 years agoFROMLIST: BACKPORT: Makefile: lld: tell clang to use lld
ndesaulniers@google.com [Mon, 11 Feb 2019 19:30:05 +0000 (19:30 +0000)]
FROMLIST: BACKPORT: Makefile: lld: tell clang to use lld

This is needed because clang doesn't select which linker to use based on
$LD but rather -fuse-ld={bfd,gold,lld,<absolute path to linker>}.  This
is problematic especially for cc-ldoption, which checks for linker flag
support via invoking the compiler, rather than the linker.

Select the linker via absolute path from $PATH via `which`. This allows
you to build with:

$ make LD=ld.lld
$ make LD=ld.lld-8
$ make LD=/path/to/ld.lld

Add -Qunused-arguments to KBUILD_CPPFLAGS when using LLD, as otherwise
Clang likes to complain about -fuse-lld= being unused when compiling but
not linking (-c) such as when cc-option is used.

Link: https://github.com/ClangBuiltLinux/linux/issues/342
Link: https://github.com/ClangBuiltLinux/linux/issues/366
Link: https://github.com/ClangBuiltLinux/linux/issues/357
Suggested-by: Nathan Chancellor <natechancellor@gmail.com>
Suggested-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
4 years agokbuild: add ld-name macro
Danny Lin [Wed, 7 Aug 2019 12:06:50 +0000 (15:06 +0300)]
kbuild: add ld-name macro

[wight554: stripped from kdrag0n/proton_bluecross@aba5259]
Signed-off-by: Volodymyr Zhdanov <wight554@gmail.com>
4 years agoarm64: relocatable: fix inconsistencies in linker script and options
Ard Biesheuvel [Mon, 3 Dec 2018 19:58:05 +0000 (20:58 +0100)]
arm64: relocatable: fix inconsistencies in linker script and options

readelf complains about the section layout of vmlinux when building
with CONFIG_RELOCATABLE=y (for KASLR):

  readelf: Warning: [21]: Link field (0) should index a symtab section.
  readelf: Warning: [21]: Info field (0) should index a relocatable section.

Also, it seems that our use of '-pie -shared' is contradictory, and
thus ambiguous. In general, the way KASLR is wired up at the moment
is highly tailored to how ld.bfd happens to implement (and conflate)
PIE executables and shared libraries, so given the current effort to
support other toolchains, let's fix some of these issues as well.

- Drop the -pie linker argument and just leave -shared. In ld.bfd,
  the differences between them are unclear (except for the ELF type
  of the produced image [0]) but lld chokes on seeing both at the
  same time.

- Rename the .rela output section to .rela.dyn, as is customary for
  shared libraries and PIE executables, so that it is not misidentified
  by readelf as a static relocation section (producing the warnings
  above).

- Pass the -z notext and -z norelro options to explicitly instruct the
  linker to permit text relocations, and to omit the RELRO program
  header (which requires a certain section layout that we don't adhere
  to in the kernel). These are the defaults for current versions of
  ld.bfd.

- Discard .eh_frame and .gnu.hash sections to avoid them from being
  emitted between .head.text and .text, screwing up the section layout.

These changes only affect the ELF image, and produce the same binary
image.

[0] b9dce7f1ba01 ("arm64: kernel: force ET_DYN ELF type for ...")

Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Peter Smith <peter.smith@linaro.org>
Tested-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
[wight554: fix conflicts in arch/arm64/kernel/vmlinux.lds.S]
Signed-off-by: Volodymyr Zhdanov <wight554@gmail.com>
4 years agoarm64: prevent regressions in compressed kernel image size when upgrading to binutils...
Nick Desaulniers [Fri, 27 Oct 2017 16:33:41 +0000 (09:33 -0700)]
arm64: prevent regressions in compressed kernel image size when upgrading to binutils 2.27

Upon upgrading to binutils 2.27, we found that our lz4 and gzip
compressed kernel images were significantly larger, resulting is 10ms
boot time regressions.

As noted by Rahul:
"aarch64 binaries uses RELA relocations, where each relocation entry
includes an addend value. This is similar to x86_64.  On x86_64, the
addend values are also stored at the relocation offset for relative
relocations. This is an optimization: in the case where code does not
need to be relocated, the loader can simply skip processing relative
relocations.  In binutils-2.25, both bfd and gold linkers did this for
x86_64, but only the gold linker did this for aarch64.  The kernel build
here is using the bfd linker, which stored zeroes at the relocation
offsets for relative relocations.  Since a set of zeroes compresses
better than a set of non-zero addend values, this behavior was resulting
in much better lz4 compression.

The bfd linker in binutils-2.27 is now storing the actual addend values
at the relocation offsets. The behavior is now consistent with what it
does for x86_64 and what gold linker does for both architectures.  The
change happened in this upstream commit:
https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=1f56df9d0d5ad89806c24e71f296576d82344613
Since a bunch of zeroes got replaced by non-zero addend values, we see
the side effect of lz4 compressed image being a bit bigger.

To get the old behavior from the bfd linker, "--no-apply-dynamic-relocs"
flag can be used:
$ LDFLAGS="--no-apply-dynamic-relocs" make
With this flag, the compressed image size is back to what it was with
binutils-2.25.

If the kernel is using ASLR, there aren't additional runtime costs to
--no-apply-dynamic-relocs, as the relocations will need to be applied
again anyway after the kernel is relocated to a random address.

If the kernel is not using ASLR, then presumably the current default
behavior of the linker is better. Since the static linker performed the
dynamic relocs, and the kernel is not moved to a different address at
load time, it can skip applying the relocations all over again."

Some measurements:

$ ld -v
GNU ld (binutils-2.25-f3d35cf6) 2.25.51.20141117
                    ^
$ ls -l vmlinux
-rwxr-x--- 1 ndesaulniers eng 300652760 Oct 26 11:57 vmlinux
$ ls -l Image.lz4-dtb
-rw-r----- 1 ndesaulniers eng 16932627 Oct 26 11:57 Image.lz4-dtb

$ ld -v
GNU ld (binutils-2.27-53dd00a1) 2.27.0.20170315
                    ^
pre patch:
$ ls -l vmlinux
-rwxr-x--- 1 ndesaulniers eng 300376208 Oct 26 11:43 vmlinux
$ ls -l Image.lz4-dtb
-rw-r----- 1 ndesaulniers eng 18159474 Oct 26 11:43 Image.lz4-dtb

post patch:
$ ls -l vmlinux
-rwxr-x--- 1 ndesaulniers eng 300376208 Oct 26 12:06 vmlinux
$ ls -l Image.lz4-dtb
-rw-r----- 1 ndesaulniers eng 16932466 Oct 26 12:06 Image.lz4-dtb

By Siqi's measurement w/ gzip:
binutils 2.27 with this patch (with --no-apply-dynamic-relocs):
Image 41535488
Image.gz 13404067

binutils 2.27 without this patch (without --no-apply-dynamic-relocs):
Image 41535488
Image.gz 14125516

Any compression scheme should be able to get better results from the
longer runs of zeros, not just GZIP and LZ4.

10ms boot time savings isn't anything to get excited about, but users of
arm64+compression+bfd-2.27 should not have to pay a penalty for no
runtime improvement.

Reported-by: Gopinath Elanchezhian <gelanchezhian@google.com>
Reported-by: Sindhuri Pentyala <spentyala@google.com>
Reported-by: Wei Wang <wvw@google.com>
Suggested-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Suggested-by: Rahul Chaudhry <rahulchaudhry@google.com>
Suggested-by: Siqi Lin <siqilin@google.com>
Suggested-by: Stephen Hines <srhines@google.com>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Reviewed-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
[will: added comment to Makefile]
Signed-off-by: Will Deacon <will.deacon@arm.com>
4 years agoarm64: kernel: force ET_DYN ELF type for CONFIG_RELOCATABLE=y
Ard Biesheuvel [Thu, 20 Oct 2016 10:12:57 +0000 (11:12 +0100)]
arm64: kernel: force ET_DYN ELF type for CONFIG_RELOCATABLE=y

GNU ld used to set the ELF file type to ET_DYN for PIE executables, which
is the same file type used for shared libraries. However, this was changed
recently, and now PIE executables are emitted as ET_EXEC instead.

The distinction is only relevant for ELF loaders, and so there is little
reason to care about the difference when building the kernel, which is
why the change has gone unnoticed until now.

However, debuggers do use the ELF binary, and expect ET_EXEC type files
to appear in memory at the exact offset described in the ELF metadata.
This means source level debugging is no longer possible when KASLR is in
effect or when executing the stub.

So add the -shared LD option when building with CONFIG_RELOCATABLE=y. This
forces the ELF file type to be set to ET_DYN (which is what you get when
building with binutils 2.24 and earlier anyway), and has no other ill
effects.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Will Deacon <will.deacon@arm.com>
4 years agoarm64: ensure the kernel is compiled for LP64
Andrew Pinski [Mon, 18 Sep 2017 10:20:20 +0000 (11:20 +0100)]
arm64: ensure the kernel is compiled for LP64

The kernel needs to be compiled as a LP64 binary for ARM64, even when
using a compiler that defaults to code-generation for the ILP32 ABI.
Consequently, we need to explicitly pass '-mabi=lp64' (supported on
gcc-4.9 and newer).

Signed-off-by: Andrew Pinski <Andrew.Pinski@caviumnetworks.com>
Signed-off-by: Philipp Tomsich <philipp.tomsich@theobroma-systems.com>
Signed-off-by: Christoph Muellner <christoph.muellner@theobroma-systems.com>
Signed-off-by: Yury Norov <ynorov@caviumnetworks.com>
Reviewed-by: David Daney <ddaney@caviumnetworks.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
4 years agoarm64: Set UTS_MACHINE in the Makefile
Michal Marek [Tue, 30 Aug 2016 08:31:35 +0000 (10:31 +0200)]
arm64: Set UTS_MACHINE in the Makefile

The make rpm target depends on proper UTS_MACHINE definition.  Also, use
the variable in arch/arm64/kernel/setup.c, so that it's not accidentally
removed in the future.

Reported-and-tested-by: Fabian Vogt <fvogt@suse.com>
Signed-off-by: Michal Marek <mmarek@suse.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
4 years agoseq_file: allocate seq_file from kmem_cache
Alexey Dobriyan [Tue, 10 Apr 2018 23:34:45 +0000 (16:34 -0700)]
seq_file: allocate seq_file from kmem_cache

For fine-grained debugging and usercopy protection.

Link: http://lkml.kernel.org/r/20180310085027.GA17121@avx2
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Glauber Costa <glommer@gmail.com>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
4 years agoselinux: reduce calls to context_struct_to_string()
Park Ju Hyung [Tue, 9 Jul 2019 10:36:18 +0000 (19:36 +0900)]
selinux: reduce calls to context_struct_to_string()

context_struct_to_string() contains expensive kmalloc() calls.

In most cases, there's no purpose in calling context_struct_to_string()
on !CONFIG_AUDIT as logs won't be saved anyways.

Signed-off-by: Park Ju Hyung <qkrwngud825@gmail.com>
4 years agoexec: Add node tampering blacklist function
Danny Lin [Sun, 4 Aug 2019 03:40:30 +0000 (03:40 +0000)]
exec: Add node tampering blacklist function

We'll be adding checks to block writes from processes which tamper with
values that we control from within the kernel, especially ones that
userspace writes to for boosting. Add a central function to perform the
process check to reduce code duplication.

This blacklists the following processes which are known to tamper with
such values:
  - init
  - libperfmgr (power@1.3-servi and NodeLooperThrea)
  - perfd (perf@1.0-servic)
  - init.qcom.post_boot.sh (init.qcom.post_)

Signed-off-by: Danny Lin <danny@kdrag0n.dev>
[added libperfmgr 1.2 in case some ROMs use wahoo powerhal]
Signed-off-by: Volodymyr Zhdanov <wight554@gmail.com>
4 years agosetlocalversion: Reduce git commit hash length
Danny Lin [Fri, 29 Mar 2019 05:30:39 +0000 (22:30 -0700)]
setlocalversion: Reduce git commit hash length

This reduces the length of the git commit hash embedded in the kernel
version fron 12 characters to 8 characters.

Before: 4.9.166-Proton-v17-gd503a36d7582
After: 4.9.166-Proton-v17-g47057aaa

Signed-off-by: Danny Lin <danny@kdrag0n.dev>
4 years agobase: make device_{online,offline} no op
wloot [Sat, 29 Jun 2019 15:34:40 +0000 (23:34 +0800)]
base: make device_{online,offline} no op

4 years agomsm_serial_hs: actually check if msm_serial_hs_tx_work failed to init
Yaroslav Furman [Fri, 24 Aug 2018 09:10:33 +0000 (12:10 +0300)]
msm_serial_hs: actually check if msm_serial_hs_tx_work failed to init

Just a small typo.

Signed-off-by: Yaroslav Furman <yaro330@gmail.com>
4 years agoFROMLIST: zram: fix lockdep warning of free block handling
Minchan Kim [Mon, 5 Nov 2018 14:44:42 +0000 (23:44 +0900)]
FROMLIST: zram: fix lockdep warning of free block handling

[  254.519728] ================================
[  254.520311] WARNING: inconsistent lock state
[  254.520898] 4.19.0+ #390 Not tainted
[  254.521387] --------------------------------
[  254.521732] inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
[  254.521732] zram_verify/2095 [HC0[0]:SC1[1]:HE1:SE0] takes:
[  254.521732] 00000000b1828693 (&(&zram->bitmap_lock)->rlock){+.?.}, at: put_entry_bdev+0x1e/0x50
[  254.521732] {SOFTIRQ-ON-W} state was registered at:
[  254.521732]   _raw_spin_lock+0x2c/0x40
[  254.521732]   zram_make_request+0x755/0xdc9
[  254.521732]   generic_make_request+0x373/0x6a0
[  254.521732]   submit_bio+0x6c/0x140
[  254.521732]   __swap_writepage+0x3a8/0x480
[  254.521732]   shrink_page_list+0x1102/0x1a60
[  254.521732]   shrink_inactive_list+0x21b/0x3f0
[  254.521732]   shrink_node_memcg.constprop.99+0x4f8/0x7e0
[  254.521732]   shrink_node+0x7d/0x2f0
[  254.521732]   do_try_to_free_pages+0xe0/0x300
[  254.521732]   try_to_free_pages+0x116/0x2b0
[  254.521732]   __alloc_pages_slowpath+0x3f4/0xf80
[  254.521732]   __alloc_pages_nodemask+0x2a2/0x2f0
[  254.521732]   __handle_mm_fault+0x42e/0xb50
[  254.521732]   handle_mm_fault+0x55/0xb0
[  254.521732]   __do_page_fault+0x235/0x4b0
[  254.521732]   page_fault+0x1e/0x30
[  254.521732] irq event stamp: 228412
[  254.521732] hardirqs last  enabled at (228412): [<ffffffff98245846>] __slab_free+0x3e6/0x600
[  254.521732] hardirqs last disabled at (228411): [<ffffffff98245625>] __slab_free+0x1c5/0x600
[  254.521732] softirqs last  enabled at (228396): [<ffffffff98e0031e>] __do_softirq+0x31e/0x427
[  254.521732] softirqs last disabled at (228403): [<ffffffff98072051>] irq_exit+0xd1/0xe0
[  254.521732]
[  254.521732] other info that might help us debug this:
[  254.521732]  Possible unsafe locking scenario:
[  254.521732]
[  254.521732]        CPU0
[  254.521732]        ----
[  254.521732]   lock(&(&zram->bitmap_lock)->rlock);
[  254.521732]   <Interrupt>
[  254.521732]     lock(&(&zram->bitmap_lock)->rlock);
[  254.521732]
[  254.521732]  *** DEADLOCK ***
[  254.521732]
[  254.521732] no locks held by zram_verify/2095.
[  254.521732]
[  254.521732] stack backtrace:
[  254.521732] CPU: 5 PID: 2095 Comm: zram_verify Not tainted 4.19.0+ #390
[  254.521732] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014
[  254.521732] Call Trace:
[  254.521732]  <IRQ>
[  254.521732]  dump_stack+0x67/0x9b
[  254.521732]  print_usage_bug+0x1bd/0x1d3
[  254.521732]  mark_lock+0x4aa/0x540
[  254.521732]  ? check_usage_backwards+0x160/0x160
[  254.521732]  __lock_acquire+0x51d/0x1300
[  254.521732]  ? free_debug_processing+0x24e/0x400
[  254.521732]  ? bio_endio+0x6d/0x1a0
[  254.521732]  ? lockdep_hardirqs_on+0x9b/0x180
[  254.521732]  ? lock_acquire+0x90/0x180
[  254.521732]  lock_acquire+0x90/0x180
[  254.521732]  ? put_entry_bdev+0x1e/0x50
[  254.521732]  _raw_spin_lock+0x2c/0x40
[  254.521732]  ? put_entry_bdev+0x1e/0x50
[  254.521732]  put_entry_bdev+0x1e/0x50
[  254.521732]  zram_free_page+0xf6/0x110
[  254.521732]  zram_slot_free_notify+0x42/0xa0
[  254.521732]  end_swap_bio_read+0x5b/0x170
[  254.521732]  blk_update_request+0x8f/0x340
[  254.521732]  scsi_end_request+0x2c/0x1e0
[  254.521732]  scsi_io_completion+0x98/0x650
[  254.521732]  blk_done_softirq+0x9e/0xd0
[  254.521732]  __do_softirq+0xcc/0x427
[  254.521732]  irq_exit+0xd1/0xe0
[  254.521732]  do_IRQ+0x93/0x120
[  254.521732]  common_interrupt+0xf/0xf
[  254.521732]  </IRQ>

With writeback feature, zram_slot_free_notify could be called
in softirq context by end_swap_bio_read. However, bitmap_lock
is not aware of that so lockdep yell out. Thanks.

get_entry_bdev
spin_lock(bitmap->lock);
irq
softirq
end_swap_bio_read
zram_slot_free_notify
zram_slot_lock <-- deadlock prone
zram_free_page
put_entry_bdev
spin_lock(bitmap->lock); <-- deadlock prone

With akpm's suggestion(i.e. bitmap operation is already atomic),
we could remove bitmap lock. It might fail to find a empty slot
if serious contention happens. However, it's not severe problem
because huge page writeback has already possiblity to fail if there
is severe memory pressure. Worst case is just keeping
the incompressible in memory, not storage.

The other problem is zram_slot_lock in zram_slot_slot_free_notify.
To make it safe is this patch introduces zram_slot_trylock where
zram_slot_free_notify uses it. Although it's rare to be contented,
this patch adds new debug stat "miss_free" to keep monitoring
how often it happens.

(am from https://lore.kernel.org/lkml/20181203024045.153534-2-minchan@kernel.org/)
Reviewed-by: Joey Pabalinas <joeypabalinas@gmail.com>
Reviewed-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Signed-off-by: Minchan Kim <minchan@kernel.org>
Bug: 117683663
Change-Id: I9030753558338e86f2bb3bdd96ec06faf4cdd305
Signed-off-by: Minchan Kim <minchan@google.com>
Signed-off-by: Danny Lin <danny@kdrag0n.dev>
4 years agomm: Increase vmstat interval
Tyler Nijmeh [Fri, 24 May 2019 20:49:55 +0000 (13:49 -0700)]
mm: Increase vmstat interval

Red Hat linux states that vmstat updates can be mildly expensive.
Increase from 1 second to 10 seconds.

Signed-off-by: Tyler Nijmeh <tylernij@gmail.com>
4 years agozram: Limit the max disksize to 2GB
Julian Liu [Tue, 17 Sep 2019 14:01:28 +0000 (22:01 +0800)]
zram: Limit the max disksize to 2GB

4 years agomsm: bus_arb: disable debug logging
Philip Cuadra [Thu, 18 Jan 2018 19:32:26 +0000 (11:32 -0800)]
msm: bus_arb: disable debug logging

This debug logging consumes 10% of all the cpu cycles in the drivers
communicating with the dsp.  Disable the logging for the time being
until it we add a config.

Bug: 71867957
Change-Id: I97d418fee3d4576b077ed284ed5ae4447da5a789
Signed-off-by: Philip Cuadra <philipcuadra@google.com>
Signed-off-by: Park Ju Hyung <qkrwngud825@gmail.com>
4 years agoblock: disable random pool contribution by default
kdrag0n [Sun, 23 Dec 2018 06:29:33 +0000 (22:29 -0800)]
block: disable random pool contribution by default

Signed-off-by: kdrag0n <dragon@khronodragon.com>
4 years agophy: Do not compile unused drivers
wloot [Thu, 6 Jun 2019 07:04:53 +0000 (15:04 +0800)]
phy: Do not compile unused drivers

4 years agoBACKPORT: crypto: arm64/aes - add scalar implementation
Ard Biesheuvel [Wed, 11 Jan 2017 16:41:52 +0000 (16:41 +0000)]
BACKPORT: crypto: arm64/aes - add scalar implementation

This adds a scalar implementation of AES, based on the precomputed tables
that are exposed by the generic AES code. Since rotates are cheap on arm64,
this implementation only uses the 4 core tables (of 1 KB each), and avoids
the prerotated ones, reducing the D-cache footprint by 75%.

On Cortex-A57, this code manages 13.0 cycles per byte, which is ~34% faster
than the generic C code. (Note that this is still >13x slower than the code
that uses the optional ARMv8 Crypto Extensions, which manages <1 cycles per
byte.)

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit bed593c0e852f5c1efd3ca4e984fd744c51cf6ee)
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
4 years agotty: serial: msm_serial_hs: fix ipclog spam
Alin Jerpelea [Sat, 1 Jul 2017 06:10:54 +0000 (08:10 +0200)]
tty: serial: msm_serial_hs: fix ipclog spam

ifdef CONFIG_IPC_LOGGING functions to fix spam when config is
not enabled

Signed-off-by: Alin Jerpelea <alin.jerpelea@sonymobile.com>
Signed-off-by: celtare21 <celtare21@gmail.com>
4 years agosoc: qcom: Further fixes for !IPC_LOGGING for IPA, SMD, SMEM.
Angelo G. Del Regno [Sat, 27 May 2017 15:36:11 +0000 (17:36 +0200)]
soc: qcom: Further fixes for !IPC_LOGGING for IPA, SMD, SMEM.

Do not crash when IPC_LOGGING is not activated.

Signed-off-by: celtare21 <celtare21@gmail.com>
4 years agosoc: qcom: stop spam when IPC_LOGGING is disabled
Alin Jerpelea [Wed, 4 Jan 2017 15:51:53 +0000 (16:51 +0100)]
soc: qcom: stop spam when IPC_LOGGING is disabled

Signed-off-by: Alin Jerpelea <alin.jerpelea@sonymobile.com>
4 years agomm, page_alloc: double zone's batchsize
Aaron Lu [Fri, 17 Aug 2018 22:49:14 +0000 (15:49 -0700)]
mm, page_alloc: double zone's batchsize

To improve page allocator's performance for order-0 pages, each CPU has
a Per-CPU-Pageset(PCP) per zone.  Whenever an order-0 page is needed,
PCP will be checked first before asking pages from Buddy.  When PCP is
used up, a batch of pages will be fetched from Buddy to improve
performance and the size of batch can affect performance.

zone's batch size gets doubled last time by commit ba56e91c9401("mm:
page_alloc: increase size of per-cpu-pages") over ten years ago.  Since
then, CPU has envolved a lot and CPU's cache sizes also increased.

Dave Hansen is concerned the current batch size doesn't fit well with
modern hardware and suggested me to do two things: first, use a page
allocator intensive benchmark, e.g.  will-it-scale/page_fault1 to find
out how performance changes with different batch sizes on various
machines and then choose a new default batch size; second, see how this
new batch size work with other workloads.

In the first test, we saw performance gains on high-core-count systems
and little to no effect on older systems with more modest core counts.
In this phase's test data, two candidates: 63 and 127 are chosen.

In the second step, ebizzy, oltp, kbuild, pigz, netperf, vm-scalability
and more will-it-scale sub-tests are tested to see how these two
candidates work with these workloads and decides a new default according
to their results.

Most test results are flat.  will-it-scale/page_fault2 process mode has
10%-18% performance increase on 4-sockets Skylake and Broadwell.
vm-scalability/lru-file-mmap-read has 17%-47% performance increase for
4-sockets servers while for 2-sockets servers, it caused 3%-8% performance
drop.  Further analysis showed that, with a larger pcp->batch and thus
larger pcp->high(the relationship of pcp->high=6 * pcp->batch is
maintained in this patch), zone lock contention shifted to LRU add side
lock contention and that caused performance drop.  This performance drop
might be mitigated by others' work on optimizing LRU lock.

Another downside of increasing pcp->batch is, when PCP is used up and need
to fetch a batch of pages from Buddy, since batch is increased, that time
can be longer than before.  My understanding is, this doesn't affect
slowpath where direct reclaim and compaction dominates.  For fastpath,
throughput is a win(according to will-it-scale/page_fault1) but worst
latency can be larger now.

Overall, I think double the batch size from 31 to 63 is relatively safe
and provide good performance boost for high-core-count systems.

The two phase's test results are listed below(all tests are done with THP
disabled).

Phase one(will-it-scale/page_fault1) test results:

Skylake-EX: increased batch size has a good effect on zone->lock
contention, though LRU contention will rise at the same time and
limited the final performance increase.

batch   score     change   zone_contention   lru_contention   total_contention
 31   15345900    +0.00%       64%                 8%           72%
 53   17903847   +16.67%       32%                38%           70%
 63   17992886   +17.25%       24%                45%           69%
 73   18022825   +17.44%       10%                61%           71%
119   18023401   +17.45%        4%                66%           70%
127   18029012   +17.48%        3%                66%           69%
137   18036075   +17.53%        4%                66%           70%
165   18035964   +17.53%        2%                67%           69%
188   18101105   +17.95%        2%                67%           69%
223   18130951   +18.15%        2%                67%           69%
255   18118898   +18.07%        2%                67%           69%
267   18101559   +17.96%        2%                67%           69%
299   18160468   +18.34%        2%                68%           70%
320   18139845   +18.21%        2%                67%           69%
393   18160869   +18.34%        2%                68%           70%
424   18170999   +18.41%        2%                68%           70%
458   18144868   +18.24%        2%                68%           70%
467   18142366   +18.22%        2%                68%           70%
498   18154549   +18.30%        1%                68%           69%
511   18134525   +18.17%        1%                69%           70%

Broadwell-EX: similar pattern as Skylake-EX.

batch   score     change   zone_contention   lru_contention   total_contention
 31   16703983    +0.00%       67%                 7%           74%
 53   18195393    +8.93%       43%                28%           71%
 63   18288885    +9.49%       38%                33%           71%
 73   18344329    +9.82%       35%                37%           72%
119   18535529   +10.96%       24%                46%           70%
127   18513596   +10.83%       23%                48%           71%
137   18514327   +10.84%       23%                48%           71%
165   18511840   +10.82%       22%                49%           71%
188   18593478   +11.31%       17%                53%           70%
223   18601667   +11.36%       17%                52%           69%
255   18774825   +12.40%       12%                58%           70%
267   18754781   +12.28%        9%                60%           69%
299   18892265   +13.10%        7%                63%           70%
320   18873812   +12.99%        8%                62%           70%
393   18891174   +13.09%        6%                64%           70%
424   18975108   +13.60%        6%                64%           70%
458   18932364   +13.34%        8%                62%           70%
467   18960891   +13.51%        5%                65%           70%
498   18944526   +13.41%        5%                64%           69%
511   18960839   +13.51%        5%                64%           69%

Skylake-EP: although increased batch reduced zone->lock contention, but
the effect is not as good as EX: zone->lock contention is still as high as
20% with a very high batch value instead of 1% on Skylake-EX or 5% on
Broadwell-EX.  Also, total_contention actually decreased with a higher
batch but that doesn't translate to performance increase.

batch   score    change   zone_contention   lru_contention   total_contention
 31   9554867    +0.00%       66%                 3%           69%
 53   9855486    +3.15%       63%                 3%           66%
 63   9980145    +4.45%       62%                 4%           66%
 73   10092774   +5.63%       62%                 5%           67%
119   10310061   +7.90%       45%                19%           64%
127   10342019   +8.24%       42%                19%           61%
137   10358182   +8.41%       42%                21%           63%
165   10397060   +8.81%       37%                24%           61%
188   10341808   +8.24%       34%                26%           60%
223   10349135   +8.31%       31%                27%           58%
255   10327189   +8.08%       28%                29%           57%
267   10344204   +8.26%       27%                29%           56%
299   10325043   +8.06%       25%                30%           55%
320   10310325   +7.91%       25%                31%           56%
393   10293274   +7.73%       21%                31%           52%
424   10311099   +7.91%       21%                32%           53%
458   10321375   +8.02%       21%                32%           53%
467   10303881   +7.84%       21%                32%           53%
498   10332462   +8.14%       20%                33%           53%
511   10325016   +8.06%       20%                32%           52%

Broadwell-EP: zone->lock and lru lock had an agreement to make sure
performance doesn't increase and they successfully managed to keep total
contention at 70%.

batch   score    change   zone_contention   lru_contention   total_contention
 31   10121178   +0.00%       19%                50%           69%
 53   10142366   +0.21%        6%                63%           69%
 63   10117984   -0.03%       11%                58%           69%
 73   10123330   +0.02%        7%                63%           70%
119   10108791   -0.12%        2%                67%           69%
127   10166074   +0.44%        3%                66%           69%
137   10141574   +0.20%        3%                66%           69%
165   10154499   +0.33%        2%                68%           70%
188   10124921   +0.04%        2%                67%           69%
223   10137399   +0.16%        2%                67%           69%
255   10143289   +0.22%        0%                68%           68%
267   10123535   +0.02%        1%                68%           69%
299   10140952   +0.20%        0%                68%           68%
320   10163170   +0.41%        0%                68%           68%
393   10000633   -1.19%        0%                69%           69%
424   10087998   -0.33%        0%                69%           69%
458   10187116   +0.65%        0%                69%           69%
467   10146790   +0.25%        0%                69%           69%
498   10197958   +0.76%        0%                69%           69%
511   10152326   +0.31%        0%                69%           69%

Haswell-EP: similar to Broadwell-EP.

batch   score   change   zone_contention   lru_contention   total_contention
 31   10442205   +0.00%       14%                48%           62%
 53   10442255   +0.00%        5%                57%           62%
 63   10452059   +0.09%        6%                57%           63%
 73   10482349   +0.38%        5%                59%           64%
119   10454644   +0.12%        3%                60%           63%
127   10431514   -0.10%        3%                59%           62%
137   10423785   -0.18%        3%                60%           63%
165   10481216   +0.37%        2%                61%           63%
188   10448755   +0.06%        2%                61%           63%
223   10467144   +0.24%        2%                61%           63%
255   10480215   +0.36%        2%                61%           63%
267   10484279   +0.40%        2%                61%           63%
299   10466450   +0.23%        2%                61%           63%
320   10452578   +0.10%        2%                61%           63%
393   10499678   +0.55%        1%                62%           63%
424   10481454   +0.38%        1%                62%           63%
458   10473562   +0.30%        1%                62%           63%
467   10484269   +0.40%        0%                62%           62%
498   10505599   +0.61%        0%                62%           62%
511   10483395   +0.39%        0%                62%           62%

Westmere-EP: contention is pretty small so not interesting.  Note too high
a batch value could hurt performance.

batch   score   change   zone_contention   lru_contention   total_contention
 31   4831523   +0.00%        2%                 3%            5%
 53   4834086   +0.05%        2%                 4%            6%
 63   4834262   +0.06%        2%                 3%            5%
 73   4832851   +0.03%        2%                 4%            6%
119   4830534   -0.02%        1%                 3%            4%
127   4827461   -0.08%        1%                 4%            5%
137   4827459   -0.08%        1%                 3%            4%
165   4820534   -0.23%        0%                 4%            4%
188   4817947   -0.28%        0%                 3%            3%
223   4809671   -0.45%        0%                 3%            3%
255   4802463   -0.60%        0%                 4%            4%
267   4801634   -0.62%        0%                 3%            3%
299   4798047   -0.69%        0%                 3%            3%
320   4793084   -0.80%        0%                 3%            3%
393   4785877   -0.94%        0%                 3%            3%
424   4782911   -1.01%        0%                 3%            3%
458   4779346   -1.08%        0%                 3%            3%
467   4780306   -1.06%        0%                 3%            3%
498   4780589   -1.05%        0%                 3%            3%
511   4773724   -1.20%        0%                 3%            3%

Skylake-Desktop: similar to Westmere-EP, nothing interesting.

batch   score   change   zone_contention   lru_contention   total_contention
 31   3906608   +0.00%        2%                 3%            5%
 53   3940164   +0.86%        2%                 3%            5%
 63   3937289   +0.79%        2%                 3%            5%
 73   3940201   +0.86%        2%                 3%            5%
119   3933240   +0.68%        2%                 3%            5%
127   3930514   +0.61%        2%                 4%            6%
137   3938639   +0.82%        0%                 3%            3%
165   3908755   +0.05%        0%                 3%            3%
188   3905621   -0.03%        0%                 3%            3%
223   3903015   -0.09%        0%                 4%            4%
255   3889480   -0.44%        0%                 3%            3%
267   3891669   -0.38%        0%                 4%            4%
299   3898728   -0.20%        0%                 4%            4%
320   3894547   -0.31%        0%                 4%            4%
393   3875137   -0.81%        0%                 4%            4%
424   3874521   -0.82%        0%                 3%            3%
458   3880432   -0.67%        0%                 4%            4%
467   3888715   -0.46%        0%                 3%            3%
498   3888633   -0.46%        0%                 4%            4%
511   3875305   -0.80%        0%                 5%            5%

Haswell-Desktop: zone->lock is pretty low as other desktops, though lru
contention is higher than other desktops.

batch   score   change   zone_contention   lru_contention   total_contention
 31   3511158   +0.00%        2%                 5%            7%
 53   3555445   +1.26%        2%                 6%            8%
 63   3561082   +1.42%        2%                 6%            8%
 73   3547218   +1.03%        2%                 6%            8%
119   3571319   +1.71%        1%                 7%            8%
127   3549375   +1.09%        0%                 6%            6%
137   3560233   +1.40%        0%                 6%            6%
165   3555176   +1.25%        2%                 6%            8%
188   3551501   +1.15%        0%                 8%            8%
223   3531462   +0.58%        0%                 7%            7%
255   3570400   +1.69%        0%                 7%            7%
267   3532235   +0.60%        1%                 8%            9%
299   3562326   +1.46%        0%                 6%            6%
320   3553569   +1.21%        0%                 8%            8%
393   3539519   +0.81%        0%                 7%            7%
424   3549271   +1.09%        0%                 8%            8%
458   3528885   +0.50%        0%                 8%            8%
467   3526554   +0.44%        0%                 7%            7%
498   3525302   +0.40%        0%                 9%            9%
511   3527556   +0.47%        0%                 8%            8%

Sandybridge-Desktop: the 0% contention isn't accurate but caused by
dropped fractional part. Since multiple contention path's contentions
are all under 1% here, with some arithmetic operations like add, the
final deviation could be as large as 3%.

batch   score   change   zone_contention   lru_contention   total_contention
 31   1744495   +0.00%        0%                 0%            0%
 53   1755341   +0.62%        0%                 0%            0%
 63   1758469   +0.80%        0%                 0%            0%
 73   1759626   +0.87%        0%                 0%            0%
119   1770417   +1.49%        0%                 0%            0%
127   1768252   +1.36%        0%                 0%            0%
137   1767848   +1.34%        0%                 0%            0%
165   1765088   +1.18%        0%                 0%            0%
188   1766918   +1.29%        0%                 0%            0%
223   1767866   +1.34%        0%                 0%            0%
255   1768074   +1.35%        0%                 0%            0%
267   1763187   +1.07%        0%                 0%            0%
299   1765620   +1.21%        0%                 0%            0%
320   1767603   +1.32%        0%                 0%            0%
393   1764612   +1.15%        0%                 0%            0%
424   1758476   +0.80%        0%                 0%            0%
458   1758593   +0.81%        0%                 0%            0%
467   1757915   +0.77%        0%                 0%            0%
498   1753363   +0.51%        0%                 0%            0%
511   1755548   +0.63%        0%                 0%            0%

Phase two test results:
Note: all percent change is against base(batch=31).

ebizzy.throughput (higer is better)

machine         batch=31      batch=63             batch=127
lkp-skl-4sp1    2410037±7%     2600451±2% +7.9%     2602878 +8.0%
lkp-bdw-ex1     1493328        1489243    -0.3%     1492145 -0.1%
lkp-skl-2sp2    1329674        1345891    +1.2%     1351056 +1.6%
lkp-bdw-ep2      711511         711511     0.0%      710708 -0.1%
lkp-wsm-ep2       75750          75528    -0.3%       75441 -0.4%
lkp-skl-d01      264126         262791    -0.5%      264113 +0.0%
lkp-hsw-d01      176601         176328    -0.2%      176368 -0.1%
lkp-sb02          98937          98937    +0.0%       99030 +0.1%

kbuild.buildtime (less is better)

machine         batch=31      batch=63             batch=127
lkp-skl-4sp1     107.00        107.67  +0.6%        107.11  +0.1%
lkp-bdw-ex1       97.33         97.33  +0.0%         97.42  +0.1%
lkp-skl-2sp2     180.00        179.83  -0.1%        179.83  -0.1%
lkp-bdw-ep2      178.17        179.17  +0.6%        177.50  -0.4%
lkp-wsm-ep2      737.00        738.00  +0.1%        738.00  +0.1%
lkp-skl-d01      642.00        653.00  +1.7%        653.00  +1.7%
lkp-hsw-d01     1310.00       1316.00  +0.5%       1311.00  +0.1%

netperf/TCP_STREAM.Throughput_total_Mbps (higher is better)

machine         batch=31      batch=63             batch=127
lkp-skl-4sp1     948790        947144  -0.2%        948333 -0.0%
lkp-bdw-ex1      904224        904366  +0.0%        904926 +0.1%
lkp-skl-2sp2     239731        239607  -0.1%        239565 -0.1%
lk-bdw-ep2       365764        365933  +0.0%        365951 +0.1%
lkp-wsm-ep2       93736         93803  +0.1%         93808 +0.1%
lkp-skl-d01       77314         77303  -0.0%         77375 +0.1%
lkp-hsw-d01       58617         60387  +3.0%         60208 +2.7%
lkp-sb02          29990         30137  +0.5%         30103 +0.4%

oltp.transactions (higer is better)

machine         batch=31      batch=63             batch=127
lkp-bdw-ex1      9073276       9100377     +0.3%    9036344     -0.4%
lkp-skl-2sp2     8898717       8852054     -0.5%    8894459     -0.0%
lkp-bdw-ep2     13426155      13384654     -0.3%   13333637     -0.7%
lkp-hsw-ep2     13146314      13232784     +0.7%   13193163     +0.4%
lkp-wsm-ep2      5035355       5019348     -0.3%    5033418     -0.0%
lkp-skl-d01       418485       4413339     -0.1%    4419039     +0.0%
lkp-hsw-d01      3517817±5%    3396120±3%  -3.5%    3455138±3%  -1.8%

pigz.throughput (higer is better)

machine         batch=31      batch=63             batch=127
lkp-skl-4sp1    1.513e+08     1.507e+08 -0.4%      1.511e+08 -0.2%
lkp-bdw-ex1     2.060e+08     2.052e+08 -0.4%      2.044e+08 -0.8%
lkp-skl-2sp2    8.836e+08     8.845e+08 +0.1%      8.836e+08 -0.0%
lkp-bdw-ep2     8.275e+08     8.464e+08 +2.3%      8.330e+08 +0.7%
lkp-wsm-ep2     2.224e+08     2.221e+08 -0.2%      2.218e+08 -0.3%
lkp-skl-d01     1.177e+08     1.177e+08 -0.0%      1.176e+08 -0.1%
lkp-hsw-d01     1.154e+08     1.154e+08 +0.1%      1.154e+08 -0.0%
lkp-sb02        0.633e+08     0.633e+08 +0.1%      0.633e+08 +0.0%

will-it-scale.malloc1.processes (higher is better)

machine         batch=31      batch=63             batch=127
lkp-skl-4sp1      620181       620484 +0.0%         620240 +0.0%
lkp-bdw-ex1      1403610      1401201 -0.2%        1417900 +1.0%
lkp-skl-2sp2     1288097      1284145 -0.3%        1283907 -0.3%
lkp-bdw-ep2      1427879      1427675 -0.0%        1428266 +0.0%
lkp-hsw-ep2      1362546      1353965 -0.6%        1354759 -0.6%
lkp-wsm-ep2      2099657      2107576 +0.4%        2100226 +0.0%
lkp-skl-d01      1476835      1476358 -0.0%        1474487 -0.2%
lkp-hsw-d01      1308810      1303429 -0.4%        1301299 -0.6%
lkp-sb02          589286       589284 -0.0%         588101 -0.2%

will-it-scale.malloc1.threads (higher is better)
machine         batch=31      batch=63             batch=127
lkp-skl-4sp1     21289         21125     -0.8%      21241     -0.2%
lkp-bdw-ex1      28114         28089     -0.1%      28007     -0.4%
lkp-skl-2sp2     91866         91946     +0.1%      92723     +0.9%
lkp-bdw-ep2      37637         37501     -0.4%      37317     -0.9%
lkp-hsw-ep2      43673         43590     -0.2%      43754     +0.2%
lkp-wsm-ep2      28577         28298     -1.0%      28545     -0.1%
lkp-skl-d01     175277        173343     -1.1%     173082     -1.3%
lkp-hsw-d01     130303        129566     -0.6%     129250     -0.8%
lkp-sb02        113742±3%     116911     +2.8%     116417±3%  +2.4%

will-it-scale.malloc2.processes (higer is better)

machine         batch=31      batch=63             batch=127
lkp-skl-4sp1    1.206e+09     1.206e+09 -0.0%      1.206e+09 +0.0%
lkp-bdw-ex1     1.319e+09     1.319e+09 -0.0%      1.319e+09 +0.0%
lkp-skl-2sp2    8.000e+08     8.021e+08 +0.3%      7.995e+08 -0.1%
lkp-bdw-ep2     6.582e+08     6.634e+08 +0.8%      6.513e+08 -1.1%
lkp-hsw-ep2     6.671e+08     6.669e+08 -0.0%      6.665e+08 -0.1%
lkp-wsm-ep2     1.805e+08     1.806e+08 +0.0%      1.804e+08 -0.1%
lkp-skl-d01     1.611e+08     1.611e+08 -0.0%      1.610e+08 -0.0%
lkp-hsw-d01     1.333e+08     1.332e+08 -0.0%      1.332e+08 -0.0%
lkp-sb02         82485104      82478206 -0.0%       82473546 -0.0%

will-it-scale.malloc2.threads (higer is better)

machine         batch=31      batch=63             batch=127
lkp-skl-4sp1    1.574e+09     1.574e+09 -0.0%      1.574e+09 -0.0%
lkp-bdw-ex1     1.737e+09     1.737e+09 +0.0%      1.737e+09 -0.0%
lkp-skl-2sp2    9.161e+08     9.162e+08 +0.0%      9.181e+08 +0.2%
lkp-bdw-ep2     7.856e+08     8.015e+08 +2.0%      8.113e+08 +3.3%
lkp-hsw-ep2     6.908e+08     6.904e+08 -0.1%      6.907e+08 -0.0%
lkp-wsm-ep2     2.409e+08     2.409e+08 +0.0%      2.409e+08 -0.0%
lkp-skl-d01     1.199e+08     1.199e+08 -0.0%      1.199e+08 -0.0%
lkp-hsw-d01     1.029e+08     1.029e+08 -0.0%      1.029e+08 +0.0%
lkp-sb02         68081213      68061423 -0.0%       68076037 -0.0%

will-it-scale.page_fault2.processes (higer is better)

machine         batch=31      batch=63             batch=127
lkp-skl-4sp1    14509125±4%   16472364 +13.5%       17123117 +18.0%
lkp-bdw-ex1     14736381      16196588  +9.9%       16364011 +11.0%
lkp-skl-2sp2     6354925       6435444  +1.3%        6436644  +1.3%
lkp-bdw-ep2      8749584       8834422  +1.0%        8827179  +0.9%
lkp-hsw-ep2      8762591       8845920  +1.0%        8825697  +0.7%
lkp-wsm-ep2      3036083       3030428  -0.2%        3021741  -0.5%
lkp-skl-d01      2307834       2304731  -0.1%        2286142  -0.9%
lkp-hsw-d01      1806237       1800786  -0.3%        1795943  -0.6%
lkp-sb02          842616        837844  -0.6%         833921  -1.0%

will-it-scale.page_fault2.threads

machine         batch=31      batch=63             batch=127
lkp-skl-4sp1     1623294       1615132±2% -0.5%     1656777    +2.1%
lkp-bdw-ex1      1995714       2025948    +1.5%     2113753±3% +5.9%
lkp-skl-2sp2     2346708       2415591    +2.9%     2416919    +3.0%
lkp-bdw-ep2      2342564       2344882    +0.1%     2300206    -1.8%
lkp-hsw-ep2      1820658       1831681    +0.6%     1844057    +1.3%
lkp-wsm-ep2      1725482       1733774    +0.5%     1740517    +0.9%
lkp-skl-d01      1832833       1823628    -0.5%     1806489    -1.4%
lkp-hsw-d01      1427913       1427287    -0.0%     1420226    -0.5%
lkp-sb02          750626        748615    -0.3%      746621    -0.5%

will-it-scale.page_fault3.processes (higher is better)

machine         batch=31      batch=63             batch=127
lkp-skl-4sp1    24382726      24400317 +0.1%       24668774 +1.2%
lkp-bdw-ex1     35399750      35683124 +0.8%       35829492 +1.2%
lkp-skl-2sp2    28136820      28068248 -0.2%       28147989 +0.0%
lkp-bdw-ep2     37269077      37459490 +0.5%       37373073 +0.3%
lkp-hsw-ep2     36224967      36114085 -0.3%       36104908 -0.3%
lkp-wsm-ep2     16820457      16911005 +0.5%       16968596 +0.9%
lkp-skl-d01      7721138       7725904 +0.1%        7756740 +0.5%
lkp-hsw-d01      7611979       7650928 +0.5%        7651323 +0.5%
lkp-sb02         3781546       3796502 +0.4%        3796827 +0.4%

will-it-scale.page_fault3.threads (higer is better)

machine         batch=31      batch=63             batch=127
lkp-skl-4sp1     1865820±3%   1900917±2%  +1.9%     1826245±4%  -2.1%
lkp-bdw-ex1      3094060      3148326     +1.8%     3150036     +1.8%
lkp-skl-2sp2     3952940      3953898     +0.0%     3989360     +0.9%
lkp-bdw-ep2      3420373±3%   3643964     +6.5%     3644910±5%  +6.6%
lkp-hsw-ep2      2609635±2%   2582310±3%  -1.0%     2780459     +6.5%
lkp-wsm-ep2      4395001      4417196     +0.5%     4432499     +0.9%
lkp-skl-d01      5363977      5400003     +0.7%     5411370     +0.9%
lkp-hsw-d01      5274131      5311294     +0.7%     5319359     +0.9%
lkp-sb02         2917314      2913004     -0.1%     2935286     +0.6%

will-it-scale.read1.processes (higer is better)

machine         batch=31      batch=63             batch=127
lkp-skl-4sp1    73762279±14%  69322519±10% -6.0%    69349855±13%  -6.0% (result unstable)
lkp-bdw-ex1     1.701e+08     1.704e+08    +0.1%    1.705e+08     +0.2%
lkp-skl-2sp2    63111570      63113953     +0.0%    63836573      +1.1%
lkp-bdw-ep2     79247409      79424610     +0.2%    78012656      -1.6%
lkp-hsw-ep2     67677026      68308800     +0.9%    67539106      -0.2%
lkp-wsm-ep2     13339630      13939817     +4.5%    13766865      +3.2%
lkp-skl-d01     10969487      10972650     +0.0%    no data
lkp-hsw-d01     9857342±2%    10080592±2%  +2.3%    10131560      +2.8%
lkp-sb02        5189076        5197473     +0.2%    5163253       -0.5%

will-it-scale.read1.threads (higher is better)

machine         batch=31      batch=63             batch=127
lkp-skl-4sp1    62468045±12%  73666726±7% +17.9%    79553123±12% +27.4% (result unstable)
lkp-bdw-ex1     1.62e+08      1.624e+08    +0.3%    1.614e+08     -0.3%
lkp-skl-2sp2    58319780      59181032     +1.5%    59821353      +2.6%
lkp-bdw-ep2     74057992      75698171     +2.2%    74990869      +1.3%
lkp-hsw-ep2     63672959      63639652     -0.1%    64387051      +1.1%
lkp-wsm-ep2     13489943      13526058     +0.3%    13259032      -1.7%
lkp-skl-d01     10297906      10338796     +0.4%    10407328      +1.1%
lkp-hsw-d01      9636721       9667376     +0.3%     9341147      -3.1%
lkp-sb02         4801938       4804496     +0.1%     4802290      +0.0%

will-it-scale.write1.processes (higer is better)

machine         batch=31      batch=63             batch=127
lkp-skl-4sp1    1.111e+08     1.104e+08±2%  -0.7%   1.122e+08±2%  +1.0%
lkp-bdw-ex1     1.392e+08     1.399e+08     +0.5%   1.397e+08     +0.4%
lkp-skl-2sp2     59369233      58994841     -0.6%    58715168     -1.1%
lkp-bdw-ep2      61820979      CPU throttle          63593123     +2.9%
lkp-hsw-ep2      57897587      57435605     -0.8%    56347450     -2.7%
lkp-wsm-ep2       7814203       7918017±2%  +1.3%     7669068     -1.9%
lkp-skl-d01       8886557       8971422     +1.0%     8818366     -0.8%
lkp-hsw-d01       9171001±5%    9189915     +0.2%     9483909     +3.4%
lkp-sb02          4475406       4475294     -0.0%     4501756     +0.6%

will-it-scale.write1.threads (higer is better)

machine         batch=31      batch=63             batch=127
lkp-skl-4sp1    1.058e+08     1.055e+08±2%  -0.2%   1.065e+08  +0.7%
lkp-bdw-ex1     1.316e+08     1.300e+08     -1.2%   1.308e+08  -0.6%
lkp-skl-2sp2     54492421      56086678     +2.9%    55975657  +2.7%
lkp-bdw-ep2      59360449      59003957     -0.6%    58101262  -2.1%
lkp-hsw-ep2      53346346±2%   52530876     -1.5%    52902487  -0.8%
lkp-wsm-ep2       7774006       7800092±2%  +0.3%     7558833  -2.8%
lkp-skl-d01       8346174       8235695     -1.3%     no data
lkp-hsw-d01       8636244       8655731     +0.2%     8658868  +0.3%
lkp-sb02          4181820       4204107     +0.5%     4182992  +0.0%

vm-scalability.anon-r-rand.throughput (higher is better)

machine         batch=31      batch=63             batch=127
lkp-skl-4sp1    11933873±3%   12356544±2%  +3.5%   12188624     +2.1%
lkp-bdw-ex1      7114424±2%    7330949±2%  +3.0%    7392419     +3.9%
lkp-skl-2sp2     6773277±5%    6492332±8%  -4.1%    6543962     -3.4%
lkp-bdw-ep2      7133846±4%    7233508     +1.4%    7013518±3%  -1.7%
lkp-hsw-ep2      4576626       4527098     -1.1%    4551679     -0.5%
lkp-wsm-ep2      2583599       2592492     +0.3%    2588039     +0.2%
lkp-hsw-d01       998199±2%    1028311     +3.0%    1006460±2%  +0.8%
lkp-sb02          570572        567854     -0.5%     568449     -0.4%

vm-scalability.anon-r-rand-mt.throughput (higher is better)

machine         batch=31      batch=63             batch=127
lkp-skl-4sp1     1789419       1787830     -0.1%    1788208     -0.1%
lkp-bdw-ex1      3492595±2%    3554966±2%  +1.8%    3558835±3%  +1.9%
lkp-skl-2sp2     3856238±2%    3975403±4%  +3.1%    3994600     +3.6%
lkp-bdw-ep2      3726963±11%   3809292±6%  +2.2%    3871924±4%  +3.9%
lkp-hsw-ep2      2131760±3%    2033578±4%  -4.6%    2130727±6%  -0.0%
lkp-wsm-ep2      2369731       2368384     -0.1%    2370252     +0.0%
lkp-skl-d01      1207128       1206220     -0.1%    1205801     -0.1%
lkp-hsw-d01       964317        992329±2%  +2.9%     992099±2%  +2.9%
lkp-sb02          567137        567346     +0.0%     566144     -0.2%

vm-scalability.lru-file-mmap-read.throughput (higher is better)

machine         batch=31      batch=63             batch=127
lkp-skl-4sp1    19560469±6%   23018999     +17.7%   23418800     +19.7%
lkp-bdw-ex1     17769135±14%  26141676±3%  +47.1%   26284723±5%  +47.9%
lkp-skl-2sp2    14056512      13578884      -3.4%   13146214      -6.5%
lkp-bdw-ep2     15336542      14737654      -3.9%   14088159      -8.1%
lkp-hsw-ep2     16275498      15756296      -3.2%   15018090      -7.7%
lkp-wsm-ep2     11272160      11237231      -0.3%   11310047      +0.3%
lkp-skl-d01      7322119       7324569      +0.0%    7184148      -1.9%
lkp-hsw-d01      6449234       6404542      -0.7%    6356141      -1.4%
lkp-sb02         3517943       3520668      +0.1%    3527309      +0.3%

vm-scalability.lru-file-mmap-read-rand.throughput (higher is better)

machine         batch=31      batch=63             batch=127
lkp-skl-4sp1     1689052       1697553  +0.5%       1698726  +0.6%
lkp-bdw-ex1      1675246       1699764  +1.5%       1712226  +2.2%
lkp-skl-2sp2     1800533       1799749  -0.0%       1800581  +0.0%
lkp-bdw-ep2      1807422       1807758  +0.0%       1804932  -0.1%
lkp-hsw-ep2      1809807       1808781  -0.1%       1807811  -0.1%
lkp-wsm-ep2      1800198       1802434  +0.1%       1801236  +0.1%
lkp-skl-d01       696689        695537  -0.2%        694106  -0.4%
lkp-hsw-d01       698364        698666  +0.0%        696686  -0.2%
lkp-sb02          258939        258787  -0.1%        258199  -0.3%

Link: http://lkml.kernel.org/r/20180711055855.29072-1-aaron.lu@intel.com
Signed-off-by: Aaron Lu <aaron.lu@intel.com>
Suggested-by: Dave Hansen <dave.hansen@intel.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Cc: Huang Ying <ying.huang@intel.com>
Cc: Kemi Wang <kemi.wang@intel.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: celtare21 <celtare21@gmail.com>
4 years agoBACKPORT: crypto: arm64/aes-ce-cipher - move assembler code to .S file
Ard Biesheuvel [Tue, 21 Nov 2017 13:40:17 +0000 (13:40 +0000)]
BACKPORT: crypto: arm64/aes-ce-cipher - move assembler code to .S file

Most crypto drivers involving kernel mode NEON take care to put the code
that actually touches the NEON register file in a separate compilation
unit, to prevent the compiler from reordering code that preserves or
restores the NEON context with code that may corrupt it. This is
necessary because we currently have no way to express the restrictions
imposed upon use of the NEON in kernel mode in a way that the compiler
understands.

However, in the case of aes-ce-cipher, it did not seem unreasonable to
deviate from this rule, given how it does not seem possible for the
compiler to reorder cross object function calls with asm blocks whose
in- and output constraints reflect that it reads from and writes to
memory.

Now that LTO is being proposed for the arm64 kernel, it is time to
revisit this. The link time optimization may replace the function
calls to kernel_neon_begin() and kernel_neon_end() with instantiations
of the IR that make up its implementation, allowing further reordering
with the asm block.

So let's clean this up, and move the asm() blocks into a separate .S
file.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-By: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit 019cd46984d04703a39924178f503a98436ac0d7)
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
4 years agoUPSTREAM: crypto: arm64/aes-ce-cipher: add non-SIMD generic fallback
Ard Biesheuvel [Mon, 24 Jul 2017 10:28:11 +0000 (11:28 +0100)]
UPSTREAM: crypto: arm64/aes-ce-cipher: add non-SIMD generic fallback

The arm64 kernel will shortly disallow nested kernel mode NEON, so
add a fallback to scalar code that can be invoked in that case.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit b8fb993a836cd432309410eadf083bbe9c0e9e9c)
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
4 years agoUPSTREAM: crypto: arm64/aes-ce-cipher - match round key endianness with generic code
Ard Biesheuvel [Mon, 24 Jul 2017 10:28:10 +0000 (11:28 +0100)]
UPSTREAM: crypto: arm64/aes-ce-cipher - match round key endianness with generic code

In order to be able to reuse the generic AES code as a fallback for
situations where the NEON may not be used, update the key handling
to match the byte order of the generic code: it stores round keys
as sequences of 32-bit quantities rather than streams of bytes, and
so our code needs to be updated to reflect that.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
(cherry picked from commit f402e3115e20b345bd6fbfcf463a506d958c7bf6)
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
4 years agoMakefile: Use pipes rather than temporary files for intermediate steps
Danny Lin [Thu, 11 Jul 2019 02:18:40 +0000 (19:18 -0700)]
Makefile: Use pipes rather than temporary files for intermediate steps

GCC supports the use of pipes for intermediate compilation steps (e.g.
passing the generated assembly code to the assembler) as a replacement for
temporary files. This bypasses VFS and other layers which can introduce
substantial amounts of overhead and instead redirects data directly
between processes.

The final product and generated code are unaffected. Memory usage while
compiling is slightly higher.

Tests showed a substantial reduction in build time when using GCC to
compile an x86 4.19 kernel:
Using temporary files in tmpfs: 2m41s
Using pipes:                    2m36s

Similar benefits were observed with an Android arm64 4.9 kernel:
Using tmpfs: 5m34s
Using pipes: 4m33s

Enable the feature when possible (i.e. when the compiler supports it) to
speed up builds at effectively no cost for many setups, particularly
those with weaker CPUs.

Test: kernel compiles and boots
Signed-off-by: Danny Lin <danny@kdrag0n.dev>
4 years agomsm8998: Pool msm-thermal every 200ms
celtare21 [Wed, 6 Feb 2019 15:05:13 +0000 (15:05 +0000)]
msm8998: Pool msm-thermal every 200ms

Signed-off-by: celtare21 <celtare21@gmail.com>
4 years agoadreno: hardcode for A540v2
Park Ju Hyung [Sun, 7 Jul 2019 17:43:37 +0000 (02:43 +0900)]
adreno: hardcode for A540v2

Signed-off-by: Park Ju Hyung <qkrwngud825@gmail.com>
[wight554: adapted A630 change for A540v2]
Signed-off-by: Volodymyr Zhdanov <wight554@gmail.com>
4 years agogpu: adreno: only compile Adreno 5xx driver
kdrag0n [Tue, 13 Nov 2018 05:43:38 +0000 (21:43 -0800)]
gpu: adreno: only compile Adreno 5xx driver

We won't be using this with an Adreno 3xx/4xx GPU.

Signed-off-by: kdrag0n <dragon@khronodragon.com>
[wight554: adapted 6xx change for 5xx]
Signed-off-by: Volodymyr Zhdanov <wight554@gmail.com>
4 years agokernel: time: Add delay after cpu_relax() in tight loops
Prasad Sodagudi [Mon, 24 Sep 2018 23:25:55 +0000 (16:25 -0700)]
kernel: time: Add delay after cpu_relax() in tight loops

Tight loops of spin_lock_irqsave() and spin_unlock_irqrestore()
in timer and hrtimer are causing scheduling delays. Add delay of
few nano seconds after cpu_relax in the timer/hrtimer tight loops.

Change-Id: Iaa0ab92da93f7b245b1d922b6edca2bebdc0fbce
Signed-off-by: Prasad Sodagudi <psodagud@codeaurora.org>
[wight554: fix conflicts in kernel/time/{hrtimer.c,tick-internal.h}]
Signed-off-by: Volodymyr Zhdanov <wight554@gmail.com>
4 years agoion: ensure prefetch/drain requests are processed in order
Liam Mark [Mon, 7 Jan 2019 20:07:25 +0000 (12:07 -0800)]
ion: ensure prefetch/drain requests are processed in order

Currently when there is an ION prefetch or drain request we are adding
them to the start of the prefetch_list list.

This can cause the requests to be processed in the wrong order, for
example if there is a prefetch request made followed by a drain request
(before the prefetch request has been processed) the the drain request
will be handled first.

Ensure prefetch and drain requests are processed in order by appending
requests to the end of the prefetch_list list

Change-Id: I935a0d69a52267e888e9b19e1e8c0d9bd68e295b
Signed-off-by: Liam Mark <lmark@codeaurora.org>
4 years agomsm: kgsl: Relax adreno spin idle tight loop
Prakash Kamliya [Thu, 27 Dec 2018 13:06:08 +0000 (18:36 +0530)]
msm: kgsl: Relax adreno spin idle tight loop

Tight loop of adreno_spin_idle() causing RT throttling.
Relax the tight loop by giving chance to other thread.

Change-Id: Ic23d4551c0cc0b5f2fa7844ca73444d1412d480c
Signed-off-by: Prakash Kamliya <pkamliya@codeaurora.org>
4 years agoANDROID: block/cfq-iosched: make group_idle per io cgroup tunable
Rick Yiu [Wed, 26 Sep 2018 08:45:50 +0000 (16:45 +0800)]
ANDROID: block/cfq-iosched: make group_idle per io cgroup tunable

If group_idle is made per io cgroup tunable, it gives more flexibility
in tuning the performance of each group. If no value is set, it will
just use the original default value.

Bug: 117857342
Bug: 132282125
Test: values could be set to each group correctly
Signed-off-by: Rick Yiu <rickyiu@google.com>
Change-Id: I9aba172419f1819f459e8305b909630fa8305978

4 years agocfq: Suppress compiler warnings about comparisons
Bart Van Assche [Tue, 7 Aug 2018 23:17:29 +0000 (16:17 -0700)]
cfq: Suppress compiler warnings about comparisons

This patch does not change any functionality but avoids that gcc
reports the following warnings when building with W=1:

block/cfq-iosched.c: In function ?cfq_back_seek_max_store?:
block/cfq-iosched.c:4741:13: warning: comparison of unsigned expression < 0 is always false [-Wtype-limits]
  if (__data < (MIN))      \
             ^
block/cfq-iosched.c:4756:1: note: in expansion of macro ?STORE_FUNCTION?
 STORE_FUNCTION(cfq_back_seek_max_store, &cfqd->cfq_back_max, 0, UINT_MAX, 0);
 ^~~~~~~~~~~~~~
block/cfq-iosched.c: In function ?cfq_slice_idle_store?:
block/cfq-iosched.c:4741:13: warning: comparison of unsigned expression < 0 is always false [-Wtype-limits]
  if (__data < (MIN))      \
             ^
block/cfq-iosched.c:4759:1: note: in expansion of macro ?STORE_FUNCTION?
 STORE_FUNCTION(cfq_slice_idle_store, &cfqd->cfq_slice_idle, 0, UINT_MAX, 1);
 ^~~~~~~~~~~~~~
block/cfq-iosched.c: In function ?cfq_group_idle_store?:
block/cfq-iosched.c:4741:13: warning: comparison of unsigned expression < 0 is always false [-Wtype-limits]
  if (__data < (MIN))      \
             ^
block/cfq-iosched.c:4760:1: note: in expansion of macro ?STORE_FUNCTION?
 STORE_FUNCTION(cfq_group_idle_store, &cfqd->cfq_group_idle, 0, UINT_MAX, 1);
 ^~~~~~~~~~~~~~
block/cfq-iosched.c: In function ?cfq_low_latency_store?:
block/cfq-iosched.c:4741:13: warning: comparison of unsigned expression < 0 is always false [-Wtype-limits]
  if (__data < (MIN))      \
             ^
block/cfq-iosched.c:4765:1: note: in expansion of macro ?STORE_FUNCTION?
 STORE_FUNCTION(cfq_low_latency_store, &cfqd->cfq_latency, 0, 1, 0);
 ^~~~~~~~~~~~~~
block/cfq-iosched.c: In function ?cfq_slice_idle_us_store?:
block/cfq-iosched.c:4775:13: warning: comparison of unsigned expression < 0 is always false [-Wtype-limits]
  if (__data < (MIN))      \
             ^
block/cfq-iosched.c:4782:1: note: in expansion of macro ?USEC_STORE_FUNCTION?
 USEC_STORE_FUNCTION(cfq_slice_idle_us_store, &cfqd->cfq_slice_idle, 0, UINT_MAX);
 ^~~~~~~~~~~~~~~~~~~
block/cfq-iosched.c: In function ?cfq_group_idle_us_store?:
block/cfq-iosched.c:4775:13: warning: comparison of unsigned expression < 0 is always false [-Wtype-limits]
  if (__data < (MIN))      \
             ^
block/cfq-iosched.c:4783:1: note: in expansion of macro ?USEC_STORE_FUNCTION?
 USEC_STORE_FUNCTION(cfq_group_idle_us_store, &cfqd->cfq_group_idle, 0, UINT_MAX);
 ^~~~~~~~~~~~~~~~~~~

Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
[wight554: fix 4.4 backport conflicts]
Signed-off-by: Volodymyr Zhdanov <wight554@gmail.com>
4 years agocfq: Annotate fall-through in a switch statement
Bart Van Assche [Tue, 7 Aug 2018 23:17:28 +0000 (16:17 -0700)]
cfq: Annotate fall-through in a switch statement

This patch avoids that gcc complains about fall-through when building
with W=1.

Signed-off-by: Bart Van Assche <bart.vanassche@wdc.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
4 years agocfq-iosched: fix the delay of cfq_group's vdisktime under iops mode
Hou Tao [Wed, 1 Mar 2017 01:02:33 +0000 (09:02 +0800)]
cfq-iosched: fix the delay of cfq_group's vdisktime under iops mode

When adding a cfq_group into the cfq service tree, we use CFQ_IDLE_DELAY
as the delay of cfq_group's vdisktime if there have been other cfq_groups
already.

When cfq is under iops mode, commit 9a7f38c42c2b ("cfq-iosched: Convert
from jiffies to nanoseconds") could result in a large iops delay and
lead to an abnormal io schedule delay for the added cfq_group. To fix
it, we just need to revert to the old CFQ_IDLE_DELAY value: HZ / 5
when iops mode is enabled.

Despite having the same value, the delay of a cfq_queue in idle class
and the delay of cfq_group are different things, so I define two new
macros for the delay of a cfq_group under time-slice mode and iops mode.

Fixes: 9a7f38c42c2b ("cfq-iosched: Convert from jiffies to nanoseconds")
Cc: <stable@vger.kernel.org> # 4.8+
Signed-off-by: Hou Tao <houtao1@huawei.com>
Acked-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@fb.com>
4 years agocfq-iosched: Delete unused function min_vdisktime()
Matthias Kaehlcke [Fri, 26 May 2017 21:22:37 +0000 (14:22 -0700)]
cfq-iosched: Delete unused function min_vdisktime()

This fixes the following warning when building with clang:

    block/cfq-iosched.c:970:19: error: unused function 'min_vdisktime'
        [-Werror,-Wunused-function]

Signed-off-by: Matthias Kaehlcke <mka@chromium.org>
Signed-off-by: Jens Axboe <axboe@fb.com>
4 years agocfq-iosched: Adjust one function call together with a variable assignment
Markus Elfring [Sat, 21 Jan 2017 21:44:07 +0000 (22:44 +0100)]
cfq-iosched: Adjust one function call together with a variable assignment

The script "checkpatch.pl" pointed information out like the following.

ERROR: do not use assignment in if condition

Thus fix the affected source code place.

Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Signed-off-by: Jens Axboe <axboe@fb.com>
4 years agoblock: Initialize cfqq->ioprio_class in cfq_get_queue()
Alexander Potapenko [Mon, 23 Jan 2017 14:06:43 +0000 (15:06 +0100)]
block: Initialize cfqq->ioprio_class in cfq_get_queue()

KMSAN (KernelMemorySanitizer, a new error detection tool) reports use of
uninitialized memory in cfq_init_cfqq():

==================================================================
BUG: KMSAN: use of unitialized memory
...
Call Trace:
 [<     inline     >] __dump_stack lib/dump_stack.c:15
 [<ffffffff8202ac97>] dump_stack+0x157/0x1d0 lib/dump_stack.c:51
 [<ffffffff813e9b65>] kmsan_report+0x205/0x360 ??:?
 [<ffffffff813eabbb>] __msan_warning+0x5b/0xb0 ??:?
 [<     inline     >] cfq_init_cfqq block/cfq-iosched.c:3754
 [<ffffffff8201e110>] cfq_get_queue+0xc80/0x14d0 block/cfq-iosched.c:3857
...
origin:
 [<ffffffff8103ab37>] save_stack_trace+0x27/0x50 arch/x86/kernel/stacktrace.c:67
 [<ffffffff813e836b>] kmsan_internal_poison_shadow+0xab/0x150 ??:?
 [<ffffffff813e88ab>] kmsan_poison_slab+0xbb/0x120 ??:?
 [<     inline     >] allocate_slab mm/slub.c:1627
 [<ffffffff813e533f>] new_slab+0x3af/0x4b0 mm/slub.c:1641
 [<     inline     >] new_slab_objects mm/slub.c:2407
 [<ffffffff813e0ef3>] ___slab_alloc+0x323/0x4a0 mm/slub.c:2564
 [<     inline     >] __slab_alloc mm/slub.c:2606
 [<     inline     >] slab_alloc_node mm/slub.c:2669
 [<ffffffff813dfb42>] kmem_cache_alloc_node+0x1d2/0x1f0 mm/slub.c:2746
 [<ffffffff8201d90d>] cfq_get_queue+0x47d/0x14d0 block/cfq-iosched.c:3850
...
==================================================================
(the line numbers are relative to 4.8-rc6, but the bug persists
upstream)

The uninitialized struct cfq_queue is created by kmem_cache_alloc_node()
and then passed to cfq_init_cfqq(), which accesses cfqq->ioprio_class
before it's initialized.

Signed-off-by: Alexander Potapenko <glider@google.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
4 years agocfq-iosched: Charge at least 1 jiffie instead of 1 ns
Jan Kara [Tue, 28 Jun 2016 07:04:02 +0000 (09:04 +0200)]
cfq-iosched: Charge at least 1 jiffie instead of 1 ns

Commit 9a7f38c42c2b (cfq-iosched: Convert from jiffies to nanoseconds)
could result in charging just 1 ns to a cgroup submitting IO instead of 1
jiffie we always charged before. It is arguable what is the right amount
to change but for now lets retain the old behavior of always charging at
least one jiffie.

Fixes: 9a7f38c42c2b92391d9dabaf9f51df7cfe5608e4
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@fb.com>
4 years agocfq-iosched: Fix regression in bonnie++ rewrite performance
Jan Kara [Tue, 28 Jun 2016 07:04:01 +0000 (09:04 +0200)]
cfq-iosched: Fix regression in bonnie++ rewrite performance

Commit 9a7f38c42c2 (cfq-iosched: Convert from jiffies to nanoseconds)
broke the condition for detecting starved sync IO in
cfq_completed_request() because rq->start_time remained in jiffies but
we compared it with nanosecond values. This manifested as a regression
in bonnie++ rewrite performance because we always ended up considering
sync IO starved and thus never increased async IO queue depth.

Since rq->start_time is used in a lot of places, converting it to ns
values would be non-trivial. So just revert the condition in CFQ to use
comparison with jiffies. This will lead to suboptimal results if
cfq_fifo_expire[1] will ever come close to 1 jiffie but so far we are
relatively far from that with the storage used with CFQ (the default
value is 128 ms).

Fixes: 9a7f38c42c2b92391d9dabaf9f51df7cfe5608e4
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@fb.com>
4 years agocfq-iosched: Convert slice_resid from u64 to s64
Jan Kara [Tue, 28 Jun 2016 07:04:00 +0000 (09:04 +0200)]
cfq-iosched: Convert slice_resid from u64 to s64

slice_resid can be both positive and negative. Commit 9a7f38c42c2b
(cfq-iosched: Convert from jiffies to nanoseconds) converted it from
long to u64. Although this did not introduce any functional regression
(the operations just overflow and the result was fine), it is certainly
wrong and could cause issues in future. So convert the type to more
appropriate s64.

Fixes: 9a7f38c42c2b92391d9dabaf9f51df7cfe5608e4
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@fb.com>
4 years agocfq-iosched: Convert to use highres timers
Jan Kara [Wed, 8 Jun 2016 13:11:39 +0000 (15:11 +0200)]
cfq-iosched: Convert to use highres timers

Reviewed-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Jan Kara <jack@suse.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
4 years agocfq-iosched: Expose microsecond interfaces
Jeff Moyer [Wed, 8 Jun 2016 13:11:38 +0000 (15:11 +0200)]
cfq-iosched: Expose microsecond interfaces

Expose interfaces to tune time slices of CFQ IO scheduler in
microseconds.

Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@fb.com>
4 years agocfq-iosched: Convert from jiffies to nanoseconds
Jeff Moyer [Wed, 8 Jun 2016 14:55:34 +0000 (08:55 -0600)]
cfq-iosched: Convert from jiffies to nanoseconds

Convert all time-keeping in CFQ IO scheduler from jiffies to nanoseconds
so that we can later make the intervals more fine-grained than jiffies.
One jiffie is several miliseconds and even for today's rotating disks
that is a noticeable amount of time and thus we leave disk unnecessarily
idle.

Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
Signed-off-by: Jan Kara <jack@suse.cz>
Signed-off-by: Jens Axboe <axboe@fb.com>
[wight554: fix backport conflict]
Signed-off-by: Volodymyr Zhdanov <wight554@gmail.com>
4 years agocfq-iosched: Allow parent cgroup to preempt its child
Jan Kara [Tue, 12 Jan 2016 15:24:19 +0000 (16:24 +0100)]
cfq-iosched: Allow parent cgroup to preempt its child

Currently we don't allow sync workload of one cgroup to preempt sync
workload of any other cgroup. This is because we want to achieve service
separation between cgroups. However in cases where cgroup preempting is
ancestor of the current cgroup, there is no need of separation and
idling introduces unnecessary overhead. This hurts for example the case
when workload is isolated within a cgroup but journalling threads are in
root cgroup. Simple way to demostrate the issue is using:

dbench4 -c /usr/share/dbench4/client.txt -t 10 -D /mnt 1

on ext4 filesystem on plain SATA drive (mounted with barrier=0 to make
difference more visible). When all processes are in the root cgroup,
reported throughput is 153.132 MB/sec. When dbench process gets its own
blkio cgroup, reported throughput drops to 26.1006 MB/sec.

Fix the problem by making check in cfq_should_preempt() more benevolent
and allow preemption by ancestor cgroup. This improves the throughput
reported by dbench4 to 48.9106 MB/sec.

Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jan Kara <jack@suse.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
4 years agocfq-iosched: Allow sync noidle workloads to preempt each other
Jan Kara [Tue, 12 Jan 2016 15:24:17 +0000 (16:24 +0100)]
cfq-iosched: Allow sync noidle workloads to preempt each other

The original idea with preemption of sync noidle queues (introduced in
commit 718eee0579b8 "cfq-iosched: fairness for sync no-idle queues") was
that we service all sync noidle queues together, we don't idle on any of
the queues individually and we idle only if there is no sync noidle
queue to be served. This intention also matches the original test:

if (cfqd->serving_type == SYNC_NOIDLE_WORKLOAD
   && new_cfqq->service_tree == cfqq->service_tree)
return true;

However since at that time cfqq->service_tree was not set for idling
queues, this test was unreliable and was replaced in commit e4a229196a7c
"cfq-iosched: fix no-idle preemption logic" by:

if (cfqd->serving_type == SYNC_NOIDLE_WORKLOAD &&
    cfqq_type(new_cfqq) == SYNC_NOIDLE_WORKLOAD &&
    new_cfqq->service_tree->count == 1)
return true;

That was a reliable test but was actually doing something different -
now we preempt sync noidle queue only if the new queue is the only one
busy in the service tree.

These days cfq queue is kept in service tree even if it is idling and
thus the original check would be safe again. But since we actually check
that cfq queues are in the same cgroup, of the same priority class and
workload type (sync noidle), we know that new_cfqq is fine to preempt
cfqq. So just remove the service tree check.

Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jan Kara <jack@suse.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
4 years agocfq-iosched: Reorder checks in cfq_should_preempt()
Jan Kara [Tue, 12 Jan 2016 15:24:16 +0000 (16:24 +0100)]
cfq-iosched: Reorder checks in cfq_should_preempt()

Move check for preemption by rt class up. There is no functional change
but it makes arguing about conditions simpler since we can be sure both
cfq queues are from the same ioprio class.

Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jan Kara <jack@suse.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
4 years agof2fs: set ioprio of GC kthread to idle
Park Ju Hyung [Tue, 26 Sep 2017 01:51:24 +0000 (10:51 +0900)]
f2fs: set ioprio of GC kthread to idle

GC should run conservatively as possible to reduce latency spikes to the user.

Setting ioprio to idle class will allow the kernel to schedule GC thread's I/O
to not affect any other processes' I/O requests.

Signed-off-by: Park Ju Hyung <qkrwngud825@gmail.com>
4 years agogpu: kgsl: Make max_gpuclk no op
wloot [Wed, 10 Apr 2019 19:29:10 +0000 (03:29 +0800)]
gpu: kgsl: Make max_gpuclk no op