git.osdn.net Git - sagit-ice-cold/kernel_xiaomi

Revert "arm64: spinlock: Drop the prfm from arch_spin_trylock"

This reverts commit 00d37423ceaf815799a1d3e4fd5c958b8b3c66a6.

Causes the following assembler errors:
/tmp/spinlock_debug-15a9d8.s:441: Error: attempt to move .org backwards

The reverted commit is an out of tree patch that does not exist in
4.14.y upstream. Mainline has outright removed the arch_spin_trylock()
function in this translation units.

See also from mainline:
commit a4c1887d4c14 ("locking/arch: Remove dummy
arch_{read,spin,write}_lock_flags() implementations")
commit c11090474d70 ("arm64: locking: Replace ticket lock implementation
with qspinlock")

Bug: 117152549
Change-Id: I3eb40e5d18ed22c4e3ff1d2b8707e784067aabac
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>

qcacld-3.0: fix uninitialized variable

msm: vidc: silence some debugfs spam

Signed-off-by: Yaroslav Furman <yaro330@gmail.com>

soc/qcom: move icnss_initialize to async probe

To reduce boot time to first stage.

Before:
[ 1.221918] init: init first stage started!

After:
[ 1.199159] init: init first stage started!

Bug: 129688998
Test: reboot 100 times, camera, wifi, basic operation
Change-Id: Iff04c974605a2a5e5580c60a6b498b1fdbb44d12
Signed-off-by: Rick Yiu <rickyiu@google.com>

msm: kgsl: move kgsl_3d_init to async probe

To reduce boot time to first stage.

Before:
[ 1.221918] init: init first stage started!

After:
[ 1.132756] init: init first stage started!

Bug: 129688998
Test: reboot 100 times, camera, wifi, basic operation
Change-Id: Ibce1d55a615f5d7f10f70263c1b9a4c6a1b26222
Signed-off-by: Rick Yiu <rickyiu@google.com>

Revert "kernel: Only expose su when daemon is running"

This is not nececarry anymore.

This reverts commit ad0d9b3dd0b46906ef1f875a8c83eb2b35ffed76.

proc: reject "." and ".." as filenames

Various subsystems can create files and directories in /proc with names
directly controlled by userspace.

Which means "/", "." and ".." are no-no.

"/" split is already taken care of, do the other 2 prohibited names.

Link: http://lkml.kernel.org/r/20180310001223.GB12443@avx2
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Acked-by: Florian Westphal <fw@strlen.de>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Cong Wang <xiyou.wangcong@gmail.com>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Danny Lin <danny@kdrag0n.dev>

proc: do mmput ASAP for /proc/*/map_files

mm_struct is not needed while printing as all the data was already
extracted.

Link: http://lkml.kernel.org/r/20180309223120.GC3843@avx2
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Danny Lin <danny@kdrag0n.dev>

proc: register filesystem last

As soon as register_filesystem() exits, filesystem can be mounted. It
is better to present fully operational /proc.

Of course it doesn't matter because /proc is not modular but do it
anyway.

Drop error check, it should be handled by panicking.

Link: http://lkml.kernel.org/r/20180309222709.GA3843@avx2
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Danny Lin <danny@kdrag0n.dev>

proc: faster /proc/cmdline

Use seq_puts() and skip format string processing.

Link: http://lkml.kernel.org/r/20180309222948.GB3843@avx2
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Danny Lin <danny@kdrag0n.dev>

mm/vmalloc.c: fix typo in comment

Reported-by: Nicholas Joll <najoll@posteo.net>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Danny Lin <danny@kdrag0n.dev>

mm/vmalloc: pass VM_USERMAP flags directly to __vmalloc_node_range()

vmalloc_user*() calls differ from normal vmalloc() only in that they set
VM_USERMAP flags for the area. During the whole history of vmalloc.c
changes now it is possible simply to pass VM_USERMAP flags directly to
__vmalloc_node_range() call instead of finding the area (which obviously
takes time) after the allocation.

Link: http://lkml.kernel.org/r/20190103145954.16942-4-rpenyaev@suse.de
Signed-off-by: Roman Penyaev <rpenyaev@suse.de>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Joe Perches <joe@perches.com>
Cc: "Luis R. Rodriguez" <mcgrof@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Danny Lin <danny@kdrag0n.dev>

mm/vmalloc.c: improve vfree() kerneldoc

vfree() might sleep if called not in interrupt context. Explain that in
the comment.

Link: http://lkml.kernel.org/r/20180914130512.10394-2-aryabinin@virtuozzo.com
Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Danny Lin <danny@kdrag0n.dev>

mm/vmalloc.c: make vmalloc_32_user() align base kernel virtual address to SHMLBA

This patch repeats the original one from David S Miller:

  2dca6999eed5 ("mm, perf_event: Make vmalloc_user() align base kernel virtual address to SHMLBA")

but for missed vmalloc_32_user() case, which also requires correct
alignment of virtual address on kernel side to avoid D-caches aliases.
A bit of copy-paste from original patch to recover in memory of what is
all about:

  When a vmalloc'd area is mmap'd into userspace, some kind of
  co-ordination is necessary for this to work on platforms with cpu
  D-caches which can have aliases.

  Otherwise kernel side writes won't be seen properly in userspace and
  vice versa.

  If the kernel side mapping and the user side one have the same
  alignment, modulo SHMLBA, this can work as long as VM_SHARED is shared
  of VMA and for all current users this is true. VM_SHARED will force
  SHMLBA alignment of the user side mmap on platforms with D-cache
  aliasing matters.

  David S. Miller

> What are the user-visible runtime effects of this change?

In simple words: proper alignment avoids possible difference in data,
seen by different virtual mapings: userspace and kernel in our case.
I.e. userspace reads cache line A, kernel writes to cache line B.  Both
cache lines correspond to the same physical memory (thus aliases).

So this should fix data corruption for archs with vivt and vipt caches,
e.g. armv6.  Personally I've never worked with this archs, I just
spotted the strange difference in code: for one case we do alignment,
for another - not.  I have a strong feeling that David simply missed
vmalloc_32_user() case.

>
> Is a -stable backport needed?

No, I do not think so.  The only one user of vmalloc_32_user() is
virtual frame buffer device drivers/video/fbdev/vfb.c, which has in the
description "The main use of this frame buffer device is testing and
debugging the frame buffer subsystem.  Do NOT enable it for normal
systems!".

And it seems to me that this vfb.c does not need 32bit addressable pages
(vmalloc_32_user() case), because it is virtual device and should not
care about things like dma32 zones, etc.  Probably is better to clean
the code and switch vfb.c from vmalloc_32_user() to vmalloc_user() case
and wipe out vmalloc_32_user() from vmalloc.c completely.  But I'm not
very much sure that this is worth to do, that's so minor, so we can
leave it as is.

Link: http://lkml.kernel.org/r/20190108110944.23591-1-rpenyaev@suse.de
Signed-off-by: Roman Penyaev <rpenyaev@suse.de>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Michal Hocko <mhocko@suse.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Danny Lin <danny@kdrag0n.dev>

vfree: add debug might_sleep()

Add might_sleep() call to vfree() to catch potential sleep-in-atomic bugs
earlier.

[aryabinin@virtuozzo.com: drop might_sleep_if() from kvfree()]
Link: http://lkml.kernel.org/r/7e19e4df-b1a6-29bd-9ae7-0266d50bef1d@virtuozzo.com
Link: http://lkml.kernel.org/r/20180914130512.10394-3-aryabinin@virtuozzo.com
Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Danny Lin <danny@kdrag0n.dev>

mm/vmalloc: do not call kmemleak_free() on not yet accounted memory

__vmalloc_area_node() calls vfree() on error path, which in turn calls
kmemleak_free(), but area is not yet accounted by kmemleak_vmalloc().

Link: http://lkml.kernel.org/r/20190103145954.16942-3-rpenyaev@suse.de
Signed-off-by: Roman Penyaev <rpenyaev@suse.de>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Joe Perches <joe@perches.com>
Cc: "Luis R. Rodriguez" <mcgrof@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Danny Lin <danny@kdrag0n.dev>

compiler: allow all arches to enable CONFIG_OPTIMIZE_INLINING

Commit 60a3cdd06394 ("x86: add optimized inlining") introduced
CONFIG_OPTIMIZE_INLINING, but it has been available only for x86.

The idea is obviously arch-agnostic.  This commit moves the config entry
from arch/x86/Kconfig.debug to lib/Kconfig.debug so that all
architectures can benefit from it.

This can make a huge difference in kernel image size especially when
CONFIG_OPTIMIZE_FOR_SIZE is enabled.

For example, I got 3.5% smaller arm64 kernel for v5.1-rc1.

  dec       file
  18983424  arch/arm64/boot/Image.before
  18321920  arch/arm64/boot/Image.after

This also slightly improves the "Kernel hacking" Kconfig menu as
e61aca5158a8 ("Merge branch 'kconfig-diet' from Dave Hansen') suggested;
this config option would be a good fit in the "compiler option" menu.

Link: http://lkml.kernel.org/r/20190423034959.13525-12-yamada.masahiro@socionext.com
Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Acked-by: Borislav Petkov <bp@suse.de>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Boris Brezillon <bbrezillon@kernel.org>
Cc: Brian Norris <computersforpeace@gmail.com>
Cc: Christophe Leroy <christophe.leroy@c-s.fr>
Cc: David Woodhouse <dwmw2@infradead.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Marek Vasut <marek.vasut@gmail.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Malaterre <malat@debian.org>
Cc: Miquel Raynal <miquel.raynal@bootlin.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Richard Weinberger <richard@nod.at>
Cc: Russell King <rmk+kernel@arm.linux.org.uk>
Cc: Stefan Agner <stefan@agner.ch>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
[kdrag0n: Backported to k4.14]
Signed-off-by: Danny Lin <danny@kdrag0n.dev>

staging: qcacld-3.0: remove redefined macro NO_SESSION

Merge tag 'v4.4.207' into 10

Merge tag 'LA.UM.8.4.r1-04700-8x98.0' of https://source.codeaurora.org/quic/la/kernel/msm-4.4 into 10

"LA.UM.8.4.r1-04700-8x98.0"

{chiron,sagit}_defconfig: Enable optimized inlining

TODO: benchmark

Signed-off-by: Danny Lin <danny@kdrag0n.dev>

irq: silence 'irq no longer affine' messages

Signed-off-by: Yaroslav Furman <yaro330@gmail.com>
Signed-off-by: Danny Lin <danny@kdrag0n.dev>

Revert "ASoC: msm: qdsp6v2: Fix sound break after 8.12.28 dev"

This reverts commit 3c4dc2139503d77f9b5123cc83668dc60be2545b.

tfa98xx: Disable enum-conversion warnings

Revert "msm: kgsl: Log context type in case of GPU faults"

This reverts commit de5400e150cf90489ce98e2c48c8439e3165f8a8.

ASoC: Split build special drives

ASoC: tfa98xx: do not touch debugging flag

ASoC: tfa98xx: select fw name by nxp id state

ASoC: tfa98xx: do not update profile index when the device is in operating mode

taken from xiaomi raphael-p-oss
to fix "Write profile error"

ASoC: tfa98xx: Always returns false on startup

taken from Xiaomi raphael-p-oss

ASoC: tfa98xx: clean up

ASoC: tfa98xx: import v6.6.3 driver

https://source.codeaurora.org/external/mas/tfa98xx/tree/?h=DIN_v6.6.3

msm_performance: checkout to msm-4.14

* HEAD: dfe33bfd3eea2224c34606846cd7cc852748db75
* repick <msm_performance: Limit boost frequency for per CPU cluster>
@wloot: backport to k4.4
Signed-off-by: Julian Liu <wlootlxt123@gmail.com>

selinux: use kmem_cache for ebitmap

The allocated size for each ebitmap_node is 192byte by kzalloc().
Then, ebitmap_node size is fixed, so it's possible to use only 144byte
for each object by kmem_cache_zalloc().
It can reduce some dynamic allocation size.

Signed-off-by: Junil Lee <junil0814.lee@lge.com>
Signed-off-by: Paul Moore <paul@paul-moore.com>

selinux: use stack on security_sid_to_context() if possible

For temporary usage such as logging, there's no need to allocate
a dedicated memory.

Signed-off-by: Park Ju Hyung <qkrwngud825@gmail.com>

cpufreq: fallback to schedutil if governor is not found

Nowadays, custom ROMs are starting to set custom governor profiles via
init scripts, often times involving custom governors (like DrunkSauce,
which is impulse on little and ironactive on big). If the user sticks
with the built-in kernel, that is totally fine as everything will get
loaded up properly. However, if the user decides to switch to a custom
kernel and those governors are not present, this function will fail to
find them when and the governor will not change. I attempt to work
around this by modifying how the method handles finding the governor.

Process without this patch:
1. Phone boots with performance governor
2. Init scripts attempts to write customer governor to sys path
3. The function cpufreq_parse_governor fails to locate the
   governor and returns -EINVAL since the input was invalid.
4. The phone does not switch from the performance governor.

Process with this patch:
1. Phone boots with performance governor
2. Init scripts attempts to write customer governor to sys path
3. The function cpufreq_parse_governor fails to locate the
   governor but then tries to locate the schedutil governor.
4. The schedutil governor is then applied.

A couple of notes:
1. If you have a different governor that you want to be set on
   boot, make sure to change schedutil to that governor's
   name
2. This is a little bit of a hack; ideally, it'd be nice to
   only switch to schedutil if the current governor is
   performance so that already applied governors/settings
   are not overwritten but there is not really a scenario
   where this will be accidentally be activated since
   custom kernel tweaking apps will only show governors that
   are currently available. Maybe when I have more time will
   I explore this in depth.

Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>

adreno: disable snapshot, coresight and trace

Signed-off-by: Park Ju Hyung <qkrwngud825@gmail.com>
[wight554: adapted for 4.4 and a540]
Signed-off-by: Volodymyr Zhdanov <wight554@gmail.com>

cfq-iosched: Don't group_idle if cfqq has big thinktime

There is no point in idling on a cfq group if the only cfq queue that is
there has too big thinktime.

Signed-off-by: Jan Kara <jack@suse.com>
Acked-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Jens Axboe <axboe@fb.com>

quota_tree: Avoid dynamic memory allocations

Most allocations done here are rather small and can fit on the stack,
eliminating the need to allocate them dynamically. Reserve a 1024B
stack buffer for this purpose to avoid the overhead of dynamic
memory allocation.

1024B covers most use cases, and higher values were observed to cause
stack corruptions.

Co-authored-by: Sultan Alsawaf <sultan@kerneltoast.com>
Signed-off-by: Park Ju Hyung <qkrwngud825@gmail.com>
Signed-off-by: Danny Lin <danny@kdrag0n.dev>

proc: Don't let Google Camera and Settings run in the background

Google Camera and Settings both burn through CPU in the background doing
nothing useful. In the case of Google Camera, it keeps polling sensors
in the background while it is doing nothing for the user.

Meanwhile, in Settings, when leaving the Adaptive brightness activity
via any means other than using the back button (e.g., the home button),
the GIF in the Adaptive brightness activity will continue playing in the
background. This bug applies to all of the dumb GIFs in Settings.

Kill both of these apps when they reach the background to stop them from
burning through battery.

Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>

ARM: dts: msm8998: Increase UFS CPU latency requirement to 100 us

Voting for longer than 70 us provides access to another CPU idle state.

Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>

{chiron,sagit}_defconfig: Use a timer frequency of 100 Hz

The use of high CPU frequencies combined with a large number of cores
makes latency decent with a 100 Hz scheduler tick. Reducing the tick
rate from 300 Hz to 100 Hz improves throughput and significantly reduces
the number of interrupts firing off per second, improving power
consumption.

Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>

qcacld-3.0: Nuke as much debug bloat as possible

The overhead from all the debugging in this monstrosity of a driver is
measurably significant. Chop it all out.

Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>

msm: kgsl: Relax CPU latency requirements to save power

Relaxing the CPU latency requirement by about 500 us won't significantly
hurt graphics performance. On the flip side, most SoCs have many idle
levels just below 1000 us in latency, with deeper idle levels having
latencies in excess of 2000 us. Changing the latency requirement to
1000 us allows most SoCs to use their deepest sub-1000-us idle state
while the GPU is active.

Additionally, since the lpm driver has been updated to allow power
levels with latencies equal to target latencies, change the wakeup
latency from 101 to 100 for clarity.

Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>

sched/tune: Hard-code top-app's stune boost to 1

Hard-code top-app's stune boost to 1 so that top-app processes are still
preferred to run on big cluster CPUs without significantly affecting the
CPU governor's frequency selection.

Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>

cpuidle: lpm-levels: Allow exit latencies equal to target latencies

This allows pm_qos votes with, say, 100 us for example to select power
levels with exit latencies equal to 100 us. The extra microsecond of
exit latency doesn't hurt.

Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>

{chiron,sagit}_defconfig: Don't print pid and CPU in dmesg

This cruft is annoying.

Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>

{chiron,sagit}_defconfig: enable devfreq boost driver

msm: mdss: Mark display-wake kthread as performance critical

This kthread is responsible for powering on the display, so it needs to
run as soon as possible to minimize lag when turning the display on.

Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>

msm: mdss: Power on display asynchronously as early as possible

Currently, mdss powers on the display a long time after an unblank is
requested and gets completely blocked while powering on the display, so
if the display takes a very long time to turn on, then mdss will be
stuck unable to do anything else in the meantime. This results in a long
delay between trying to wake the device and the display actually
powering on.

In order to make the display turn on faster when waking the device from
sleep, start powering on the display as soon as the framebuffer unblank
event is received. This allows mdss to continue resuming while the
display takes its time powering on. A high-priority kthread is used here
to ensure the display powers on as quickly as possible.

In the event that the framebuffer unblank notifier is not used (such as
for AOD), the display will be powered on at the time that it is
requested via the MDSS_EVENT_LINK_READY event.

To make this work, kickoffs need to be blocked when they attempt to
power on the display, so that a kickoff won't continue while the display
is still powered off.

Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>

msm: mdss: Boost DDR bus when committing a new frame

In order to reduce jank, request a DDR bus boost whenever a new frame is
ready to be rendered to the display. The boost should be sufficient
enough to render 60 FPS without any dropped frames when there is no
significant external source of load.

Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>

devfreq_boost: Mark boost kthreads as performance critical

The boost kthreads are performance critical for obvious reasons.

Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>

devfreq_boost: Introduce devfreq boost driver

This driver boosts enumerated devfreq devices upon input, and allows for
boosting specific devfreq devices on other custom events. The boost
frequencies for this driver should be set so that frame drops are
near-zero at the boosted frequencies and power consumption is minimized
at said frequencies. The goal of this driver is to provide an interface
to achieve optimal device performance by requesting boosts on key
events, such as when a frame is ready to rendered to the display.

Currently, support is only present for boosting the cpubw devfreq
device, but the driver is structured in a way that makes it easy to add
support for new boostable devfreq devices in the future.

Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>

cpufreq: Kill userspace CPU boosting entirely

Kernel-based CPU boosting is used now, so stop userspace from messing with
it by turning scaling_min_freq into a no-op. Note that this is done instead
of making scaling_min_freq read-only so that userspace doesn't spit out
error messages when it can't do its boosting.

Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>

fs: exec: block nfs injector from launching

Another optimizer, duh...

Signed-off-by: Yaroslav Furman <yaro330@gmail.com>

{chiron,sagit}_defconfig: enable unwated apps blocker

fs: introduce unwated apps blocker

This is a POC commit which targets various Android optimiers,
that use 2011 tweaks and script to
"improve battery life and performance"
and completely mess up all the kernel settings, thus leading
to poor UX and whine from users.

Currently blocking: L Speed, FDE.AI

Now blocking: L Speed
Based on [1] and [2]:
https://github.com/RaphielGang/spins_kernel_xiaomi_sdm845/commit/75804821e68be3ece795402e175e36ebf7206540
https://github.com/kerneltoast/android_kernel_google_bluecross/commit/18f25c985ce1d23ca98f765249671e7252886e4d

Signed-off-by: Yaroslav Furman <yaro330@gmail.com>

zram: Do not allow compression algorithm to be changed

Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>

zram: Move default compression algorithm choice to Kconfig

Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: Yaroslav Furman <yaro330@gmail.com>

kernel: bpf: move syscall allocations to stack

These are really small, freed in the same function, very frequent.
Allocating them on stack will improve performance.

Signed-off-by: Yaroslav Furman <yaro330@gmail.com>

kernel/printk: use on-stack allocations for kernel log

These allocationsare just 1kb in size, using kmalloc is not
worth it for them. This should speed up printing of kernel log
when uptime gets very long.

Signed-off-by: Yaroslav Furman <yaro330@gmail.com>

msm: gsi: disable debug driver

Signed-off-by: Yaroslav Furman <yaro330@gmail.com>

cpuidle: lpm-levels: Remove debug event logging

A measurably significant amount of CPU time is spent on logging events
for debugging purposes in lpm_cpuidle_enter. Kill the useless logging to
reduce overhead.

Signed-off-by: Danny Lin <danny@kdrag0n.dev>
Signed-off-by: Yaroslav Furman <yaro330@gmail.com>

ipa_v3: fix some maybe-uninitialised warnings

Signed-off-by: Yaroslav Furman <yaro330@gmail.com>

mm/slab_common: Align all caches' objects to hardware cachelines

This only increases the memory used by all caches by about 10%, which is
relatively very little for the performance benefit of cacheline
alignment.

Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>

ASoC: msm: qdsp6v2: Make version checking no-op

After Pie tag was released CAF added functions
for checking fw version that are not supported
by our DSP.

And kernel tell us about it by spamming:

[10186.137518] q6core_get_service_version: Failed to get service size for service id 7
with error -95
[10186.141517] q6core_get_service_version: Failed to get service size for service id 8
with error -95
[10186.151816] q6core_get_service_version: Failed to get service size for service id 7
with error -95
[10254.278514] q6core_get_service_version: Failed to get service size for service id 7
with error -95
[10254.282274] q6core_get_service_version: Failed to get service size for service id 8
with error -95
[10254.292154] q6core_get_service_version: Failed to get service size for service id 7
with error -95
[10294.549313] q6core_get_service_version: Failed to get service size for service id 7
with error -95
[10294.553506] q6core_get_service_version: Failed to get service size for service id 8
with error -95
[10294.563891] q6core_get_service_version: Failed to get service size for service id 7
with error -95

This results in certain audio apps getting focked up
after system suspends and then goes back online.

Change-Id: I09dfa1ee3adad8df62f79bc79a88a74f60d73b23
Signed-off-by: Yaroslav Furman <yaro330@gmail.com>

Revert "{chiron,sagit}_defconfig: enable bfq"

This reverts commit e34df45d2cab26ba0dbc4fdc106548f3e13eb395.

Revert "BACKPORT: zsmalloc: introduce zs_huge_class_size()"

This reverts commit 4d1ddb8d3b84e9e162217bf55e8aad6fa796b836.

ARM: dts: msm8998: Tune mincpubw configs

f2fs: avoid kernel panic on corruption test

xfstests/generic/475 complains kernel warn/panic while testing corrupted disk.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>

f2fs: Check write pointer consistency of non-open zones

To catch f2fs bugs in write pointer handling code for zoned block
devices, check write pointers of non-open zones that current segments do
not point to. Do this check at mount time, after the fsync data recovery
and current segments' write pointer consistency fix. Check two items
comparing write pointers with valid block maps in SIT.

The first item is check for zones with no valid blocks. When there is no
valid blocks in a zone, the write pointer should be at the start of the
zone. If not, next write operation to the zone will cause unaligned write
error. If write pointer is not at the zone start, make mount fail and ask
users to run fsck.

The second item is check between the write pointer position and the last
valid block in the zone. It is unexpected that the last valid block
position is beyond the write pointer. In such a case, report as the bug.
Fix is not required for such zone, because the zone is not selected for
next write operation until the zone get discarded.

Also move a constant F2FS_REPORT_ZONE from super.c to f2fs.h to use it
in segment.c also.

Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>

f2fs: Check write pointer consistency of open zones

On sudden f2fs shutdown, write pointers of zoned block devices can go
further but f2fs meta data keeps current segments at positions before the
write operations. After remounting the f2fs, this inconsistency causes
write operations not at write pointers and "Unaligned write command"
error is reported.

To avoid the error, compare current segments with write pointers of open
zones the current segments point to, during mount operation. If the write
pointer position is not aligned with the current segment position, assign
a new zone to the current segments. Also check the newly assigned zone
has write pointer at zone start. If not, make mount fail and ask users to
run fsck.

Perform the consistency check twice. Once during fsync recovery. Not to
lose the fsync data, do the check after fsync data gets restored and
before checkpoint commit which flushes data at current segment positions.
The second check is done at end of f2fs_fill_super() to make sure the
write pointer consistency regardless of fsync data recovery execution.

Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@wdc.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>

f2fs: fix wrong description in document

As reported in bugzilla, default value of DEF_RAM_THRESHOLD was fixed by
commit 29710bcf9426 ("f2fs: fix wrong percentage"), however leaving wrong
description in document, fix it.

https://bugzilla.kernel.org/show_bug.cgi?id=205203

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>

f2fs: cache global IPU bio

In commit 8648de2c581e ("f2fs: add bio cache for IPU"), we added
f2fs_submit_ipu_bio() in __write_data_page() as below:

__write_data_page()

if (!S_ISDIR(inode->i_mode) && !IS_NOQUOTA(inode)) {
f2fs_submit_ipu_bio(sbi, bio, page);
....
}

in order to avoid below deadlock:

Thread A Thread B
- __write_data_page (inode x, page y)
- f2fs_do_write_data_page
  - set_page_writeback        ---- set writeback flag in page y
  - f2fs_inplace_write_data
- f2fs_balance_fs
- lock gc_mutex
- lock gc_mutex
  - f2fs_gc
   - do_garbage_collect
    - gc_data_segment
     - move_data_page
      - f2fs_wait_on_page_writeback
       - wait_on_page_writeback  --- wait writeback of page y

However, the bio submission breaks the merge of IPU IOs.

So in this patch let's add a global bio cache for merged IPU pages,
then f2fs_wait_on_page_writeback() is able to submit bio if a
writebacked page is cached in global bio cache.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>

f2fs: fix to avoid memory leakage in f2fs_listxattr

In f2fs_listxattr, there is no boundary check before
memcpy e_name to buffer.
If the e_name_len is corrupted,
unexpected memory contents may be returned to the buffer.

Signed-off-by: Randall Huang <huangrandall@google.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>

f2fs: check total_segments from devices in raw_super

For multi-device F2FS, we should check if the sum of total_segments from
all devices matches segment_count.

Signed-off-by: Qiuyang Sun <sunqiuyang@huawei.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>

f2fs: update multi-dev metadata in resize_fs

Multi-device metadata should be updated in resize_fs as well.

Also, we check that the new FS size still reaches the last device.

Signed-off-by: Qiuyang Sun <sunqiuyang@huawei.com>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>

f2fs: mark recovery flag correctly in read_raw_super_block()

On the combination of first fail and second success,
we will miss to mark recovery flag because currently
we reuse err variable in the loop.

Signed-off-by: Chengguang Xu <cgxu519@zoho.com.cn>
Reviewed-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>

f2fs: fix to update time in lazytime mode

generic/018 reports an inconsistent status of atime, the
testcase is as below:
- open file with O_SYNC
- write file to construct fraged space
- calc md5 of file
- record {a,c,m}time
- defrag file --- do nothing
- umount & mount
- check {a,c,m}time

The root cause is, as f2fs enables lazytime by default, atime
update will dirty vfs inode, rather than dirtying f2fs inode (by set
with FI_DIRTY_INODE), so later f2fs_write_inode() called from VFS will
fail to update inode page due to our skip:

f2fs_write_inode()
if (is_inode_flag_set(inode, FI_DIRTY_INODE))
return 0;

So eventually, after evict(), we lose last atime for ever.

To fix this issue, we need to check whether {a,c,m,cr}time is
consistent in between inode cache and inode page, and only skip
f2fs_update_inode() if f2fs inode is not dirty and time is
consistent as well.

Signed-off-by: Chao Yu <yuchao0@huawei.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>

simple_lmk: Change Kconfig defaults

Commit "simple_lmk: Make reclaim deterministic" changed Simple LMK's
behavior, so the default parameters must be updated as well to
compensate.

Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>

simple_lmk: Clean up some code style nitpicks

Using a parameter to pass around a unmodified pointer to a global
variable is crufty; just use the `victims` variable directly instead.
Also, compress the code in simple_lmk_init_set() a bit to make it look
cleaner.

Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>
Signed-off-by: celtare21 <celtare21@gmail.com>

simple_lmk: Make reclaim deterministic

The 20 ms delay in the reclaim thread is a hacky fudge factor that can
cause Simple LMK to behave wildly differently depending on the
circumstances of when it is invoked. When kswapd doesn't get enough CPU
time to finish up and go back to sleep within 20 ms, Simple LMK performs
superfluous reclaims.

This is suboptimal, so make Simple LMK more deterministic by eliminating
the delay and instead queuing up reclaim requests from kswapd.

Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>
Signed-off-by: celtare21 <celtare21@gmail.com>

simple_lmk: Fix broken multicopy atomicity for victims_to_kill

When the reclaim thread writes to victims_to_kill on one CPU, it expects
the updated value to be immediately reflected on all CPUs in order for
simple_lmk_mm_freed() to work correctly. Due to the lack of memory
barriers to guarantee multicopy atomicity, simple_lmk_mm_freed() can be
given a victim's mm without knowing the correct victims_to_kill value,
which can cause the reclaim thread to remain stuck waiting forever for
all victims to be freed. This scenario, despite being rare, has been
observed.

Fix this by using proper atomic helpers with memory barriers.

Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>
Signed-off-by: celtare21 <celtare21@gmail.com>

msm: mdss: Silence debug logs

simple_lmk: Don't give victims privileges

* When slmk is triggered, there are usually heavy tasks running.
Increasing the victim's priority may result in unnecessary preemption and lag

defconfig: Disable CC_STACKPROTECTOR_STRONG

* save few size, idc others

Kbuild: don't pass "-C" to preprocessor when processing linker scripts

For some odd historical reason, we preprocessed the linker scripts with
"-C", which keeps comments around. That makes no sense, since the
comments are not meaningful for the build anyway.

And it actually breaks things, since linker scripts can't have C++ style
"//" comments in them, so keeping comments after preprocessing now
limits us in odd and surprising ways in our header files for no good
reason.

The -C option goes back to pre-git and pre-bitkeeper times, but seems to
have been historically used (along with "-traditional") for some
odd-ball architectures (ia64, MIPS and SH). It probably didn't matter
back then either, but might possibly have been used to minimize the
difference between the original file and the pre-processed result.

The reason for this may be lost in time, but let's not perpetuate it
only because we can't remember why we did this crazy thing.

This was triggered by the recent addition of SPDX lines to the source
tree, where people apparently were confused about why header files
couldn't use the C++ comment format.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Greg KH <gregkh@linuxfoundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

qcacld-3.0: Disable open-source flag

* It can disable some debug codes and make the driver more similar to OEM's

qpnp-fg-gen3: Don't ratelimit interaction related props

* The state changes corresponding to these props should be immediately fed back in userspace.

loop: avoid EAGAIN, if offset or block_size are changed

This patch tries to avoid EAGAIN due to nrpages!=0 that was originally trying
to drop stale pages resulting in wrong data access.

Report: https://bugs.chromium.org/p/chromium/issues/detail?id=938958#c38

Cc: <stable@vger.kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: linux-block@vger.kernel.org
Cc: Bart Van Assche <bvanassche@acm.org>
Fixes: 5db470e229e2 ("loop: drop caches if offset or block_size are changed")
Reported-by: Gwendal Grignou <gwendal@chromium.org>
Reported-by: grygorii tertychnyi <gtertych@cisco.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Julian Liu <wlootlxt123@gmail.com>

scatterlist: Speed up for_each_sg() loop macro

Scatterlists are chained in predictable arrays of up to
SG_MAX_SINGLE_ALLOC sg structs in length. Using this knowledge, speed up
for_each_sg() by using constant operations to determine when to simply
increment the sg pointer by one or get the next sg array in the chain.

Rudimentary measurements with a trivial loop body show that this yields
roughly a 2x performance gain.

The following simple test module proves the correctness of the new loop
definition by testing all the different edge cases of sg chains:
#include <linux/module.h>
#include <linux/scatterlist.h>
#include <linux/slab.h>

static int __init test_for_each_sg(void)
{
static const gfp_t gfp_flags = GFP_KERNEL | __GFP_NOFAIL;
        struct scatterlist *sg;
        struct sg_table *table;
        long old = 0, new = 0;
        unsigned int i, nents;

        table = kmalloc(sizeof(*table), gfp_flags);
        for (nents = 1; nents <= 3 * SG_MAX_SINGLE_ALLOC; nents++) {
                BUG_ON(sg_alloc_table(table, nents, gfp_flags));
                for (sg = table->sgl; sg; sg = sg_next(sg))
                        old ^= (long)sg;
                for_each_sg(table->sgl, sg, nents, i)
                        new ^= (long)sg;
                sg_free_table(table);
        }

        BUG_ON(old != new);
        kfree(table);
        return 0;
}
module_init(test_for_each_sg);

Signed-off-by: Sultan Alsawaf <sultan@kerneltoast.com>

mm/shmem.c: fix unlikely() test of info->seals to test only for WRITE and GROW

Running my likely/unlikely profiler, I discovered that the test in
shmem_write_begin() that tests for info->seals as unlikely, is always
incorrect. This is because shmem_get_inode() sets info->seals to have
F_SEAL_SEAL set by default, and it is unlikely to be cleared when
shmem_write_begin() is called. Thus, the if statement is very likely.

But as the if statement block only cares about F_SEAL_WRITE and
F_SEAL_GROW, change the test to only test those two bits.

Link: http://lkml.kernel.org/r/20170203105656.7aec6237@gandalf.local.home
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Acked-by: Hugh Dickins <hughd@google.com>
Cc: David Herrmann <dh.herrmann@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

sched/core: Remove unlikely() annotation from sched_move_task()

The check for 'running' in sched_move_task() has an unlikely() around it. That
is, it is unlikely that the task being moved is running. That use to be
true. But with a couple of recent updates, it is now likely that the task
will be running.

The first change came from ea86cb4b7621 ("sched/cgroup: Fix
cpu_cgroup_fork() handling") that moved around the use case of
sched_move_task() in do_fork() where the call is now done after the task is
woken (hence it is running).

The second change came from 8e5bfa8c1f84 ("sched/autogroup: Do not use
autogroup->tg in zombie threads") where sched_move_task() is called by the
exit path, by the task that is exiting. Hence it too is running.

Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Link: http://lkml.kernel.org/r/20170206110426.27ca6426@gandalf.local.home
Signed-off-by: Ingo Molnar <mingo@kernel.org>

locking/rtmutex: Flip unlikely() branch to likely() in __rt_mutex_slowlock()

Running my likely/unlikely profiler for 3 weeks on two production
machines, I discovered that the unlikely() test in
__rt_mutex_slowlock() checking if state is TASK_INTERRUPTIBLE is hit
100% of the time, making it a very likely case.

The reason is, on a vanilla kernel, the majority case of calling
rt_mutex() is from the futex code. This code is always called as
TASK_INTERRUPTIBLE. In the -rt patch, this code is commonly called when
PREEMPT_RT is enabled with TASK_UNINTERRUPTIBLE. But that's not the
likely scenario.

The rt_mutex() code should be optimized for the common vanilla case,
and that is from a futex, with TASK_INTERRUPTIBLE as the state.

Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/20170119113234.1efeedd1@gandalf.local.home
Signed-off-by: Ingo Molnar <mingo@kernel.org>

locking/rtmutex: Only warn once on a trylock from bad context

One warning should be enough to get one motivated to fix this. It is
possible that this happens more than once and that starts flooding the
output. Later the prints will be suppressed so we only get half of it.
Depending on the console system used it might not be helpful.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1464356838-1755-1-git-send-email-bigeasy@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>

rtmutex: Make wait_lock irq safe

Sasha reported a lockdep splat about a potential deadlock between RCU boosting
rtmutex and the posix timer it_lock.

CPU0 CPU1

rtmutex_lock(&rcu->rt_mutex)
  spin_lock(&rcu->rt_mutex.wait_lock)
local_irq_disable()
spin_lock(&timer->it_lock)
spin_lock(&rcu->mutex.wait_lock)
--> Interrupt
    spin_lock(&timer->it_lock)

This is caused by the following code sequence on CPU1

     rcu_read_lock()
     x = lookup();
     if (x)
      spin_lock_irqsave(&x->it_lock);
     rcu_read_unlock();
     return x;

We could fix that in the posix timer code by keeping rcu read locked across
the spinlocked and irq disabled section, but the above sequence is common and
there is no reason not to support it.

Taking rt_mutex.wait_lock irq safe prevents the deadlock.

Reported-by: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>

ASoC: tas2559: use power efficient workingqueues

power: supply: use power efficient workingqueues

* YEEEEEEE

ASoC: wcd9335: use power efficient workingqueues

mm, vmstat: Add likelihood labels to quiet_vmstat conditions

These labels are based on observations from a running system as well as
from inspecting the code:

!delayed_work_pending:
  true  = 3509732
  false = 7495535

!need_update:
  true  = 6656251
  false = 840000

Signed-off-by: Danny Lin <danny@kdrag0n.dev>