OSDN Git Service

uclinux-h8/linux.git
5 years agopowerpc/64s: Use early_mmu_has_feature() in set_kuap()
Michael Ellerman [Wed, 8 May 2019 03:06:42 +0000 (13:06 +1000)]
powerpc/64s: Use early_mmu_has_feature() in set_kuap()

When implementing the KUAP support on Radix we fixed one case where
mmu_has_feature() was being called too early in boot via
__put_user_size().

However since then some new code in linux-next has created a new path
via which we can end up calling mmu_has_feature() too early.

On P9 this leads to crashes early in boot if we have both PPC_KUAP and
CONFIG_JUMP_LABEL_FEATURE_CHECK_DEBUG enabled. Our early boot code
calls printk() which calls probe_kernel_read(), that does a
__copy_from_user_inatomic() which calls into set_kuap() and that uses
mmu_has_feature().

At that point in boot we haven't patched MMU features yet so the debug
code in mmu_has_feature() complains, and calls printk(). At that point
we recurse, eg:

  ...
  dump_stack+0xdc
  probe_kernel_read+0x1a4
  check_pointer+0x58
  ...
  printk+0x40
  dump_stack_print_info+0xbc
  dump_stack+0x8
  probe_kernel_read+0x1a4
  probe_kernel_read+0x19c
  check_pointer+0x58
  ...
  printk+0x40
  cpufeatures_process_feature+0xc8
  scan_cpufeatures_subnodes+0x380
  of_scan_flat_dt_subnodes+0xb4
  dt_cpu_ftrs_scan_callback+0x158
  of_scan_flat_dt+0xf0
  dt_cpu_ftrs_scan+0x3c
  early_init_devtree+0x360
  early_setup+0x9c

And so on for infinity, symptom is a dead system.

Even more fun is what happens when using the hash MMU (ie. p8 or p9
with Radix disabled), and when we don't have
CONFIG_JUMP_LABEL_FEATURE_CHECK_DEBUG enabled. With the debug disabled
we don't check if static keys have been initialised, we just rely on
the jump label. But the jump label defaults to true so we just whack
the AMR even though Radix is not enabled.

Clearing the AMR is fine, but after we've done the user copy we write
(0b11 << 62) into AMR. When using hash that makes all pages with key
zero no longer readable or writable. All kernel pages implicitly have
key zero, and so all of a sudden the kernel can't read or write any of
its memory. Again dead system.

In the medium term we have several options for fixing this.
probe_kernel_read() doesn't need to touch AMR at all, it's not doing a
user access after all, but it uses __copy_from_user_inatomic() just
because it's easy, we could fix that.

It would also be safe to default to not writing to the AMR during
early boot, until we've detected features. But it's not clear that
flipping all the MMU features to static_key_false won't introduce
other bugs.

But for now just switch to early_mmu_has_feature() in set_kuap(), that
avoids all the problems with jump labels. It adds the overhead of a
global lookup and test, but that's probably trivial compared to the
writes to the AMR anyway.

Fixes: 890274c2dc4c ("powerpc/64s: Implement KUAP for Radix MMU")
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Reviewed-by: Russell Currey <ruscur@russell.cc>
5 years agopowerpc/book3s/64: check for NULL pointer in pgd_alloc()
Rick Lindsley [Mon, 6 May 2019 00:20:43 +0000 (17:20 -0700)]
powerpc/book3s/64: check for NULL pointer in pgd_alloc()

When the memset code was added to pgd_alloc(), it failed to consider
that kmem_cache_alloc() can return NULL. It's uncommon, but not
impossible under heavy memory contention. Example oops:

  Unable to handle kernel paging request for data at address 0x00000000
  Faulting instruction address: 0xc0000000000a4000
  Oops: Kernel access of bad area, sig: 11 [#1]
  LE SMP NR_CPUS=2048 NUMA pSeries
  CPU: 70 PID: 48471 Comm: entrypoint.sh Kdump: loaded Not tainted 4.14.0-115.6.1.el7a.ppc64le #1
  task: c000000334a00000 task.stack: c000000331c00000
  NIP:  c0000000000a4000 LR: c00000000012f43c CTR: 0000000000000020
  REGS: c000000331c039c0 TRAP: 0300   Not tainted  (4.14.0-115.6.1.el7a.ppc64le)
  MSR:  800000010280b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE,TM[E]>  CR: 44022840  XER: 20040000
  CFAR: c000000000008874 DAR: 0000000000000000 DSISR: 42000000 SOFTE: 1
  ...
  NIP [c0000000000a4000] memset+0x68/0x104
  LR [c00000000012f43c] mm_init+0x27c/0x2f0
  Call Trace:
    mm_init+0x260/0x2f0 (unreliable)
    copy_mm+0x11c/0x638
    copy_process.isra.28.part.29+0x6fc/0x1080
    _do_fork+0xdc/0x4c0
    ppc_clone+0x8/0xc
  Instruction dump:
  409e000c b0860000 38c60002 409d000c 90860000 38c60004 78a0d183 78a506a0
  7c0903a6 41820034 60000000 60420000 <f8860000f8860008 f8860010 f8860018

Fixes: fc5c2f4a55a2 ("powerpc/mm/hash64: Zero PGD pages on allocation")
Cc: stable@vger.kernel.org # v4.16+
Signed-off-by: Rick Lindsley <ricklind@vnet.linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/mm: Fix hugetlb page initialization
Sachin Sant [Mon, 6 May 2019 12:03:33 +0000 (17:33 +0530)]
powerpc/mm: Fix hugetlb page initialization

This patch fixes a regression by using correct kernel config variable
for HUGETLB_PAGE_SIZE_VARIABLE.

Without this huge pages are disabled during kernel boot.
[0.309496] hugetlbfs: disabling because there are no supported hugepage sizes

Fixes: c5710cd20735 ("powerpc/mm: cleanup HPAGE_SHIFT setup")
Reported-by: Sachin Sant <sachinp@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Tested-by: Sachin Sant <sachinp@linux.ibm.com>
Reviewed-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agoocxl: Fix return value check in afu_ioctl()
Wei Yongjun [Sat, 4 May 2019 07:04:30 +0000 (07:04 +0000)]
ocxl: Fix return value check in afu_ioctl()

In case of error, the function eventfd_ctx_fdget() returns ERR_PTR() and
never returns NULL. The NULL test in the return value check should be
replaced with IS_ERR().

This issue was detected by using the Coccinelle software.

Fixes: 060146614643 ("ocxl: move event_fd handling to frontend")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Acked-by: Alastair D'Silva <alastair@d-silva.org>
Acked-by: Andrew Donnellan <ajd@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/mm: fix section mismatch for setup_kup()
Christophe Leroy [Mon, 6 May 2019 08:10:43 +0000 (08:10 +0000)]
powerpc/mm: fix section mismatch for setup_kup()

commit b28c97505eb1 ("powerpc/64: Setup KUP on secondary CPUs")
moved setup_kup() out of the __init section. As stated in that commit,
"this is only for 64-bit". But this function is also used on PPC32,
where the two functions called by setup_kup() are in the __init
section, so setup_kup() has to either be kept in the __init
section on PPC32 or marked __ref.

This patch marks it __ref, it fixes the below build warnings.

  MODPOST vmlinux.o
WARNING: vmlinux.o(.text+0x169ec): Section mismatch in reference from the function setup_kup() to the function .init.text:setup_kuep()
The function setup_kup() references
the function __init setup_kuep().
This is often because setup_kup lacks a __init
annotation or the annotation of setup_kuep is wrong.

WARNING: vmlinux.o(.text+0x16a04): Section mismatch in reference from the function setup_kup() to the function .init.text:setup_kuap()
The function setup_kup() references
the function __init setup_kuap().
This is often because setup_kup lacks a __init
annotation or the annotation of setup_kuap is wrong.

Fixes: b28c97505eb1 ("powerpc/64: Setup KUP on secondary CPUs")
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/mm: fix redundant inclusion of pgtable-frag.o in Makefile
Christophe Leroy [Mon, 6 May 2019 06:47:55 +0000 (06:47 +0000)]
powerpc/mm: fix redundant inclusion of pgtable-frag.o in Makefile

The patch identified below added pgtable-frag.o to obj-y
but some merge witchery kept it also for obj-CONFIG_PPC_BOOK3S_64

This patch clears the duplication.

Fixes: 737b434d3d55 ("powerpc/mm: convert Book3E 64 to pte_fragment")
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/mm: Fix makefile for KASAN
Christophe Leroy [Mon, 6 May 2019 06:21:01 +0000 (06:21 +0000)]
powerpc/mm: Fix makefile for KASAN

In commit 17312f258cf6 ("powerpc/mm: Move book3s32 specifics in
subdirectory mm/book3s64"), ppc_mmu_32.c was moved and renamed.

This patch fixes Makefiles to disable KASAN instrumentation on
the new name and location.

Fixes: f072015c7b74 ("powerpc: disable KASAN instrumentation on early/critical files.")
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/kasan: add missing/lost Makefile
Christophe Leroy [Mon, 6 May 2019 06:21:00 +0000 (06:21 +0000)]
powerpc/kasan: add missing/lost Makefile

For unknown reason (aka. mpe is a doofus), the new Makefile added via
the KASAN support patch didn't land into arch/powerpc/mm/kasan/

This patch restores it.

Fixes: 2edb16efc899 ("powerpc/32: Add KASAN support")
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agoselftests/powerpc: Add a signal fuzzer selftest
Breno Leitao [Thu, 17 Jan 2019 17:01:54 +0000 (15:01 -0200)]
selftests/powerpc: Add a signal fuzzer selftest

This is a new selftest that raises SIGUSR1 signals and handles it in a
set of different ways, trying to create different scenario for testing
purpose.

This test works raising a signal and calling sigreturn interleaved
with TM operations, as starting, suspending and terminating a
transaction. The test depends on random numbers, and, based on them,
it sets different TM states.

Other than that, the test fills out the user context struct that is
passed to the sigreturn system call with random data, in order to make
sure that the signal handler syscall can handle different and invalid
states properly.

This selftest has command line parameters to control what kind of
tests the user wants to run, as for example, if a transaction should
be started prior to signal being raised, or, after the signal being
raised and before the sigreturn. If no parameter is given, the default
is enabling all options.

This test does not check if the user context is being read and set
properly by the kernel. Its purpose, at this time, is basically
guaranteeing that the kernel does not crash on invalid scenarios.

Signed-off-by: Breno Leitao <leitao@debian.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/booke64: set RI in default MSR
Laurentiu Tudor [Mon, 15 Apr 2019 11:52:11 +0000 (14:52 +0300)]
powerpc/booke64: set RI in default MSR

Set RI in the default kernel's MSR so that the architected way of
detecting unrecoverable machine check interrupts has a chance to work.
This is inline with the MSR setup of the rest of booke powerpc
architectures configured here.

Signed-off-by: Laurentiu Tudor <laurentiu.tudor@nxp.com>
Cc: stable@vger.kernel.org
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agoocxl: Provide global MMIO accessors for external drivers
Alastair D'Silva [Wed, 27 Mar 2019 05:31:36 +0000 (16:31 +1100)]
ocxl: Provide global MMIO accessors for external drivers

External drivers that communicate via OpenCAPI will need to make
MMIO calls to interact with the devices.

Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
Acked-by: Frederic Barrat <fbarrat@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agoocxl: move event_fd handling to frontend
Alastair D'Silva [Wed, 27 Mar 2019 05:31:35 +0000 (16:31 +1100)]
ocxl: move event_fd handling to frontend

Event_fd is only used in the driver frontend, so it does not
need to exist in the backend code. Relocate it to the frontend
and provide an opaque mechanism for consumers instead.

Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
Acked-by: Frederic Barrat <fbarrat@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agoocxl: afu_irq only deals with IRQ IDs, not offsets
Alastair D'Silva [Wed, 27 Mar 2019 05:31:34 +0000 (16:31 +1100)]
ocxl: afu_irq only deals with IRQ IDs, not offsets

The use of offsets is required only in the frontend, so alter
the IRQ API to only work with IRQ IDs in the backend.

Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
Acked-by: Frederic Barrat <fbarrat@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agoocxl: Allow external drivers to use OpenCAPI contexts
Alastair D'Silva [Wed, 27 Mar 2019 05:31:33 +0000 (16:31 +1100)]
ocxl: Allow external drivers to use OpenCAPI contexts

Most OpenCAPI operations require a valid context, so
exposing these functions to external drivers is necessary.

Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
Acked-by: Frederic Barrat <fbarrat@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agoocxl: Create a clear delineation between ocxl backend & frontend
Alastair D'Silva [Wed, 27 Mar 2019 05:31:32 +0000 (16:31 +1100)]
ocxl: Create a clear delineation between ocxl backend & frontend

The OCXL driver contains both frontend code for interacting with userspace,
as well as backend code for interacting with the hardware.

This patch separates the backend code from the frontend so that it can be
used by other device drivers that communicate via OpenCAPI.

Relocate dev, cdev & sysfs files to the frontend code to allow external
drivers to maintain their own devices.

Reference counting on the device in the backend is replaced with kref
counting.

Move file & sysfs layer initialisation from core.c (backend) to
pci.c (frontend).

Create an ocxl_function oriented interface for initing devices &
enumerating AFUs.

Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
Acked-by: Frederic Barrat <fbarrat@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agoocxl: Don't pass pci_dev around
Alastair D'Silva [Wed, 27 Mar 2019 05:31:31 +0000 (16:31 +1100)]
ocxl: Don't pass pci_dev around

This data is already available in a struct

Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
Acked-by: Frederic Barrat <fbarrat@linux.ibm.com>
Acked-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agoocxl: Split pci.c
Alastair D'Silva [Wed, 27 Mar 2019 05:31:30 +0000 (16:31 +1100)]
ocxl: Split pci.c

In preparation for making core code available for external drivers,
move the core code out of pci.c and into core.c

Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
Acked-by: Frederic Barrat <fbarrat@linux.ibm.com>
Acked-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agoocxl: Remove some unused exported symbols
Alastair D'Silva [Mon, 25 Mar 2019 05:34:55 +0000 (16:34 +1100)]
ocxl: Remove some unused exported symbols

Remove some unused exported symbols.

Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
Acked-by: Frederic Barrat <fbarrat@linux.ibm.com>
Acked-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agoocxl: Remove superfluous 'extern' from headers
Alastair D'Silva [Mon, 25 Mar 2019 05:34:54 +0000 (16:34 +1100)]
ocxl: Remove superfluous 'extern' from headers

The 'extern' keyword adds no value here.

Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
Acked-by: Frederic Barrat <fbarrat@linux.ibm.com>
Acked-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agoocxl: read_pasid never returns an error, so make it void
Alastair D'Silva [Mon, 25 Mar 2019 05:34:53 +0000 (16:34 +1100)]
ocxl: read_pasid never returns an error, so make it void

No need for a return value in read_pasid as it only returns 0.

Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
Acked-by: Frederic Barrat <fbarrat@linux.ibm.com>
Acked-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agoocxl: Rename struct link to ocxl_link
Alastair D'Silva [Mon, 25 Mar 2019 05:34:52 +0000 (16:34 +1100)]
ocxl: Rename struct link to ocxl_link

The term 'link' is ambiguous (especially when the struct is used for a
list), so rename it for clarity.

Signed-off-by: Alastair D'Silva <alastair@d-silva.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
Acked-by: Frederic Barrat <fbarrat@linux.ibm.com>
Acked-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/perf: Trace imc PMU functions
Anju T Sudhakar [Tue, 16 Apr 2019 09:48:31 +0000 (15:18 +0530)]
powerpc/perf: Trace imc PMU functions

Add PMU functions to support trace-imc.

Signed-off-by: Anju T Sudhakar <anju@linux.vnet.ibm.com>
Reviewed-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/perf: Trace imc events detection and cpuhotplug
Anju T Sudhakar [Tue, 16 Apr 2019 09:48:30 +0000 (15:18 +0530)]
powerpc/perf: Trace imc events detection and cpuhotplug

Patch detects trace-imc events, does memory initilizations for each online
cpu, and registers cpuhotplug call-backs.

Signed-off-by: Anju T Sudhakar <anju@linux.vnet.ibm.com>
Reviewed-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/perf: Add privileged access check for thread_imc
Madhavan Srinivasan [Tue, 16 Apr 2019 09:48:29 +0000 (15:18 +0530)]
powerpc/perf: Add privileged access check for thread_imc

Add code to restrict user access to thread_imc pmu since
some event report privilege level information.

Fixes: f74c89bd80fb3 ("powerpc/perf: Add thread IMC PMU support")
Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Signed-off-by: Anju T Sudhakar <anju@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/perf: Rearrange setting of ldbar for thread-imc
Anju T Sudhakar [Tue, 16 Apr 2019 09:48:28 +0000 (15:18 +0530)]
powerpc/perf: Rearrange setting of ldbar for thread-imc

LDBAR holds the memory address allocated for each cpu. For thread-imc
the mode bit (i.e bit 1) of LDBAR is set to accumulation.
Currently, ldbar is loaded with per cpu memory address and mode set to
accumulation at boot time.

To enable trace-imc, the mode bit of ldbar should be set to 'trace'. So to
accommodate trace-mode of IMC, reposition setting of ldbar for thread-imc
to thread_imc_event_add(). Also reset ldbar at thread_imc_event_del().

Signed-off-by: Anju T Sudhakar <anju@linux.vnet.ibm.com>
Reviewed-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/include: Add data structures and macros for IMC trace mode
Anju T Sudhakar [Tue, 16 Apr 2019 09:48:27 +0000 (15:18 +0530)]
powerpc/include: Add data structures and macros for IMC trace mode

Add the macros needed for IMC (In-Memory Collection Counters) trace-mode
and data structure to hold the trace-imc record data.
Also, add the new type "OPAL_IMC_COUNTERS_TRACE" in 'opal-api.h', since
there is a new switch case added in the opal-calls for IMC.

Signed-off-by: Anju T Sudhakar <anju@linux.vnet.ibm.com>
Reviewed-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/perf: Fix loop exit condition in nest_imc_event_init
Anju T Sudhakar [Tue, 18 Dec 2018 06:20:41 +0000 (11:50 +0530)]
powerpc/perf: Fix loop exit condition in nest_imc_event_init

The data structure (i.e struct imc_mem_info) to hold the memory address
information for nest imc units is allocated based on the number of nodes
in the system.

nest_imc_event_init() traverse this struct array to calculate the memory
base address for the event-cpu. If we fail to find a match for the event
cpu's chip-id in imc_mem_info struct array, then the do-while loop will
iterate until we crash.

Fix this by changing the loop exit condition based on the number of
non zero vbase elements in the array, since the allocation is done for
nr_chips + 1.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Fixes: 885dcd709ba91 ("powerpc/perf: Add nest IMC PMU support")
Signed-off-by: Anju T Sudhakar <anju@linux.vnet.ibm.com>
Reviewed-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/perf: Return accordingly on invalid chip-id in
Anju T Sudhakar [Tue, 27 Nov 2018 08:24:52 +0000 (13:54 +0530)]
powerpc/perf: Return accordingly on invalid chip-id in

Nest hardware counter memory resides in a per-chip reserve-memory.
During nest_imc_event_init(), chip-id of the event-cpu is considered to
calculate the base memory addresss for that cpu. Return, proper error
condition if the chip_id calculated is invalid.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Fixes: 885dcd709ba91 ("powerpc/perf: Add nest IMC PMU support")
Reviewed-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Signed-off-by: Anju T Sudhakar <anju@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/perf: Remove PM_BR_CMPL_ALT from power9 event list
Madhavan Srinivasan [Mon, 1 Apr 2019 06:20:39 +0000 (11:50 +0530)]
powerpc/perf: Remove PM_BR_CMPL_ALT from power9 event list

PM_BR_CMPL_ALT event is not supported, remove it from the power9 event
list.

Fixes: 24bedcb7c811 ("powerpc/perf: Fix branch event code for power9")
Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/perf: Add generic compat mode pmu driver
Madhavan Srinivasan [Thu, 4 Apr 2019 11:54:50 +0000 (17:24 +0530)]
powerpc/perf: Add generic compat mode pmu driver

Most of the power processor generation performance monitoring
unit (PMU) driver code is bundled in the kernel and one of those
is enabled/registered based on the oprofile_cpu_type check at
the boot.

But things get little tricky incase of "compat" mode boot.
IBM POWER System Server based processors has a compactibility
mode feature, which simpily put is, Nth generation processor
(lets say POWER8) will act and appear in a mode consistent
with an earlier generation (N-1) processor (that is POWER7).
And in this "compat" mode boot, kernel modify the
"oprofile_cpu_type" to be Nth generation (POWER8). If Nth
generation pmu driver is bundled (POWER8), it gets registered.

Key dependency here is to have distro support for latest
processor performance monitoring support. Patch here adds
a generic "compat-mode" performance monitoring driver to
be register in absence of powernv platform specific pmu driver.

Driver supports only "cycles" and "instruction" events.
"0x0001e" used as event code for "cycles" and "0x00002"
used as event code for "instruction" events. New file
called "generic-compat-pmu.c" is created to contain the driver
specific code. And base raw event code format modeled
on PPMU_ARCH_207S.

Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
[mpe: Use SPDX tag for license]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/perf: init pmu from core-book3s
Madhavan Srinivasan [Thu, 4 Apr 2019 11:54:49 +0000 (17:24 +0530)]
powerpc/perf: init pmu from core-book3s

Currenty pmu driver file for each ppc64 generation processor
has a __init call in itself. Refactor the code by moving the
__init call to core-books.c. This also clean's up compat mode
pmu driver registration.

Suggested-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
[mpe: Use SPDX tag for license]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/powernv/ioda2: Add __printf format/argument verification
Joe Perches [Thu, 30 Mar 2017 10:19:25 +0000 (03:19 -0700)]
powerpc/powernv/ioda2: Add __printf format/argument verification

Fix fallout too.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agoDocumentation: powerpc: Expand the DAWR acronym
Joel Stanley [Mon, 1 Apr 2019 06:11:56 +0000 (16:41 +1030)]
Documentation: powerpc: Expand the DAWR acronym

Those not of us not drowning in POWER might not know what this means.

Signed-off-by: Joel Stanley <joel@jms.id.au>
Acked-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/configs: Add (back) MLX5 ethernet support to skiroot_defconfig
Joel Stanley [Wed, 3 Apr 2019 00:49:26 +0000 (11:19 +1030)]
powerpc/configs: Add (back) MLX5 ethernet support to skiroot_defconfig

It turns out that some defconfig changes and kernel config option
changes meant we accidentally dropped Ethernet support for Mellanox
CLX5 cards.

Fixes: cbc39809a398 ("powerpc/configs: Update skiroot defconfig")
Reported-by: Carol L Soto <clsoto@us.ibm.com>
Suggested-by: Carol L Soto <clsoto@us.ibm.com>
Signed-off-by: Stewart Smith <stewart@linux.ibm.com>
Signed-off-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/hmi: Fix kernel hang when TB is in error state.
Mahesh Salgaonkar [Mon, 4 Mar 2019 19:42:19 +0000 (01:12 +0530)]
powerpc/hmi: Fix kernel hang when TB is in error state.

On TOD/TB errors timebase register stops/freezes until HMI error recovery
gets TOD/TB back into running state. On successful recovery, TB starts
running again and udelay() that relies on TB value continues to function
properly. But in case when HMI fails to recover from TOD/TB errors, the
TB register stay freezed. With TB not running the __delay() function
keeps looping and never return. If __delay() is called while in panic
path then system hangs and never reboots after panic.

Signed-off-by: Mahesh Salgaonkar <mahesh@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/xmon: add read-only mode
Christopher M. Riedl [Tue, 16 Apr 2019 03:26:38 +0000 (22:26 -0500)]
powerpc/xmon: add read-only mode

Operations which write to memory and special purpose registers should be
restricted on systems with integrity guarantees (such as Secure Boot)
and, optionally, to avoid self-destructive behaviors.

Add a config option, XMON_DEFAULT_RO_MODE, to set default xmon behavior.
The kernel cmdline options xmon=ro and xmon=rw override this default.

The following xmon operations are affected:
memops:
disable memmove
disable memset
disable memzcan
memex:
no-op'd mwrite
super_regs:
no-op'd write_spr
bpt_cmds:
disable
proc_call:
disable

Signed-off-by: Christopher M. Riedl <cmr@informatik.wtf>
Reviewed-by: Oliver O'Halloran <oohall@gmail.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/boot: Fix missing check of lseek() return value
Bo YU [Tue, 30 Oct 2018 13:21:55 +0000 (09:21 -0400)]
powerpc/boot: Fix missing check of lseek() return value

This is detected by Coverity scan: CID: 1440481

Signed-off-by: Bo YU <tsu.yubo@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/entry: Remove unneeded need_resched() loop
Valentin Schneider [Mon, 11 Mar 2019 22:47:46 +0000 (22:47 +0000)]
powerpc/entry: Remove unneeded need_resched() loop

Since the enabling and disabling of IRQs within preempt_schedule_irq()
is contained in a need_resched() loop, we don't need the outer arch
code loop.

Signed-off-by: Valentin Schneider <valentin.schneider@arm.com>
[mpe: Rebase since CURRENT_THREAD_INFO() removal]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/dts/fsl: add crypto node alias for B4
Horia Geantă [Wed, 20 Mar 2019 12:55:16 +0000 (14:55 +0200)]
powerpc/dts/fsl: add crypto node alias for B4

crypto node alias is needed by U-boot to identify the node and
perform fix-ups, like adding "fsl,sec-era" property.

Signed-off-by: Horia Geantă <horia.geanta@nxp.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/prom_init: get rid of PROM_SCRATCH_SIZE
Christophe Leroy [Tue, 2 Apr 2019 09:08:38 +0000 (09:08 +0000)]
powerpc/prom_init: get rid of PROM_SCRATCH_SIZE

PROM_SCRATCH_SIZE is same as sizeof(prom_scratch)

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/security: Show powerpc_security_features in debugfs
Michael Ellerman [Tue, 9 Apr 2019 13:14:20 +0000 (23:14 +1000)]
powerpc/security: Show powerpc_security_features in debugfs

This can be helpful for debugging problems with the security feature
flags, especially on guests where the flags come from the hypervisor
via an hcall and so can't be observed in the device tree.

Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Reviewed-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/mm: Warn if W+X pages found on boot
Russell Currey [Thu, 2 May 2019 07:39:47 +0000 (17:39 +1000)]
powerpc/mm: Warn if W+X pages found on boot

Implement code to walk all pages and warn if any are found to be both
writable and executable.  Depends on STRICT_KERNEL_RWX enabled, and is
behind the DEBUG_WX config option.

This only runs on boot and has no runtime performance implications.

Very heavily influenced (and in some cases copied verbatim) from the
ARM64 code written by Laura Abbott (thanks!), since our ptdump
infrastructure is similar.

Signed-off-by: Russell Currey <ruscur@russell.cc>
[mpe: Fixup build error when disabled]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/mm/ptdump: Wrap seq_printf() to handle NULL pointers
Russell Currey [Thu, 2 May 2019 07:39:46 +0000 (17:39 +1000)]
powerpc/mm/ptdump: Wrap seq_printf() to handle NULL pointers

Lovingly borrowed from the arch/arm64 ptdump code.

This doesn't seem to be an issue in practice, but is necessary for my
upcoming commit.

Signed-off-by: Russell Currey <ruscur@russell.cc>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc: remove the __kernel_io_end export
Christoph Hellwig [Tue, 30 Apr 2019 18:27:39 +0000 (14:27 -0400)]
powerpc: remove the __kernel_io_end export

This export was added in this merge window, but without any actual
user, or justification for a modular user.

Fixes: a35a3c6f6065 ("powerpc/mm/hash64: Add a variable to track the end of IO mapping")
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agoMAINTAINERS: Update cxl/ocxl email address
Andrew Donnellan [Thu, 2 May 2019 06:00:41 +0000 (16:00 +1000)]
MAINTAINERS: Update cxl/ocxl email address

Use my @linux.ibm.com email to avoid a layer of redirection.

Signed-off-by: Andrew Donnellan <ajd@linux.ibm.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/64: Don't trace code that runs with the soft irq mask unreconciled
Nicholas Piggin [Thu, 2 May 2019 05:21:07 +0000 (15:21 +1000)]
powerpc/64: Don't trace code that runs with the soft irq mask unreconciled

"Reconciling" in terms of interrupt handling, is to bring the soft irq
mask state in to synch with the hardware, after an interrupt causes
MSR[EE] to be cleared (while the soft mask may be enabled, and hard
irqs not marked disabled).

General kernel code should not be called while unreconciled, because
local_irq_disable, etc. manipulations can cause surprising irq traces,
and it's fragile because the soft irq code does not really expect to
be called in this situation.

When exiting from an interrupt, MSR[EE] is cleared to prevent races,
but soft irq state is enabled for the returned-to context, so this is
now an unreconciled state. restore_math is called in this state, and
that can be ftraced, and the ftrace subsystem disables local irqs.

Mark restore_math and its callees as notrace. Restore a sanity check
in the soft irq code that had to be disabled for this case, by commit
4da1f79227ad4 ("powerpc/64: Disable irq restore warning for now").

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/irq: drop __irq_offset_value
Christophe Leroy [Sat, 9 Mar 2019 17:47:27 +0000 (18:47 +0100)]
powerpc/irq: drop __irq_offset_value

This patch drops__irq_offset_value which has not been used since
commit 9c4cb8251513 ("powerpc: Remove use of CONFIG_PPC_MERGE")

This removes a sparse warning.

Fixes: 9c4cb8251513 ("powerpc: Remove use of CONFIG_PPC_MERGE")
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/setup: replace ifdefs by IS_ENABLED() wherever possible.
Christophe Leroy [Fri, 22 Mar 2019 08:08:45 +0000 (08:08 +0000)]
powerpc/setup: replace ifdefs by IS_ENABLED() wherever possible.

Compared to ifdefs, IS_ENABLED() provide a cleaner code and allows
to detect compilation failure regardless of the selected options.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/setup: cleanup the #ifdef CONFIG_TAU block
Christophe Leroy [Fri, 22 Mar 2019 08:08:44 +0000 (08:08 +0000)]
powerpc/setup: cleanup the #ifdef CONFIG_TAU block

Use cpu_has_feature() instead of opencoding

Use IS_ENABLED() instead of #ifdef for CONFIG_TAU_AVERAGE

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/setup: cleanup ifdef mess in check_cache_coherency()
Christophe Leroy [Fri, 22 Mar 2019 08:08:43 +0000 (08:08 +0000)]
powerpc/setup: cleanup ifdef mess in check_cache_coherency()

Use IS_ENABLED() instead of #ifdefs

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/setup: Remove unnecessary #ifdef CONFIG_ALTIVEC
Christophe Leroy [Fri, 22 Mar 2019 08:08:42 +0000 (08:08 +0000)]
powerpc/setup: Remove unnecessary #ifdef CONFIG_ALTIVEC

CPU_FTR_ALTIVEC is only set when CONFIG_ALTIVEC is selected, so
the ifdef is unnecessary.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/mm: define an empty mm_iommu_init()
Christophe Leroy [Fri, 22 Mar 2019 08:08:40 +0000 (08:08 +0000)]
powerpc/mm: define an empty mm_iommu_init()

To avoid ifdefs, define a empty static inline mm_iommu_init() function
when CONFIG_SPAPR_TCE_IOMMU is not selected.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/fadump: define an empty fadump_cleanup()
Christophe Leroy [Fri, 22 Mar 2019 08:08:39 +0000 (08:08 +0000)]
powerpc/fadump: define an empty fadump_cleanup()

To avoid #ifdefs, define an static inline fadump_cleanup() function
when CONFIG_FADUMP is not selected

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/32: Don't add dummy frames when calling trace_hardirqs_on/off
Christophe Leroy [Tue, 30 Apr 2019 12:39:05 +0000 (12:39 +0000)]
powerpc/32: Don't add dummy frames when calling trace_hardirqs_on/off

No need to add dummy frames when calling trace_hardirqs_on or
trace_hardirqs_off. GCC properly handles empty stacks.

In addition, powerpc doesn't set CONFIG_FRAME_POINTER, therefore
__builtin_return_address(1..) returns NULL at all time. So the
dummy frames are definitely unneeded here.

In the meantime, avoid reading memory for loading r1 with a value
we already know.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/32: don't do syscall stuff in transfer_to_handler
Christophe Leroy [Tue, 30 Apr 2019 12:39:04 +0000 (12:39 +0000)]
powerpc/32: don't do syscall stuff in transfer_to_handler

As syscalls are now handled via a fast entry path, syscall related
actions can be removed from the generic transfer_to_handler path.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/32: implement fast entry for syscalls on BOOKE
Christophe Leroy [Tue, 30 Apr 2019 12:39:03 +0000 (12:39 +0000)]
powerpc/32: implement fast entry for syscalls on BOOKE

This patch implements a fast entry for syscalls.

Syscalls don't have to preserve non volatile registers except LR.

This patch then implement a fast entry for syscalls, where
volatile registers get clobbered.

As this entry is dedicated to syscall it always sets MSR_EE
and warns in case MSR_EE was previously off

It also assumes that the call is always from user, system calls are
unexpected from kernel.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/32: implement fast entry for syscalls on non BOOKE
Christophe Leroy [Tue, 30 Apr 2019 12:39:02 +0000 (12:39 +0000)]
powerpc/32: implement fast entry for syscalls on non BOOKE

This patch implements a fast entry for syscalls.

Syscalls don't have to preserve non volatile registers except LR.

This patch then implement a fast entry for syscalls, where
volatile registers get clobbered.

As this entry is dedicated to syscall it always sets MSR_EE
and warns in case MSR_EE was previously off

It also assumes that the call is always from user, system calls are
unexpected from kernel.

The overall series improves null_syscall selftest by 12,5% on an 83xx
and by 17% on a 8xx.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc: Fix 32-bit handling of MSR_EE on exceptions
Christophe Leroy [Tue, 30 Apr 2019 12:39:01 +0000 (12:39 +0000)]
powerpc: Fix 32-bit handling of MSR_EE on exceptions

[text mostly copied from benh's RFC/WIP]

ppc32 are still doing something rather gothic and wrong on 32-bit
which we stopped doing on 64-bit a while ago.

We have that thing where some handlers "copy" the EE value from the
original stack frame into the new MSR before transferring to the
handler.

Thus for a number of exceptions, we enter the handlers with interrupts
enabled.

This is rather fishy, some of the stuff that handlers might do early
on such as irq_enter/exit or user_exit, context tracking, etc...
should be run with interrupts off afaik.

Generally our handlers know when to re-enable interrupts if needed.

The problem we were having is that we assumed these interrupts would
return with interrupts enabled. However that isn't the case.

Instead, this patch changes things so that we always enter exception
handlers with interrupts *off* with the notable exception of syscalls
which are special (and get a fast path).

Suggested-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/32: get rid of COPY_EE in exception entry
Christophe Leroy [Tue, 30 Apr 2019 12:39:00 +0000 (12:39 +0000)]
powerpc/32: get rid of COPY_EE in exception entry

EXC_XFER_TEMPLATE() is not called with COPY_EE anymore so
we can get rid of copyee parameters and related COPY_EE and NOCOPY
macros.

Suggested-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
[splited out from benh RFC patch]

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/32: Enter exceptions with MSR_EE unset
Christophe Leroy [Tue, 30 Apr 2019 12:38:59 +0000 (12:38 +0000)]
powerpc/32: Enter exceptions with MSR_EE unset

All exceptions handlers know when to reenable interrupts, so
it is safer to enter all of them with MSR_EE unset, except
for syscalls.

Suggested-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
[splited out from benh RFC patch]

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/32: enter syscall with MSR_EE inconditionaly set
Christophe Leroy [Tue, 30 Apr 2019 12:38:58 +0000 (12:38 +0000)]
powerpc/32: enter syscall with MSR_EE inconditionaly set

syscalls are expected to be entered with MSR_EE set. Lets
make it inconditional by forcing MSR_EE on syscalls.

This patch adds EXC_XFER_SYS for that.

Suggested-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
[splited out from benh RFC patch]

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/fsl_booke: ensure SPEFloatingPointException() reenables interrupts
Christophe Leroy [Tue, 30 Apr 2019 12:38:57 +0000 (12:38 +0000)]
powerpc/fsl_booke: ensure SPEFloatingPointException() reenables interrupts

SPEFloatingPointException() is the only exception handler which 'forgets' to
re-enable interrupts. This patch makes sure it does.

Suggested-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/40x: Refactor exception entry macros by using head_32.h
Christophe Leroy [Tue, 30 Apr 2019 12:38:56 +0000 (12:38 +0000)]
powerpc/40x: Refactor exception entry macros by using head_32.h

Refactor exception entry macros by using the ones defined in head_32.h

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/40x: Split and rename NORMAL_EXCEPTION_PROLOG
Christophe Leroy [Tue, 30 Apr 2019 12:38:55 +0000 (12:38 +0000)]
powerpc/40x: Split and rename NORMAL_EXCEPTION_PROLOG

This patch splits NORMAL_EXCEPTION_PROLOG in the same way as in
head_8xx.S and head_32.S and renames it EXCEPTION_PROLOG() as well
to match head_32.h

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/40x: add exception frame marker
Christophe Leroy [Tue, 30 Apr 2019 12:38:54 +0000 (12:38 +0000)]
powerpc/40x: add exception frame marker

This patch adds STACK_FRAME_REGS_MARKER in the stack at exception entry
in order to see interrupts in call traces as below:

[    0.013964] Call Trace:
[    0.014014] [c0745db0] [c007a9d4] tick_periodic.constprop.5+0xd8/0x104 (unreliable)
[    0.014086] [c0745dc0] [c007aa20] tick_handle_periodic+0x20/0x9c
[    0.014181] [c0745de0] [c0009cd0] timer_interrupt+0xa0/0x264
[    0.014258] [c0745e10] [c000e484] ret_from_except+0x0/0x14
[    0.014390] --- interrupt: 901 at console_unlock.part.7+0x3f4/0x528
[    0.014390]     LR = console_unlock.part.7+0x3f0/0x528
[    0.014455] [c0745ee0] [c0050334] console_unlock.part.7+0x114/0x528 (unreliable)
[    0.014542] [c0745f30] [c00524e0] register_console+0x3d8/0x44c
[    0.014625] [c0745f60] [c0675aac] cpm_uart_console_init+0x18/0x2c
[    0.014709] [c0745f70] [c06614f4] console_init+0x114/0x1cc
[    0.014795] [c0745fb0] [c0658b68] start_kernel+0x300/0x3d8
[    0.014864] [c0745ff0] [c00022cc] start_here+0x44/0x98

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/40x: Don't use SPRN_SPRG_SCRATCH2 in EXCEPTION_PROLOG
Christophe Leroy [Tue, 30 Apr 2019 12:38:53 +0000 (12:38 +0000)]
powerpc/40x: Don't use SPRN_SPRG_SCRATCH2 in EXCEPTION_PROLOG

Unlike said in the comment, r1 is not reused by the critical
exception handler, as it uses a dedicated critirq_ctx stack.
Decrementing r1 early is then unneeded.

Should the above be valid, the code is crap buggy anyway as
r1 gets some intermediate values that would jeopardise the
whole process (for instance after mfspr   r1,SPRN_SPRG_THREAD)

Using SPRN_SPRG_SCRATCH2 to save r1 is then not needed, r11 can be
used instead. This avoids one mtspr and one mfspr and makes the
prolog closer to what's done on 6xx and 8xx.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/32: make the 6xx/8xx EXC_XFER_TEMPLATE() similar to the 40x/booke one
Christophe Leroy [Tue, 30 Apr 2019 12:38:52 +0000 (12:38 +0000)]
powerpc/32: make the 6xx/8xx EXC_XFER_TEMPLATE() similar to the 40x/booke one

6xx/8xx EXC_XFER_TEMPLATE() macro adds a i##n symbol which is
unused and can be removed.
40x and booke EXC_XFER_TEMPLATE() macros takes msr from the caller
while the 6xx/8xx version uses only MSR_KERNEL as msr value.

This patch modifies the 6xx/8xx version to make it similar to the
40x and booke versions.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/32: move LOAD_MSR_KERNEL() into head_32.h and use it
Christophe Leroy [Tue, 30 Apr 2019 12:38:51 +0000 (12:38 +0000)]
powerpc/32: move LOAD_MSR_KERNEL() into head_32.h and use it

As preparation for using head_32.h for head_40x.S, move
LOAD_MSR_KERNEL() there and use it to load r10 with MSR_KERNEL value.

In the mean time, this patch modifies it so that it takes into account
the size of the passed value to determine if 'li' can be used or if
'lis/ori' is needed instead of using the size of MSR_KERNEL. This is
done by using gas macro.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/32: Refactor EXCEPTION entry macros for head_8xx.S and head_32.S
Christophe Leroy [Tue, 30 Apr 2019 12:38:50 +0000 (12:38 +0000)]
powerpc/32: Refactor EXCEPTION entry macros for head_8xx.S and head_32.S

EXCEPTION_PROLOG is similar in head_8xx.S and head_32.S

This patch creates head_32.h and moves EXCEPTION_PROLOG macro
into it. It also converts it from a GCC macro to a GAS macro
in order to ease refactorisation with 40x later, since
GAS macros allows the use of #ifdef/#else/#endif inside it.
And it also has the advantage of not requiring the uggly "; \"
at the end of each line.

This patch also moves EXCEPTION() and EXC_XFER_XXXX() macros which
are also similar while adding START_EXCEPTION() out of EXCEPTION().

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/mm: print hash info in a helper
Christophe Leroy [Fri, 26 Apr 2019 16:36:39 +0000 (16:36 +0000)]
powerpc/mm: print hash info in a helper

Reduce #ifdef mess by defining a helper to print
hash info at startup.

In the meantime, remove the display of hash table address
to reduce leak of non necessary information.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/32s: don't try to print hash table address.
Christophe Leroy [Fri, 26 Apr 2019 16:36:37 +0000 (16:36 +0000)]
powerpc/32s: don't try to print hash table address.

Due to %p, (ptrval) is printed in lieu of the hash table address.

showing the hash table address isn't an operationnal need so just
don't print it.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/32s: drop Hash_end
Christophe Leroy [Fri, 26 Apr 2019 16:36:36 +0000 (16:36 +0000)]
powerpc/32s: drop Hash_end

Hash_end has never been used, drop it.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/32s: map kasan zero shadow with PAGE_READONLY instead of PAGE_KERNEL_RO
Christophe Leroy [Fri, 26 Apr 2019 16:23:37 +0000 (16:23 +0000)]
powerpc/32s: map kasan zero shadow with PAGE_READONLY instead of PAGE_KERNEL_RO

For hash32, the zero shadow page gets mapped with PAGE_READONLY instead
of PAGE_KERNEL_RO, because the PP bits don't provide a RO kernel, so
PAGE_KERNEL_RO is equivalent to PAGE_KERNEL. By using PAGE_READONLY,
the page is RO for both kernel and user, but this is not a security issue
as it contains only zeroes.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/32s: set up an early static hash table for KASAN.
Christophe Leroy [Fri, 26 Apr 2019 16:23:36 +0000 (16:23 +0000)]
powerpc/32s: set up an early static hash table for KASAN.

KASAN requires early activation of hash table, before memblock()
functions are available.

This patch implements an early hash_table statically defined in
__initdata.

During early boot, a single page table is used.

For hash32, when doing the final init, one page table is allocated
for each PGD entry because of the _PAGE_HASHPTE flag which can't be
common to several virt pages. This is done after memblock get
available but before switching to the final hash table, otherwise
there are issues with TLB flushing due to the shared entries.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/32s: move hash code patching out of MMU_init_hw()
Christophe Leroy [Fri, 26 Apr 2019 16:23:35 +0000 (16:23 +0000)]
powerpc/32s: move hash code patching out of MMU_init_hw()

For KASAN, hash table handling will be activated early for
accessing to KASAN shadow areas.

In order to avoid any modification of the hash functions while
they are still used with the early hash table, the code patching
is moved out of MMU_init_hw() and put close to the big-bang switch
to the final hash table.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/32: Add KASAN support
Christophe Leroy [Fri, 26 Apr 2019 16:23:34 +0000 (16:23 +0000)]
powerpc/32: Add KASAN support

This patch adds KASAN support for PPC32. The following patch
will add an early activation of hash table for book3s. Until
then, a warning will be raised if trying to use KASAN on an
hash 6xx.

To support KASAN, this patch initialises that MMU mapings for
accessing to the KASAN shadow area defined in a previous patch.

An early mapping is set as soon as the kernel code has been
relocated at its definitive place.

Then the definitive mapping is set once paging is initialised.

For modules, the shadow area is allocated at module_alloc().

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc: disable KASAN instrumentation on early/critical files.
Christophe Leroy [Fri, 26 Apr 2019 16:23:33 +0000 (16:23 +0000)]
powerpc: disable KASAN instrumentation on early/critical files.

All files containing functions run before kasan_early_init() is called
must have KASAN instrumentation disabled.

For those file, branch profiling also have to be disabled otherwise
each if () generates a call to ftrace_likely_update().

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/32: prepare shadow area for KASAN
Christophe Leroy [Fri, 26 Apr 2019 16:23:32 +0000 (16:23 +0000)]
powerpc/32: prepare shadow area for KASAN

This patch prepares a shadow area for KASAN.

The shadow area will be at the top of the kernel virtual
memory space above the fixmap area and will occupy one
eighth of the total kernel virtual memory space.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/32: make KVIRT_TOP dependent on FIXMAP_START
Christophe Leroy [Fri, 26 Apr 2019 16:23:31 +0000 (16:23 +0000)]
powerpc/32: make KVIRT_TOP dependent on FIXMAP_START

When we add KASAN shadow area, KVIRT_TOP can't be anymore fixed
at 0xfe000000.

This patch uses FIXADDR_START to define KVIRT_TOP.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/32: use memset() instead of memset_io() to zero BSS
Christophe Leroy [Fri, 26 Apr 2019 16:23:30 +0000 (16:23 +0000)]
powerpc/32: use memset() instead of memset_io() to zero BSS

Since commit 400c47d81ca38 ("powerpc32: memset: only use dcbz once cache is
enabled"), memset() can be used before activation of the cache,
so no need to use memset_io() for zeroing the BSS.

Acked-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc: don't use direct assignation during early boot.
Christophe Leroy [Fri, 26 Apr 2019 16:23:29 +0000 (16:23 +0000)]
powerpc: don't use direct assignation during early boot.

In kernel/cputable.c, explicitly use memcpy() instead of *y = *x;
This will allow GCC to replace it with __memcpy() when KASAN is
selected.

Acked-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/prom_init: don't use string functions from lib/
Christophe Leroy [Fri, 26 Apr 2019 16:23:28 +0000 (16:23 +0000)]
powerpc/prom_init: don't use string functions from lib/

When KASAN is active, the string functions in lib/ are doing the
KASAN checks. This is too early for prom_init.

This patch implements dedicated string functions for prom_init,
which will be compiled in with KASAN disabled.

Size of prom_init before the patch:
   text    data     bss     dec     hex filename
  12060     488    6960   19508    4c34 arch/powerpc/kernel/prom_init.o

Size of prom_init after the patch:
   text    data     bss     dec     hex filename
  12460     488    6960   19908    4dc4 arch/powerpc/kernel/prom_init.o

This increases the size of prom_init a bit, but as prom_init is
in __init section, it is freed after boot anyway.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc: remove CONFIG_CMDLINE #ifdef mess
Christophe Leroy [Fri, 26 Apr 2019 16:23:27 +0000 (16:23 +0000)]
powerpc: remove CONFIG_CMDLINE #ifdef mess

This patch makes CONFIG_CMDLINE defined at all time. It avoids
having to enclose related code inside #ifdef CONFIG_CMDLINE

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc: prepare string/mem functions for KASAN
Christophe Leroy [Fri, 26 Apr 2019 16:23:26 +0000 (16:23 +0000)]
powerpc: prepare string/mem functions for KASAN

CONFIG_KASAN implements wrappers for memcpy() memmove() and memset()
Those wrappers are doing the verification then call respectively
__memcpy() __memmove() and __memset(). The arches are therefore
expected to rename their optimised functions that way.

For files on which KASAN is inhibited, #defines are used to allow
them to directly call optimised versions of the functions without
going through the KASAN wrappers.

See commit 393f203f5fd5 ("x86_64: kasan: add interceptors for
memset/memmove/memcpy functions") for details.

Other string / mem functions do not (yet) have kasan wrappers,
we therefore have to fallback to the generic versions when
KASAN is active, otherwise KASAN checks will be skipped.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
[mpe: Fixups to keep selftests working]
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/32: Move early_init() in a separate file
Christophe Leroy [Fri, 26 Apr 2019 16:23:25 +0000 (16:23 +0000)]
powerpc/32: Move early_init() in a separate file

In preparation of KASAN, move early_init() into a separate
file in order to allow deactivation of KASAN for that function.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/mm: refactor pgd_alloc() and pgd_free() on nohash
Christophe Leroy [Fri, 26 Apr 2019 15:58:13 +0000 (15:58 +0000)]
powerpc/mm: refactor pgd_alloc() and pgd_free() on nohash

pgd_alloc() and pgd_free() are identical on nohash 32 and 64.

Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/mm: refactor pmd_pgtable()
Christophe Leroy [Fri, 26 Apr 2019 15:58:12 +0000 (15:58 +0000)]
powerpc/mm: refactor pmd_pgtable()

pmd_pgtable() is identical on the 4 subarches, refactor it.

Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/mm: refactor pgtable freeing functions on nohash
Christophe Leroy [Fri, 26 Apr 2019 15:58:11 +0000 (15:58 +0000)]
powerpc/mm: refactor pgtable freeing functions on nohash

pgtable_free() and others are identical on nohash/32 and 64,
so move them into asm/nohash/pgalloc.h

Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/mm: Only keep one version of pmd_populate() functions on nohash/32
Christophe Leroy [Fri, 26 Apr 2019 15:58:10 +0000 (15:58 +0000)]
powerpc/mm: Only keep one version of pmd_populate() functions on nohash/32

Use IS_ENABLED(CONFIG_BOOKE) to make single versions of
pmd_populate() and pmd_populate_kernel()

Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/mm: refactor definition of pgtable_cache[]
Christophe Leroy [Fri, 26 Apr 2019 15:58:09 +0000 (15:58 +0000)]
powerpc/mm: refactor definition of pgtable_cache[]

pgtable_cache[] is the same for the 4 subarches, lets make it common.

Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/mm: refactor pte_alloc_one() and pte_free() families definition.
Christophe Leroy [Fri, 26 Apr 2019 15:58:08 +0000 (15:58 +0000)]
powerpc/mm: refactor pte_alloc_one() and pte_free() families definition.

Functions pte_alloc_one(), pte_alloc_one_kernel(), pte_free(),
pte_free_kernel() are identical for the four subarches.

This patch moves their definition in a common place.

Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/mm: inline pte_alloc_one_kernel() and pte_alloc_one() on PPC32
Christophe Leroy [Fri, 26 Apr 2019 15:58:07 +0000 (15:58 +0000)]
powerpc/mm: inline pte_alloc_one_kernel() and pte_alloc_one() on PPC32

pte_alloc_one_kernel() and pte_alloc_one() are simple calls to
pte_fragment_alloc(), so they are good candidates for inlining as
already done on PPC64.

Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/mm: don't use pte_alloc_kernel() until slab is available on PPC32
Christophe Leroy [Fri, 26 Apr 2019 15:58:06 +0000 (15:58 +0000)]
powerpc/mm: don't use pte_alloc_kernel() until slab is available on PPC32

In the same way as PPC64, implement early allocation functions and
avoid calling pte_alloc_kernel() before slab is available.

Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/book3e: move early_alloc_pgtable() to init section
Christophe Leroy [Fri, 26 Apr 2019 15:58:05 +0000 (15:58 +0000)]
powerpc/book3e: move early_alloc_pgtable() to init section

early_alloc_pgtable() is only used during init.

Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/Kconfig: select PPC_MM_SLICES from subarch type
Christophe Leroy [Fri, 26 Apr 2019 15:58:04 +0000 (15:58 +0000)]
powerpc/Kconfig: select PPC_MM_SLICES from subarch type

Lets select PPC_MM_SLICES from the subarch config item instead of
doing it via defaults declaration in the PPC_MM_SLICES item itself.

Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/mm: get rid of nohash/32/mmu.h and nohash/64/mmu.h
Christophe Leroy [Fri, 26 Apr 2019 15:58:03 +0000 (15:58 +0000)]
powerpc/mm: get rid of nohash/32/mmu.h and nohash/64/mmu.h

Those files have no real added values, especially the 64 bit
which only includes the common book3e mmu.h which is also
included from 32 bits side.

So lets do the final inclusion directly from nohash/mmu.h

Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/mm: move pgtable_t in asm/mmu.h
Christophe Leroy [Fri, 26 Apr 2019 15:58:02 +0000 (15:58 +0000)]
powerpc/mm: move pgtable_t in asm/mmu.h

pgtable_t is now identical for all subarches, move it to the
top level asm/mmu.h

Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/mm: convert Book3E 64 to pte_fragment
Christophe Leroy [Fri, 26 Apr 2019 15:58:01 +0000 (15:58 +0000)]
powerpc/mm: convert Book3E 64 to pte_fragment

Book3E 64 is the only subarch not using pte_fragment. In order
to allow refactorisation, this patch converts it to pte_fragment.

Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/mm: drop __bad_pte()
Christophe Leroy [Fri, 26 Apr 2019 15:57:59 +0000 (15:57 +0000)]
powerpc/mm: drop __bad_pte()

This has never been called (since Kernel has been in git at least),
drop it.

Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
5 years agopowerpc/mm: flatten function __find_linux_pte() step 3
Christophe Leroy [Fri, 26 Apr 2019 05:59:53 +0000 (05:59 +0000)]
powerpc/mm: flatten function __find_linux_pte() step 3

__find_linux_pte() is full of if/else which is hard to
follow allthough the handling is pretty simple.

Previous patches left a { } block. This patch removes it.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>