OSDN Git Service

uclinux-h8/linux.git
5 years agoKVM/VMX: Add hv tlb range flush support
Lan Tianyu [Thu, 6 Dec 2018 13:21:07 +0000 (21:21 +0800)]
KVM/VMX: Add hv tlb range flush support

This patch is to register tlb_remote_flush_with_range callback with
hv tlb range flush interface.

Signed-off-by: Lan Tianyu <Tianyu.Lan@microsoft.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
5 years agox86/hyper-v: Add HvFlushGuestAddressList hypercall support
Lan Tianyu [Thu, 6 Dec 2018 13:21:05 +0000 (21:21 +0800)]
x86/hyper-v: Add HvFlushGuestAddressList hypercall support

Hyper-V provides HvFlushGuestAddressList() hypercall to flush EPT tlb
with specified ranges. This patch is to add the hypercall support.

Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Signed-off-by: Lan Tianyu <Tianyu.Lan@microsoft.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
5 years agoKVM: Add tlb_remote_flush_with_range callback in kvm_x86_ops
Lan Tianyu [Thu, 6 Dec 2018 13:21:04 +0000 (21:21 +0800)]
KVM: Add tlb_remote_flush_with_range callback in kvm_x86_ops

Add flush range call back in the kvm_x86_ops and platform can use it
to register its associated function. The parameter "kvm_tlb_range"
accepts a single range and flush list which contains a list of ranges.

Signed-off-by: Lan Tianyu <Tianyu.Lan@microsoft.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
5 years agoKVM: x86: Disable Intel PT when VMXON in L1 guest
Luwei Kang [Wed, 24 Oct 2018 08:05:16 +0000 (16:05 +0800)]
KVM: x86: Disable Intel PT when VMXON in L1 guest

Currently, Intel Processor Trace do not support tracing in L1 guest
VMX operation(IA32_VMX_MISC[bit 14] is 0). As mentioned in SDM,
on these type of processors, execution of the VMXON instruction will
clears IA32_RTIT_CTL.TraceEn and any attempt to write IA32_RTIT_CTL
causes a general-protection exception (#GP).

Signed-off-by: Luwei Kang <luwei.kang@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
5 years agoKVM: x86: Set intercept for Intel PT MSRs read/write
Chao Peng [Wed, 24 Oct 2018 08:05:15 +0000 (16:05 +0800)]
KVM: x86: Set intercept for Intel PT MSRs read/write

To save performance overhead, disable intercept Intel PT MSRs
read/write when Intel PT is enabled in guest.
MSR_IA32_RTIT_CTL is an exception that will always be intercepted.

Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
Signed-off-by: Luwei Kang <luwei.kang@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
5 years agoKVM: x86: Implement Intel PT MSRs read/write emulation
Chao Peng [Wed, 24 Oct 2018 08:05:14 +0000 (16:05 +0800)]
KVM: x86: Implement Intel PT MSRs read/write emulation

This patch implement Intel Processor Trace MSRs read/write
emulation.
Intel PT MSRs read/write need to be emulated when Intel PT
MSRs is intercepted in guest and during live migration.

Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
Signed-off-by: Luwei Kang <luwei.kang@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
5 years agoKVM: x86: Introduce a function to initialize the PT configuration
Luwei Kang [Wed, 24 Oct 2018 08:05:13 +0000 (16:05 +0800)]
KVM: x86: Introduce a function to initialize the PT configuration

Initialize the Intel PT configuration when cpuid update.
Include cpuid inforamtion, rtit_ctl bit mask and the number of
address ranges.

Signed-off-by: Luwei Kang <luwei.kang@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
5 years agoKVM: x86: Add Intel PT context switch for each vcpu
Chao Peng [Wed, 24 Oct 2018 08:05:12 +0000 (16:05 +0800)]
KVM: x86: Add Intel PT context switch for each vcpu

Load/Store Intel Processor Trace register in context switch.
MSR IA32_RTIT_CTL is loaded/stored automatically from VMCS.
In Host-Guest mode, we need load/resore PT MSRs only when PT
is enabled in guest.

Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
Signed-off-by: Luwei Kang <luwei.kang@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
5 years agoKVM: x86: Add Intel Processor Trace cpuid emulation
Chao Peng [Wed, 24 Oct 2018 08:05:11 +0000 (16:05 +0800)]
KVM: x86: Add Intel Processor Trace cpuid emulation

Expose Intel Processor Trace to guest only when
the PT works in Host-Guest mode.

Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
Signed-off-by: Luwei Kang <luwei.kang@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
5 years agoKVM: x86: Add Intel PT virtualization work mode
Chao Peng [Wed, 24 Oct 2018 08:05:10 +0000 (16:05 +0800)]
KVM: x86: Add Intel PT virtualization work mode

Intel Processor Trace virtualization can be work in one
of 2 possible modes:

a. System-Wide mode (default):
   When the host configures Intel PT to collect trace packets
   of the entire system, it can leave the relevant VMX controls
   clear to allow VMX-specific packets to provide information
   across VMX transitions.
   KVM guest will not aware this feature in this mode and both
   host and KVM guest trace will output to host buffer.

b. Host-Guest mode:
   Host can configure trace-packet generation while in
   VMX non-root operation for guests and root operation
   for native executing normally.
   Intel PT will be exposed to KVM guest in this mode, and
   the trace output to respective buffer of host and guest.
   In this mode, tht status of PT will be saved and disabled
   before VM-entry and restored after VM-exit if trace
   a virtual machine.

Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
Signed-off-by: Luwei Kang <luwei.kang@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
5 years agoperf/x86/intel/pt: add new capability for Intel PT
Luwei Kang [Wed, 24 Oct 2018 08:05:09 +0000 (16:05 +0800)]
perf/x86/intel/pt: add new capability for Intel PT

This adds support for "output to Trace Transport subsystem"
capability of Intel PT. It means that PT can output its
trace to an MMIO address range rather than system memory buffer.

Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Luwei Kang <luwei.kang@intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
5 years agoperf/x86/intel/pt: Add new bit definitions for PT MSRs
Luwei Kang [Wed, 24 Oct 2018 08:05:08 +0000 (16:05 +0800)]
perf/x86/intel/pt: Add new bit definitions for PT MSRs

Add bit definitions for Intel PT MSRs to support trace output
directed to the memeory subsystem and holds a count if packet
bytes that have been sent out.

These are required by the upcoming PT support in KVM guests
for MSRs read/write emulation.

Signed-off-by: Luwei Kang <luwei.kang@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
5 years agoperf/x86/intel/pt: Introduce intel_pt_validate_cap()
Luwei Kang [Wed, 24 Oct 2018 08:05:07 +0000 (16:05 +0800)]
perf/x86/intel/pt: Introduce intel_pt_validate_cap()

intel_pt_validate_hw_cap() validates whether a given PT capability is
supported by the hardware. It checks the PT capability array which
reflects the capabilities of the hardware on which the code is executed.

For setting up PT for KVM guests this is not correct as the capability
array for the guest can be different from the host array.

Provide a new function to check against a given capability array.

Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Luwei Kang <luwei.kang@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
5 years agoperf/x86/intel/pt: Export pt_cap_get()
Chao Peng [Wed, 24 Oct 2018 08:05:06 +0000 (16:05 +0800)]
perf/x86/intel/pt: Export pt_cap_get()

pt_cap_get() is required by the upcoming PT support in KVM guests.

Export it and move the capabilites enum to a global header.

As a global functions, "pt_*" is already used for ptrace and
other things, so it makes sense to use "intel_pt_*" as a prefix.

Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
Signed-off-by: Luwei Kang <luwei.kang@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
5 years agoperf/x86/intel/pt: Move Intel PT MSRs bit defines to global header
Chao Peng [Wed, 24 Oct 2018 08:05:05 +0000 (16:05 +0800)]
perf/x86/intel/pt: Move Intel PT MSRs bit defines to global header

The Intel Processor Trace (PT) MSR bit defines are in a private
header. The upcoming support for PT virtualization requires these defines
to be accessible from KVM code.

Move them to the global MSR header file.

Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
Signed-off-by: Luwei Kang <luwei.kang@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
5 years agokvm: selftests: aarch64: dirty_log_test: support greater than 40-bit IPAs
Andrew Jones [Tue, 6 Nov 2018 13:57:12 +0000 (14:57 +0100)]
kvm: selftests: aarch64: dirty_log_test: support greater than 40-bit IPAs

When KVM has KVM_CAP_ARM_VM_IPA_SIZE we can test with > 40-bit IPAs by
using the 'type' field of KVM_CREATE_VM.

Signed-off-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
5 years agokvm: selftests: add pa-48/va-48 VM modes
Andrew Jones [Tue, 6 Nov 2018 13:57:11 +0000 (14:57 +0100)]
kvm: selftests: add pa-48/va-48 VM modes

Signed-off-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
5 years agokvm: selftests: dirty_log_test: improve mode param management
Andrew Jones [Tue, 6 Nov 2018 13:57:10 +0000 (14:57 +0100)]
kvm: selftests: dirty_log_test: improve mode param management

Signed-off-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
5 years agokvm: selftests: dirty_log_test: reset guest test phys offset
Andrew Jones [Tue, 6 Nov 2018 13:57:09 +0000 (14:57 +0100)]
kvm: selftests: dirty_log_test: reset guest test phys offset

We need to reset the offset for each mode as it will change
depending on the number of guest physical address bits.

Signed-off-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
5 years agokvm: selftests: dirty_log_test: always use -t
Andrew Jones [Tue, 6 Nov 2018 13:57:08 +0000 (14:57 +0100)]
kvm: selftests: dirty_log_test: always use -t

There's no reason not to always test the topmost physical
addresses, and if the user wants to try lower addresses
then '-p' (used to be '-o before this patch) can be used.
Let's remove the '-t' option and just always do what it did.

Signed-off-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
5 years agokvm: selftests: dirty_log_test: don't identity map the test mem
Andrew Jones [Tue, 6 Nov 2018 13:57:07 +0000 (14:57 +0100)]
kvm: selftests: dirty_log_test: don't identity map the test mem

It isn't necessary and can even cause problems when testing high
guest physical addresses. This patch leaves the test memory id-
mapped by default, but when using '-t' the test memory virtual
addresses stay the same even though the physical addresses switch
to the topmost valid addresses.

Signed-off-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
5 years agokvm: selftests: x86_64: dirty_log_test: fix -t
Andrew Jones [Tue, 6 Nov 2018 13:57:06 +0000 (14:57 +0100)]
kvm: selftests: x86_64: dirty_log_test: fix -t

Signed-off-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
5 years agoKVM: fix some typos
Wei Yang [Mon, 5 Nov 2018 06:45:03 +0000 (14:45 +0800)]
KVM: fix some typos

Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
[Preserved the iff and a probably intentional weird bracket notation.
 Also dropped the style change to make a single-purpose patch. - Radim]
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
5 years agox86/kvmclock: convert to SPDX identifiers
Peng Hao [Fri, 2 Nov 2018 09:05:17 +0000 (17:05 +0800)]
x86/kvmclock: convert to SPDX identifiers

Update the verbose license text with the matching SPDX
license identifier.

Signed-off-by: Peng Hao <peng.hao2@zte.com.cn>
[Changed deprecated GPL-2.0+ to GPL-2.0-or-later. - Radim]
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
5 years agoKVM: x86: Remove KF() macro placeholder
Sean Christopherson [Mon, 5 Nov 2018 18:44:32 +0000 (10:44 -0800)]
KVM: x86: Remove KF() macro placeholder

Although well-intentioned, keeping the KF() definition as a hint for
handling scattered CPUID features may be counter-productive.  Simply
redefining the bit position only works for directly manipulating the
guest's CPUID leafs, e.g. it doesn't make guest_cpuid_has() magically
work.  Taking an alternative approach, e.g. ensuring the bit position
is identical between the Linux-defined and hardware-defined features,
may be a simpler and/or more effective method of exposing scattered
features to the guest.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
5 years agokvm: vmx: Allow guest read access to IA32_TSC
Jim Mattson [Fri, 9 Nov 2018 17:35:11 +0000 (09:35 -0800)]
kvm: vmx: Allow guest read access to IA32_TSC

Let the guest read the IA32_TSC MSR with the generic RDMSR instruction
as well as the specific RDTSC(P) instructions. Note that the hardware
applies the TSC multiplier and offset (when applicable) to the result of
RDMSR(IA32_TSC), just as it does to the result of RDTSC(P).

Signed-off-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Peter Shier <pshier@google.com>
Reviewed-by: Marc Orr <marcorr@google.com>
Reviewed-by: Liran Alon <liran.alon@oracle.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
5 years agokvm: nVMX: NMI-window and interrupt-window exiting should wake L2 from HLT
Jim Mattson [Mon, 26 Nov 2018 19:22:32 +0000 (11:22 -0800)]
kvm: nVMX: NMI-window and interrupt-window exiting should wake L2 from HLT

According to the SDM, "NMI-window exiting" VM-exits wake a logical
processor from the same inactive states as would an NMI and
"interrupt-window exiting" VM-exits wake a logical processor from the
same inactive states as would an external interrupt. Specifically, they
wake a logical processor from the shutdown state and from the states
entered using the HLT and MWAIT instructions.

Fixes: 6dfacadd5858 ("KVM: nVMX: Add support for activity state HLT")
Signed-off-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Peter Shier <pshier@google.com>
Suggested-by: Sean Christopherson <sean.j.christopherson@intel.com>
[Squashed comments of two Jim's patches and used the simplified code
 hunk provided by Sean. - Radim]
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
5 years agoKVM: nSVM: Fix nested guest support for PAUSE filtering.
Tambe, William [Tue, 13 Nov 2018 16:51:20 +0000 (16:51 +0000)]
KVM: nSVM: Fix nested guest support for PAUSE filtering.

Currently, the nested guest's PAUSE intercept intentions are not being
honored.  Instead, since the L0 hypervisor's pause_filter_count and
pause_filter_thresh values are still in place, these values are used
instead of those programmed in the VMCB by the L1 hypervisor.

To honor the desired PAUSE intercept support of the L1 hypervisor, the L0
hypervisor must use the PAUSE filtering fields of the L1 hypervisor. This
requires saving and restoring of both the L0 and L1 hypervisor's PAUSE
filtering fields.

Signed-off-by: William Tambe <william.tambe@amd.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
5 years agokvm: Change offset in kvm_write_guest_offset_cached to unsigned
Jim Mattson [Fri, 14 Dec 2018 22:34:43 +0000 (14:34 -0800)]
kvm: Change offset in kvm_write_guest_offset_cached to unsigned

Since the offset is added directly to the hva from the
gfn_to_hva_cache, a negative offset could result in an out of bounds
write. The existing BUG_ON only checks for addresses beyond the end of
the gfn_to_hva_cache, not for addresses before the start of the
gfn_to_hva_cache.

Note that all current call sites have non-negative offsets.

Fixes: 4ec6e8636256 ("kvm: Introduce kvm_write_guest_offset_cached()")
Reported-by: Cfir Cohen <cfir@google.com>
Signed-off-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Cfir Cohen <cfir@google.com>
Reviewed-by: Peter Shier <pshier@google.com>
Reviewed-by: Krish Sadhukhan <krish.sadhukhan@oracle.com>
Reviewed-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
5 years agokvm: Disallow wraparound in kvm_gfn_to_hva_cache_init
Jim Mattson [Mon, 17 Dec 2018 21:53:33 +0000 (13:53 -0800)]
kvm: Disallow wraparound in kvm_gfn_to_hva_cache_init

Previously, in the case where (gpa + len) wrapped around, the entire
region was not validated, as the comment claimed. It doesn't actually
seem that wraparound should be allowed here at all.

Furthermore, since some callers don't check the return code from this
function, it seems prudent to clear ghc->memslot in the event of an
error.

Fixes: 8f964525a121f ("KVM: Allow cross page reads and writes from cached translations.")
Reported-by: Cfir Cohen <cfir@google.com>
Signed-off-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Cfir Cohen <cfir@google.com>
Reviewed-by: Marc Orr <marcorr@google.com>
Cc: Andrew Honig <ahonig@google.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
5 years agoKVM: VMX: Remove duplicated include from vmx.c
YueHaibing [Tue, 18 Dec 2018 01:01:49 +0000 (01:01 +0000)]
KVM: VMX: Remove duplicated include from vmx.c

Remove duplicated include.

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
5 years agoselftests: kvm: report failed stage when exit reason is unexpected
Vitaly Kuznetsov [Wed, 19 Dec 2018 11:15:18 +0000 (12:15 +0100)]
selftests: kvm: report failed stage when exit reason is unexpected

When we get a report like

==== Test Assertion Failure ====
  x86_64/state_test.c:157: run->exit_reason == KVM_EXIT_IO
  pid=955 tid=955 - Success
     1 0x0000000000401350: main at state_test.c:154
     2 0x00007fc31c9e9412: ?? ??:0
     3 0x000000000040159d: _start at ??:?
  Unexpected exit reason: 8 (SHUTDOWN),

it is not obvious which particular stage failed. Add the info.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
5 years agoKVM: x86: svm: report MSR_IA32_MCG_EXT_CTL as unsupported
Vitaly Kuznetsov [Wed, 19 Dec 2018 11:06:13 +0000 (12:06 +0100)]
KVM: x86: svm: report MSR_IA32_MCG_EXT_CTL as unsupported

AMD doesn't seem to implement MSR_IA32_MCG_EXT_CTL and svm code in kvm
knows nothing about it, however, this MSR is among emulated_msrs and
thus returned with KVM_GET_MSR_INDEX_LIST. The consequent KVM_GET_MSRS,
of course, fails.

Report the MSR as unsupported to not confuse userspace.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
5 years agoKVM: x86: fix size of x86_fpu_cache objects
Paolo Bonzini [Fri, 21 Dec 2018 10:25:59 +0000 (11:25 +0100)]
KVM: x86: fix size of x86_fpu_cache objects

The memory allocation in b666a4b69739 ("kvm: x86: Dynamically allocate
guest_fpu", 2018-11-06) is wrong, there are other members in struct fpu
before the fpregs_state union and the patch should be doing something
similar to the code in fpu__init_task_struct_size.  It's enough to run
a guest and then rmmod kvm to see slub errors which are actually caused
by memory corruption.

For now let's revert it to sizeof(struct fpu), which is conservative.
I have plans to move fsave/fxsave/xsave directly in KVM, without using
the kernel FPU helpers, and once it's done, the size of the object in
the cache will be something like kvm_xstate_size.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
5 years agoInput: synaptics - enable SMBus for HP EliteBook 840 G4
Mantas Mikulėnas [Fri, 21 Dec 2018 09:04:26 +0000 (01:04 -0800)]
Input: synaptics - enable SMBus for HP EliteBook 840 G4

dmesg reports that "Your touchpad (PNP: SYN3052 SYN0100 SYN0002 PNP0f13)
says it can support a different bus."

I've tested the offered psmouse.synaptics_intertouch=1 with 4.18.x and
4.19.x and it seems to work well. No problems seen with suspend/resume.

Also, it appears that RMI/SMBus mode is actually required for 3-4 finger
multitouch gestures to work -- otherwise they are not reported at all.

Information from dmesg in both modes:

  psmouse serio3: synaptics: Touchpad model: 1, fw: 8.2, id: 0x1e2b1,
      caps: 0xf00123/0x840300/0x2e800/0x0, board id: 3139, fw id: 2000742

  psmouse serio3: synaptics: Trying to set up SMBus access
  rmi4_smbus 6-002c: registering SMbus-connected sensor
  rmi4_f01 rmi4-00.fn01: found RMI device,
      manufacturer: Synaptics, product: TM3139-001, fw id: 2000742

Signed-off-by: Mantas Mikulėnas <grawity@gmail.com>
Reviewed-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
5 years agoMerge branches 'pm-devfreq', 'pm-avs' and 'pm-tools'
Rafael J. Wysocki [Fri, 21 Dec 2018 09:07:37 +0000 (10:07 +0100)]
Merge branches 'pm-devfreq', 'pm-avs' and 'pm-tools'

* pm-devfreq:
  PM / devfreq: add devfreq_suspend/resume() functions
  PM / devfreq: add support for suspend/resume of a devfreq device
  PM / devfreq: refactor set_target frequency function

* pm-avs:
  PM / AVS: SmartReflex: Switch to SPDX Licence ID
  PM / AVS: SmartReflex: NULL check before some freeing functions is not needed
  PM / AVS: SmartReflex: remove unused function

* pm-tools:
  tools/power/x86/intel_pstate_tracer: Fix non root execution for post processing a trace file
  tools/power turbostat: consolidate duplicate model numbers
  tools/power turbostat: fix goldmont C-state limit decoding
  cpupower : Auto-completion for cpupower tool
  tools/power turbostat: reduce debug output
  tools/power turbosat: fix AMD APIC-id output

5 years agoMerge branches 'pm-core', 'pm-qos', 'pm-domains' and 'pm-sleep'
Rafael J. Wysocki [Fri, 21 Dec 2018 09:06:44 +0000 (10:06 +0100)]
Merge branches 'pm-core', 'pm-qos', 'pm-domains' and 'pm-sleep'

* pm-core:
  PM-runtime: Switch autosuspend over to using hrtimers

* pm-qos:
  PM / QoS: Change to use DEFINE_SHOW_ATTRIBUTE macro

* pm-domains:
  PM / Domains: remove define_genpd_open_function() and define_genpd_debugfs_fops()

* pm-sleep:
  PM / sleep: convert to DEFINE_SHOW_ATTRIBUTE

5 years agoMerge branch 'pm-opp'
Rafael J. Wysocki [Fri, 21 Dec 2018 09:06:18 +0000 (10:06 +0100)]
Merge branch 'pm-opp'

* pm-opp:
  PM / Domains: Propagate performance state updates
  PM / Domains: Factorize dev_pm_genpd_set_performance_state()
  PM / Domains: Save OPP table pointer in genpd
  OPP: Don't return 0 on error from of_get_required_opp_performance_state()
  OPP: Add dev_pm_opp_xlate_performance_state() helper
  OPP: Improve _find_table_of_opp_np()
  PM / Domains: Make genpd performance states orthogonal to the idlestates
  OPP: Fix missing debugfs supply directory for OPPs
  OPP: Use opp_table->regulators to verify no regulator case
  OPP: Remove of_dev_pm_opp_find_required_opp()
  OPP: Rename and relocate of_genpd_opp_to_performance_state()
  OPP: Configure all required OPPs
  OPP: Add dev_pm_opp_{set|put}_genpd_virt_dev() helper
  PM / Domains: Add genpd_opp_to_performance_state()
  OPP: Populate OPPs from "required-opps" property
  OPP: Populate required opp tables from "required-opps" property
  OPP: Separate out custom OPP handler specific code
  OPP: Identify and mark genpd OPP tables
  PM / Domains: Rename genpd virtual devices as virt_dev

5 years agoMerge branches 'pm-cpuidle', 'pm-cpufreq' and 'pm-cpufreq-sched'
Rafael J. Wysocki [Fri, 21 Dec 2018 09:06:06 +0000 (10:06 +0100)]
Merge branches 'pm-cpuidle', 'pm-cpufreq' and 'pm-cpufreq-sched'

* pm-cpuidle:
  cpuidle: Add 'above' and 'below' idle state metrics
  cpuidle: big.LITTLE: fix refcount leak
  cpuidle: Add cpuidle.governor= command line parameter
  cpuidle: poll_state: Disregard disable idle states
  Documentation: admin-guide: PM: Add cpuidle document

* pm-cpufreq:
  cpufreq: qcom-hw: Add support for QCOM cpufreq HW driver
  dt-bindings: cpufreq: Introduce QCOM cpufreq firmware bindings
  cpufreq: nforce2: Remove meaningless return
  cpufreq: ia64: Remove unused header files
  cpufreq: imx6q: save one condition block for normal case of nvmem read
  cpufreq: imx6q: remove unused code
  cpufreq: pmac64: add of_node_put()
  cpufreq: powernv: add of_node_put()
  Documentation: intel_pstate: Clarify coordination of P-State limits
  cpufreq: intel_pstate: Force HWP min perf before offline
  cpufreq: s3c24xx: Change to use DEFINE_SHOW_ATTRIBUTE macro

* pm-cpufreq-sched:
  sched/cpufreq: Add the SPDX tags

5 years agoMerge branch 'acpi-pci'
Rafael J. Wysocki [Fri, 21 Dec 2018 09:04:23 +0000 (10:04 +0100)]
Merge branch 'acpi-pci'

* acpi-pci:
  ACPI: Make PCI slot detection driver depend on PCI
  ACPI/IORT: Stub out ACS functions when CONFIG_PCI is not set
  arm64: select ACPI PCI code only when both features are enabled
  PCI/ACPI: Allow ACPI to be built without CONFIG_PCI set
  ACPICA: Remove PCI bits from ACPICA when CONFIG_PCI is unset
  ACPI: Allow CONFIG_PCI to be unset for reboot
  ACPI: Move PCI reset to a separate function

5 years agoMerge branches 'acpi-tables', 'acpi-soc', 'acpi-apei' and 'acpi-misc'
Rafael J. Wysocki [Fri, 21 Dec 2018 09:04:14 +0000 (10:04 +0100)]
Merge branches 'acpi-tables', 'acpi-soc', 'acpi-apei' and 'acpi-misc'

* acpi-tables:
  ACPI / tables: Add an ifdef around amlcode and dsdt_amlcode
  ACPI / tables: add DSDT AmlCode new declaration name support
  ACPI: SPCR: Consider baud rate 0 as preconfigured state

* acpi-soc:
  ACPI / LPSS: Ignore acpi_device_fix_up_power() return value
  ACPI / APD: Add clock frequency for Hisilicon Hip08 SPI controller

* acpi-apei:
  ACPI/APEI: Clear GHES block_status before panic()
  ACPI, APEI, EINJ: Change to use DEFINE_SHOW_ATTRIBUTE macro

* acpi-misc:
  ACPI: fix acpi_find_child_device() invocation in acpi_preset_companion()

5 years agoMerge branch 'acpica'
Rafael J. Wysocki [Fri, 21 Dec 2018 09:03:16 +0000 (10:03 +0100)]
Merge branch 'acpica'

* acpica:
  ACPICA: Update version to 20181213
  ACPICA: change coding style to match ACPICA, no functional change
  ACPICA: Debug output: Add option to display method/object evaluation
  ACPICA: disassembler: disassemble OEMx tables as AML
  ACPICA: Add "Windows 2018.2" string in the _OSI support
  ACPICA: Expressions in package elements are not supported
  ACPICA: Update buffer-to-string conversions
  ACPICA: add comments, no functional change
  ACPICA: Remove defines that use deprecated flag
  ACPICA: Add "Windows 2018" string in the _OSI support
  ACPICA: Update version to 20181031
  ACPICA: iASL: Enhance error detection
  ACPICA: iASL: adding definition and disassembly for TPM2 revision 3
  ACPICA: Use %d for signed int print formatting instead of %u
  ACPICA: Debugger: refactor to fix unused variable warning

5 years agoInput: elantech - disable elan-i2c for P52 and P72
Benjamin Tissoires [Fri, 21 Dec 2018 08:42:38 +0000 (00:42 -0800)]
Input: elantech - disable elan-i2c for P52 and P72

The current implementation of elan_i2c is known to not support those
2 laptops.

A proper fix is to tweak both elantech and elan_i2c to transmit the
correct information from PS/2, which would make a bad candidate for
stable.

So to give us some time for fixing the root of the problem, disable
elan_i2c for the devices we know are not behaving properly.

Link: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1803600
Link: https://bugs.archlinux.org/task/59714
Fixes: df077237cf55 Input: elantech - detect new ICs and setup Host Notify for them

Cc: stable@vger.kernel.org # v4.18+
Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Acked-by: Peter Hutterer <peter.hutterer@who-t.net>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
5 years agogpio: mvebu: only fail on missing clk if pwm is actually to be used
Uwe Kleine-König [Mon, 17 Dec 2018 08:43:13 +0000 (09:43 +0100)]
gpio: mvebu: only fail on missing clk if pwm is actually to be used

The gpio IP on Armada 370 at offset 0x18180 has neither a clk nor pwm
registers. So there is no need for a clk as the pwm isn't used anyhow.
So only check for the clk in the presence of the pwm registers. This fixes
a failure to probe the gpio driver for the above mentioned gpio device.

Fixes: 757642f9a584 ("gpio: mvebu: Add limited PWM support")
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Reviewed-by: Gregory CLEMENT <gregory.clement@bootlin.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
5 years agogpio: max7301: fix driver for use with CONFIG_VMAP_STACK
Christophe Leroy [Fri, 7 Dec 2018 13:07:55 +0000 (13:07 +0000)]
gpio: max7301: fix driver for use with CONFIG_VMAP_STACK

spi_read() and spi_write() require DMA-safe memory. When
CONFIG_VMAP_STACK is selected, those functions cannot be used
with buffers on stack.

This patch replaces calls to spi_read() and spi_write() by
spi_write_then_read() which doesn't require DMA-safe buffers.

Fixes: 0c36ec314735 ("gpio: gpio driver for max7301 SPI GPIO expander")
Cc: <stable@vger.kernel.org>
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
5 years agogpio: gpio-omap: Revert deferred wakeup quirk handling for regressions
Tony Lindgren [Fri, 7 Dec 2018 19:08:29 +0000 (11:08 -0800)]
gpio: gpio-omap: Revert deferred wakeup quirk handling for regressions

Commit ec0daae685b2 ("gpio: omap: Add level wakeup handling for omap4
based SoCs") attempted to fix omap4 GPIO wakeup handling as it was
blocking deeper SoC idle states. However this caused a regression for
GPIOs during runtime having over second long latencies for Ethernet
GPIO interrupt as reportedy by Russell King <rmk+kernel@armlinux.org.uk>.

Let's fix this issue by doing a partial revert of the breaking commit.
We still want to keep the quirk handling around as it is also used for
OMAP_GPIO_QUIRK_IDLE_REMOVE_TRIGGER.

The real fix for omap4 GPIO wakeup handling involves fixes for
omap_set_gpio_trigger() and omap_gpio_unmask_irq() and will be posted
separately. And we must keep the wakeup bit enabled during runtime
because of module doing clock autogating with autoidle configured.

Reported-by: Russell King <rmk+kernel@armlinux.org.uk>
Fixes: ec0daae685b2 ("gpio: omap: Add level wakeup handling for omap4
based SoCs")
Cc: Aaro Koskinen <aaro.koskinen@iki.fi>
Cc: Grygorii Strashko <grygorii.strashko@ti.com>
Cc: Keerthy <j-keerthy@ti.com>
Cc: Ladislav Michl <ladis@linux-mips.org>
Cc: Russell King <rmk+kernel@armlinux.org.uk>
Cc: Tero Kristo <t-kristo@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
5 years agoKVM: PPC: Book3S HV: Keep rc bits in shadow pgtable in sync with host
Suraj Jitindar Singh [Fri, 21 Dec 2018 03:28:43 +0000 (14:28 +1100)]
KVM: PPC: Book3S HV: Keep rc bits in shadow pgtable in sync with host

The rc bits contained in ptes are used to track whether a page has been
accessed and whether it is dirty. The accessed bit is used to age a page
and the dirty bit to track whether a page is dirty or not.

Now that we support nested guests there are three ptes which track the
state of the same page:
- The partition-scoped page table in the L1 guest, mapping L2->L1 address
- The partition-scoped page table in the host for the L1 guest, mapping
  L1->L0 address
- The shadow partition-scoped page table for the nested guest in the host,
  mapping L2->L0 address

The idea is to attempt to keep the rc state of these three ptes in sync,
both when setting and when clearing rc bits.

When setting the bits we achieve consistency by:
- Initially setting the bits in the shadow page table as the 'and' of the
  other two.
- When updating in software the rc bits in the shadow page table we
  ensure the state is consistent with the other two locations first, and
  update these before reflecting the change into the shadow page table.
  i.e. only set the bits in the L2->L0 pte if also set in both the
       L2->L1 and the L1->L0 pte.

When clearing the bits we achieve consistency by:
- The rc bits in the shadow page table are only cleared when discarding
  a pte, and we don't need to record this as if either bit is set then
  it must also be set in the pte mapping L1->L0.
- When L1 clears an rc bit in the L2->L1 mapping it __should__ issue a
  tlbie instruction
  - This means we will discard the pte from the shadow page table
    meaning the mapping will have to be setup again.
  - When setup the pte again in the shadow page table we will ensure
    consistency with the L2->L1 pte.
- When the host clears an rc bit in the L1->L0 mapping we need to also
  clear the bit in any ptes in the shadow page table which map the same
  gfn so we will be notified if a nested guest accesses the page.
  This case is what this patch specifically concerns.
  - We can search the nest_rmap list for that given gfn and clear the
    same bit from all corresponding ptes in shadow page tables.
  - If a nested guest causes either of the rc bits to be set by software
    in future then we will update the L1->L0 pte and maintain consistency.

With the process outlined above we aim to maintain consistency of the 3
pte locations where we track rc for a given guest page.

Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
5 years agoKVM: PPC: Book3S HV: Introduce kvmhv_update_nest_rmap_rc_list()
Suraj Jitindar Singh [Fri, 21 Dec 2018 03:28:42 +0000 (14:28 +1100)]
KVM: PPC: Book3S HV: Introduce kvmhv_update_nest_rmap_rc_list()

Introduce a function kvmhv_update_nest_rmap_rc_list() which for a given
nest_rmap list will traverse it, find the corresponding pte in the shadow
page tables, and if it still maps the same host page update the rc bits
accordingly.

Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
5 years agoKVM: PPC: Book3S HV: Apply combination of host and l1 pte rc for nested guest
Suraj Jitindar Singh [Fri, 21 Dec 2018 03:28:41 +0000 (14:28 +1100)]
KVM: PPC: Book3S HV: Apply combination of host and l1 pte rc for nested guest

The shadow page table contains ptes for translations from nested guest
address to host address. Currently when creating these ptes we take the
rc bits from the pte for the L1 guest address to host address
translation. This is incorrect as we must also factor in the rc bits
from the pte for the nested guest address to L1 guest address
translation (as contained in the L1 guest partition table for the nested
guest).

By not calculating these bits correctly L1 may not have been correctly
notified when it needed to update its rc bits in the partition table it
maintains for its nested guest.

Modify the code so that the rc bits in the resultant pte for the L2->L0
translation are the 'and' of the rc bits in the L2->L1 pte and the L1->L0
pte, also accounting for whether this was a write access when setting
the dirty bit.

Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
5 years agoKVM: PPC: Book3S HV: Align gfn to L1 page size when inserting nest-rmap entry
Suraj Jitindar Singh [Fri, 21 Dec 2018 03:28:40 +0000 (14:28 +1100)]
KVM: PPC: Book3S HV: Align gfn to L1 page size when inserting nest-rmap entry

Nested rmap entries are used to store the translation from L1 gpa to L2
gpa when entries are inserted into the shadow (nested) page tables. This
rmap list is located by indexing the rmap array in the memslot by L1
gfn. When we come to search for these entries we only know the L1 page size
(which could be PAGE_SIZE, 2M or a 1G page) and so can only select a gfn
aligned to that size. This means that when we insert the entry, so we can
find it later, we need to align the gfn we use to select the rmap list
in which to insert the entry to L1 page size as well.

By not doing this we were missing nested rmap entries when modifying L1
ptes which were for a page also passed through to an L2 guest.

Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
5 years agoKVM: PPC: Book3S HV: Hold kvm->mmu_lock across updating nested pte rc bits
Suraj Jitindar Singh [Fri, 21 Dec 2018 03:28:39 +0000 (14:28 +1100)]
KVM: PPC: Book3S HV: Hold kvm->mmu_lock across updating nested pte rc bits

We already hold the kvm->mmu_lock spin lock across updating the rc bits
in the pte for the L1 guest. Continue to hold the lock across updating
the rc bits in the pte for the nested guest as well to prevent
invalidations from occurring.

Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
5 years agotcp: fix a race in inet_diag_dump_icsk()
Eric Dumazet [Thu, 20 Dec 2018 23:28:56 +0000 (15:28 -0800)]
tcp: fix a race in inet_diag_dump_icsk()

Alexei reported use after frees in inet_diag_dump_icsk() [1]

Because we use refcount_set() when various sockets are setup and
inserted into ehash, we also need to make sure inet_diag_dump_icsk()
wont race with the refcount_set() operations.

Jonathan Lemon sent a patch changing net_twsk_hashdance() but
other spots would need risky changes.

Instead, fix inet_diag_dump_icsk() as this bug came with
linux-4.10 only.

[1] Quoting Alexei :

First something iterating over sockets finds already freed tw socket:

refcount_t: increment on 0; use-after-free.
WARNING: CPU: 2 PID: 2738 at lib/refcount.c:153 refcount_inc+0x26/0x30
RIP: 0010:refcount_inc+0x26/0x30
RSP: 0018:ffffc90004c8fbc0 EFLAGS: 00010282
RAX: 000000000000002b RBX: 0000000000000000 RCX: 0000000000000000
RDX: ffff88085ee9d680 RSI: ffff88085ee954c8 RDI: ffff88085ee954c8
RBP: ffff88010ecbd2c0 R08: 0000000000000000 R09: 000000000000174c
R10: ffffffff81e7c5a0 R11: 0000000000000000 R12: 0000000000000000
R13: ffff8806ba9bf210 R14: ffffffff82304600 R15: ffff88010ecbd328
FS:  00007f81f5a7d700(0000) GS:ffff88085ee80000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f81e2a95000 CR3: 000000069b2eb006 CR4: 00000000003606e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 inet_diag_dump_icsk+0x2b3/0x4e0 [inet_diag]  // sock_hold(sk); in net/ipv4/inet_diag.c:1002
 ? kmalloc_large_node+0x37/0x70
 ? __kmalloc_node_track_caller+0x1cb/0x260
 ? __alloc_skb+0x72/0x1b0
 ? __kmalloc_reserve.isra.40+0x2e/0x80
 __inet_diag_dump+0x3b/0x80 [inet_diag]
 netlink_dump+0x116/0x2a0
 netlink_recvmsg+0x205/0x3c0
 sock_read_iter+0x89/0xd0
 __vfs_read+0xf7/0x140
 vfs_read+0x8a/0x140
 SyS_read+0x3f/0xa0
 do_syscall_64+0x5a/0x100

then a minute later twsk timer fires and hits two bad refcnts
for this freed socket:

refcount_t: decrement hit 0; leaking memory.
WARNING: CPU: 31 PID: 0 at lib/refcount.c:228 refcount_dec+0x2e/0x40
Modules linked in:
RIP: 0010:refcount_dec+0x2e/0x40
RSP: 0018:ffff88085f5c3ea8 EFLAGS: 00010296
RAX: 000000000000002c RBX: ffff88010ecbd2c0 RCX: 000000000000083f
RDX: 0000000000000000 RSI: 00000000000000f6 RDI: 000000000000003f
RBP: ffffc90003c77280 R08: 0000000000000000 R09: 00000000000017d3
R10: ffffffff81e7c5a0 R11: 0000000000000000 R12: ffffffff82ad2d80
R13: ffffffff8182de00 R14: ffff88085f5c3ef8 R15: 0000000000000000
FS:  0000000000000000(0000) GS:ffff88085f5c0000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fbe42685250 CR3: 0000000002209001 CR4: 00000000003606e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 <IRQ>
 inet_twsk_kill+0x9d/0xc0  // inet_twsk_bind_unhash(tw, hashinfo);
 call_timer_fn+0x29/0x110
 run_timer_softirq+0x36b/0x3a0

refcount_t: underflow; use-after-free.
WARNING: CPU: 31 PID: 0 at lib/refcount.c:187 refcount_sub_and_test+0x46/0x50
RIP: 0010:refcount_sub_and_test+0x46/0x50
RSP: 0018:ffff88085f5c3eb8 EFLAGS: 00010296
RAX: 0000000000000026 RBX: ffff88010ecbd2c0 RCX: 000000000000083f
RDX: 0000000000000000 RSI: 00000000000000f6 RDI: 000000000000003f
RBP: ffff88010ecbd358 R08: 0000000000000000 R09: 000000000000185b
R10: ffffffff81e7c5a0 R11: 0000000000000000 R12: ffff88010ecbd358
R13: ffffffff8182de00 R14: ffff88085f5c3ef8 R15: 0000000000000000
FS:  0000000000000000(0000) GS:ffff88085f5c0000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fbe42685250 CR3: 0000000002209001 CR4: 00000000003606e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 <IRQ>
 inet_twsk_put+0x12/0x20  // inet_twsk_put(tw);
 call_timer_fn+0x29/0x110
 run_timer_softirq+0x36b/0x3a0

Fixes: 67db3e4bfbc9 ("tcp: no longer hold ehash lock while calling tcp_get_info()")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Alexei Starovoitov <ast@kernel.org>
Cc: Jonathan Lemon <jonathan.lemon@gmail.com>
Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMAINTAINERS: update cxgb4 and cxgb3 maintainer
Ganesh Goudar [Thu, 20 Dec 2018 13:26:09 +0000 (18:56 +0530)]
MAINTAINERS: update cxgb4 and cxgb3 maintainer

Arjun Vynipadath will be taking over as maintainer from now.

Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoipv6: frags: Fix bogus skb->sk in reassembled packets
Herbert Xu [Thu, 20 Dec 2018 13:20:10 +0000 (21:20 +0800)]
ipv6: frags: Fix bogus skb->sk in reassembled packets

It was reported that IPsec would crash when it encounters an IPv6
reassembled packet because skb->sk is non-zero and not a valid
pointer.

This is because skb->sk is now a union with ip_defrag_offset.

This patch fixes this by resetting skb->sk when exiting from
the reassembly code.

Reported-by: Xiumei Mu <xmu@redhat.com>
Fixes: 219badfaade9 ("ipv6: frags: get rid of ip6frag_skb_cb/...")
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agomscc: Configured MAC entries should be locked.
Allan W. Nielsen [Thu, 20 Dec 2018 08:37:17 +0000 (09:37 +0100)]
mscc: Configured MAC entries should be locked.

The MAC table in Ocelot supports auto aging (normal) and static entries.
MAC entries that is manually configured should be static and not subject
to aging.

Fixes: a556c76adc05 ("net: mscc: Add initial Ocelot switch support")
Signed-off-by: Allan Nielsen <allan.nielsen@microchip.com>
Reviewed-by: Steen Hegelund <steen.hegelund@microchip.com>
Signed-off-by: Steen Hegelund <steen.hegelund@microchip.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge branch 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa...
Linus Torvalds [Thu, 20 Dec 2018 22:49:56 +0000 (14:49 -0800)]
Merge branch 'i2c/for-current' of git://git./linux/kernel/git/wsa/linux

Pull i2c fixes from Wolfram Sang:
 "I2C has a MAINTAINERS update for you, so people will be immediately
  pointed to the right person for this previously orphaned driver.

  And one of Arnd's build warning fixes for a new driver added this
  cycle"

* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
  i2c: nvidia-gpu: mark resume function as __maybe_unused
  MAINTAINERS: add entry for i2c-axxia driver

5 years agoMerge tag 'upstream-4.20-rc7' of git://git.infradead.org/linux-ubifs
Linus Torvalds [Thu, 20 Dec 2018 22:17:24 +0000 (14:17 -0800)]
Merge tag 'upstream-4.20-rc7' of git://git.infradead.org/linux-ubifs

Pull UBI/UBIFS fixes from Richard Weinberger:

 - Kconfig dependency fixes for our new auth feature

 - Fix for selecting the right compressor when creating a fs

 - Bugfix for a bug in UBIFS's O_TMPFILE implementation

 - Refcounting fixes for UBI

* tag 'upstream-4.20-rc7' of git://git.infradead.org/linux-ubifs:
  ubifs: Handle re-linking of inodes correctly while recovery
  ubi: Do not drop UBI device reference before using
  ubi: Put MTD device after it is not used
  ubifs: Fix default compression selection in ubifs
  ubifs: Fix memory leak on error condition
  ubifs: auth: Add CONFIG_KEYS dependency
  ubifs: CONFIG_UBIFS_FS_AUTHENTICATION should depend on UBIFS_FS
  ubifs: replay: Fix high stack usage

5 years agoACPI / tables: Add an ifdef around amlcode and dsdt_amlcode
Nathan Chancellor [Thu, 20 Dec 2018 19:38:56 +0000 (12:38 -0700)]
ACPI / tables: Add an ifdef around amlcode and dsdt_amlcode

Clang warns:

drivers/acpi/tables.c:715:14: warning: unused variable 'amlcode'
[-Wunused-variable]
static void *amlcode __attribute__ ((weakref("AmlCode")));
             ^
drivers/acpi/tables.c:716:14: warning: unused variable 'dsdt_amlcode'
[-Wunused-variable]
static void *dsdt_amlcode __attribute__ ((weakref("dsdt_aml_code")));
             ^
2 warnings generated.

The only uses of these variables are hiddem behind CONFIG_ACPI_CUSTOM_DSDT
so do the same thing here.

Fixes: 82e4eb4e9653 (ACPI / tables: add DSDT AmlCode new declaration name support)
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
5 years agoACPI/APEI: Clear GHES block_status before panic()
Lenny Szubowicz [Wed, 19 Dec 2018 16:50:52 +0000 (11:50 -0500)]
ACPI/APEI: Clear GHES block_status before panic()

In __ghes_panic() clear the block status in the APEI generic
error status block for that generic hardware error source before
calling panic() to prevent a second panic() in the crash kernel
for exactly the same fatal error.

Otherwise ghes_probe(), running in the crash kernel, would see
an unhandled error in the APEI generic error status block and
panic again, thereby precluding any crash dump.

Signed-off-by: Lenny Szubowicz <lszubowi@redhat.com>
Signed-off-by: David Arcari <darcari@redhat.com>
Tested-by: Tyler Baicar <baicar.tyler@gmail.com>
Acked-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
5 years agoMIPS: math-emu: Write-protect delay slot emulation pages
Paul Burton [Thu, 20 Dec 2018 17:45:43 +0000 (17:45 +0000)]
MIPS: math-emu: Write-protect delay slot emulation pages

Mapping the delay slot emulation page as both writeable & executable
presents a security risk, in that if an exploit can write to & jump into
the page then it can be used as an easy way to execute arbitrary code.

Prevent this by mapping the page read-only for userland, and using
access_process_vm() with the FOLL_FORCE flag to write to it from
mips_dsemul().

This will likely be less efficient due to copy_to_user_page() performing
cache maintenance on a whole page, rather than a single line as in the
previous use of flush_cache_sigtramp(). However this delay slot
emulation code ought not to be running in any performance critical paths
anyway so this isn't really a problem, and we can probably do better in
copy_to_user_page() anyway in future.

A major advantage of this approach is that the fix is small & simple to
backport to stable kernels.

Reported-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Paul Burton <paul.burton@mips.com>
Fixes: 432c6bacbd0c ("MIPS: Use per-mm page to execute branch delay slot instructions")
Cc: stable@vger.kernel.org # v4.8+
Cc: linux-mips@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: Rich Felker <dalias@libc.org>
Cc: David Daney <david.daney@cavium.com>
5 years agoMerge tag 'drm-misc-fixes-2018-12-20' of git://anongit.freedesktop.org/drm/drm-misc...
Daniel Vetter [Thu, 20 Dec 2018 17:13:53 +0000 (18:13 +0100)]
Merge tag 'drm-misc-fixes-2018-12-20' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes

Fix spectre v1 vuln in drm_ioctl

Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
From: Sean Paul <sean@poorly.run>
Link: https://patchwork.freedesktop.org/patch/msgid/20181220165740.GA42344@art_vandelay
5 years agoMerge remote-tracking branches 'spi/topic/mem' and 'spi/topic/mtd' into spi-next
Mark Brown [Thu, 20 Dec 2018 16:01:30 +0000 (16:01 +0000)]
Merge remote-tracking branches 'spi/topic/mem' and 'spi/topic/mtd' into spi-next

5 years agoMerge branch 'spi-4.21' into spi-next
Mark Brown [Thu, 20 Dec 2018 16:01:28 +0000 (16:01 +0000)]
Merge branch 'spi-4.21' into spi-next

5 years agoMerge branch 'spi-4.20' into spi-linus
Mark Brown [Thu, 20 Dec 2018 16:01:26 +0000 (16:01 +0000)]
Merge branch 'spi-4.20' into spi-linus

5 years agoMerge tag 'm68k-for-v4.20-tag2' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Thu, 20 Dec 2018 15:35:16 +0000 (07:35 -0800)]
Merge tag 'm68k-for-v4.20-tag2' of git://git./linux/kernel/git/geert/linux-m68k

Pull m68k fix from Geert Uytterhoeven:
 "Fix memblock-related crashes"

* tag 'm68k-for-v4.20-tag2' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k:
  m68k: Fix memblock-related crashes

5 years agoMerge tag 'kbuild-fixes-v4.20-2' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Thu, 20 Dec 2018 15:33:09 +0000 (07:33 -0800)]
Merge tag 'kbuild-fixes-v4.20-2' of git://git./linux/kernel/git/masahiroy/linux-kbuild

Pull Kbuild fix from Masahiro Yamada:
 "Fix false positive warning/error about missing library for objtool"

* tag 'kbuild-fixes-v4.20-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
  kbuild: fix false positive warning/error about missing libelf

5 years agoMerge tag 'char-misc-4.20-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/gregk...
Linus Torvalds [Thu, 20 Dec 2018 15:30:37 +0000 (07:30 -0800)]
Merge tag 'char-misc-4.20-rc8' of git://git./linux/kernel/git/gregkh/char-misc

Pull char/misc driver fixes from Greg KH:
 "Here are three tiny last-minute driver fixes for 4.20-rc8 that resolve
  some reported issues, and one MAINTAINERS file update.

  All of them are related to the hyper-v subsystem, it seems people are
  actually testing and using it now, which is nice to see :)

  The fixes are:
   - uio_hv_generic: fix for opening multiple times
   - Remove PCI dependancy on hyperv drivers
   - return proper error code for an unopened channel.

  And Sasha has signed up to help out with the hyperv maintainership.

  All of these have been in linux-next for a while with no reported
  issues"

* tag 'char-misc-4.20-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
  Drivers: hv: vmbus: Return -EINVAL for the sys files for unopened channels
  x86, hyperv: remove PCI dependency
  MAINTAINERS: Patch monkey for the Hyper-V code
  uio_hv_generic: set callbacks on open

5 years agoMerge tag 'tty-4.20-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty
Linus Torvalds [Thu, 20 Dec 2018 15:29:11 +0000 (07:29 -0800)]
Merge tag 'tty-4.20-rc8' of git://git./linux/kernel/git/gregkh/tty

Pull tty/serial fix from Greg KH:
 "Here is a single fix, a revert, for the 8250 serial driver to resolve
  a reported problem.

  There was some attempted patches to fix the issue, but people are
  arguing about them, so reverting the patch to revert back to the 4.19
  and older behavior is the best thing to do at this late in the release
  cycle.

  The revert has been in linux-next with no reported issues"

* tag 'tty-4.20-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
  Revert "serial: 8250: Fix clearing FIFOs in RS485 mode again"

5 years agoMerge tag 'usb-4.20-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Linus Torvalds [Thu, 20 Dec 2018 15:27:39 +0000 (07:27 -0800)]
Merge tag 'usb-4.20-rc8' of git://git./linux/kernel/git/gregkh/usb

Pull USB fixes and ids from Greg KH:
 "Here are some late xhci fixes for 4.20-rc8 as well as a few new device
  ids for the option usb-serial driver.

  The xhci fixes resolve some many-reported issues and all of these have
  been in linux-next for a while with no reported problems"

* tag 'usb-4.20-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
  USB: xhci: fix 'broken_suspend' placement in struct xchi_hcd
  xhci: Don't prevent USB2 bus suspend in state check intended for USB3 only
  USB: serial: option: add Telit LN940 series
  USB: serial: option: add Fibocom NL668 series
  USB: serial: option: add Simcom SIM7500/SIM7600 (MBIM mode)
  USB: serial: option: add GosunCn ZTE WeLink ME3630
  USB: serial: option: add HP lt4132

5 years agoMerge tag 'mmc-v4.20-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc
Linus Torvalds [Thu, 20 Dec 2018 15:25:31 +0000 (07:25 -0800)]
Merge tag 'mmc-v4.20-rc7' of git://git./linux/kernel/git/ulfh/mmc

Pull MMC fixes from Ulf Hansson:
 "MMC core:
   - Restore code to allow BKOPS and CACHE ctrl even if no HPI support
   - Reset HPI enabled state during re-init
   - Use a default minimum timeout when enabling CACHE ctrl

  MMC host:
   - omap_hsmmc: Fix DMA API warning
   - sdhci-tegra: Fix dt parsing of SDMMC pads autocal values
   - Correct register accesses when enabling v4 mode"

* tag 'mmc-v4.20-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
  mmc: core: Use a minimum 1600ms timeout when enabling CACHE ctrl
  mmc: core: Allow BKOPS and CACHE ctrl even if no HPI support
  mmc: core: Reset HPI enabled state during re-init and in case of errors
  mmc: omap_hsmmc: fix DMA API warning
  mmc: tegra: Fix for SDMMC pads autocal parsing from dt
  mmc: sdhci: Fix sdhci_do_enable_v4_mode

5 years agoiomap: Revert "fs/iomap.c: get/put the page in iomap_page_create/release()"
Dave Chinner [Thu, 20 Dec 2018 12:23:24 +0000 (23:23 +1100)]
iomap: Revert "fs/iomap.c: get/put the page in iomap_page_create/release()"

This reverts commit 61c6de667263184125d5ca75e894fcad632b0dd3.

The reverted commit added page reference counting to iomap page
structures that are used to track block size < page size state. This
was supposed to align the code with page migration page accounting
assumptions, but what it has done instead is break XFS filesystems.
Every fstests run I've done on sub-page block size XFS filesystems
has since picking up this commit 2 days ago has failed with bad page
state errors such as:

# ./run_check.sh "-m rmapbt=1,reflink=1 -i sparse=1 -b size=1k" "generic/038"
....
SECTION       -- xfs
FSTYP         -- xfs (debug)
PLATFORM      -- Linux/x86_64 test1 4.20.0-rc6-dgc+
MKFS_OPTIONS  -- -f -m rmapbt=1,reflink=1 -i sparse=1 -b size=1k /dev/sdc
MOUNT_OPTIONS -- /dev/sdc /mnt/scratch

generic/038 454s ...
 run fstests generic/038 at 2018-12-20 18:43:05
 XFS (sdc): Unmounting Filesystem
 XFS (sdc): Mounting V5 Filesystem
 XFS (sdc): Ending clean mount
 BUG: Bad page state in process kswapd0  pfn:3a7fa
 page:ffffea0000ccbeb0 count:0 mapcount:0 mapping:ffff88800d9b6360 index:0x1
 flags: 0xfffffc0000000()
 raw: 000fffffc0000000 dead000000000100 dead000000000200 ffff88800d9b6360
 raw: 0000000000000001 0000000000000000 00000000ffffffff
 page dumped because: non-NULL mapping
 CPU: 0 PID: 676 Comm: kswapd0 Not tainted 4.20.0-rc6-dgc+ #915
 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.1-1 04/01/2014
 Call Trace:
  dump_stack+0x67/0x90
  bad_page.cold.116+0x8a/0xbd
  free_pcppages_bulk+0x4bf/0x6a0
  free_unref_page_list+0x10f/0x1f0
  shrink_page_list+0x49d/0xf50
  shrink_inactive_list+0x19d/0x3b0
  shrink_node_memcg.constprop.77+0x398/0x690
  ? shrink_slab.constprop.81+0x278/0x3f0
  shrink_node+0x7a/0x2f0
  kswapd+0x34b/0x6d0
  ? node_reclaim+0x240/0x240
  kthread+0x11f/0x140
  ? __kthread_bind_mask+0x60/0x60
  ret_from_fork+0x24/0x30
 Disabling lock debugging due to kernel taint
....

The failures are from anyway that frees pages and empties the
per-cpu page magazines, so it's not a predictable failure or an easy
to debug failure.

generic/038 is a reliable reproducer of this problem - it has a 9 in
10 failure rate on one of my test machines. Failure on other
machines have been at random points in fstests runs but every run
has ended up tripping this problem. Hence generic/038 was used to
bisect the failure because it was the most reliable failure.

It is too close to the 4.20 release (not to mention holidays) to
try to diagnose, fix and test the underlying cause of the problem,
so reverting the commit is the only option we have right now. The
revert has been tested against a current tot 4.20-rc7+ kernel across
multiple machines running sub-page block size XFs filesystems and
none of the bad page state failures have been seen.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Cc: Piotr Jaroszynski <pjaroszynski@nvidia.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: William Kucharski <william.kucharski@oracle.com>
Cc: Darrick J. Wong <darrick.wong@oracle.com>
Cc: Brian Foster <bfoster@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agospi: sh-msiof: Reduce the number of times write to and perform the transmission from...
Hoan Nguyen An [Thu, 20 Dec 2018 08:48:42 +0000 (17:48 +0900)]
spi: sh-msiof: Reduce the number of times write to and perform the transmission from FIFO

The current state of the spi-sh-msiof, in master transfer mode: if t-> bits_per_word <= 8,
if the data length is divisible by 4 ((len & 3) = 0), the length of each word will be 32 bits
In case of data length can not be divisible by 4 ((len & 3) != 0), always set each word to be
8 bits, this will increase the number of times that write to FIFO, increasing the number of
times it should be transmitted. Assume that the number of bytes of data length more than 64 bytes,
each transmission will write 64 times into the TFDR then transmit, a maximum one-time
transmission will transmit 64 bytes if each word is 8 bits long.

Switch to setting if t->bits_per_word <= 8, the word length will be 32 bits although the data
length is not divisible by 4, then if leftover, will transmit the balance and the length of each
words is 1 byte. The maximum each can transmit up to 64 x 4 (Data Size = 32 bits (4 bytes)) = 256 bytes.
TMDR2 : Bits 28 to 24  BITLEN1[4:0] Data Size (8 to 32 bits)
        Bits 23 to 16  WDLEN1[7:0]  Word Count (1 to 64 words)

Signed-off-by: Hoan Nguyen An <na-hoan@jinso.co.jp>
Signed-off-by: Mark Brown <broonie@kernel.org>
5 years agoregulator: convert to DEFINE_SHOW_ATTRIBUTE
Yangtao Li [Sat, 15 Dec 2018 08:21:48 +0000 (03:21 -0500)]
regulator: convert to DEFINE_SHOW_ATTRIBUTE

Use DEFINE_SHOW_ATTRIBUTE macro to simplify the code.

Signed-off-by: Yangtao Li <tiny.windzz@gmail.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
5 years agoregulator: mcp16502: Fix missing n_voltages setting
Axel Lin [Wed, 19 Dec 2018 08:58:04 +0000 (16:58 +0800)]
regulator: mcp16502: Fix missing n_voltages setting

The n_voltages setting is not set, fix it.

Signed-off-by: Axel Lin <axel.lin@ingics.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
5 years agoregulator: mcp16502: Use #ifdef CONFIG_PM_SLEEP around mcp16502_suspend/resume_noirq
Axel Lin [Thu, 20 Dec 2018 13:14:13 +0000 (21:14 +0800)]
regulator: mcp16502: Use #ifdef CONFIG_PM_SLEEP around mcp16502_suspend/resume_noirq

mcp16502_suspend/resume_noirq is only used by SET_NOIRQ_SYSTEM_SLEEP_PM_OPS
when CONFIG_PM_SLEEP is defined.
So use #ifdef CONFIG_PM_SLEEP instead CONFIG_SUSPEND guard.

Signed-off-by: Axel Lin <axel.lin@ingics.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
5 years agoMerge tag 'kvm-ppc-next-4.21-1' of git://git.kernel.org/pub/scm/linux/kernel/git...
Radim Krčmář [Thu, 20 Dec 2018 13:54:09 +0000 (14:54 +0100)]
Merge tag 'kvm-ppc-next-4.21-1' of git://git./linux/kernel/git/paulus/powerpc

PPC KVM update for 4.21 from Paul Mackerras

The main new feature this time is support in HV nested KVM for passing
a device that is emulated by a level 0 hypervisor and presented to
level 1 as a PCI device through to a level 2 guest using VFIO.

Apart from that there are improvements for migration of radix guests
under HV KVM and some other fixes and cleanups.

5 years agomedia: cx23885: only reset DMA on problematic CPUs
Brad Love [Wed, 19 Dec 2018 17:07:01 +0000 (12:07 -0500)]
media: cx23885: only reset DMA on problematic CPUs

It is reported that commit 95f408bbc4e4 ("media: cx23885: Ryzen DMA
related RiSC engine stall fixes") caused regresssions with other CPUs.

Ensure that the quirk will be applied only for the CPUs that
are known to cause problems.

A module option is added for explicit control of the behaviour.

Fixes: 95f408bbc4e4 ("media: cx23885: Ryzen DMA related RiSC engine stall fixes")

Signed-off-by: Brad Love <brad@nextdimension.cc>
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
5 years agomedia: ddbridge: Move asm includes after linux ones
Nathan Chancellor [Mon, 10 Dec 2018 23:35:14 +0000 (18:35 -0500)]
media: ddbridge: Move asm includes after linux ones

Without this, cpumask_t and bool are not defined:

In file included from drivers/media/pci/ddbridge/ddbridge-ci.c:19:
In file included from drivers/media/pci/ddbridge/ddbridge.h:22:
./arch/arm/include/asm/irq.h:35:50: error: unknown type name 'cpumask_t'
extern void arch_trigger_cpumask_backtrace(const cpumask_t *mask,
                                                 ^
./arch/arm/include/asm/irq.h:36:9: error: unknown type name 'bool'
                                           bool exclude_self);
                                           ^

Doing a survey of the kernel tree, this appears to be expected because
'#include <asm/irq.h>' is always after the linux includes.

This also fixes warnings of this variety (with Clang):

In file included from drivers/media/pci/ddbridge/ddbridge-ci.c:19:
In file included from drivers/media/pci/ddbridge/ddbridge.h:56:
In file included from ./include/media/dvb_net.h:22:
In file included from ./include/linux/netdevice.h:50:
In file included from ./include/uapi/linux/neighbour.h:6:
In file included from ./include/linux/netlink.h:9:
In file included from ./include/net/scm.h:11:
In file included from ./include/linux/sched/signal.h:6:
./include/linux/signal.h:87:11: warning: array index 3 is past the end
of the array (which contains 2 elements) [-Warray-bounds]
                return (set->sig[3] | set->sig[2] |
                        ^        ~
./arch/arm/include/asm/signal.h:17:2: note: array 'sig' declared here
        unsigned long sig[_NSIG_WORDS];
        ^

Fixes: b6973637c4cc ("media: ddbridge: remove another duplicate of io.h and sort includes")

Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
5 years agoACPI: Make PCI slot detection driver depend on PCI
Sinan Kaya [Wed, 19 Dec 2018 22:46:59 +0000 (22:46 +0000)]
ACPI: Make PCI slot detection driver depend on PCI

Since this is ACPI PCI slot detection driver for PCI, it doesn't make sense
to compile this without PCI support in place.

Signed-off-by: Sinan Kaya <okaya@kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
5 years agoACPI/IORT: Stub out ACS functions when CONFIG_PCI is not set
Sinan Kaya [Wed, 19 Dec 2018 22:46:58 +0000 (22:46 +0000)]
ACPI/IORT: Stub out ACS functions when CONFIG_PCI is not set

Remove PCI dependent code out of iort.c when CONFIG_PCI is not defined.
A quick search reveals the following functions:
1. pci_request_acs()
2. pci_domain_nr()
3. pci_is_root_bus()
4. to_pci_dev()

Both pci_domain_nr() and pci_is_root_bus() are defined in linux/pci.h.
pci_domain_nr() is a stub function when CONFIG_PCI is not set and
pci_is_root_bus() just returns a reference to a structure member which
is still valid without CONFIG_PCI set.

to_pci_dev() is a macro that expands to container_of.

pci_request_acs() is the only code that gets pulled in from drivers/pci/*.c

Signed-off-by: Sinan Kaya <okaya@kernel.org>
Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
5 years agoarm64: select ACPI PCI code only when both features are enabled
Sinan Kaya [Wed, 19 Dec 2018 22:46:57 +0000 (22:46 +0000)]
arm64: select ACPI PCI code only when both features are enabled

ACPI and PCI are no longer coupled to each other. Specify requirements
for both when pulling in code.

Signed-off-by: Sinan Kaya <okaya@kernel.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
5 years agoPCI/ACPI: Allow ACPI to be built without CONFIG_PCI set
Sinan Kaya [Wed, 19 Dec 2018 22:46:56 +0000 (22:46 +0000)]
PCI/ACPI: Allow ACPI to be built without CONFIG_PCI set

We are compiling PCI code today for systems with ACPI and no PCI
device present. Remove the useless code and reduce the tight
dependency.

Signed-off-by: Sinan Kaya <okaya@kernel.org>
Acked-by: Bjorn Helgaas <bhelgaas@google.com> # PCI parts
Acked-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
5 years agoACPICA: Remove PCI bits from ACPICA when CONFIG_PCI is unset
Sinan Kaya [Wed, 19 Dec 2018 22:46:55 +0000 (22:46 +0000)]
ACPICA: Remove PCI bits from ACPICA when CONFIG_PCI is unset

Allow ACPI to be built without PCI support in place.

Signed-off-by: Sinan Kaya <okaya@kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
5 years agoACPI: Allow CONFIG_PCI to be unset for reboot
Sinan Kaya [Wed, 19 Dec 2018 22:46:54 +0000 (22:46 +0000)]
ACPI: Allow CONFIG_PCI to be unset for reboot

Make PCI reboot conditional on CONFIG_PCI set on the kernel configuration.

Signed-off-by: Sinan Kaya <okaya@kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
5 years agoACPI: Move PCI reset to a separate function
Sinan Kaya [Wed, 19 Dec 2018 22:46:53 +0000 (22:46 +0000)]
ACPI: Move PCI reset to a separate function

Getting ready to factor out PCI specific code when CONFIG_PCI is unset.
Create a acpi_pci_reset() that kick starts PCI specific reset.

Signed-off-by: Sinan Kaya <okaya@kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
5 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Linus Torvalds [Thu, 20 Dec 2018 07:34:33 +0000 (23:34 -0800)]
Merge git://git./linux/kernel/git/davem/net

Pull networking fixes from David Miller:

 1) Off by one in netlink parsing of mac802154_hwsim, from Alexander
    Aring.

 2) nf_tables RCU usage fix from Taehee Yoo.

 3) Flow dissector needs nhoff and thoff clamping, from Stanislav
    Fomichev.

 4) Missing sin6_flowinfo initialization in SCTP, from Xin Long.

 5) Spectrev1 in ipmr and ip6mr, from Gustavo A. R. Silva.

 6) Fix r8169 crash when DEBUG_SHIRQ is enabled, from Heiner Kallweit.

 7) Fix SKB leak in rtlwifi, from Larry Finger.

 8) Fix state pruning in bpf verifier, from Jakub Kicinski.

 9) Don't handle completely duplicate fragments as overlapping, from
    Michal Kubecek.

10) Fix memory corruption with macb and 64-bit DMA, from Anssi Hannula.

11) Fix TCP fallback socket release in smc, from Myungho Jung.

12) gro_cells_destroy needs to napi_disable, from Lorenzo Bianconi.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (130 commits)
  rds: Fix warning.
  neighbor: NTF_PROXY is a valid ndm_flag for a dump request
  net: mvpp2: fix the phylink mode validation
  net/sched: cls_flower: Remove old entries from rhashtable
  net/tls: allocate tls context using GFP_ATOMIC
  iptunnel: make TUNNEL_FLAGS available in uapi
  gro_cell: add napi_disable in gro_cells_destroy
  lan743x: Remove MAC Reset from initialization
  net/mlx5e: Remove the false indication of software timestamping support
  net/mlx5: Typo fix in del_sw_hw_rule
  net/mlx5e: RX, Fix wrong early return in receive queue poll
  ipv6: explicitly initialize udp6_addr in udp_sock_create6()
  bnxt_en: Fix ethtool self-test loopback.
  net/rds: remove user triggered WARN_ON in rds_sendmsg
  net/rds: fix warn in rds_message_alloc_sgs
  ath10k: skip sending quiet mode cmd for WCN3990
  mac80211: free skb fraglist before freeing the skb
  nl80211: fix memory leak if validate_pae_over_nl80211() fails
  net/smc: fix TCP fallback socket release
  vxge: ensure data0 is initialized in when fetching firmware version information
  ...

5 years agodrm/ioctl: Fix Spectre v1 vulnerabilities
Gustavo A. R. Silva [Thu, 20 Dec 2018 00:00:15 +0000 (18:00 -0600)]
drm/ioctl: Fix Spectre v1 vulnerabilities

nr is indirectly controlled by user-space, hence leading to a
potential exploitation of the Spectre variant 1 vulnerability.

This issue was detected with the help of Smatch:

drivers/gpu/drm/drm_ioctl.c:805 drm_ioctl() warn: potential spectre issue 'dev->driver->ioctls' [r]
drivers/gpu/drm/drm_ioctl.c:810 drm_ioctl() warn: potential spectre issue 'drm_ioctls' [r] (local cap)
drivers/gpu/drm/drm_ioctl.c:892 drm_ioctl_flags() warn: potential spectre issue 'drm_ioctls' [r] (local cap)

Fix this by sanitizing nr before using it to index dev->driver->ioctls
and drm_ioctls.

Notice that given that speculation windows are large, the policy is
to kill the speculation on the first load and not worry if it can be
completed with a dependent load/store [1].

[1] https://marc.info/?l=linux-kernel&m=152449131114778&w=2

Cc: stable@vger.kernel.org
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20181220000015.GA18973@embeddedor
5 years agords: Fix warning.
David S. Miller [Thu, 20 Dec 2018 04:53:18 +0000 (20:53 -0800)]
rds: Fix warning.

>> net/rds/send.c:1109:42: warning: Using plain integer as NULL pointer

Fixes: ea010070d0a7 ("net/rds: fix warn in rds_message_alloc_sgs")
Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost
Linus Torvalds [Thu, 20 Dec 2018 02:40:48 +0000 (18:40 -0800)]
Merge tag 'for_linus' of git://git./linux/kernel/git/mst/vhost

Pull virtio fix from Michael Tsirkin:
 "A last-minute fix for a test build"

* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
  virtio: fix test build after uio.h change

5 years agoMerge tag 'nfs-for-4.20-6' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
Linus Torvalds [Thu, 20 Dec 2018 02:38:54 +0000 (18:38 -0800)]
Merge tag 'nfs-for-4.20-6' of git://git.linux-nfs.org/projects/trondmy/linux-nfs

Pull NFS client bugfixes from Trond Myklebust:

 - Fix TCP socket disconnection races by ensuring we always call
   xprt_disconnect_done() after releasing the socket.

 - Fix a race when clearing both XPRT_CONNECTING and XPRT_LOCKED

 - Remove xprt_connect_status() so it does not mask errors that should
   be handled by call_connect_status()

* tag 'nfs-for-4.20-6' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
  SUNRPC: Remove xprt_connect_status()
  SUNRPC: Fix a race with XPRT_CONNECTING
  SUNRPC: Fix disconnection races

5 years agoMerge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Linus Torvalds [Thu, 20 Dec 2018 02:27:58 +0000 (18:27 -0800)]
Merge tag 'for-linus' of git://git./virt/kvm/kvm

Pull kvm fixes from Paolo Bonzini:

 -  One nasty use-after-free bugfix, from this merge window however

 -  A less nasty use-after-free that can only zero some words at the
    beginning of the page, and hence is not really exploitable

 -  A NULL pointer dereference

 -  A dummy implementation of an AMD chicken bit MSR that Windows uses
    for some unknown reason

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  kvm: x86: Add AMD's EX_CFG to the list of ignored MSRs
  KVM: X86: Fix NULL deref in vcpu_scan_ioapic
  KVM: Fix UAF in nested posted interrupt processing
  KVM: fix unregistering coalesced mmio zone from wrong bus

5 years agoMerge tag 'dma-mapping-4.20-4' of git://git.infradead.org/users/hch/dma-mapping
Linus Torvalds [Thu, 20 Dec 2018 02:16:17 +0000 (18:16 -0800)]
Merge tag 'dma-mapping-4.20-4' of git://git.infradead.org/users/hch/dma-mapping

Pull dma-mapping fix from Christoph Hellwig:
 "Fix a regression in dma-direct that didn't take account the magic AMD
  memory encryption mask in the DMA address"

* tag 'dma-mapping-4.20-4' of git://git.infradead.org/users/hch/dma-mapping:
  dma-direct: do not include SME mask in the DMA supported check

5 years agoneighbor: NTF_PROXY is a valid ndm_flag for a dump request
David Ahern [Thu, 20 Dec 2018 00:54:38 +0000 (16:54 -0800)]
neighbor: NTF_PROXY is a valid ndm_flag for a dump request

When dumping proxy entries the dump request has NTF_PROXY set in
ndm_flags. strict mode checking needs to be updated to allow this
flag.

Fixes: 51183d233b5a ("net/neighbor: Update neigh_dump_info for strict data checking")
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: mvpp2: fix the phylink mode validation
Antoine Tenart [Wed, 19 Dec 2018 17:00:12 +0000 (18:00 +0100)]
net: mvpp2: fix the phylink mode validation

The mvpp2_phylink_validate() sets all modes that are supported by a
given PPv2 port. An mistake made the 10000baseT_Full mode being
advertised in some cases when a port wasn't configured to perform at
10G. This patch fixes this.

Fixes: d97c9f4ab000 ("net: mvpp2: 1000baseX support")
Reported-by: Russell King <linux@armlinux.org.uk>
Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet/sched: cls_flower: Remove old entries from rhashtable
Roi Dayan [Wed, 19 Dec 2018 16:07:56 +0000 (18:07 +0200)]
net/sched: cls_flower: Remove old entries from rhashtable

When replacing a rule we add the new rule to the rhashtable
but only remove the old if not in skip_sw.
This commit fix this and remove the old rule anyway.

Fixes: 35cc3cefc4de ("net/sched: cls_flower: Reject duplicated rules also under skip_sw")
Signed-off-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Vlad Buslov <vladbu@mellanox.com>
Acked-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet/tls: allocate tls context using GFP_ATOMIC
Ganesh Goudar [Wed, 19 Dec 2018 11:48:22 +0000 (17:18 +0530)]
net/tls: allocate tls context using GFP_ATOMIC

create_ctx can be called from atomic context, hence use
GFP_ATOMIC instead of GFP_KERNEL.

[  395.962599] BUG: sleeping function called from invalid context at mm/slab.h:421
[  395.979896] in_atomic(): 1, irqs_disabled(): 0, pid: 16254, name: openssl
[  395.996564] 2 locks held by openssl/16254:
[  396.010492]  #0: 00000000347acb52 (sk_lock-AF_INET){+.+.}, at: do_tcp_setsockopt.isra.44+0x13b/0x9a0
[  396.029838]  #1: 000000006c9552b5 (device_spinlock){+...}, at: tls_init+0x1d/0x280
[  396.047675] CPU: 5 PID: 16254 Comm: openssl Tainted: G           O      4.20.0-rc6+ #25
[  396.066019] Hardware name: Supermicro X10SRA-F/X10SRA-F, BIOS 2.0c 09/25/2017
[  396.083537] Call Trace:
[  396.096265]  dump_stack+0x5e/0x8b
[  396.109876]  ___might_sleep+0x216/0x250
[  396.123940]  kmem_cache_alloc_trace+0x1b0/0x240
[  396.138800]  create_ctx+0x1f/0x60
[  396.152504]  tls_init+0xbd/0x280
[  396.166135]  tcp_set_ulp+0x191/0x2d0
[  396.180035]  ? tcp_set_ulp+0x2c/0x2d0
[  396.193960]  do_tcp_setsockopt.isra.44+0x148/0x9a0
[  396.209013]  __sys_setsockopt+0x7c/0xe0
[  396.223054]  __x64_sys_setsockopt+0x20/0x30
[  396.237378]  do_syscall_64+0x4a/0x180
[  396.251200]  entry_SYSCALL_64_after_hwframe+0x49/0xbe

Fixes: df9d4a178022 ("net/tls: sleeping function from invalid context")
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoiptunnel: make TUNNEL_FLAGS available in uapi
wenxu [Wed, 19 Dec 2018 06:11:15 +0000 (14:11 +0800)]
iptunnel: make TUNNEL_FLAGS available in uapi

ip l add dev tun type gretap external
ip r a 10.0.0.1 encap ip dst 192.168.152.171 id 1000 dev gretap

For gretap Key example when the command set the id but don't set the
TUNNEL_KEY flags. There is no key field in the send packet

In the lwtunnel situation, some TUNNEL_FLAGS should can be set by
userspace

Signed-off-by: wenxu <wenxu@ucloud.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agogro_cell: add napi_disable in gro_cells_destroy
Lorenzo Bianconi [Wed, 19 Dec 2018 22:23:00 +0000 (23:23 +0100)]
gro_cell: add napi_disable in gro_cells_destroy

Add napi_disable routine in gro_cells_destroy since starting from
commit c42858eaf492 ("gro_cells: remove spinlock protecting receive
queues") gro_cell_poll and gro_cells_destroy can run concurrently on
napi_skbs list producing a kernel Oops if the tunnel interface is
removed while gro_cell_poll is running. The following Oops has been
triggered removing a vxlan device while the interface is receiving
traffic

[ 5628.948853] BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
[ 5628.949981] PGD 0 P4D 0
[ 5628.950308] Oops: 0002 [#1] SMP PTI
[ 5628.950748] CPU: 0 PID: 9 Comm: ksoftirqd/0 Not tainted 4.20.0-rc6+ #41
[ 5628.952940] RIP: 0010:gro_cell_poll+0x49/0x80
[ 5628.955615] RSP: 0018:ffffc9000004fdd8 EFLAGS: 00010202
[ 5628.956250] RAX: 0000000000000000 RBX: ffffe8ffffc08150 RCX: 0000000000000000
[ 5628.957102] RDX: 0000000000000000 RSI: ffff88802356bf00 RDI: ffffe8ffffc08150
[ 5628.957940] RBP: 0000000000000026 R08: 0000000000000000 R09: 0000000000000000
[ 5628.958803] R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000040
[ 5628.959661] R13: ffffe8ffffc08100 R14: 0000000000000000 R15: 0000000000000040
[ 5628.960682] FS:  0000000000000000(0000) GS:ffff88803ea00000(0000) knlGS:0000000000000000
[ 5628.961616] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 5628.962359] CR2: 0000000000000008 CR3: 000000000221c000 CR4: 00000000000006b0
[ 5628.963188] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 5628.964034] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 5628.964871] Call Trace:
[ 5628.965179]  net_rx_action+0xf0/0x380
[ 5628.965637]  __do_softirq+0xc7/0x431
[ 5628.966510]  run_ksoftirqd+0x24/0x30
[ 5628.966957]  smpboot_thread_fn+0xc5/0x160
[ 5628.967436]  kthread+0x113/0x130
[ 5628.968283]  ret_from_fork+0x3a/0x50
[ 5628.968721] Modules linked in:
[ 5628.969099] CR2: 0000000000000008
[ 5628.969510] ---[ end trace 9d9dedc7181661fe ]---
[ 5628.970073] RIP: 0010:gro_cell_poll+0x49/0x80
[ 5628.972965] RSP: 0018:ffffc9000004fdd8 EFLAGS: 00010202
[ 5628.973611] RAX: 0000000000000000 RBX: ffffe8ffffc08150 RCX: 0000000000000000
[ 5628.974504] RDX: 0000000000000000 RSI: ffff88802356bf00 RDI: ffffe8ffffc08150
[ 5628.975462] RBP: 0000000000000026 R08: 0000000000000000 R09: 0000000000000000
[ 5628.976413] R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000040
[ 5628.977375] R13: ffffe8ffffc08100 R14: 0000000000000000 R15: 0000000000000040
[ 5628.978296] FS:  0000000000000000(0000) GS:ffff88803ea00000(0000) knlGS:0000000000000000
[ 5628.979327] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 5628.980044] CR2: 0000000000000008 CR3: 000000000221c000 CR4: 00000000000006b0
[ 5628.980929] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 5628.981736] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 5628.982409] Kernel panic - not syncing: Fatal exception in interrupt
[ 5628.983307] Kernel Offset: disabled

Fixes: c42858eaf492 ("gro_cells: remove spinlock protecting receive queues")
Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agolan743x: Remove MAC Reset from initialization
Bryan Whitehead [Wed, 19 Dec 2018 21:55:15 +0000 (16:55 -0500)]
lan743x: Remove MAC Reset from initialization

The MAC Reset was noticed to erase important EEPROM settings.
It is also unnecessary since a chip wide reset was done earlier
in initialization, and that reset preserves EEPROM settings.

There for this patch removes the unnecessary MAC specific reset.

Signed-off-by: Bryan Whitehead <Bryan.Whitehead@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agovirtio: fix test build after uio.h change
Michael S. Tsirkin [Wed, 19 Dec 2018 23:21:51 +0000 (18:21 -0500)]
virtio: fix test build after uio.h change

Fixes: d38499530e5 ("fs: decouple READ and WRITE from the block layer ops")
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>