OSDN Git Service
Lan Tianyu [Thu, 6 Dec 2018 13:21:07 +0000 (21:21 +0800)]
KVM/VMX: Add hv tlb range flush support
This patch is to register tlb_remote_flush_with_range callback with
hv tlb range flush interface.
Signed-off-by: Lan Tianyu <Tianyu.Lan@microsoft.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Lan Tianyu [Thu, 6 Dec 2018 13:21:05 +0000 (21:21 +0800)]
x86/hyper-v: Add HvFlushGuestAddressList hypercall support
Hyper-V provides HvFlushGuestAddressList() hypercall to flush EPT tlb
with specified ranges. This patch is to add the hypercall support.
Reviewed-by: Michael Kelley <mikelley@microsoft.com>
Signed-off-by: Lan Tianyu <Tianyu.Lan@microsoft.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Lan Tianyu [Thu, 6 Dec 2018 13:21:04 +0000 (21:21 +0800)]
KVM: Add tlb_remote_flush_with_range callback in kvm_x86_ops
Add flush range call back in the kvm_x86_ops and platform can use it
to register its associated function. The parameter "kvm_tlb_range"
accepts a single range and flush list which contains a list of ranges.
Signed-off-by: Lan Tianyu <Tianyu.Lan@microsoft.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Luwei Kang [Wed, 24 Oct 2018 08:05:16 +0000 (16:05 +0800)]
KVM: x86: Disable Intel PT when VMXON in L1 guest
Currently, Intel Processor Trace do not support tracing in L1 guest
VMX operation(IA32_VMX_MISC[bit 14] is 0). As mentioned in SDM,
on these type of processors, execution of the VMXON instruction will
clears IA32_RTIT_CTL.TraceEn and any attempt to write IA32_RTIT_CTL
causes a general-protection exception (#GP).
Signed-off-by: Luwei Kang <luwei.kang@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Chao Peng [Wed, 24 Oct 2018 08:05:15 +0000 (16:05 +0800)]
KVM: x86: Set intercept for Intel PT MSRs read/write
To save performance overhead, disable intercept Intel PT MSRs
read/write when Intel PT is enabled in guest.
MSR_IA32_RTIT_CTL is an exception that will always be intercepted.
Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
Signed-off-by: Luwei Kang <luwei.kang@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Chao Peng [Wed, 24 Oct 2018 08:05:14 +0000 (16:05 +0800)]
KVM: x86: Implement Intel PT MSRs read/write emulation
This patch implement Intel Processor Trace MSRs read/write
emulation.
Intel PT MSRs read/write need to be emulated when Intel PT
MSRs is intercepted in guest and during live migration.
Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
Signed-off-by: Luwei Kang <luwei.kang@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Luwei Kang [Wed, 24 Oct 2018 08:05:13 +0000 (16:05 +0800)]
KVM: x86: Introduce a function to initialize the PT configuration
Initialize the Intel PT configuration when cpuid update.
Include cpuid inforamtion, rtit_ctl bit mask and the number of
address ranges.
Signed-off-by: Luwei Kang <luwei.kang@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Chao Peng [Wed, 24 Oct 2018 08:05:12 +0000 (16:05 +0800)]
KVM: x86: Add Intel PT context switch for each vcpu
Load/Store Intel Processor Trace register in context switch.
MSR IA32_RTIT_CTL is loaded/stored automatically from VMCS.
In Host-Guest mode, we need load/resore PT MSRs only when PT
is enabled in guest.
Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
Signed-off-by: Luwei Kang <luwei.kang@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Chao Peng [Wed, 24 Oct 2018 08:05:11 +0000 (16:05 +0800)]
KVM: x86: Add Intel Processor Trace cpuid emulation
Expose Intel Processor Trace to guest only when
the PT works in Host-Guest mode.
Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
Signed-off-by: Luwei Kang <luwei.kang@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Chao Peng [Wed, 24 Oct 2018 08:05:10 +0000 (16:05 +0800)]
KVM: x86: Add Intel PT virtualization work mode
Intel Processor Trace virtualization can be work in one
of 2 possible modes:
a. System-Wide mode (default):
When the host configures Intel PT to collect trace packets
of the entire system, it can leave the relevant VMX controls
clear to allow VMX-specific packets to provide information
across VMX transitions.
KVM guest will not aware this feature in this mode and both
host and KVM guest trace will output to host buffer.
b. Host-Guest mode:
Host can configure trace-packet generation while in
VMX non-root operation for guests and root operation
for native executing normally.
Intel PT will be exposed to KVM guest in this mode, and
the trace output to respective buffer of host and guest.
In this mode, tht status of PT will be saved and disabled
before VM-entry and restored after VM-exit if trace
a virtual machine.
Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
Signed-off-by: Luwei Kang <luwei.kang@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Luwei Kang [Wed, 24 Oct 2018 08:05:09 +0000 (16:05 +0800)]
perf/x86/intel/pt: add new capability for Intel PT
This adds support for "output to Trace Transport subsystem"
capability of Intel PT. It means that PT can output its
trace to an MMIO address range rather than system memory buffer.
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Luwei Kang <luwei.kang@intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Luwei Kang [Wed, 24 Oct 2018 08:05:08 +0000 (16:05 +0800)]
perf/x86/intel/pt: Add new bit definitions for PT MSRs
Add bit definitions for Intel PT MSRs to support trace output
directed to the memeory subsystem and holds a count if packet
bytes that have been sent out.
These are required by the upcoming PT support in KVM guests
for MSRs read/write emulation.
Signed-off-by: Luwei Kang <luwei.kang@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Luwei Kang [Wed, 24 Oct 2018 08:05:07 +0000 (16:05 +0800)]
perf/x86/intel/pt: Introduce intel_pt_validate_cap()
intel_pt_validate_hw_cap() validates whether a given PT capability is
supported by the hardware. It checks the PT capability array which
reflects the capabilities of the hardware on which the code is executed.
For setting up PT for KVM guests this is not correct as the capability
array for the guest can be different from the host array.
Provide a new function to check against a given capability array.
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Luwei Kang <luwei.kang@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Chao Peng [Wed, 24 Oct 2018 08:05:06 +0000 (16:05 +0800)]
perf/x86/intel/pt: Export pt_cap_get()
pt_cap_get() is required by the upcoming PT support in KVM guests.
Export it and move the capabilites enum to a global header.
As a global functions, "pt_*" is already used for ptrace and
other things, so it makes sense to use "intel_pt_*" as a prefix.
Acked-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
Signed-off-by: Luwei Kang <luwei.kang@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Chao Peng [Wed, 24 Oct 2018 08:05:05 +0000 (16:05 +0800)]
perf/x86/intel/pt: Move Intel PT MSRs bit defines to global header
The Intel Processor Trace (PT) MSR bit defines are in a private
header. The upcoming support for PT virtualization requires these defines
to be accessible from KVM code.
Move them to the global MSR header file.
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
Signed-off-by: Luwei Kang <luwei.kang@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Andrew Jones [Tue, 6 Nov 2018 13:57:12 +0000 (14:57 +0100)]
kvm: selftests: aarch64: dirty_log_test: support greater than 40-bit IPAs
When KVM has KVM_CAP_ARM_VM_IPA_SIZE we can test with > 40-bit IPAs by
using the 'type' field of KVM_CREATE_VM.
Signed-off-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Andrew Jones [Tue, 6 Nov 2018 13:57:11 +0000 (14:57 +0100)]
kvm: selftests: add pa-48/va-48 VM modes
Signed-off-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Andrew Jones [Tue, 6 Nov 2018 13:57:10 +0000 (14:57 +0100)]
kvm: selftests: dirty_log_test: improve mode param management
Signed-off-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Andrew Jones [Tue, 6 Nov 2018 13:57:09 +0000 (14:57 +0100)]
kvm: selftests: dirty_log_test: reset guest test phys offset
We need to reset the offset for each mode as it will change
depending on the number of guest physical address bits.
Signed-off-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Andrew Jones [Tue, 6 Nov 2018 13:57:08 +0000 (14:57 +0100)]
kvm: selftests: dirty_log_test: always use -t
There's no reason not to always test the topmost physical
addresses, and if the user wants to try lower addresses
then '-p' (used to be '-o before this patch) can be used.
Let's remove the '-t' option and just always do what it did.
Signed-off-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Andrew Jones [Tue, 6 Nov 2018 13:57:07 +0000 (14:57 +0100)]
kvm: selftests: dirty_log_test: don't identity map the test mem
It isn't necessary and can even cause problems when testing high
guest physical addresses. This patch leaves the test memory id-
mapped by default, but when using '-t' the test memory virtual
addresses stay the same even though the physical addresses switch
to the topmost valid addresses.
Signed-off-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Andrew Jones [Tue, 6 Nov 2018 13:57:06 +0000 (14:57 +0100)]
kvm: selftests: x86_64: dirty_log_test: fix -t
Signed-off-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Wei Yang [Mon, 5 Nov 2018 06:45:03 +0000 (14:45 +0800)]
KVM: fix some typos
Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
[Preserved the iff and a probably intentional weird bracket notation.
Also dropped the style change to make a single-purpose patch. - Radim]
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Peng Hao [Fri, 2 Nov 2018 09:05:17 +0000 (17:05 +0800)]
x86/kvmclock: convert to SPDX identifiers
Update the verbose license text with the matching SPDX
license identifier.
Signed-off-by: Peng Hao <peng.hao2@zte.com.cn>
[Changed deprecated GPL-2.0+ to GPL-2.0-or-later. - Radim]
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Sean Christopherson [Mon, 5 Nov 2018 18:44:32 +0000 (10:44 -0800)]
KVM: x86: Remove KF() macro placeholder
Although well-intentioned, keeping the KF() definition as a hint for
handling scattered CPUID features may be counter-productive. Simply
redefining the bit position only works for directly manipulating the
guest's CPUID leafs, e.g. it doesn't make guest_cpuid_has() magically
work. Taking an alternative approach, e.g. ensuring the bit position
is identical between the Linux-defined and hardware-defined features,
may be a simpler and/or more effective method of exposing scattered
features to the guest.
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Jim Mattson [Fri, 9 Nov 2018 17:35:11 +0000 (09:35 -0800)]
kvm: vmx: Allow guest read access to IA32_TSC
Let the guest read the IA32_TSC MSR with the generic RDMSR instruction
as well as the specific RDTSC(P) instructions. Note that the hardware
applies the TSC multiplier and offset (when applicable) to the result of
RDMSR(IA32_TSC), just as it does to the result of RDTSC(P).
Signed-off-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Peter Shier <pshier@google.com>
Reviewed-by: Marc Orr <marcorr@google.com>
Reviewed-by: Liran Alon <liran.alon@oracle.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Jim Mattson [Mon, 26 Nov 2018 19:22:32 +0000 (11:22 -0800)]
kvm: nVMX: NMI-window and interrupt-window exiting should wake L2 from HLT
According to the SDM, "NMI-window exiting" VM-exits wake a logical
processor from the same inactive states as would an NMI and
"interrupt-window exiting" VM-exits wake a logical processor from the
same inactive states as would an external interrupt. Specifically, they
wake a logical processor from the shutdown state and from the states
entered using the HLT and MWAIT instructions.
Fixes:
6dfacadd5858 ("KVM: nVMX: Add support for activity state HLT")
Signed-off-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Peter Shier <pshier@google.com>
Suggested-by: Sean Christopherson <sean.j.christopherson@intel.com>
[Squashed comments of two Jim's patches and used the simplified code
hunk provided by Sean. - Radim]
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Tambe, William [Tue, 13 Nov 2018 16:51:20 +0000 (16:51 +0000)]
KVM: nSVM: Fix nested guest support for PAUSE filtering.
Currently, the nested guest's PAUSE intercept intentions are not being
honored. Instead, since the L0 hypervisor's pause_filter_count and
pause_filter_thresh values are still in place, these values are used
instead of those programmed in the VMCB by the L1 hypervisor.
To honor the desired PAUSE intercept support of the L1 hypervisor, the L0
hypervisor must use the PAUSE filtering fields of the L1 hypervisor. This
requires saving and restoring of both the L0 and L1 hypervisor's PAUSE
filtering fields.
Signed-off-by: William Tambe <william.tambe@amd.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Jim Mattson [Fri, 14 Dec 2018 22:34:43 +0000 (14:34 -0800)]
kvm: Change offset in kvm_write_guest_offset_cached to unsigned
Since the offset is added directly to the hva from the
gfn_to_hva_cache, a negative offset could result in an out of bounds
write. The existing BUG_ON only checks for addresses beyond the end of
the gfn_to_hva_cache, not for addresses before the start of the
gfn_to_hva_cache.
Note that all current call sites have non-negative offsets.
Fixes:
4ec6e8636256 ("kvm: Introduce kvm_write_guest_offset_cached()")
Reported-by: Cfir Cohen <cfir@google.com>
Signed-off-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Cfir Cohen <cfir@google.com>
Reviewed-by: Peter Shier <pshier@google.com>
Reviewed-by: Krish Sadhukhan <krish.sadhukhan@oracle.com>
Reviewed-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Jim Mattson [Mon, 17 Dec 2018 21:53:33 +0000 (13:53 -0800)]
kvm: Disallow wraparound in kvm_gfn_to_hva_cache_init
Previously, in the case where (gpa + len) wrapped around, the entire
region was not validated, as the comment claimed. It doesn't actually
seem that wraparound should be allowed here at all.
Furthermore, since some callers don't check the return code from this
function, it seems prudent to clear ghc->memslot in the event of an
error.
Fixes:
8f964525a121f ("KVM: Allow cross page reads and writes from cached translations.")
Reported-by: Cfir Cohen <cfir@google.com>
Signed-off-by: Jim Mattson <jmattson@google.com>
Reviewed-by: Cfir Cohen <cfir@google.com>
Reviewed-by: Marc Orr <marcorr@google.com>
Cc: Andrew Honig <ahonig@google.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
YueHaibing [Tue, 18 Dec 2018 01:01:49 +0000 (01:01 +0000)]
KVM: VMX: Remove duplicated include from vmx.c
Remove duplicated include.
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Vitaly Kuznetsov [Wed, 19 Dec 2018 11:15:18 +0000 (12:15 +0100)]
selftests: kvm: report failed stage when exit reason is unexpected
When we get a report like
==== Test Assertion Failure ====
x86_64/state_test.c:157: run->exit_reason == KVM_EXIT_IO
pid=955 tid=955 - Success
1 0x0000000000401350: main at state_test.c:154
2 0x00007fc31c9e9412: ?? ??:0
3 0x000000000040159d: _start at ??:?
Unexpected exit reason: 8 (SHUTDOWN),
it is not obvious which particular stage failed. Add the info.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Vitaly Kuznetsov [Wed, 19 Dec 2018 11:06:13 +0000 (12:06 +0100)]
KVM: x86: svm: report MSR_IA32_MCG_EXT_CTL as unsupported
AMD doesn't seem to implement MSR_IA32_MCG_EXT_CTL and svm code in kvm
knows nothing about it, however, this MSR is among emulated_msrs and
thus returned with KVM_GET_MSR_INDEX_LIST. The consequent KVM_GET_MSRS,
of course, fails.
Report the MSR as unsupported to not confuse userspace.
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Paolo Bonzini [Fri, 21 Dec 2018 10:25:59 +0000 (11:25 +0100)]
KVM: x86: fix size of x86_fpu_cache objects
The memory allocation in
b666a4b69739 ("kvm: x86: Dynamically allocate
guest_fpu", 2018-11-06) is wrong, there are other members in struct fpu
before the fpregs_state union and the patch should be doing something
similar to the code in fpu__init_task_struct_size. It's enough to run
a guest and then rmmod kvm to see slub errors which are actually caused
by memory corruption.
For now let's revert it to sizeof(struct fpu), which is conservative.
I have plans to move fsave/fxsave/xsave directly in KVM, without using
the kernel FPU helpers, and once it's done, the size of the object in
the cache will be something like kvm_xstate_size.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Mantas Mikulėnas [Fri, 21 Dec 2018 09:04:26 +0000 (01:04 -0800)]
Input: synaptics - enable SMBus for HP EliteBook 840 G4
dmesg reports that "Your touchpad (PNP: SYN3052 SYN0100 SYN0002 PNP0f13)
says it can support a different bus."
I've tested the offered psmouse.synaptics_intertouch=1 with 4.18.x and
4.19.x and it seems to work well. No problems seen with suspend/resume.
Also, it appears that RMI/SMBus mode is actually required for 3-4 finger
multitouch gestures to work -- otherwise they are not reported at all.
Information from dmesg in both modes:
psmouse serio3: synaptics: Touchpad model: 1, fw: 8.2, id: 0x1e2b1,
caps: 0xf00123/0x840300/0x2e800/0x0, board id: 3139, fw id:
2000742
psmouse serio3: synaptics: Trying to set up SMBus access
rmi4_smbus 6-002c: registering SMbus-connected sensor
rmi4_f01 rmi4-00.fn01: found RMI device,
manufacturer: Synaptics, product: TM3139-001, fw id:
2000742
Signed-off-by: Mantas Mikulėnas <grawity@gmail.com>
Reviewed-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Rafael J. Wysocki [Fri, 21 Dec 2018 09:07:37 +0000 (10:07 +0100)]
Merge branches 'pm-devfreq', 'pm-avs' and 'pm-tools'
* pm-devfreq:
PM / devfreq: add devfreq_suspend/resume() functions
PM / devfreq: add support for suspend/resume of a devfreq device
PM / devfreq: refactor set_target frequency function
* pm-avs:
PM / AVS: SmartReflex: Switch to SPDX Licence ID
PM / AVS: SmartReflex: NULL check before some freeing functions is not needed
PM / AVS: SmartReflex: remove unused function
* pm-tools:
tools/power/x86/intel_pstate_tracer: Fix non root execution for post processing a trace file
tools/power turbostat: consolidate duplicate model numbers
tools/power turbostat: fix goldmont C-state limit decoding
cpupower : Auto-completion for cpupower tool
tools/power turbostat: reduce debug output
tools/power turbosat: fix AMD APIC-id output
Rafael J. Wysocki [Fri, 21 Dec 2018 09:06:44 +0000 (10:06 +0100)]
Merge branches 'pm-core', 'pm-qos', 'pm-domains' and 'pm-sleep'
* pm-core:
PM-runtime: Switch autosuspend over to using hrtimers
* pm-qos:
PM / QoS: Change to use DEFINE_SHOW_ATTRIBUTE macro
* pm-domains:
PM / Domains: remove define_genpd_open_function() and define_genpd_debugfs_fops()
* pm-sleep:
PM / sleep: convert to DEFINE_SHOW_ATTRIBUTE
Rafael J. Wysocki [Fri, 21 Dec 2018 09:06:18 +0000 (10:06 +0100)]
Merge branch 'pm-opp'
* pm-opp:
PM / Domains: Propagate performance state updates
PM / Domains: Factorize dev_pm_genpd_set_performance_state()
PM / Domains: Save OPP table pointer in genpd
OPP: Don't return 0 on error from of_get_required_opp_performance_state()
OPP: Add dev_pm_opp_xlate_performance_state() helper
OPP: Improve _find_table_of_opp_np()
PM / Domains: Make genpd performance states orthogonal to the idlestates
OPP: Fix missing debugfs supply directory for OPPs
OPP: Use opp_table->regulators to verify no regulator case
OPP: Remove of_dev_pm_opp_find_required_opp()
OPP: Rename and relocate of_genpd_opp_to_performance_state()
OPP: Configure all required OPPs
OPP: Add dev_pm_opp_{set|put}_genpd_virt_dev() helper
PM / Domains: Add genpd_opp_to_performance_state()
OPP: Populate OPPs from "required-opps" property
OPP: Populate required opp tables from "required-opps" property
OPP: Separate out custom OPP handler specific code
OPP: Identify and mark genpd OPP tables
PM / Domains: Rename genpd virtual devices as virt_dev
Rafael J. Wysocki [Fri, 21 Dec 2018 09:06:06 +0000 (10:06 +0100)]
Merge branches 'pm-cpuidle', 'pm-cpufreq' and 'pm-cpufreq-sched'
* pm-cpuidle:
cpuidle: Add 'above' and 'below' idle state metrics
cpuidle: big.LITTLE: fix refcount leak
cpuidle: Add cpuidle.governor= command line parameter
cpuidle: poll_state: Disregard disable idle states
Documentation: admin-guide: PM: Add cpuidle document
* pm-cpufreq:
cpufreq: qcom-hw: Add support for QCOM cpufreq HW driver
dt-bindings: cpufreq: Introduce QCOM cpufreq firmware bindings
cpufreq: nforce2: Remove meaningless return
cpufreq: ia64: Remove unused header files
cpufreq: imx6q: save one condition block for normal case of nvmem read
cpufreq: imx6q: remove unused code
cpufreq: pmac64: add of_node_put()
cpufreq: powernv: add of_node_put()
Documentation: intel_pstate: Clarify coordination of P-State limits
cpufreq: intel_pstate: Force HWP min perf before offline
cpufreq: s3c24xx: Change to use DEFINE_SHOW_ATTRIBUTE macro
* pm-cpufreq-sched:
sched/cpufreq: Add the SPDX tags
Rafael J. Wysocki [Fri, 21 Dec 2018 09:04:23 +0000 (10:04 +0100)]
Merge branch 'acpi-pci'
* acpi-pci:
ACPI: Make PCI slot detection driver depend on PCI
ACPI/IORT: Stub out ACS functions when CONFIG_PCI is not set
arm64: select ACPI PCI code only when both features are enabled
PCI/ACPI: Allow ACPI to be built without CONFIG_PCI set
ACPICA: Remove PCI bits from ACPICA when CONFIG_PCI is unset
ACPI: Allow CONFIG_PCI to be unset for reboot
ACPI: Move PCI reset to a separate function
Rafael J. Wysocki [Fri, 21 Dec 2018 09:04:14 +0000 (10:04 +0100)]
Merge branches 'acpi-tables', 'acpi-soc', 'acpi-apei' and 'acpi-misc'
* acpi-tables:
ACPI / tables: Add an ifdef around amlcode and dsdt_amlcode
ACPI / tables: add DSDT AmlCode new declaration name support
ACPI: SPCR: Consider baud rate 0 as preconfigured state
* acpi-soc:
ACPI / LPSS: Ignore acpi_device_fix_up_power() return value
ACPI / APD: Add clock frequency for Hisilicon Hip08 SPI controller
* acpi-apei:
ACPI/APEI: Clear GHES block_status before panic()
ACPI, APEI, EINJ: Change to use DEFINE_SHOW_ATTRIBUTE macro
* acpi-misc:
ACPI: fix acpi_find_child_device() invocation in acpi_preset_companion()
Rafael J. Wysocki [Fri, 21 Dec 2018 09:03:16 +0000 (10:03 +0100)]
Merge branch 'acpica'
* acpica:
ACPICA: Update version to
20181213
ACPICA: change coding style to match ACPICA, no functional change
ACPICA: Debug output: Add option to display method/object evaluation
ACPICA: disassembler: disassemble OEMx tables as AML
ACPICA: Add "Windows 2018.2" string in the _OSI support
ACPICA: Expressions in package elements are not supported
ACPICA: Update buffer-to-string conversions
ACPICA: add comments, no functional change
ACPICA: Remove defines that use deprecated flag
ACPICA: Add "Windows 2018" string in the _OSI support
ACPICA: Update version to
20181031
ACPICA: iASL: Enhance error detection
ACPICA: iASL: adding definition and disassembly for TPM2 revision 3
ACPICA: Use %d for signed int print formatting instead of %u
ACPICA: Debugger: refactor to fix unused variable warning
Benjamin Tissoires [Fri, 21 Dec 2018 08:42:38 +0000 (00:42 -0800)]
Input: elantech - disable elan-i2c for P52 and P72
The current implementation of elan_i2c is known to not support those
2 laptops.
A proper fix is to tweak both elantech and elan_i2c to transmit the
correct information from PS/2, which would make a bad candidate for
stable.
So to give us some time for fixing the root of the problem, disable
elan_i2c for the devices we know are not behaving properly.
Link: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1803600
Link: https://bugs.archlinux.org/task/59714
Fixes:
df077237cf55 Input: elantech - detect new ICs and setup Host Notify for them
Cc: stable@vger.kernel.org # v4.18+
Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com>
Acked-by: Peter Hutterer <peter.hutterer@who-t.net>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Uwe Kleine-König [Mon, 17 Dec 2018 08:43:13 +0000 (09:43 +0100)]
gpio: mvebu: only fail on missing clk if pwm is actually to be used
The gpio IP on Armada 370 at offset 0x18180 has neither a clk nor pwm
registers. So there is no need for a clk as the pwm isn't used anyhow.
So only check for the clk in the presence of the pwm registers. This fixes
a failure to probe the gpio driver for the above mentioned gpio device.
Fixes:
757642f9a584 ("gpio: mvebu: Add limited PWM support")
Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Reviewed-by: Gregory CLEMENT <gregory.clement@bootlin.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Christophe Leroy [Fri, 7 Dec 2018 13:07:55 +0000 (13:07 +0000)]
gpio: max7301: fix driver for use with CONFIG_VMAP_STACK
spi_read() and spi_write() require DMA-safe memory. When
CONFIG_VMAP_STACK is selected, those functions cannot be used
with buffers on stack.
This patch replaces calls to spi_read() and spi_write() by
spi_write_then_read() which doesn't require DMA-safe buffers.
Fixes:
0c36ec314735 ("gpio: gpio driver for max7301 SPI GPIO expander")
Cc: <stable@vger.kernel.org>
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Tony Lindgren [Fri, 7 Dec 2018 19:08:29 +0000 (11:08 -0800)]
gpio: gpio-omap: Revert deferred wakeup quirk handling for regressions
Commit
ec0daae685b2 ("gpio: omap: Add level wakeup handling for omap4
based SoCs") attempted to fix omap4 GPIO wakeup handling as it was
blocking deeper SoC idle states. However this caused a regression for
GPIOs during runtime having over second long latencies for Ethernet
GPIO interrupt as reportedy by Russell King <rmk+kernel@armlinux.org.uk>.
Let's fix this issue by doing a partial revert of the breaking commit.
We still want to keep the quirk handling around as it is also used for
OMAP_GPIO_QUIRK_IDLE_REMOVE_TRIGGER.
The real fix for omap4 GPIO wakeup handling involves fixes for
omap_set_gpio_trigger() and omap_gpio_unmask_irq() and will be posted
separately. And we must keep the wakeup bit enabled during runtime
because of module doing clock autogating with autoidle configured.
Reported-by: Russell King <rmk+kernel@armlinux.org.uk>
Fixes:
ec0daae685b2 ("gpio: omap: Add level wakeup handling for omap4
based SoCs")
Cc: Aaro Koskinen <aaro.koskinen@iki.fi>
Cc: Grygorii Strashko <grygorii.strashko@ti.com>
Cc: Keerthy <j-keerthy@ti.com>
Cc: Ladislav Michl <ladis@linux-mips.org>
Cc: Russell King <rmk+kernel@armlinux.org.uk>
Cc: Tero Kristo <t-kristo@ti.com>
Signed-off-by: Tony Lindgren <tony@atomide.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Suraj Jitindar Singh [Fri, 21 Dec 2018 03:28:43 +0000 (14:28 +1100)]
KVM: PPC: Book3S HV: Keep rc bits in shadow pgtable in sync with host
The rc bits contained in ptes are used to track whether a page has been
accessed and whether it is dirty. The accessed bit is used to age a page
and the dirty bit to track whether a page is dirty or not.
Now that we support nested guests there are three ptes which track the
state of the same page:
- The partition-scoped page table in the L1 guest, mapping L2->L1 address
- The partition-scoped page table in the host for the L1 guest, mapping
L1->L0 address
- The shadow partition-scoped page table for the nested guest in the host,
mapping L2->L0 address
The idea is to attempt to keep the rc state of these three ptes in sync,
both when setting and when clearing rc bits.
When setting the bits we achieve consistency by:
- Initially setting the bits in the shadow page table as the 'and' of the
other two.
- When updating in software the rc bits in the shadow page table we
ensure the state is consistent with the other two locations first, and
update these before reflecting the change into the shadow page table.
i.e. only set the bits in the L2->L0 pte if also set in both the
L2->L1 and the L1->L0 pte.
When clearing the bits we achieve consistency by:
- The rc bits in the shadow page table are only cleared when discarding
a pte, and we don't need to record this as if either bit is set then
it must also be set in the pte mapping L1->L0.
- When L1 clears an rc bit in the L2->L1 mapping it __should__ issue a
tlbie instruction
- This means we will discard the pte from the shadow page table
meaning the mapping will have to be setup again.
- When setup the pte again in the shadow page table we will ensure
consistency with the L2->L1 pte.
- When the host clears an rc bit in the L1->L0 mapping we need to also
clear the bit in any ptes in the shadow page table which map the same
gfn so we will be notified if a nested guest accesses the page.
This case is what this patch specifically concerns.
- We can search the nest_rmap list for that given gfn and clear the
same bit from all corresponding ptes in shadow page tables.
- If a nested guest causes either of the rc bits to be set by software
in future then we will update the L1->L0 pte and maintain consistency.
With the process outlined above we aim to maintain consistency of the 3
pte locations where we track rc for a given guest page.
Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Suraj Jitindar Singh [Fri, 21 Dec 2018 03:28:42 +0000 (14:28 +1100)]
KVM: PPC: Book3S HV: Introduce kvmhv_update_nest_rmap_rc_list()
Introduce a function kvmhv_update_nest_rmap_rc_list() which for a given
nest_rmap list will traverse it, find the corresponding pte in the shadow
page tables, and if it still maps the same host page update the rc bits
accordingly.
Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Suraj Jitindar Singh [Fri, 21 Dec 2018 03:28:41 +0000 (14:28 +1100)]
KVM: PPC: Book3S HV: Apply combination of host and l1 pte rc for nested guest
The shadow page table contains ptes for translations from nested guest
address to host address. Currently when creating these ptes we take the
rc bits from the pte for the L1 guest address to host address
translation. This is incorrect as we must also factor in the rc bits
from the pte for the nested guest address to L1 guest address
translation (as contained in the L1 guest partition table for the nested
guest).
By not calculating these bits correctly L1 may not have been correctly
notified when it needed to update its rc bits in the partition table it
maintains for its nested guest.
Modify the code so that the rc bits in the resultant pte for the L2->L0
translation are the 'and' of the rc bits in the L2->L1 pte and the L1->L0
pte, also accounting for whether this was a write access when setting
the dirty bit.
Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Suraj Jitindar Singh [Fri, 21 Dec 2018 03:28:40 +0000 (14:28 +1100)]
KVM: PPC: Book3S HV: Align gfn to L1 page size when inserting nest-rmap entry
Nested rmap entries are used to store the translation from L1 gpa to L2
gpa when entries are inserted into the shadow (nested) page tables. This
rmap list is located by indexing the rmap array in the memslot by L1
gfn. When we come to search for these entries we only know the L1 page size
(which could be PAGE_SIZE, 2M or a 1G page) and so can only select a gfn
aligned to that size. This means that when we insert the entry, so we can
find it later, we need to align the gfn we use to select the rmap list
in which to insert the entry to L1 page size as well.
By not doing this we were missing nested rmap entries when modifying L1
ptes which were for a page also passed through to an L2 guest.
Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Suraj Jitindar Singh [Fri, 21 Dec 2018 03:28:39 +0000 (14:28 +1100)]
KVM: PPC: Book3S HV: Hold kvm->mmu_lock across updating nested pte rc bits
We already hold the kvm->mmu_lock spin lock across updating the rc bits
in the pte for the L1 guest. Continue to hold the lock across updating
the rc bits in the pte for the nested guest as well to prevent
invalidations from occurring.
Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Signed-off-by: Paul Mackerras <paulus@ozlabs.org>
Eric Dumazet [Thu, 20 Dec 2018 23:28:56 +0000 (15:28 -0800)]
tcp: fix a race in inet_diag_dump_icsk()
Alexei reported use after frees in inet_diag_dump_icsk() [1]
Because we use refcount_set() when various sockets are setup and
inserted into ehash, we also need to make sure inet_diag_dump_icsk()
wont race with the refcount_set() operations.
Jonathan Lemon sent a patch changing net_twsk_hashdance() but
other spots would need risky changes.
Instead, fix inet_diag_dump_icsk() as this bug came with
linux-4.10 only.
[1] Quoting Alexei :
First something iterating over sockets finds already freed tw socket:
refcount_t: increment on 0; use-after-free.
WARNING: CPU: 2 PID: 2738 at lib/refcount.c:153 refcount_inc+0x26/0x30
RIP: 0010:refcount_inc+0x26/0x30
RSP: 0018:
ffffc90004c8fbc0 EFLAGS:
00010282
RAX:
000000000000002b RBX:
0000000000000000 RCX:
0000000000000000
RDX:
ffff88085ee9d680 RSI:
ffff88085ee954c8 RDI:
ffff88085ee954c8
RBP:
ffff88010ecbd2c0 R08:
0000000000000000 R09:
000000000000174c
R10:
ffffffff81e7c5a0 R11:
0000000000000000 R12:
0000000000000000
R13:
ffff8806ba9bf210 R14:
ffffffff82304600 R15:
ffff88010ecbd328
FS:
00007f81f5a7d700(0000) GS:
ffff88085ee80000(0000) knlGS:
0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
CR2:
00007f81e2a95000 CR3:
000000069b2eb006 CR4:
00000000003606e0
DR0:
0000000000000000 DR1:
0000000000000000 DR2:
0000000000000000
DR3:
0000000000000000 DR6:
00000000fffe0ff0 DR7:
0000000000000400
Call Trace:
inet_diag_dump_icsk+0x2b3/0x4e0 [inet_diag] // sock_hold(sk); in net/ipv4/inet_diag.c:1002
? kmalloc_large_node+0x37/0x70
? __kmalloc_node_track_caller+0x1cb/0x260
? __alloc_skb+0x72/0x1b0
? __kmalloc_reserve.isra.40+0x2e/0x80
__inet_diag_dump+0x3b/0x80 [inet_diag]
netlink_dump+0x116/0x2a0
netlink_recvmsg+0x205/0x3c0
sock_read_iter+0x89/0xd0
__vfs_read+0xf7/0x140
vfs_read+0x8a/0x140
SyS_read+0x3f/0xa0
do_syscall_64+0x5a/0x100
then a minute later twsk timer fires and hits two bad refcnts
for this freed socket:
refcount_t: decrement hit 0; leaking memory.
WARNING: CPU: 31 PID: 0 at lib/refcount.c:228 refcount_dec+0x2e/0x40
Modules linked in:
RIP: 0010:refcount_dec+0x2e/0x40
RSP: 0018:
ffff88085f5c3ea8 EFLAGS:
00010296
RAX:
000000000000002c RBX:
ffff88010ecbd2c0 RCX:
000000000000083f
RDX:
0000000000000000 RSI:
00000000000000f6 RDI:
000000000000003f
RBP:
ffffc90003c77280 R08:
0000000000000000 R09:
00000000000017d3
R10:
ffffffff81e7c5a0 R11:
0000000000000000 R12:
ffffffff82ad2d80
R13:
ffffffff8182de00 R14:
ffff88085f5c3ef8 R15:
0000000000000000
FS:
0000000000000000(0000) GS:
ffff88085f5c0000(0000) knlGS:
0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
CR2:
00007fbe42685250 CR3:
0000000002209001 CR4:
00000000003606e0
DR0:
0000000000000000 DR1:
0000000000000000 DR2:
0000000000000000
DR3:
0000000000000000 DR6:
00000000fffe0ff0 DR7:
0000000000000400
Call Trace:
<IRQ>
inet_twsk_kill+0x9d/0xc0 // inet_twsk_bind_unhash(tw, hashinfo);
call_timer_fn+0x29/0x110
run_timer_softirq+0x36b/0x3a0
refcount_t: underflow; use-after-free.
WARNING: CPU: 31 PID: 0 at lib/refcount.c:187 refcount_sub_and_test+0x46/0x50
RIP: 0010:refcount_sub_and_test+0x46/0x50
RSP: 0018:
ffff88085f5c3eb8 EFLAGS:
00010296
RAX:
0000000000000026 RBX:
ffff88010ecbd2c0 RCX:
000000000000083f
RDX:
0000000000000000 RSI:
00000000000000f6 RDI:
000000000000003f
RBP:
ffff88010ecbd358 R08:
0000000000000000 R09:
000000000000185b
R10:
ffffffff81e7c5a0 R11:
0000000000000000 R12:
ffff88010ecbd358
R13:
ffffffff8182de00 R14:
ffff88085f5c3ef8 R15:
0000000000000000
FS:
0000000000000000(0000) GS:
ffff88085f5c0000(0000) knlGS:
0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
CR2:
00007fbe42685250 CR3:
0000000002209001 CR4:
00000000003606e0
DR0:
0000000000000000 DR1:
0000000000000000 DR2:
0000000000000000
DR3:
0000000000000000 DR6:
00000000fffe0ff0 DR7:
0000000000000400
Call Trace:
<IRQ>
inet_twsk_put+0x12/0x20 // inet_twsk_put(tw);
call_timer_fn+0x29/0x110
run_timer_softirq+0x36b/0x3a0
Fixes:
67db3e4bfbc9 ("tcp: no longer hold ehash lock while calling tcp_get_info()")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: Alexei Starovoitov <ast@kernel.org>
Cc: Jonathan Lemon <jonathan.lemon@gmail.com>
Acked-by: Jonathan Lemon <jonathan.lemon@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ganesh Goudar [Thu, 20 Dec 2018 13:26:09 +0000 (18:56 +0530)]
MAINTAINERS: update cxgb4 and cxgb3 maintainer
Arjun Vynipadath will be taking over as maintainer from now.
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Herbert Xu [Thu, 20 Dec 2018 13:20:10 +0000 (21:20 +0800)]
ipv6: frags: Fix bogus skb->sk in reassembled packets
It was reported that IPsec would crash when it encounters an IPv6
reassembled packet because skb->sk is non-zero and not a valid
pointer.
This is because skb->sk is now a union with ip_defrag_offset.
This patch fixes this by resetting skb->sk when exiting from
the reassembly code.
Reported-by: Xiumei Mu <xmu@redhat.com>
Fixes:
219badfaade9 ("ipv6: frags: get rid of ip6frag_skb_cb/...")
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
Allan W. Nielsen [Thu, 20 Dec 2018 08:37:17 +0000 (09:37 +0100)]
mscc: Configured MAC entries should be locked.
The MAC table in Ocelot supports auto aging (normal) and static entries.
MAC entries that is manually configured should be static and not subject
to aging.
Fixes:
a556c76adc05 ("net: mscc: Add initial Ocelot switch support")
Signed-off-by: Allan Nielsen <allan.nielsen@microchip.com>
Reviewed-by: Steen Hegelund <steen.hegelund@microchip.com>
Signed-off-by: Steen Hegelund <steen.hegelund@microchip.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Thu, 20 Dec 2018 22:49:56 +0000 (14:49 -0800)]
Merge branch 'i2c/for-current' of git://git./linux/kernel/git/wsa/linux
Pull i2c fixes from Wolfram Sang:
"I2C has a MAINTAINERS update for you, so people will be immediately
pointed to the right person for this previously orphaned driver.
And one of Arnd's build warning fixes for a new driver added this
cycle"
* 'i2c/for-current' of git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux:
i2c: nvidia-gpu: mark resume function as __maybe_unused
MAINTAINERS: add entry for i2c-axxia driver
Linus Torvalds [Thu, 20 Dec 2018 22:17:24 +0000 (14:17 -0800)]
Merge tag 'upstream-4.20-rc7' of git://git.infradead.org/linux-ubifs
Pull UBI/UBIFS fixes from Richard Weinberger:
- Kconfig dependency fixes for our new auth feature
- Fix for selecting the right compressor when creating a fs
- Bugfix for a bug in UBIFS's O_TMPFILE implementation
- Refcounting fixes for UBI
* tag 'upstream-4.20-rc7' of git://git.infradead.org/linux-ubifs:
ubifs: Handle re-linking of inodes correctly while recovery
ubi: Do not drop UBI device reference before using
ubi: Put MTD device after it is not used
ubifs: Fix default compression selection in ubifs
ubifs: Fix memory leak on error condition
ubifs: auth: Add CONFIG_KEYS dependency
ubifs: CONFIG_UBIFS_FS_AUTHENTICATION should depend on UBIFS_FS
ubifs: replay: Fix high stack usage
Nathan Chancellor [Thu, 20 Dec 2018 19:38:56 +0000 (12:38 -0700)]
ACPI / tables: Add an ifdef around amlcode and dsdt_amlcode
Clang warns:
drivers/acpi/tables.c:715:14: warning: unused variable 'amlcode'
[-Wunused-variable]
static void *amlcode __attribute__ ((weakref("AmlCode")));
^
drivers/acpi/tables.c:716:14: warning: unused variable 'dsdt_amlcode'
[-Wunused-variable]
static void *dsdt_amlcode __attribute__ ((weakref("dsdt_aml_code")));
^
2 warnings generated.
The only uses of these variables are hiddem behind CONFIG_ACPI_CUSTOM_DSDT
so do the same thing here.
Fixes:
82e4eb4e9653 (ACPI / tables: add DSDT AmlCode new declaration name support)
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Lenny Szubowicz [Wed, 19 Dec 2018 16:50:52 +0000 (11:50 -0500)]
ACPI/APEI: Clear GHES block_status before panic()
In __ghes_panic() clear the block status in the APEI generic
error status block for that generic hardware error source before
calling panic() to prevent a second panic() in the crash kernel
for exactly the same fatal error.
Otherwise ghes_probe(), running in the crash kernel, would see
an unhandled error in the APEI generic error status block and
panic again, thereby precluding any crash dump.
Signed-off-by: Lenny Szubowicz <lszubowi@redhat.com>
Signed-off-by: David Arcari <darcari@redhat.com>
Tested-by: Tyler Baicar <baicar.tyler@gmail.com>
Acked-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Paul Burton [Thu, 20 Dec 2018 17:45:43 +0000 (17:45 +0000)]
MIPS: math-emu: Write-protect delay slot emulation pages
Mapping the delay slot emulation page as both writeable & executable
presents a security risk, in that if an exploit can write to & jump into
the page then it can be used as an easy way to execute arbitrary code.
Prevent this by mapping the page read-only for userland, and using
access_process_vm() with the FOLL_FORCE flag to write to it from
mips_dsemul().
This will likely be less efficient due to copy_to_user_page() performing
cache maintenance on a whole page, rather than a single line as in the
previous use of flush_cache_sigtramp(). However this delay slot
emulation code ought not to be running in any performance critical paths
anyway so this isn't really a problem, and we can probably do better in
copy_to_user_page() anyway in future.
A major advantage of this approach is that the fix is small & simple to
backport to stable kernels.
Reported-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Paul Burton <paul.burton@mips.com>
Fixes:
432c6bacbd0c ("MIPS: Use per-mm page to execute branch delay slot instructions")
Cc: stable@vger.kernel.org # v4.8+
Cc: linux-mips@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: Rich Felker <dalias@libc.org>
Cc: David Daney <david.daney@cavium.com>
Daniel Vetter [Thu, 20 Dec 2018 17:13:53 +0000 (18:13 +0100)]
Merge tag 'drm-misc-fixes-2018-12-20' of git://anongit.freedesktop.org/drm/drm-misc into drm-fixes
Fix spectre v1 vuln in drm_ioctl
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
From: Sean Paul <sean@poorly.run>
Link: https://patchwork.freedesktop.org/patch/msgid/20181220165740.GA42344@art_vandelay
Mark Brown [Thu, 20 Dec 2018 16:01:30 +0000 (16:01 +0000)]
Merge remote-tracking branches 'spi/topic/mem' and 'spi/topic/mtd' into spi-next
Mark Brown [Thu, 20 Dec 2018 16:01:28 +0000 (16:01 +0000)]
Merge branch 'spi-4.21' into spi-next
Mark Brown [Thu, 20 Dec 2018 16:01:26 +0000 (16:01 +0000)]
Merge branch 'spi-4.20' into spi-linus
Linus Torvalds [Thu, 20 Dec 2018 15:35:16 +0000 (07:35 -0800)]
Merge tag 'm68k-for-v4.20-tag2' of git://git./linux/kernel/git/geert/linux-m68k
Pull m68k fix from Geert Uytterhoeven:
"Fix memblock-related crashes"
* tag 'm68k-for-v4.20-tag2' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/linux-m68k:
m68k: Fix memblock-related crashes
Linus Torvalds [Thu, 20 Dec 2018 15:33:09 +0000 (07:33 -0800)]
Merge tag 'kbuild-fixes-v4.20-2' of git://git./linux/kernel/git/masahiroy/linux-kbuild
Pull Kbuild fix from Masahiro Yamada:
"Fix false positive warning/error about missing library for objtool"
* tag 'kbuild-fixes-v4.20-2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
kbuild: fix false positive warning/error about missing libelf
Linus Torvalds [Thu, 20 Dec 2018 15:30:37 +0000 (07:30 -0800)]
Merge tag 'char-misc-4.20-rc8' of git://git./linux/kernel/git/gregkh/char-misc
Pull char/misc driver fixes from Greg KH:
"Here are three tiny last-minute driver fixes for 4.20-rc8 that resolve
some reported issues, and one MAINTAINERS file update.
All of them are related to the hyper-v subsystem, it seems people are
actually testing and using it now, which is nice to see :)
The fixes are:
- uio_hv_generic: fix for opening multiple times
- Remove PCI dependancy on hyperv drivers
- return proper error code for an unopened channel.
And Sasha has signed up to help out with the hyperv maintainership.
All of these have been in linux-next for a while with no reported
issues"
* tag 'char-misc-4.20-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc:
Drivers: hv: vmbus: Return -EINVAL for the sys files for unopened channels
x86, hyperv: remove PCI dependency
MAINTAINERS: Patch monkey for the Hyper-V code
uio_hv_generic: set callbacks on open
Linus Torvalds [Thu, 20 Dec 2018 15:29:11 +0000 (07:29 -0800)]
Merge tag 'tty-4.20-rc8' of git://git./linux/kernel/git/gregkh/tty
Pull tty/serial fix from Greg KH:
"Here is a single fix, a revert, for the 8250 serial driver to resolve
a reported problem.
There was some attempted patches to fix the issue, but people are
arguing about them, so reverting the patch to revert back to the 4.19
and older behavior is the best thing to do at this late in the release
cycle.
The revert has been in linux-next with no reported issues"
* tag 'tty-4.20-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty:
Revert "serial: 8250: Fix clearing FIFOs in RS485 mode again"
Linus Torvalds [Thu, 20 Dec 2018 15:27:39 +0000 (07:27 -0800)]
Merge tag 'usb-4.20-rc8' of git://git./linux/kernel/git/gregkh/usb
Pull USB fixes and ids from Greg KH:
"Here are some late xhci fixes for 4.20-rc8 as well as a few new device
ids for the option usb-serial driver.
The xhci fixes resolve some many-reported issues and all of these have
been in linux-next for a while with no reported problems"
* tag 'usb-4.20-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb:
USB: xhci: fix 'broken_suspend' placement in struct xchi_hcd
xhci: Don't prevent USB2 bus suspend in state check intended for USB3 only
USB: serial: option: add Telit LN940 series
USB: serial: option: add Fibocom NL668 series
USB: serial: option: add Simcom SIM7500/SIM7600 (MBIM mode)
USB: serial: option: add GosunCn ZTE WeLink ME3630
USB: serial: option: add HP lt4132
Linus Torvalds [Thu, 20 Dec 2018 15:25:31 +0000 (07:25 -0800)]
Merge tag 'mmc-v4.20-rc7' of git://git./linux/kernel/git/ulfh/mmc
Pull MMC fixes from Ulf Hansson:
"MMC core:
- Restore code to allow BKOPS and CACHE ctrl even if no HPI support
- Reset HPI enabled state during re-init
- Use a default minimum timeout when enabling CACHE ctrl
MMC host:
- omap_hsmmc: Fix DMA API warning
- sdhci-tegra: Fix dt parsing of SDMMC pads autocal values
- Correct register accesses when enabling v4 mode"
* tag 'mmc-v4.20-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/ulfh/mmc:
mmc: core: Use a minimum 1600ms timeout when enabling CACHE ctrl
mmc: core: Allow BKOPS and CACHE ctrl even if no HPI support
mmc: core: Reset HPI enabled state during re-init and in case of errors
mmc: omap_hsmmc: fix DMA API warning
mmc: tegra: Fix for SDMMC pads autocal parsing from dt
mmc: sdhci: Fix sdhci_do_enable_v4_mode
Dave Chinner [Thu, 20 Dec 2018 12:23:24 +0000 (23:23 +1100)]
iomap: Revert "fs/iomap.c: get/put the page in iomap_page_create/release()"
This reverts commit
61c6de667263184125d5ca75e894fcad632b0dd3.
The reverted commit added page reference counting to iomap page
structures that are used to track block size < page size state. This
was supposed to align the code with page migration page accounting
assumptions, but what it has done instead is break XFS filesystems.
Every fstests run I've done on sub-page block size XFS filesystems
has since picking up this commit 2 days ago has failed with bad page
state errors such as:
# ./run_check.sh "-m rmapbt=1,reflink=1 -i sparse=1 -b size=1k" "generic/038"
....
SECTION -- xfs
FSTYP -- xfs (debug)
PLATFORM -- Linux/x86_64 test1 4.20.0-rc6-dgc+
MKFS_OPTIONS -- -f -m rmapbt=1,reflink=1 -i sparse=1 -b size=1k /dev/sdc
MOUNT_OPTIONS -- /dev/sdc /mnt/scratch
generic/038 454s ...
run fstests generic/038 at 2018-12-20 18:43:05
XFS (sdc): Unmounting Filesystem
XFS (sdc): Mounting V5 Filesystem
XFS (sdc): Ending clean mount
BUG: Bad page state in process kswapd0 pfn:3a7fa
page:
ffffea0000ccbeb0 count:0 mapcount:0 mapping:
ffff88800d9b6360 index:0x1
flags: 0xfffffc0000000()
raw:
000fffffc0000000 dead000000000100 dead000000000200 ffff88800d9b6360
raw:
0000000000000001 0000000000000000 00000000ffffffff
page dumped because: non-NULL mapping
CPU: 0 PID: 676 Comm: kswapd0 Not tainted 4.20.0-rc6-dgc+ #915
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.1-1 04/01/2014
Call Trace:
dump_stack+0x67/0x90
bad_page.cold.116+0x8a/0xbd
free_pcppages_bulk+0x4bf/0x6a0
free_unref_page_list+0x10f/0x1f0
shrink_page_list+0x49d/0xf50
shrink_inactive_list+0x19d/0x3b0
shrink_node_memcg.constprop.77+0x398/0x690
? shrink_slab.constprop.81+0x278/0x3f0
shrink_node+0x7a/0x2f0
kswapd+0x34b/0x6d0
? node_reclaim+0x240/0x240
kthread+0x11f/0x140
? __kthread_bind_mask+0x60/0x60
ret_from_fork+0x24/0x30
Disabling lock debugging due to kernel taint
....
The failures are from anyway that frees pages and empties the
per-cpu page magazines, so it's not a predictable failure or an easy
to debug failure.
generic/038 is a reliable reproducer of this problem - it has a 9 in
10 failure rate on one of my test machines. Failure on other
machines have been at random points in fstests runs but every run
has ended up tripping this problem. Hence generic/038 was used to
bisect the failure because it was the most reliable failure.
It is too close to the 4.20 release (not to mention holidays) to
try to diagnose, fix and test the underlying cause of the problem,
so reverting the commit is the only option we have right now. The
revert has been tested against a current tot 4.20-rc7+ kernel across
multiple machines running sub-page block size XFs filesystems and
none of the bad page state failures have been seen.
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Cc: Piotr Jaroszynski <pjaroszynski@nvidia.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: William Kucharski <william.kucharski@oracle.com>
Cc: Darrick J. Wong <darrick.wong@oracle.com>
Cc: Brian Foster <bfoster@redhat.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Hoan Nguyen An [Thu, 20 Dec 2018 08:48:42 +0000 (17:48 +0900)]
spi: sh-msiof: Reduce the number of times write to and perform the transmission from FIFO
The current state of the spi-sh-msiof, in master transfer mode: if t-> bits_per_word <= 8,
if the data length is divisible by 4 ((len & 3) = 0), the length of each word will be 32 bits
In case of data length can not be divisible by 4 ((len & 3) != 0), always set each word to be
8 bits, this will increase the number of times that write to FIFO, increasing the number of
times it should be transmitted. Assume that the number of bytes of data length more than 64 bytes,
each transmission will write 64 times into the TFDR then transmit, a maximum one-time
transmission will transmit 64 bytes if each word is 8 bits long.
Switch to setting if t->bits_per_word <= 8, the word length will be 32 bits although the data
length is not divisible by 4, then if leftover, will transmit the balance and the length of each
words is 1 byte. The maximum each can transmit up to 64 x 4 (Data Size = 32 bits (4 bytes)) = 256 bytes.
TMDR2 : Bits 28 to 24 BITLEN1[4:0] Data Size (8 to 32 bits)
Bits 23 to 16 WDLEN1[7:0] Word Count (1 to 64 words)
Signed-off-by: Hoan Nguyen An <na-hoan@jinso.co.jp>
Signed-off-by: Mark Brown <broonie@kernel.org>
Yangtao Li [Sat, 15 Dec 2018 08:21:48 +0000 (03:21 -0500)]
regulator: convert to DEFINE_SHOW_ATTRIBUTE
Use DEFINE_SHOW_ATTRIBUTE macro to simplify the code.
Signed-off-by: Yangtao Li <tiny.windzz@gmail.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Axel Lin [Wed, 19 Dec 2018 08:58:04 +0000 (16:58 +0800)]
regulator: mcp16502: Fix missing n_voltages setting
The n_voltages setting is not set, fix it.
Signed-off-by: Axel Lin <axel.lin@ingics.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Axel Lin [Thu, 20 Dec 2018 13:14:13 +0000 (21:14 +0800)]
regulator: mcp16502: Use #ifdef CONFIG_PM_SLEEP around mcp16502_suspend/resume_noirq
mcp16502_suspend/resume_noirq is only used by SET_NOIRQ_SYSTEM_SLEEP_PM_OPS
when CONFIG_PM_SLEEP is defined.
So use #ifdef CONFIG_PM_SLEEP instead CONFIG_SUSPEND guard.
Signed-off-by: Axel Lin <axel.lin@ingics.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Radim Krčmář [Thu, 20 Dec 2018 13:54:09 +0000 (14:54 +0100)]
Merge tag 'kvm-ppc-next-4.21-1' of git://git./linux/kernel/git/paulus/powerpc
PPC KVM update for 4.21 from Paul Mackerras
The main new feature this time is support in HV nested KVM for passing
a device that is emulated by a level 0 hypervisor and presented to
level 1 as a PCI device through to a level 2 guest using VFIO.
Apart from that there are improvements for migration of radix guests
under HV KVM and some other fixes and cleanups.
Brad Love [Wed, 19 Dec 2018 17:07:01 +0000 (12:07 -0500)]
media: cx23885: only reset DMA on problematic CPUs
It is reported that commit
95f408bbc4e4 ("media: cx23885: Ryzen DMA
related RiSC engine stall fixes") caused regresssions with other CPUs.
Ensure that the quirk will be applied only for the CPUs that
are known to cause problems.
A module option is added for explicit control of the behaviour.
Fixes:
95f408bbc4e4 ("media: cx23885: Ryzen DMA related RiSC engine stall fixes")
Signed-off-by: Brad Love <brad@nextdimension.cc>
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Nathan Chancellor [Mon, 10 Dec 2018 23:35:14 +0000 (18:35 -0500)]
media: ddbridge: Move asm includes after linux ones
Without this, cpumask_t and bool are not defined:
In file included from drivers/media/pci/ddbridge/ddbridge-ci.c:19:
In file included from drivers/media/pci/ddbridge/ddbridge.h:22:
./arch/arm/include/asm/irq.h:35:50: error: unknown type name 'cpumask_t'
extern void arch_trigger_cpumask_backtrace(const cpumask_t *mask,
^
./arch/arm/include/asm/irq.h:36:9: error: unknown type name 'bool'
bool exclude_self);
^
Doing a survey of the kernel tree, this appears to be expected because
'#include <asm/irq.h>' is always after the linux includes.
This also fixes warnings of this variety (with Clang):
In file included from drivers/media/pci/ddbridge/ddbridge-ci.c:19:
In file included from drivers/media/pci/ddbridge/ddbridge.h:56:
In file included from ./include/media/dvb_net.h:22:
In file included from ./include/linux/netdevice.h:50:
In file included from ./include/uapi/linux/neighbour.h:6:
In file included from ./include/linux/netlink.h:9:
In file included from ./include/net/scm.h:11:
In file included from ./include/linux/sched/signal.h:6:
./include/linux/signal.h:87:11: warning: array index 3 is past the end
of the array (which contains 2 elements) [-Warray-bounds]
return (set->sig[3] | set->sig[2] |
^ ~
./arch/arm/include/asm/signal.h:17:2: note: array 'sig' declared here
unsigned long sig[_NSIG_WORDS];
^
Fixes:
b6973637c4cc ("media: ddbridge: remove another duplicate of io.h and sort includes")
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Tested-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Sinan Kaya [Wed, 19 Dec 2018 22:46:59 +0000 (22:46 +0000)]
ACPI: Make PCI slot detection driver depend on PCI
Since this is ACPI PCI slot detection driver for PCI, it doesn't make sense
to compile this without PCI support in place.
Signed-off-by: Sinan Kaya <okaya@kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Sinan Kaya [Wed, 19 Dec 2018 22:46:58 +0000 (22:46 +0000)]
ACPI/IORT: Stub out ACS functions when CONFIG_PCI is not set
Remove PCI dependent code out of iort.c when CONFIG_PCI is not defined.
A quick search reveals the following functions:
1. pci_request_acs()
2. pci_domain_nr()
3. pci_is_root_bus()
4. to_pci_dev()
Both pci_domain_nr() and pci_is_root_bus() are defined in linux/pci.h.
pci_domain_nr() is a stub function when CONFIG_PCI is not set and
pci_is_root_bus() just returns a reference to a structure member which
is still valid without CONFIG_PCI set.
to_pci_dev() is a macro that expands to container_of.
pci_request_acs() is the only code that gets pulled in from drivers/pci/*.c
Signed-off-by: Sinan Kaya <okaya@kernel.org>
Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Sinan Kaya [Wed, 19 Dec 2018 22:46:57 +0000 (22:46 +0000)]
arm64: select ACPI PCI code only when both features are enabled
ACPI and PCI are no longer coupled to each other. Specify requirements
for both when pulling in code.
Signed-off-by: Sinan Kaya <okaya@kernel.org>
Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Sinan Kaya [Wed, 19 Dec 2018 22:46:56 +0000 (22:46 +0000)]
PCI/ACPI: Allow ACPI to be built without CONFIG_PCI set
We are compiling PCI code today for systems with ACPI and no PCI
device present. Remove the useless code and reduce the tight
dependency.
Signed-off-by: Sinan Kaya <okaya@kernel.org>
Acked-by: Bjorn Helgaas <bhelgaas@google.com> # PCI parts
Acked-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Sinan Kaya [Wed, 19 Dec 2018 22:46:55 +0000 (22:46 +0000)]
ACPICA: Remove PCI bits from ACPICA when CONFIG_PCI is unset
Allow ACPI to be built without PCI support in place.
Signed-off-by: Sinan Kaya <okaya@kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Sinan Kaya [Wed, 19 Dec 2018 22:46:54 +0000 (22:46 +0000)]
ACPI: Allow CONFIG_PCI to be unset for reboot
Make PCI reboot conditional on CONFIG_PCI set on the kernel configuration.
Signed-off-by: Sinan Kaya <okaya@kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Sinan Kaya [Wed, 19 Dec 2018 22:46:53 +0000 (22:46 +0000)]
ACPI: Move PCI reset to a separate function
Getting ready to factor out PCI specific code when CONFIG_PCI is unset.
Create a acpi_pci_reset() that kick starts PCI specific reset.
Signed-off-by: Sinan Kaya <okaya@kernel.org>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Linus Torvalds [Thu, 20 Dec 2018 07:34:33 +0000 (23:34 -0800)]
Merge git://git./linux/kernel/git/davem/net
Pull networking fixes from David Miller:
1) Off by one in netlink parsing of mac802154_hwsim, from Alexander
Aring.
2) nf_tables RCU usage fix from Taehee Yoo.
3) Flow dissector needs nhoff and thoff clamping, from Stanislav
Fomichev.
4) Missing sin6_flowinfo initialization in SCTP, from Xin Long.
5) Spectrev1 in ipmr and ip6mr, from Gustavo A. R. Silva.
6) Fix r8169 crash when DEBUG_SHIRQ is enabled, from Heiner Kallweit.
7) Fix SKB leak in rtlwifi, from Larry Finger.
8) Fix state pruning in bpf verifier, from Jakub Kicinski.
9) Don't handle completely duplicate fragments as overlapping, from
Michal Kubecek.
10) Fix memory corruption with macb and 64-bit DMA, from Anssi Hannula.
11) Fix TCP fallback socket release in smc, from Myungho Jung.
12) gro_cells_destroy needs to napi_disable, from Lorenzo Bianconi.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (130 commits)
rds: Fix warning.
neighbor: NTF_PROXY is a valid ndm_flag for a dump request
net: mvpp2: fix the phylink mode validation
net/sched: cls_flower: Remove old entries from rhashtable
net/tls: allocate tls context using GFP_ATOMIC
iptunnel: make TUNNEL_FLAGS available in uapi
gro_cell: add napi_disable in gro_cells_destroy
lan743x: Remove MAC Reset from initialization
net/mlx5e: Remove the false indication of software timestamping support
net/mlx5: Typo fix in del_sw_hw_rule
net/mlx5e: RX, Fix wrong early return in receive queue poll
ipv6: explicitly initialize udp6_addr in udp_sock_create6()
bnxt_en: Fix ethtool self-test loopback.
net/rds: remove user triggered WARN_ON in rds_sendmsg
net/rds: fix warn in rds_message_alloc_sgs
ath10k: skip sending quiet mode cmd for WCN3990
mac80211: free skb fraglist before freeing the skb
nl80211: fix memory leak if validate_pae_over_nl80211() fails
net/smc: fix TCP fallback socket release
vxge: ensure data0 is initialized in when fetching firmware version information
...
Gustavo A. R. Silva [Thu, 20 Dec 2018 00:00:15 +0000 (18:00 -0600)]
drm/ioctl: Fix Spectre v1 vulnerabilities
nr is indirectly controlled by user-space, hence leading to a
potential exploitation of the Spectre variant 1 vulnerability.
This issue was detected with the help of Smatch:
drivers/gpu/drm/drm_ioctl.c:805 drm_ioctl() warn: potential spectre issue 'dev->driver->ioctls' [r]
drivers/gpu/drm/drm_ioctl.c:810 drm_ioctl() warn: potential spectre issue 'drm_ioctls' [r] (local cap)
drivers/gpu/drm/drm_ioctl.c:892 drm_ioctl_flags() warn: potential spectre issue 'drm_ioctls' [r] (local cap)
Fix this by sanitizing nr before using it to index dev->driver->ioctls
and drm_ioctls.
Notice that given that speculation windows are large, the policy is
to kill the speculation on the first load and not worry if it can be
completed with a dependent load/store [1].
[1] https://marc.info/?l=linux-kernel&m=
152449131114778&w=2
Cc: stable@vger.kernel.org
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
Link: https://patchwork.freedesktop.org/patch/msgid/20181220000015.GA18973@embeddedor
David S. Miller [Thu, 20 Dec 2018 04:53:18 +0000 (20:53 -0800)]
rds: Fix warning.
>> net/rds/send.c:1109:42: warning: Using plain integer as NULL pointer
Fixes:
ea010070d0a7 ("net/rds: fix warn in rds_message_alloc_sgs")
Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Linus Torvalds [Thu, 20 Dec 2018 02:40:48 +0000 (18:40 -0800)]
Merge tag 'for_linus' of git://git./linux/kernel/git/mst/vhost
Pull virtio fix from Michael Tsirkin:
"A last-minute fix for a test build"
* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
virtio: fix test build after uio.h change
Linus Torvalds [Thu, 20 Dec 2018 02:38:54 +0000 (18:38 -0800)]
Merge tag 'nfs-for-4.20-6' of git://git.linux-nfs.org/projects/trondmy/linux-nfs
Pull NFS client bugfixes from Trond Myklebust:
- Fix TCP socket disconnection races by ensuring we always call
xprt_disconnect_done() after releasing the socket.
- Fix a race when clearing both XPRT_CONNECTING and XPRT_LOCKED
- Remove xprt_connect_status() so it does not mask errors that should
be handled by call_connect_status()
* tag 'nfs-for-4.20-6' of git://git.linux-nfs.org/projects/trondmy/linux-nfs:
SUNRPC: Remove xprt_connect_status()
SUNRPC: Fix a race with XPRT_CONNECTING
SUNRPC: Fix disconnection races
Linus Torvalds [Thu, 20 Dec 2018 02:27:58 +0000 (18:27 -0800)]
Merge tag 'for-linus' of git://git./virt/kvm/kvm
Pull kvm fixes from Paolo Bonzini:
- One nasty use-after-free bugfix, from this merge window however
- A less nasty use-after-free that can only zero some words at the
beginning of the page, and hence is not really exploitable
- A NULL pointer dereference
- A dummy implementation of an AMD chicken bit MSR that Windows uses
for some unknown reason
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
kvm: x86: Add AMD's EX_CFG to the list of ignored MSRs
KVM: X86: Fix NULL deref in vcpu_scan_ioapic
KVM: Fix UAF in nested posted interrupt processing
KVM: fix unregistering coalesced mmio zone from wrong bus
Linus Torvalds [Thu, 20 Dec 2018 02:16:17 +0000 (18:16 -0800)]
Merge tag 'dma-mapping-4.20-4' of git://git.infradead.org/users/hch/dma-mapping
Pull dma-mapping fix from Christoph Hellwig:
"Fix a regression in dma-direct that didn't take account the magic AMD
memory encryption mask in the DMA address"
* tag 'dma-mapping-4.20-4' of git://git.infradead.org/users/hch/dma-mapping:
dma-direct: do not include SME mask in the DMA supported check
David Ahern [Thu, 20 Dec 2018 00:54:38 +0000 (16:54 -0800)]
neighbor: NTF_PROXY is a valid ndm_flag for a dump request
When dumping proxy entries the dump request has NTF_PROXY set in
ndm_flags. strict mode checking needs to be updated to allow this
flag.
Fixes:
51183d233b5a ("net/neighbor: Update neigh_dump_info for strict data checking")
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Antoine Tenart [Wed, 19 Dec 2018 17:00:12 +0000 (18:00 +0100)]
net: mvpp2: fix the phylink mode validation
The mvpp2_phylink_validate() sets all modes that are supported by a
given PPv2 port. An mistake made the 10000baseT_Full mode being
advertised in some cases when a port wasn't configured to perform at
10G. This patch fixes this.
Fixes:
d97c9f4ab000 ("net: mvpp2: 1000baseX support")
Reported-by: Russell King <linux@armlinux.org.uk>
Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Roi Dayan [Wed, 19 Dec 2018 16:07:56 +0000 (18:07 +0200)]
net/sched: cls_flower: Remove old entries from rhashtable
When replacing a rule we add the new rule to the rhashtable
but only remove the old if not in skip_sw.
This commit fix this and remove the old rule anyway.
Fixes:
35cc3cefc4de ("net/sched: cls_flower: Reject duplicated rules also under skip_sw")
Signed-off-by: Roi Dayan <roid@mellanox.com>
Reviewed-by: Vlad Buslov <vladbu@mellanox.com>
Acked-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Ganesh Goudar [Wed, 19 Dec 2018 11:48:22 +0000 (17:18 +0530)]
net/tls: allocate tls context using GFP_ATOMIC
create_ctx can be called from atomic context, hence use
GFP_ATOMIC instead of GFP_KERNEL.
[ 395.962599] BUG: sleeping function called from invalid context at mm/slab.h:421
[ 395.979896] in_atomic(): 1, irqs_disabled(): 0, pid: 16254, name: openssl
[ 395.996564] 2 locks held by openssl/16254:
[ 396.010492] #0:
00000000347acb52 (sk_lock-AF_INET){+.+.}, at: do_tcp_setsockopt.isra.44+0x13b/0x9a0
[ 396.029838] #1:
000000006c9552b5 (device_spinlock){+...}, at: tls_init+0x1d/0x280
[ 396.047675] CPU: 5 PID: 16254 Comm: openssl Tainted: G O 4.20.0-rc6+ #25
[ 396.066019] Hardware name: Supermicro X10SRA-F/X10SRA-F, BIOS 2.0c 09/25/2017
[ 396.083537] Call Trace:
[ 396.096265] dump_stack+0x5e/0x8b
[ 396.109876] ___might_sleep+0x216/0x250
[ 396.123940] kmem_cache_alloc_trace+0x1b0/0x240
[ 396.138800] create_ctx+0x1f/0x60
[ 396.152504] tls_init+0xbd/0x280
[ 396.166135] tcp_set_ulp+0x191/0x2d0
[ 396.180035] ? tcp_set_ulp+0x2c/0x2d0
[ 396.193960] do_tcp_setsockopt.isra.44+0x148/0x9a0
[ 396.209013] __sys_setsockopt+0x7c/0xe0
[ 396.223054] __x64_sys_setsockopt+0x20/0x30
[ 396.237378] do_syscall_64+0x4a/0x180
[ 396.251200] entry_SYSCALL_64_after_hwframe+0x49/0xbe
Fixes:
df9d4a178022 ("net/tls: sleeping function from invalid context")
Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
wenxu [Wed, 19 Dec 2018 06:11:15 +0000 (14:11 +0800)]
iptunnel: make TUNNEL_FLAGS available in uapi
ip l add dev tun type gretap external
ip r a 10.0.0.1 encap ip dst 192.168.152.171 id 1000 dev gretap
For gretap Key example when the command set the id but don't set the
TUNNEL_KEY flags. There is no key field in the send packet
In the lwtunnel situation, some TUNNEL_FLAGS should can be set by
userspace
Signed-off-by: wenxu <wenxu@ucloud.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
Lorenzo Bianconi [Wed, 19 Dec 2018 22:23:00 +0000 (23:23 +0100)]
gro_cell: add napi_disable in gro_cells_destroy
Add napi_disable routine in gro_cells_destroy since starting from
commit
c42858eaf492 ("gro_cells: remove spinlock protecting receive
queues") gro_cell_poll and gro_cells_destroy can run concurrently on
napi_skbs list producing a kernel Oops if the tunnel interface is
removed while gro_cell_poll is running. The following Oops has been
triggered removing a vxlan device while the interface is receiving
traffic
[ 5628.948853] BUG: unable to handle kernel NULL pointer dereference at
0000000000000008
[ 5628.949981] PGD 0 P4D 0
[ 5628.950308] Oops: 0002 [#1] SMP PTI
[ 5628.950748] CPU: 0 PID: 9 Comm: ksoftirqd/0 Not tainted 4.20.0-rc6+ #41
[ 5628.952940] RIP: 0010:gro_cell_poll+0x49/0x80
[ 5628.955615] RSP: 0018:
ffffc9000004fdd8 EFLAGS:
00010202
[ 5628.956250] RAX:
0000000000000000 RBX:
ffffe8ffffc08150 RCX:
0000000000000000
[ 5628.957102] RDX:
0000000000000000 RSI:
ffff88802356bf00 RDI:
ffffe8ffffc08150
[ 5628.957940] RBP:
0000000000000026 R08:
0000000000000000 R09:
0000000000000000
[ 5628.958803] R10:
0000000000000001 R11:
0000000000000000 R12:
0000000000000040
[ 5628.959661] R13:
ffffe8ffffc08100 R14:
0000000000000000 R15:
0000000000000040
[ 5628.960682] FS:
0000000000000000(0000) GS:
ffff88803ea00000(0000) knlGS:
0000000000000000
[ 5628.961616] CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
[ 5628.962359] CR2:
0000000000000008 CR3:
000000000221c000 CR4:
00000000000006b0
[ 5628.963188] DR0:
0000000000000000 DR1:
0000000000000000 DR2:
0000000000000000
[ 5628.964034] DR3:
0000000000000000 DR6:
00000000fffe0ff0 DR7:
0000000000000400
[ 5628.964871] Call Trace:
[ 5628.965179] net_rx_action+0xf0/0x380
[ 5628.965637] __do_softirq+0xc7/0x431
[ 5628.966510] run_ksoftirqd+0x24/0x30
[ 5628.966957] smpboot_thread_fn+0xc5/0x160
[ 5628.967436] kthread+0x113/0x130
[ 5628.968283] ret_from_fork+0x3a/0x50
[ 5628.968721] Modules linked in:
[ 5628.969099] CR2:
0000000000000008
[ 5628.969510] ---[ end trace
9d9dedc7181661fe ]---
[ 5628.970073] RIP: 0010:gro_cell_poll+0x49/0x80
[ 5628.972965] RSP: 0018:
ffffc9000004fdd8 EFLAGS:
00010202
[ 5628.973611] RAX:
0000000000000000 RBX:
ffffe8ffffc08150 RCX:
0000000000000000
[ 5628.974504] RDX:
0000000000000000 RSI:
ffff88802356bf00 RDI:
ffffe8ffffc08150
[ 5628.975462] RBP:
0000000000000026 R08:
0000000000000000 R09:
0000000000000000
[ 5628.976413] R10:
0000000000000001 R11:
0000000000000000 R12:
0000000000000040
[ 5628.977375] R13:
ffffe8ffffc08100 R14:
0000000000000000 R15:
0000000000000040
[ 5628.978296] FS:
0000000000000000(0000) GS:
ffff88803ea00000(0000) knlGS:
0000000000000000
[ 5628.979327] CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
[ 5628.980044] CR2:
0000000000000008 CR3:
000000000221c000 CR4:
00000000000006b0
[ 5628.980929] DR0:
0000000000000000 DR1:
0000000000000000 DR2:
0000000000000000
[ 5628.981736] DR3:
0000000000000000 DR6:
00000000fffe0ff0 DR7:
0000000000000400
[ 5628.982409] Kernel panic - not syncing: Fatal exception in interrupt
[ 5628.983307] Kernel Offset: disabled
Fixes:
c42858eaf492 ("gro_cells: remove spinlock protecting receive queues")
Signed-off-by: Lorenzo Bianconi <lorenzo.bianconi@redhat.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Bryan Whitehead [Wed, 19 Dec 2018 21:55:15 +0000 (16:55 -0500)]
lan743x: Remove MAC Reset from initialization
The MAC Reset was noticed to erase important EEPROM settings.
It is also unnecessary since a chip wide reset was done earlier
in initialization, and that reset preserves EEPROM settings.
There for this patch removes the unnecessary MAC specific reset.
Signed-off-by: Bryan Whitehead <Bryan.Whitehead@microchip.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Michael S. Tsirkin [Wed, 19 Dec 2018 23:21:51 +0000 (18:21 -0500)]
virtio: fix test build after uio.h change
Fixes:
d38499530e5 ("fs: decouple READ and WRITE from the block layer ops")
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>