OSDN Git Service
Srinivasarao P [Tue, 9 Jan 2018 11:00:10 +0000 (16:30 +0530)]
Merge android-4.4.110 (
5cc8c2e) into msm-4.4
* refs/heads/tmp-
5cc8c2e
Linux 4.4.110
kaiser: Set _PAGE_NX only if supported
x86/kasan: Clear kasan_zero_page after TLB flush
x86/vdso: Get pvclock data from the vvar VMA instead of the fixmap
x86, vdso, pvclock: Simplify and speed up the vdso pvclock reader
KPTI: Report when enabled
KPTI: Rename to PAGE_TABLE_ISOLATION
x86/kaiser: Move feature detection up
kaiser: disabled on Xen PV
x86/kaiser: Reenable PARAVIRT
x86/paravirt: Dont patch flush_tlb_single
kaiser: kaiser_flush_tlb_on_return_to_user() check PCID
kaiser: asm/tlbflush.h handle noPGE at lower level
kaiser: drop is_atomic arg to kaiser_pagetable_walk()
kaiser: use ALTERNATIVE instead of x86_cr3_pcid_noflush
x86/kaiser: Check boottime cmdline params
x86/kaiser: Rename and simplify X86_FEATURE_KAISER handling
kaiser: add "nokaiser" boot option, using ALTERNATIVE
kaiser: fix unlikely error in alloc_ldt_struct()
kaiser: _pgd_alloc() without __GFP_REPEAT to avoid stalls
kaiser: paranoid_entry pass cr3 need to paranoid_exit
kaiser: x86_cr3_pcid_noflush and x86_cr3_pcid_user
kaiser: PCID 0 for kernel and 128 for user
kaiser: load_new_mm_cr3() let SWITCH_USER_CR3 flush user
kaiser: enhanced by kernel and user PCIDs
kaiser: vmstat show NR_KAISERTABLE as nr_overhead
kaiser: delete KAISER_REAL_SWITCH option
kaiser: name that 0x1000 KAISER_SHADOW_PGD_OFFSET
kaiser: cleanups while trying for gold link
kaiser: kaiser_remove_mapping() move along the pgd
kaiser: tidied up kaiser_add/remove_mapping slightly
kaiser: tidied up asm/kaiser.h somewhat
kaiser: ENOMEM if kaiser_pagetable_walk() NULL
kaiser: fix perf crashes
kaiser: fix regs to do_nmi() ifndef CONFIG_KAISER
kaiser: KAISER depends on SMP
kaiser: fix build and FIXME in alloc_ldt_struct()
kaiser: stack map PAGE_SIZE at THREAD_SIZE-PAGE_SIZE
kaiser: do not set _PAGE_NX on pgd_none
kaiser: merged update
KAISER: Kernel Address Isolation
x86/boot: Add early cmdline parsing for options with arguments
ANDROID: sdcardfs: Add default_normal option
ANDROID: sdcardfs: notify lower file of opens
Conflicts:
kernel/fork.c
Change-Id: I9c8c12e63321d79dc2c89fb470ca8de587366911
Signed-off-by: Srinivasarao P <spathi@codeaurora.org>
Srinivasarao P [Tue, 9 Jan 2018 10:42:59 +0000 (16:12 +0530)]
Merge android-4.4.109 (
8cbe01c) into msm-4.4
* refs/heads/tmp-
8cbe01c
Linux 4.4.109
mm/vmstat: Make NR_TLB_REMOTE_FLUSH_RECEIVED available even on UP
n_tty: fix EXTPROC vs ICANON interaction with TIOCINQ (aka FIONREAD)
x86/smpboot: Remove stale TLB flush invocations
nohz: Prevent a timer interrupt storm in tick_nohz_stop_sched_tick()
usb: xhci: Add XHCI_TRUST_TX_LENGTH for Renesas uPD720201
USB: Fix off by one in type-specific length check of BOS SSP capability
usb: add RESET_RESUME for ELSA MicroLink 56K
usb: Add device quirk for Logitech HD Pro Webcam C925e
USB: serial: option: adding support for YUGA CLM920-NC5
USB: serial: option: add support for Telit ME910 PID 0x1101
USB: serial: qcserial: add Sierra Wireless EM7565
USB: serial: ftdi_sio: add id for Airbus DS P8GR
usbip: vhci: stop printing kernel pointer addresses in messages
usbip: stub: stop printing kernel pointer addresses in messages
usbip: fix usbip bind writing random string after command in match_busid
sock: free skb in skb_complete_tx_timestamp on error
net: phy: micrel: ksz9031: reconfigure autoneg after phy autoneg workaround
net: Fix double free and memory corruption in get_net_ns_by_id()
net: bridge: fix early call to br_stp_change_bridge_id and plug newlink leaks
ipv4: Fix use-after-free when flushing FIB tables
sctp: Replace use of sockets_allocated with specified macro.
net: mvmdio: disable/unprepare clocks in EPROBE_DEFER case
net: ipv4: fix for a race condition in raw_sendmsg
tg3: Fix rx hang on MTU change with 5717/5719
tcp md5sig: Use skb's saddr when replying to an incoming segment
net: reevalulate autoflowlabel setting after sysctl setting
net: qmi_wwan: add Sierra EM7565 1199:9091
netlink: Add netns check on taps
net: igmp: Use correct source address on IGMPv3 reports
ipv6: mcast: better catch silly mtu values
ipv4: igmp: guard against silly MTU values
kbuild: add '-fno-stack-check' to kernel build options
x86/mm/64: Fix reboot interaction with CR4.PCIDE
x86/mm: Enable CR4.PCIDE on supported systems
x86/mm: Add the 'nopcid' boot option to turn off PCID
x86/mm: Disable PCID on 32-bit kernels
x86/mm: Remove the UP asm/tlbflush.h code, always use the (formerly) SMP code
x86/mm: Reimplement flush_tlb_page() using flush_tlb_mm_range()
x86/mm: Make flush_tlb_mm_range() more predictable
x86/mm: Remove flush_tlb() and flush_tlb_current_task()
x86/vm86/32: Switch to flush_tlb_mm_range() in mark_screen_rdonly()
ALSA: hda - fix headset mic detection issue on a Dell machine
ALSA: hda: Drop useless WARN_ON()
ASoC: twl4030: fix child-node lookup
ASoC: fsl_ssi: AC'97 ops need regmap, clock and cleaning up on failure
iw_cxgb4: Only validate the MSN for successful completions
ring-buffer: Mask out the info bits when returning buffer page length
tracing: Fix crash when it fails to alloc ring buffer
tracing: Fix possible double free on failure of allocating trace buffer
tracing: Remove extra zeroing out of the ring buffer page
net: mvneta: clear interface link status on port disable
powerpc/perf: Dereference BHRB entries safely
kvm: x86: fix RSM when PCID is non-zero
KVM: X86: Fix load RFLAGS w/o the fixed bit
spi: xilinx: Detect stall with Unknown commands
parisc: Hide Diva-built-in serial aux and graphics card
PCI / PM: Force devices to D0 in pci_pm_thaw_noirq()
ALSA: usb-audio: Fix the missing ctl name suffix at parsing SU
ALSA: rawmidi: Avoid racy info ioctl via ctl device
mfd: twl6040: Fix child-node lookup
mfd: twl4030-audio: Fix sibling-node lookup
mfd: cros ec: spi: Don't send first message too soon
crypto: mcryptd - protect the per-CPU queue with a lock
ACPI: APEI / ERST: Fix missing error handling in erst_reader()
Change-Id: I3823f793c0c85d1639e9be10358cf70cfcd13afc
Signed-off-by: Srinivasarao P <spathi@codeaurora.org>
Srinivasarao P [Tue, 9 Jan 2018 10:40:13 +0000 (16:10 +0530)]
Merge android-4.4.108 (
55b3b8c) into msm-4.4
* refs/heads/tmp-
55b3b8c
Linux 4.4.108
alpha: fix build failures
ALSA: hda - Fix yet another i915 pointer leftover in error path
ALSA: hda - Degrade i915 binding failure message
ALSA: hda - Clear the leftover component assignment at snd_hdac_i915_exit()
Revert "Bluetooth: btusb: driver to enable the usb-wakeup feature"
MIPS: math-emu: Fix final emulation phase for certain instructions
thermal: hisilicon: Handle return value of clk_prepare_enable
cpuidle: fix broadcast control when broadcast can not be entered
rtc: set the alarm to the next expiring timer
tcp: fix under-evaluated ssthresh in TCP Vegas
fm10k: ensure we process SM mbx when processing VF mbx
scsi: lpfc: PLOGI failures during NPIV testing
scsi: lpfc: Fix secure firmware updates
PCI/AER: Report non-fatal errors only to the affected endpoint
ixgbe: fix use of uninitialized padding
igb: check memory allocation failure
PCI: Create SR-IOV virtfn/physfn links before attaching driver
scsi: mpt3sas: Fix IO error occurs on pulling out a drive from RAID1 volume created on two SATA drive
scsi: cxgb4i: fix Tx skb leak
PCI: Avoid bus reset if bridge itself is broken
net: phy: at803x: Change error to EINVAL for invalid MAC
rtc: pl031: make interrupt optional
crypto: crypto4xx - increase context and scatter ring buffer elements
backlight: pwm_bl: Fix overflow condition
bnxt_en: Fix NULL pointer dereference in reopen failure path
cpuidle: powernv: Pass correct drv->cpumask for registration
ARM: dma-mapping: disallow dma_get_sgtable() for non-kernel managed memory
netfilter: nfnetlink_queue: fix secctx memory leak
xhci: plat: Register shutdown for xhci_plat
isdn: kcapi: avoid uninitialized data
KVM: pci-assign: do not map smm memory slot pages in vt-d page tables
ARM: dts: am335x-evmsk: adjust mmc2 param to allow suspend
netfilter: nf_nat_snmp: Fix panic when snmp_trap_helper fails to register
netfilter: nfnl_cthelper: fix a race when walk the nf_ct_helper_hash table
irda: vlsi_ir: fix check for DMA mapping errors
RDMA/iser: Fix possible mr leak on device removal event
i40e: Do not enable NAPI on q_vectors that have no rings
net: Do not allow negative values for busy_read and busy_poll sysctl interfaces
bna: avoid writing uninitialized data into hw registers
s390/qeth: no ETH header for outbound AF_IUCV
r8152: prevent the driver from transmitting packets with carrier off
HID: xinmo: fix for out of range for THT 2P arcade controller.
hwmon: (asus_atk0110) fix uninitialized data access
ARM: dts: ti: fix PCI bus dtc warnings
KVM: VMX: Fix enable VPID conditions
KVM: x86: correct async page present tracepoint
scsi: lpfc: Fix PT2PT PRLI reject
pinctrl: st: add irq_request/release_resources callbacks
inet: frag: release spinlock before calling icmp_send()
netfilter: nfnl_cthelper: Fix memory leak
netfilter: nfnl_cthelper: fix runtime expectation policy updates
usb: gadget: udc: remove pointer dereference after free
usb: gadget: f_uvc: Sanity check wMaxPacketSize for SuperSpeed
net: qmi_wwan: Add USB IDs for MDM6600 modem on Motorola Droid 4
bna: integer overflow bug in debugfs
sch_dsmark: fix invalid skb_cow() usage
crypto: deadlock between crypto_alg_sem/rtnl_mutex/genl_mutex
r8152: fix the list rx_done may be used without initialization
cpuidle: Validate cpu_dev in cpuidle_add_sysfs()
arm: kprobes: Align stack to 8-bytes in test code
arm: kprobes: Fix the return address of multiple kretprobes
ALSA: hda - add support for docking station for HP 840 G3
ALSA: hda - add support for docking station for HP 820 G2
x86/irq: Do not substract irq_tlb_count from irq_call_count
sched/core: Idle_task_exit() shouldn't use switch_mm_irqs_off()
ARM: Hide finish_arch_post_lock_switch() from modules
x86/mm, sched/core: Turn off IRQs in switch_mm()
x86/mm, sched/core: Uninline switch_mm()
x86/mm: Build arch/x86/mm/tlb.c even on !SMP
sched/core: Add switch_mm_irqs_off() and use it in the scheduler
mm/mmu_context, sched/core: Fix mmu_context.h assumption
mm/rmap: batched invalidations should use existing api
x86/mm: If INVPCID is available, use it to flush global mappings
x86/mm: Add a 'noinvpcid' boot option to turn off INVPCID
x86/mm: Fix INVPCID asm constraint
x86/mm: Add INVPCID helpers
cxl: Check if vphb exists before iterating over AFU devices
arm64: Initialise high_memory global variable earlier
ANDROID: binder: Remove obsolete proc waitqueue.
Change-Id: Ie954ccd1dbd861672345bb0ee879273be4d0a441
Signed-off-by: Srinivasarao P <spathi@codeaurora.org>
Srinivasarao P [Tue, 9 Jan 2018 10:35:02 +0000 (16:05 +0530)]
Merge android-4.4.107 (
79f138a) into msm-4.4
* refs/heads/tmp-
79f138a
Linux 4.4.107
ath9k: fix tx99 potential info leak
IB/ipoib: Grab rtnl lock on heavy flush when calling ndo_open/stop
RDMA/cma: Avoid triggering undefined behavior
macvlan: Only deliver one copy of the frame to the macvlan interface
udf: Avoid overflow when session starts at large offset
scsi: bfa: integer overflow in debugfs
scsi: sd: change allow_restart to bool in sysfs interface
scsi: sd: change manage_start_stop to bool in sysfs interface
vt6655: Fix a possible sleep-in-atomic bug in vt6655_suspend
scsi: scsi_devinfo: Add REPORTLUN2 to EMC SYMMETRIX blacklist entry
raid5: Set R5_Expanded on parity devices as well as data.
pinctrl: adi2: Fix Kconfig build problem
usb: musb: da8xx: fix babble condition handling
tty fix oops when rmmod 8250
powerpc/perf/hv-24x7: Fix incorrect comparison in memord
scsi: hpsa: destroy sas transport properties before scsi_host
scsi: hpsa: cleanup sas_phy structures in sysfs when unloading
PCI: Detach driver before procfs & sysfs teardown on device remove
xfs: fix incorrect extent state in xfs_bmap_add_extent_unwritten_real
xfs: fix log block underflow during recovery cycle verification
l2tp: cleanup l2tp_tunnel_delete calls
bcache: fix wrong cache_misses statistics
bcache: explicitly destroy mutex while exiting
GFS2: Take inode off order_write list when setting jdata flag
thermal/drivers/step_wise: Fix temperature regulation misbehavior
ppp: Destroy the mutex when cleanup
clk: tegra: Fix cclk_lp divisor register
clk: imx6: refine hdmi_isfr's parent to make HDMI work on i.MX6 SoCs w/o VPU
clk: mediatek: add the option for determining PLL source clock
mm: Handle 0 flags in _calc_vm_trans() macro
crypto: tcrypt - fix buffer lengths in test_aead_speed()
arm-ccn: perf: Prevent module unload while PMU is in use
target/file: Do not return error for UNMAP if length is zero
target:fix condition return in core_pr_dump_initiator_port()
iscsi-target: fix memory leak in lio_target_tiqn_addtpg()
target/iscsi: Fix a race condition in iscsit_add_reject_from_cmd()
powerpc/ipic: Fix status get and status clear
powerpc/opal: Fix EBUSY bug in acquiring tokens
netfilter: ipvs: Fix inappropriate output of procfs
powerpc/powernv/cpufreq: Fix the frequency read by /proc/cpuinfo
PCI/PME: Handle invalid data when reading Root Status
dmaengine: ti-dma-crossbar: Correct am335x/am43xx mux value type
rtc: pcf8563: fix output clock rate
video: fbdev: au1200fb: Return an error code if a memory allocation fails
video: fbdev: au1200fb: Release some resources if a memory allocation fails
video: udlfb: Fix read EDID timeout
fbdev: controlfb: Add missing modes to fix out of bounds access
sfc: don't warn on successful change of MAC
target: fix race during implicit transition work flushes
target: fix ALUA transition timeout handling
target: Use system workqueue for ALUA transitions
btrfs: add missing memset while reading compressed inline extents
NFSv4.1 respect server's max size in CREATE_SESSION
efi/esrt: Cleanup bad memory map log messages
perf symbols: Fix symbols__fixup_end heuristic for corner cases
net/mlx4_core: Avoid delays during VF driver device shutdown
afs: Fix afs_kill_pages()
afs: Fix page leak in afs_write_begin()
afs: Populate and use client modification time
afs: Fix the maths in afs_fs_store_data()
afs: Prevent callback expiry timer overflow
afs: Migrate vlocation fields to 64-bit
afs: Flush outstanding writes when an fd is closed
afs: Adjust mode bits processing
afs: Populate group ID from vnode status
afs: Fix missing put_page()
drm/radeon: reinstate oland workaround for sclk
mmc: mediatek: Fixed bug where clock frequency could be set wrong
sched/deadline: Use deadline instead of period when calculating overflow
sched/deadline: Throttle a constrained deadline task activated after the deadline
sched/deadline: Make sure the replenishment timer fires in the next period
drm/radeon/si: add dpm quirk for Oland
fjes: Fix wrong netdevice feature flags
scsi: hpsa: limit outstanding rescans
scsi: hpsa: update check for logical volume status
openrisc: fix issue handling 8 byte get_user calls
intel_th: pci: Add Gemini Lake support
mlxsw: reg: Fix SPVMLR max record count
mlxsw: reg: Fix SPVM max record count
net: Resend IGMP memberships upon peer notification.
dmaengine: Fix array index out of bounds warning in __get_unmap_pool()
net: wimax/i2400m: fix NULL-deref at probe
writeback: fix memory leak in wb_queue_work()
netfilter: bridge: honor frag_max_size when refragmenting
drm/omap: fix dmabuf mmap for dma_alloc'ed buffers
Input: i8042 - add TUXEDO BU1406 (N24_25BU) to the nomux list
NFSD: fix nfsd_reset_versions for NFSv4.
NFSD: fix nfsd_minorversion(.., NFSD_AVAIL)
net: bcmgenet: Power up the internal PHY before probing the MII
net: bcmgenet: power down internal phy if open or resume fails
net: bcmgenet: reserved phy revisions must be checked first
net: bcmgenet: correct MIB access of UniMAC RUNT counters
net: bcmgenet: correct the RBUF_OVFL_CNT and RBUF_ERR_CNT MIB values
net: initialize msg.msg_flags in recvfrom
userfaultfd: selftest: vm: allow to build in vm/ directory
userfaultfd: shmem: __do_fault requires VM_FAULT_NOPAGE
md-cluster: free md_cluster_info if node leave cluster
usb: phy: isp1301: Add OF device ID table
mac80211: Fix addition of mesh configuration element
KEYS: add missing permission check for request_key() destination
ext4: fix crash when a directory's i_size is too small
ext4: fix fdatasync(2) after fallocate(2) operation
dmaengine: dmatest: move callback wait queue to thread context
sched/rt: Do not pull from current CPU if only one CPU to pull
xhci: Don't add a virt_dev to the devs array before it's fully allocated
Bluetooth: btusb: driver to enable the usb-wakeup feature
ceph: drop negative child dentries before try pruning inode's alias
usbip: fix stub_send_ret_submit() vulnerability to null transfer_buffer
USB: core: prevent malicious bNumInterfaces overflow
USB: uas and storage: Add US_FL_BROKEN_FUA for another JMicron JMS567 ID
tracing: Allocate mask_str buffer dynamically
autofs: fix careless error in recent commit
crypto: salsa20 - fix blkcipher_walk API usage
crypto: hmac - require that the underlying hash algorithm is unkeyed
UPSTREAM: arm64: setup: introduce kaslr_offset()
UPSTREAM: kcov: fix comparison callback signature
UPSTREAM: kcov: support comparison operands collection
UPSTREAM: kcov: remove pointless current != NULL check
UPSTREAM: kcov: support compat processes
UPSTREAM: kcov: simplify interrupt check
UPSTREAM: kcov: make kcov work properly with KASLR enabled
UPSTREAM: kcov: add more missing includes
UPSTREAM: kcov: add missing #include <linux/sched.h>
UPSTREAM: kcov: properly check if we are in an interrupt
UPSTREAM: kcov: don't profile branches in kcov
UPSTREAM: kcov: don't trace the code coverage code
BACKPORT: kernel: add kcov code coverage
Conflicts:
Makefile
mm/kasan/Makefile
scripts/Makefile.lib
Change-Id: Ic19953706ea2e700621b0ba94d1c90bbffa4f471
Signed-off-by: Srinivasarao P <spathi@codeaurora.org>
Srinivasarao P [Tue, 9 Jan 2018 10:29:02 +0000 (15:59 +0530)]
Merge android-4.4.106 (
2fea039) into msm-4.4
* refs/heads/tmp-
2fea039
Linux 4.4.106
usb: gadget: ffs: Forbid usb_ep_alloc_request from sleeping
arm: KVM: Fix VTTBR_BADDR_MASK BUG_ON off-by-one
Revert "x86/mm/pat: Ensure cpa->pfn only contains page frame numbers"
Revert "x86/efi: Hoist page table switching code into efi_call_virt()"
Revert "x86/efi: Build our own page table structures"
net/packet: fix a race in packet_bind() and packet_notifier()
packet: fix crash in fanout_demux_rollover()
sit: update frag_off info
rds: Fix NULL pointer dereference in __rds_rdma_map
tipc: fix memory leak in tipc_accept_from_sock()
more bio_map_user_iov() leak fixes
s390: always save and restore all registers on context switch
ipmi: Stop timers before cleaning up the module
audit: ensure that 'audit=1' actually enables audit for PID 1
ipvlan: fix ipv6 outbound device
afs: Connect up the CB.ProbeUuid
IB/mlx5: Assign send CQ and recv CQ of UMR QP
IB/mlx4: Increase maximal message size under UD QP
xfrm: Copy policy family in clone_policy
jump_label: Invoke jump_label_test() via early_initcall()
atm: horizon: Fix irq release error
sctp: use the right sk after waking up from wait_buf sleep
sctp: do not free asoc when it is already dead in sctp_sendmsg
sparc64/mm: set fields in deferred pages
block: wake up all tasks blocked in get_request()
sunrpc: Fix rpc_task_begin trace point
NFS: Fix a typo in nfs_rename()
dynamic-debug-howto: fix optional/omitted ending line number to be LARGE instead of 0
lib/genalloc.c: make the avail variable an atomic_long_t
route: update fnhe_expires for redirect when the fnhe exists
route: also update fnhe_genid when updating a route cache
mac80211_hwsim: Fix memory leak in hwsim_new_radio_nl()
kbuild: pkg: use --transform option to prefix paths in tar
EDAC, i5000, i5400: Fix definition of NRECMEMB register
EDAC, i5000, i5400: Fix use of MTR_DRAM_WIDTH macro
powerpc/powernv/ioda2: Gracefully fail if too many TCE levels requested
drm/amd/amdgpu: fix console deadlock if late init failed
axonram: Fix gendisk handling
netfilter: don't track fragmented packets
zram: set physical queue limits to avoid array out of bounds accesses
i2c: riic: fix restart condition
crypto: s5p-sss - Fix completing crypto request in IRQ handler
ipv6: reorder icmpv6_init() and ip6_mr_init()
bnx2x: do not rollback VF MAC/VLAN filters we did not configure
bnx2x: fix possible overrun of VFPF multicast addresses array
bnx2x: prevent crash when accessing PTP with interface down
spi_ks8995: fix "BUG: key
accdaa28 not in .data!"
arm64: KVM: Survive unknown traps from guests
arm: KVM: Survive unknown traps from guests
KVM: nVMX: reset nested_run_pending if the vCPU is going to be reset
irqchip/crossbar: Fix incorrect type of register size
scsi: lpfc: Fix crash during Hardware error recovery on SLI3 adapters
workqueue: trigger WARN if queue_delayed_work() is called with NULL @wq
libata: drop WARN from protocol error in ata_sff_qc_issue()
kvm: nVMX: VMCLEAR should not cause the vCPU to shut down
USB: gadgetfs: Fix a potential memory leak in 'dev_config()'
usb: gadget: configs: plug memory leak
HID: chicony: Add support for another ASUS Zen AiO keyboard
gpio: altera: Use handle_level_irq when configured as a level_high
ARM: OMAP2+: Release device node after it is no longer needed.
ARM: OMAP2+: Fix device node reference counts
module: set __jump_table alignment to 8
selftest/powerpc: Fix false failures for skipped tests
x86/hpet: Prevent might sleep splat on resume
ARM: OMAP2+: gpmc-onenand: propagate error on initialization failure
vti6: Don't report path MTU below IPV6_MIN_MTU.
Revert "s390/kbuild: enable modversions for symbols exported from asm"
Revert "spi: SPI_FSL_DSPI should depend on HAS_DMA"
Revert "drm/armada: Fix compile fail"
mm: drop unused pmdp_huge_get_and_clear_notify()
thp: fix MADV_DONTNEED vs. numa balancing race
thp: reduce indentation level in change_huge_pmd()
scsi: storvsc: Workaround for virtual DVD SCSI version
ARM: avoid faulting on qemu
ARM: BUG if jumping to usermode address in kernel mode
arm64: fpsimd: Prevent registers leaking from dead tasks
KVM: VMX: remove I/O port 0x80 bypass on Intel hosts
arm64: KVM: fix VTTBR_BADDR_MASK BUG_ON off-by-one
media: dvb: i2c transfers over usb cannot be done from stack
drm/exynos: gem: Drop NONCONTIG flag for buffers allocated without IOMMU
drm: extra printk() wrapper macros
kdb: Fix handling of kallsyms_symbol_next() return value
s390: fix compat system call table
iommu/vt-d: Fix scatterlist offset handling
ALSA: usb-audio: Add check return value for usb_string()
ALSA: usb-audio: Fix out-of-bound error
ALSA: seq: Remove spurious WARN_ON() at timer check
ALSA: pcm: prevent UAF in snd_pcm_info
x86/PCI: Make broadcom_postcore_init() check acpi_disabled
X.509: reject invalid BIT STRING for subjectPublicKey
ASN.1: check for error from ASN1_OP_END__ACT actions
ASN.1: fix out-of-bounds read when parsing indefinite length item
efi: Move some sysfs files to be read-only by root
scsi: libsas: align sata_device's rps_resp on a cacheline
isa: Prevent NULL dereference in isa_bus driver callbacks
hv: kvp: Avoid reading past allocated blocks from KVP file
virtio: release virtio index when fail to device_register
can: usb_8dev: cancel urb on -EPIPE and -EPROTO
can: esd_usb2: cancel urb on -EPIPE and -EPROTO
can: ems_usb: cancel urb on -EPIPE and -EPROTO
can: kvaser_usb: cancel urb on -EPIPE and -EPROTO
can: kvaser_usb: ratelimit errors if incomplete messages are received
can: kvaser_usb: Fix comparison bug in kvaser_usb_read_bulk_callback()
can: kvaser_usb: free buf in error paths
can: ti_hecc: Fix napi poll return value for repoll
BACKPORT: irq: Make the irqentry text section unconditional
UPSTREAM: arch, ftrace: for KASAN put hard/soft IRQ entries into separate sections
UPSTREAM: x86, kasan, ftrace: Put APIC interrupt handlers into .irqentry.text
UPSTREAM: kasan: make get_wild_bug_type() static
UPSTREAM: kasan: separate report parts by empty lines
UPSTREAM: kasan: improve double-free report format
UPSTREAM: kasan: print page description after stacks
UPSTREAM: kasan: improve slab object description
UPSTREAM: kasan: change report header
UPSTREAM: kasan: simplify address description logic
UPSTREAM: kasan: change allocation and freeing stack traces headers
UPSTREAM: kasan: unify report headers
UPSTREAM: kasan: introduce helper functions for determining bug type
BACKPORT: kasan: report only the first error by default
UPSTREAM: kasan: fix races in quarantine_remove_cache()
UPSTREAM: kasan: resched in quarantine_remove_cache()
BACKPORT: kasan, sched/headers: Uninline kasan_enable/disable_current()
BACKPORT: kasan: drain quarantine of memcg slab objects
UPSTREAM: kasan: eliminate long stalls during quarantine reduction
UPSTREAM: kasan: support panic_on_warn
UPSTREAM: x86/suspend: fix false positive KASAN warning on suspend/resume
UPSTREAM: kasan: support use-after-scope detection
UPSTREAM: kasan/tests: add tests for user memory access functions
UPSTREAM: mm, kasan: add a ksize() test
UPSTREAM: kasan: test fix: warn if the UAF could not be detected in kmalloc_uaf2
UPSTREAM: kasan: modify kmalloc_large_oob_right(), add kmalloc_pagealloc_oob_right()
UPSTREAM: lib/stackdepot: export save/fetch stack for drivers
UPSTREAM: lib/stackdepot.c: bump stackdepot capacity from 16MB to 128MB
BACKPORT: kprobes: Unpoison stack in jprobe_return() for KASAN
UPSTREAM: kasan: remove the unnecessary WARN_ONCE from quarantine.c
UPSTREAM: kasan: avoid overflowing quarantine size on low memory systems
UPSTREAM: kasan: improve double-free reports
BACKPORT: mm: coalesce split strings
BACKPORT: mm/kasan: get rid of ->state in struct kasan_alloc_meta
UPSTREAM: mm/kasan: get rid of ->alloc_size in struct kasan_alloc_meta
UPSTREAM: mm: kasan: remove unused 'reserved' field from struct kasan_alloc_meta
UPSTREAM: mm/kasan, slub: don't disable interrupts when object leaves quarantine
UPSTREAM: mm/kasan: don't reduce quarantine in atomic contexts
UPSTREAM: mm/kasan: fix corruptions and false positive reports
UPSTREAM: lib/stackdepot.c: use __GFP_NOWARN for stack allocations
BACKPORT: mm, kasan: switch SLUB to stackdepot, enable memory quarantine for SLUB
UPSTREAM: kasan/quarantine: fix bugs on qlist_move_cache()
UPSTREAM: mm: mempool: kasan: don't poot mempool objects in quarantine
UPSTREAM: kasan: change memory hot-add error messages to info messages
BACKPORT: mm/kasan: add API to check memory regions
UPSTREAM: mm/kasan: print name of mem[set,cpy,move]() caller in report
UPSTREAM: mm: kasan: initial memory quarantine implementation
UPSTREAM: lib/stackdepot: avoid to return 0 handle
UPSTREAM: lib/stackdepot.c: allow the stack trace hash to be zero
UPSTREAM: mm, kasan: fix compilation for CONFIG_SLAB
BACKPORT: mm, kasan: stackdepot implementation. Enable stackdepot for SLAB
BACKPORT: mm, kasan: add GFP flags to KASAN API
UPSTREAM: mm, kasan: SLAB support
UPSTREAM: mm/slab: align cache size first before determination of OFF_SLAB candidate
UPSTREAM: mm/slab: use more appropriate condition check for debug_pagealloc
UPSTREAM: mm/slab: factor out debugging initialization in cache_init_objs()
UPSTREAM: mm/slab: remove object status buffer for DEBUG_SLAB_LEAK
UPSTREAM: mm/slab: alternative implementation for DEBUG_SLAB_LEAK
UPSTREAM: mm/slab: clean up DEBUG_PAGEALLOC processing code
UPSTREAM: mm/slab: activate debug_pagealloc in SLAB when it is actually enabled
sched: EAS/WALT: Don't take into account of running task's util
BACKPORT: schedutil: Reset cached freq if it is not in sync with next_freq
UPSTREAM: kasan: add functions to clear stack poison
Conflicts:
arch/arm/include/asm/kvm_arm.h
arch/arm64/kernel/vmlinux.lds.S
include/linux/kasan.h
kernel/softirq.c
lib/Kconfig
lib/Kconfig.kasan
lib/Makefile
lib/stackdepot.c
mm/kasan/kasan.c
sound/usb/mixer.c
Change-Id: If70ced6da5f19be3dd92d10a8d8cd4d5841e5870
Signed-off-by: Srinivasarao P <spathi@codeaurora.org>
Srinivasarao P [Tue, 2 Jan 2018 10:42:14 +0000 (16:12 +0530)]
Merge android-4.4.105 (
8a53962) into msm-4.4
* refs/heads/tmp-
8a53962
Linux 4.4.105
xen-netfront: avoid crashing on resume after a failure in talk_to_netback()
usb: host: fix incorrect updating of offset
USB: usbfs: Filter flags passed in from user space
USB: devio: Prevent integer overflow in proc_do_submiturb()
USB: Increase usbfs transfer limit
USB: core: Add type-specific length check of BOS descriptors
usb: ch9: Add size macro for SSP dev cap descriptor
usb: Add USB 3.1 Precision time measurement capability descriptor support
usb: xhci: fix panic in xhci_free_virt_devices_depth_first
usb: hub: Cycle HUB power when initialization fails
Revert "ocfs2: should wait dio before inode lock in ocfs2_setattr()"
net: fec: fix multicast filtering hardware setup
xen-netfront: Improve error handling during initialization
mm: avoid returning VM_FAULT_RETRY from ->page_mkwrite handlers
tcp: correct memory barrier usage in tcp_check_space()
dmaengine: pl330: fix double lock
tipc: fix cleanup at module unload
net: sctp: fix array overrun read on sctp_timer_tbl
drm/exynos/decon5433: set STANDALONE_UPDATE_F on output enablement
NFSv4: Fix client recovery when server reboots multiple times
KVM: arm/arm64: Fix occasional warning from the timer work function
nfs: Don't take a reference on fl->fl_file for LOCK operation
ravb: Remove Rx overflow log messages
net/appletalk: Fix kernel memory disclosure
vti6: fix device register to report IFLA_INFO_KIND
ARM: OMAP1: DMA: Correct the number of logical channels
net: systemport: Pad packet before inserting TSB
net: systemport: Utilize skb_put_padto()
kprobes/x86: Disable preemption in ftrace-based jprobes
perf test attr: Fix ignored test case result
sysrq : fix Show Regs call trace on ARM
EDAC, sb_edac: Fix missing break in switch
x86/entry: Use SYSCALL_DEFINE() macros for sys_modify_ldt()
serial: 8250: Preserve DLD[7:4] for PORT_XR17V35X
usb: phy: tahvo: fix error handling in tahvo_usb_probe()
spi: sh-msiof: Fix DMA transfer size check
serial: 8250_fintek: Fix rs485 disablement on invalid ioctl()
selftests/x86/ldt_get: Add a few additional tests for limits
s390/pci: do not require AIS facility
ima: fix hash algorithm initialization
USB: serial: option: add Quectel BG96 id
s390/runtime instrumentation: simplify task exit handling
serial: 8250_pci: Add Amazon PCI serial device ID
usb: quirks: Add no-lpm quirk for KY-688 USB 3.1 Type-C Hub
uas: Always apply US_FL_NO_ATA_1X quirk to Seagate devices
bcache: recover data from backing when data is clean
bcache: only permit to recovery read error when cache device is clean
ANDROID: initramfs: call free_initrd() when skipping init
Conflicts:
drivers/usb/core/config.c
include/linux/usb.h
include/uapi/linux/usb/ch9.h
Change-Id: Ibada5100be12f3a1389461f7738ee2ecb0d427af
Signed-off-by: Srinivasarao P <spathi@codeaurora.org>
Greg Kroah-Hartman [Sat, 6 Jan 2018 09:53:18 +0000 (10:53 +0100)]
Merge 4.4.110 into android-4.4
Changes in 4.4.110
x86/boot: Add early cmdline parsing for options with arguments
KAISER: Kernel Address Isolation
kaiser: merged update
kaiser: do not set _PAGE_NX on pgd_none
kaiser: stack map PAGE_SIZE at THREAD_SIZE-PAGE_SIZE
kaiser: fix build and FIXME in alloc_ldt_struct()
kaiser: KAISER depends on SMP
kaiser: fix regs to do_nmi() ifndef CONFIG_KAISER
kaiser: fix perf crashes
kaiser: ENOMEM if kaiser_pagetable_walk() NULL
kaiser: tidied up asm/kaiser.h somewhat
kaiser: tidied up kaiser_add/remove_mapping slightly
kaiser: kaiser_remove_mapping() move along the pgd
kaiser: cleanups while trying for gold link
kaiser: name that 0x1000 KAISER_SHADOW_PGD_OFFSET
kaiser: delete KAISER_REAL_SWITCH option
kaiser: vmstat show NR_KAISERTABLE as nr_overhead
kaiser: enhanced by kernel and user PCIDs
kaiser: load_new_mm_cr3() let SWITCH_USER_CR3 flush user
kaiser: PCID 0 for kernel and 128 for user
kaiser: x86_cr3_pcid_noflush and x86_cr3_pcid_user
kaiser: paranoid_entry pass cr3 need to paranoid_exit
kaiser: _pgd_alloc() without __GFP_REPEAT to avoid stalls
kaiser: fix unlikely error in alloc_ldt_struct()
kaiser: add "nokaiser" boot option, using ALTERNATIVE
x86/kaiser: Rename and simplify X86_FEATURE_KAISER handling
x86/kaiser: Check boottime cmdline params
kaiser: use ALTERNATIVE instead of x86_cr3_pcid_noflush
kaiser: drop is_atomic arg to kaiser_pagetable_walk()
kaiser: asm/tlbflush.h handle noPGE at lower level
kaiser: kaiser_flush_tlb_on_return_to_user() check PCID
x86/paravirt: Dont patch flush_tlb_single
x86/kaiser: Reenable PARAVIRT
kaiser: disabled on Xen PV
x86/kaiser: Move feature detection up
KPTI: Rename to PAGE_TABLE_ISOLATION
KPTI: Report when enabled
x86, vdso, pvclock: Simplify and speed up the vdso pvclock reader
x86/vdso: Get pvclock data from the vvar VMA instead of the fixmap
x86/kasan: Clear kasan_zero_page after TLB flush
kaiser: Set _PAGE_NX only if supported
Linux 4.4.110
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Greg Kroah-Hartman [Fri, 5 Jan 2018 14:44:27 +0000 (15:44 +0100)]
Linux 4.4.110
Guenter Roeck [Thu, 4 Jan 2018 21:41:55 +0000 (13:41 -0800)]
kaiser: Set _PAGE_NX only if supported
This resolves a crash if loaded under qemu + haxm under windows.
See https://www.spinics.net/lists/kernel/msg2689835.html for details.
Here is a boot log (the log is from chromeos-4.4, but Tao Wu says that
the same log is also seen with vanilla v4.4.110-rc1).
[ 0.712750] Freeing unused kernel memory: 552K
[ 0.721821] init: Corrupted page table at address
57b029b332e0
[ 0.722761] PGD
80000000bb238067 PUD
bc36a067 PMD
bc369067 PTE
45d2067
[ 0.722761] Bad pagetable: 000b [#1] PREEMPT SMP
[ 0.722761] Modules linked in:
[ 0.722761] CPU: 1 PID: 1 Comm: init Not tainted 4.4.96 #31
[ 0.722761] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
rel-1.7.5.1-0-g8936dbb-20141113_115728-nilsson.home.kraxel.org 04/01/2014
[ 0.722761] task:
ffff8800bc290000 ti:
ffff8800bc28c000 task.ti:
ffff8800bc28c000
[ 0.722761] RIP: 0010:[<
ffffffff83f4129e>] [<
ffffffff83f4129e>] __clear_user+0x42/0x67
[ 0.722761] RSP: 0000:
ffff8800bc28fcf8 EFLAGS:
00010202
[ 0.722761] RAX:
0000000000000000 RBX:
00000000000001a4 RCX:
00000000000001a4
[ 0.722761] RDX:
0000000000000000 RSI:
0000000000000008 RDI:
000057b029b332e0
[ 0.722761] RBP:
ffff8800bc28fd08 R08:
ffff8800bc290000 R09:
ffff8800bb2f4000
[ 0.722761] R10:
ffff8800bc290000 R11:
ffff8800bb2f4000 R12:
000057b029b332e0
[ 0.722761] R13:
0000000000000000 R14:
000057b029b33340 R15:
ffff8800bb1e2a00
[ 0.722761] FS:
0000000000000000(0000) GS:
ffff8800bfb00000(0000) knlGS:
0000000000000000
[ 0.722761] CS: 0010 DS: 0000 ES: 0000 CR0:
000000008005003b
[ 0.722761] CR2:
000057b029b332e0 CR3:
00000000bb2f8000 CR4:
00000000000006e0
[ 0.722761] Stack:
[ 0.722761]
000057b029b332e0 ffff8800bb95fa80 ffff8800bc28fd18 ffffffff83f4120c
[ 0.722761]
ffff8800bc28fe18 ffffffff83e9e7a1 ffff8800bc28fd68 0000000000000000
[ 0.722761]
ffff8800bc290000 ffff8800bc290000 ffff8800bc290000 ffff8800bc290000
[ 0.722761] Call Trace:
[ 0.722761] [<
ffffffff83f4120c>] clear_user+0x2e/0x30
[ 0.722761] [<
ffffffff83e9e7a1>] load_elf_binary+0xa7f/0x18f7
[ 0.722761] [<
ffffffff83de2088>] search_binary_handler+0x86/0x19c
[ 0.722761] [<
ffffffff83de389e>] do_execveat_common.isra.26+0x909/0xf98
[ 0.722761] [<
ffffffff844febe0>] ? rest_init+0x87/0x87
[ 0.722761] [<
ffffffff83de40be>] do_execve+0x23/0x25
[ 0.722761] [<
ffffffff83c002e3>] run_init_process+0x2b/0x2d
[ 0.722761] [<
ffffffff844fec4d>] kernel_init+0x6d/0xda
[ 0.722761] [<
ffffffff84505b2f>] ret_from_fork+0x3f/0x70
[ 0.722761] [<
ffffffff844febe0>] ? rest_init+0x87/0x87
[ 0.722761] Code: 86 84 be 12 00 00 00 e8 87 0d e8 ff 66 66 90 48 89 d8 48 c1
eb 03 4c 89 e7 83 e0 07 48 89 d9 be 08 00 00 00 31 d2 48 85 c9 74 0a <48> 89 17
48 01 f7 ff c9 75 f6 48 89 c1 85 c9 74 09 88 17 48 ff
[ 0.722761] RIP [<
ffffffff83f4129e>] __clear_user+0x42/0x67
[ 0.722761] RSP <
ffff8800bc28fcf8>
[ 0.722761] ---[ end trace
def703879b4ff090 ]---
[ 0.722761] BUG: sleeping function called from invalid context at /mnt/host/source/src/third_party/kernel/v4.4/kernel/locking/rwsem.c:21
[ 0.722761] in_atomic(): 0, irqs_disabled(): 1, pid: 1, name: init
[ 0.722761] CPU: 1 PID: 1 Comm: init Tainted: G D 4.4.96 #31
[ 0.722761] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
rel-1.7.5.1-0-g8936dbb-20141113_115728-nilsson.home.kraxel.org 04/01/2014
[ 0.722761]
0000000000000086 dcb5d76098c89836 ffff8800bc28fa30 ffffffff83f34004
[ 0.722761]
ffffffff84839dc2 0000000000000015 ffff8800bc28fa40 ffffffff83d57dc9
[ 0.722761]
ffff8800bc28fa68 ffffffff83d57e6a ffffffff84a53640 0000000000000000
[ 0.722761] Call Trace:
[ 0.722761] [<
ffffffff83f34004>] dump_stack+0x4d/0x63
[ 0.722761] [<
ffffffff83d57dc9>] ___might_sleep+0x13a/0x13c
[ 0.722761] [<
ffffffff83d57e6a>] __might_sleep+0x9f/0xa6
[ 0.722761] [<
ffffffff84502788>] down_read+0x20/0x31
[ 0.722761] [<
ffffffff83cc5d9b>] __blocking_notifier_call_chain+0x35/0x63
[ 0.722761] [<
ffffffff83cc5ddd>] blocking_notifier_call_chain+0x14/0x16
[ 0.800374] usb 1-1: new full-speed USB device number 2 using uhci_hcd
[ 0.722761] [<
ffffffff83cefe97>] profile_task_exit+0x1a/0x1c
[ 0.802309] [<
ffffffff83cac84e>] do_exit+0x39/0xe7f
[ 0.802309] [<
ffffffff83ce5938>] ? vprintk_default+0x1d/0x1f
[ 0.802309] [<
ffffffff83d7bb95>] ? printk+0x57/0x73
[ 0.802309] [<
ffffffff83c46e25>] oops_end+0x80/0x85
[ 0.802309] [<
ffffffff83c7b747>] pgtable_bad+0x8a/0x95
[ 0.802309] [<
ffffffff83ca7f4a>] __do_page_fault+0x8c/0x352
[ 0.802309] [<
ffffffff83eefba5>] ? file_has_perm+0xc4/0xe5
[ 0.802309] [<
ffffffff83ca821c>] do_page_fault+0xc/0xe
[ 0.802309] [<
ffffffff84507682>] page_fault+0x22/0x30
[ 0.802309] [<
ffffffff83f4129e>] ? __clear_user+0x42/0x67
[ 0.802309] [<
ffffffff83f4127f>] ? __clear_user+0x23/0x67
[ 0.802309] [<
ffffffff83f4120c>] clear_user+0x2e/0x30
[ 0.802309] [<
ffffffff83e9e7a1>] load_elf_binary+0xa7f/0x18f7
[ 0.802309] [<
ffffffff83de2088>] search_binary_handler+0x86/0x19c
[ 0.802309] [<
ffffffff83de389e>] do_execveat_common.isra.26+0x909/0xf98
[ 0.802309] [<
ffffffff844febe0>] ? rest_init+0x87/0x87
[ 0.802309] [<
ffffffff83de40be>] do_execve+0x23/0x25
[ 0.802309] [<
ffffffff83c002e3>] run_init_process+0x2b/0x2d
[ 0.802309] [<
ffffffff844fec4d>] kernel_init+0x6d/0xda
[ 0.802309] [<
ffffffff84505b2f>] ret_from_fork+0x3f/0x70
[ 0.802309] [<
ffffffff844febe0>] ? rest_init+0x87/0x87
[ 0.830559] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000009
[ 0.830559]
[ 0.831305] Kernel Offset: 0x2c00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
[ 0.831305] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000009
The crash part of this problem may be solved with the following patch
(thanks to Hugh for the hint). There is still another problem, though -
with this patch applied, the qemu session aborts with "VCPU Shutdown
request", whatever that means.
Cc: lepton <ytht.net@gmail.com>
Signed-off-by: Guenter Roeck <groeck@chromium.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Andrey Ryabinin [Mon, 11 Jan 2016 12:51:18 +0000 (15:51 +0300)]
x86/kasan: Clear kasan_zero_page after TLB flush
commit
69e0210fd01ff157d332102219aaf5c26ca8069b upstream.
Currently we clear kasan_zero_page before __flush_tlb_all(). This
works with current implementation of native_flush_tlb[_global]()
because it doesn't cause do any writes to kasan shadow memory.
But any subtle change made in native_flush_tlb*() could break this.
Also current code seems doesn't work for paravirt guests (lguest).
Only after the TLB flush we can be sure that kasan_zero_page is not
used as early shadow anymore (instrumented code will not write to it).
So it should cleared it only after the TLB flush.
Signed-off-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Reviewed-by: Borislav Petkov <bp@suse.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Luis R. Rodriguez <mcgrof@suse.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Toshi Kani <toshi.kani@hp.com>
Cc: linux-mm@kvack.org
Link: http://lkml.kernel.org/r/1452516679-32040-2-git-send-email-aryabinin@virtuozzo.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Jamie Iles <jamie.iles@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Andy Lutomirski [Fri, 11 Dec 2015 03:20:20 +0000 (19:20 -0800)]
x86/vdso: Get pvclock data from the vvar VMA instead of the fixmap
commit
dac16fba6fc590fa7239676b35ed75dae4c4cd2b upstream.
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-mm@kvack.org
Link: http://lkml.kernel.org/r/9d37826fdc7e2d2809efe31d5345f97186859284.1449702533.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Jamie Iles <jamie.iles@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Andy Lutomirski [Fri, 11 Dec 2015 03:20:19 +0000 (19:20 -0800)]
x86, vdso, pvclock: Simplify and speed up the vdso pvclock reader
commit
6b078f5de7fc0851af4102493c7b5bb07e49c4cb upstream.
The pvclock vdso code was too abstracted to understand easily
and excessively paranoid. Simplify it for a huge speedup.
This opens the door for additional simplifications, as the vdso
no longer accesses the pvti for any vcpu other than vcpu 0.
Before, vclock_gettime using kvm-clock took about 45ns on my
machine. With this change, it takes 29ns, which is almost as
fast as the pure TSC implementation.
Signed-off-by: Andy Lutomirski <luto@amacapital.net>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-mm@kvack.org
Link: http://lkml.kernel.org/r/6b51dcc41f1b101f963945c5ec7093d72bdac429.1449702533.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Jamie Iles <jamie.iles@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Kees Cook [Wed, 3 Jan 2018 18:43:32 +0000 (10:43 -0800)]
KPTI: Report when enabled
Make sure dmesg reports when KPTI is enabled.
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Kees Cook [Wed, 3 Jan 2018 18:43:15 +0000 (10:43 -0800)]
KPTI: Rename to PAGE_TABLE_ISOLATION
This renames CONFIG_KAISER to CONFIG_PAGE_TABLE_ISOLATION.
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Borislav Petkov [Mon, 25 Dec 2017 12:57:16 +0000 (13:57 +0100)]
x86/kaiser: Move feature detection up
... before the first use of kaiser_enabled as otherwise funky
things happen:
about to get started...
(XEN) d0v0 Unhandled page fault fault/trap [#14, ec=0000]
(XEN) Pagetable walk from
ffff88022a449090:
(XEN) L4[0x110] =
0000000229e0e067 0000000000001e0e
(XEN) L3[0x008] =
0000000000000000 ffffffffffffffff
(XEN) domain_crash_sync called from entry.S: fault at
ffff82d08033fd08
entry.o#create_bounce_frame+0x135/0x14d
(XEN) Domain 0 (vcpu#0) crashed on cpu#0:
(XEN) ----[ Xen-4.9.1_02-3.21 x86_64 debug=n Not tainted ]----
(XEN) CPU: 0
(XEN) RIP: e033:[<
ffffffff81007460>]
(XEN) RFLAGS:
0000000000000286 EM: 1 CONTEXT: pv guest (d0v0)
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Jiri Kosina [Tue, 2 Jan 2018 13:19:49 +0000 (14:19 +0100)]
kaiser: disabled on Xen PV
Kaiser cannot be used on paravirtualized MMUs (namely reading and writing CR3).
This does not work with KAISER as the CR3 switch from and to user space PGD
would require to map the whole XEN_PV machinery into both.
More importantly, enabling KAISER on Xen PV doesn't make too much sense, as PV
guests use distinct %cr3 values for kernel and user already.
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Borislav Petkov [Tue, 2 Jan 2018 13:19:49 +0000 (14:19 +0100)]
x86/kaiser: Reenable PARAVIRT
Now that the required bits have been addressed, reenable
PARAVIRT.
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Thomas Gleixner [Mon, 4 Dec 2017 14:07:30 +0000 (15:07 +0100)]
x86/paravirt: Dont patch flush_tlb_single
commit
a035795499ca1c2bd1928808d1a156eda1420383 upstream
native_flush_tlb_single() will be changed with the upcoming
PAGE_TABLE_ISOLATION feature. This requires to have more code in
there than INVLPG.
Remove the paravirt patching for it.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com>
Reviewed-by: Juergen Gross <jgross@suse.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Borislav Petkov <bpetkov@suse.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David Laight <David.Laight@aculab.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Eduardo Valentin <eduval@amazon.com>
Cc: Greg KH <gregkh@linuxfoundation.org>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: aliguori@amazon.com
Cc: daniel.gruss@iaik.tugraz.at
Cc: hughd@google.com
Cc: keescook@google.com
Cc: linux-mm@kvack.org
Cc: michael.schwarz@iaik.tugraz.at
Cc: moritz.lipp@iaik.tugraz.at
Cc: richard.fellner@student.tugraz.at
Link: https://lkml.kernel.org/r/20171204150606.828111617@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Hugh Dickins [Sun, 5 Nov 2017 01:43:06 +0000 (18:43 -0700)]
kaiser: kaiser_flush_tlb_on_return_to_user() check PCID
Let kaiser_flush_tlb_on_return_to_user() do the X86_FEATURE_PCID
check, instead of each caller doing it inline first: nobody needs
to optimize for the noPCID case, it's clearer this way, and better
suits later changes. Replace those no-op X86_CR3_PCID_KERN_FLUSH lines
by a BUILD_BUG_ON() in load_new_mm_cr3(), in case something changes.
Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Hugh Dickins [Sun, 5 Nov 2017 01:23:24 +0000 (18:23 -0700)]
kaiser: asm/tlbflush.h handle noPGE at lower level
I found asm/tlbflush.h too twisty, and think it safer not to avoid
__native_flush_tlb_global_irq_disabled() in the kaiser_enabled case,
but instead let it handle kaiser_enabled along with cr3: it can just
use __native_flush_tlb() for that, no harm in re-disabling preemption.
(This is not the same change as Kirill and Dave have suggested for
upstream, flipping PGE in cr4: that's neat, but needs a cpu_has_pge
check; cr3 is enough for kaiser, and thought to be cheaper than cr4.)
Also delete the X86_FEATURE_INVPCID invpcid_flush_all_nonglobals()
preference from __native_flush_tlb(): unlike the invpcid_flush_all()
preference in __native_flush_tlb_global(), it's not seen in upstream
4.14, and was recently reported to be surprisingly slow.
Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Hugh Dickins [Sun, 29 Oct 2017 18:36:19 +0000 (11:36 -0700)]
kaiser: drop is_atomic arg to kaiser_pagetable_walk()
I have not observed a might_sleep() warning from setup_fixmap_gdt()'s
use of kaiser_add_mapping() in our tree (why not?), but like upstream
we have not provided a way for that to pass is_atomic true down to
kaiser_pagetable_walk(), and at startup it's far from a likely source
of trouble: so just delete the walk's is_atomic arg and might_sleep().
Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Hugh Dickins [Wed, 4 Oct 2017 03:49:04 +0000 (20:49 -0700)]
kaiser: use ALTERNATIVE instead of x86_cr3_pcid_noflush
Now that we're playing the ALTERNATIVE game, use that more efficient
method: instead of user-mapping an extra page, and reading an extra
cacheline each time for x86_cr3_pcid_noflush.
Neel has found that __stringify(bts $X86_CR3_PCID_NOFLUSH_BIT, %rax)
is a working substitute for the "bts $63, %rax" in these ALTERNATIVEs;
but the one line with $63 in looks clearer, so let's stick with that.
Worried about what happens with an ALTERNATIVE between the jump and
jump label in another ALTERNATIVE? I was, but have checked the
combinations in SWITCH_KERNEL_CR3_NO_STACK at entry_SYSCALL_64,
and it does a good job.
Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Borislav Petkov [Tue, 2 Jan 2018 13:19:48 +0000 (14:19 +0100)]
x86/kaiser: Check boottime cmdline params
AMD (and possibly other vendors) are not affected by the leak
KAISER is protecting against.
Keep the "nopti" for traditional reasons and add pti=<on|off|auto>
like upstream.
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Borislav Petkov [Tue, 2 Jan 2018 13:19:48 +0000 (14:19 +0100)]
x86/kaiser: Rename and simplify X86_FEATURE_KAISER handling
Concentrate it in arch/x86/mm/kaiser.c and use the upstream string "nopti".
Signed-off-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Hugh Dickins [Sun, 24 Sep 2017 23:59:49 +0000 (16:59 -0700)]
kaiser: add "nokaiser" boot option, using ALTERNATIVE
Added "nokaiser" boot option: an early param like "noinvpcid".
Most places now check int kaiser_enabled (#defined 0 when not
CONFIG_KAISER) instead of #ifdef CONFIG_KAISER; but entry_64.S
and entry_64_compat.S are using the ALTERNATIVE technique, which
patches in the preferred instructions at runtime. That technique
is tied to x86 cpu features, so X86_FEATURE_KAISER is fabricated.
Prior to "nokaiser", Kaiser #defined _PAGE_GLOBAL 0: revert that,
but be careful with both _PAGE_GLOBAL and CR4.PGE: setting them when
nokaiser like when !CONFIG_KAISER, but not setting either when kaiser -
neither matters on its own, but it's hard to be sure that _PAGE_GLOBAL
won't get set in some obscure corner, or something add PGE into CR4.
By omitting _PAGE_GLOBAL from __supported_pte_mask when kaiser_enabled,
all page table setup which uses pte_pfn() masks it out of the ptes.
It's slightly shameful that the same declaration versus definition of
kaiser_enabled appears in not one, not two, but in three header files
(asm/kaiser.h, asm/pgtable.h, asm/tlbflush.h). I felt safer that way,
than with #including any of those in any of the others; and did not
feel it worth an asm/kaiser_enabled.h - kernel/cpu/common.c includes
them all, so we shall hear about it if they get out of synch.
Cleanups while in the area: removed the silly #ifdef CONFIG_KAISER
from kaiser.c; removed the unused native_get_normal_pgd(); removed
the spurious reg clutter from SWITCH_*_CR3 macro stubs; corrected some
comments. But more interestingly, set CR4.PSE in secondary_startup_64:
the manual is clear that it does not matter whether it's 0 or 1 when
4-level-pts are enabled, but I was distracted to find cr4 different on
BSP and auxiliaries - BSP alone was adding PSE, in probe_page_size_mask().
Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Hugh Dickins [Tue, 5 Dec 2017 04:13:35 +0000 (20:13 -0800)]
kaiser: fix unlikely error in alloc_ldt_struct()
An error from kaiser_add_mapping() here is not at all likely, but
Eric Biggers rightly points out that __free_ldt_struct() relies on
new_ldt->size being initialized: move that up.
Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Hugh Dickins [Fri, 13 Oct 2017 19:10:00 +0000 (12:10 -0700)]
kaiser: _pgd_alloc() without __GFP_REPEAT to avoid stalls
Synthetic filesystem mempressure testing has shown softlockups, with
hour-long page allocation stalls, and pgd_alloc() trying for order:1
with __GFP_REPEAT in one of the backtraces each time.
That's _pgd_alloc() going for a Kaiser double-pgd, using the __GFP_REPEAT
common to all page table allocations, but actually having no effect on
order:0 (see should_alloc_oom() and should_continue_reclaim() in this
tree, but beware that ports to another tree might behave differently).
Order:1 stack allocation has been working satisfactorily without
__GFP_REPEAT forever, and page table allocation only asks __GFP_REPEAT
for awkward occasions in a long-running process: it's not appropriate
at fork or exec time, and seems to be doing much more harm than good:
getting those contiguous pages under very heavy mempressure can be
hard (though even without it, Kaiser does generate more mempressure).
Mask out that __GFP_REPEAT inside _pgd_alloc(). Why not take it out
of the PGALLOG_GFP altogether, as v4.7 commit
a3a9a59d2067 ("x86: get
rid of superfluous __GFP_REPEAT") did? Because I think that might
make a difference to our page table memcg charging, which I'd prefer
not to interfere with at this time.
hughd adds: __alloc_pages_slowpath() in the 4.4.89-stable tree handles
__GFP_REPEAT a little differently than in prod kernel or 3.18.72-stable,
so it may not always be exactly a no-op on order:0 pages, as said above;
but I think still appropriate to omit it from Kaiser or non-Kaiser pgd.
Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Hugh Dickins [Wed, 27 Sep 2017 01:43:07 +0000 (18:43 -0700)]
kaiser: paranoid_entry pass cr3 need to paranoid_exit
Neel Natu points out that paranoid_entry() was wrong to assume that
an entry that did not need swapgs would not need SWITCH_KERNEL_CR3:
paranoid_entry (used for debug breakpoint, int3, double fault or MCE;
though I think it's only the MCE case that is cause for concern here)
can break in at an awkward time, between cr3 switch and swapgs, but
its handling always needs kernel gs and kernel cr3.
Easy to fix in itself, but paranoid_entry() also needs to convey to
paranoid_exit() (and my reading of macro idtentry says paranoid_entry
and paranoid_exit are always paired) how to restore the prior state.
The swapgs state is already conveyed by %ebx (0 or 1), so extend that
also to convey when SWITCH_USER_CR3 will be needed (2 or 3).
(Yes, I'd much prefer that 0 meant no swapgs, whereas it's the other
way round: and a convention shared with error_entry() and error_exit(),
which I don't want to touch. Perhaps I should have inverted the bit
for switch cr3 too, but did not.)
paranoid_exit() would be straightforward, except for TRACE_IRQS: it
did TRACE_IRQS_IRETQ when doing swapgs, but TRACE_IRQS_IRETQ_DEBUG
when not: which is it supposed to use when SWITCH_USER_CR3 is split
apart from that? As best as I can determine, commit
5963e317b1e9
("ftrace/x86: Do not change stacks in DEBUG when calling lockdep")
missed the swapgs case, and should have used TRACE_IRQS_IRETQ_DEBUG
there too (the discrepancy has nothing to do with the liberal use
of _NO_STACK and _UNSAFE_STACK hereabouts: TRACE_IRQS_OFF_DEBUG has
just been used in all cases); discrepancy lovingly preserved across
several paranoid_exit() cleanups, but I'm now removing it.
Neel further indicates that to use SWITCH_USER_CR3_NO_STACK there in
paranoid_exit() is now not only unnecessary but unsafe: might corrupt
syscall entry's unsafe_stack_register_backup of %rax. Just use
SWITCH_USER_CR3: and delete SWITCH_USER_CR3_NO_STACK altogether,
before we make the mistake of using it again.
hughd adds: this commit fixes an issue in the Kaiser-without-PCIDs
part of the series, and ought to be moved earlier, if you decided
to make a release of Kaiser-without-PCIDs.
Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Hugh Dickins [Sun, 27 Aug 2017 23:24:27 +0000 (16:24 -0700)]
kaiser: x86_cr3_pcid_noflush and x86_cr3_pcid_user
Mostly this commit is just unshouting X86_CR3_PCID_KERN_VAR and
X86_CR3_PCID_USER_VAR: we usually name variables in lower-case.
But why does x86_cr3_pcid_noflush need to be __aligned(PAGE_SIZE)?
Ah, it's a leftover from when kaiser_add_user_map() once complained
about mapping the same page twice. Make it __read_mostly instead.
(I'm a little uneasy about all the unrelated data which shares its
page getting user-mapped too, but that was so before, and not a big
deal: though we call it user-mapped, it's not mapped with _PAGE_USER.)
And there is a little change around the two calls to do_nmi().
Previously they set the NOFLUSH bit (if PCID supported) when
forcing to kernel context before do_nmi(); now they also have the
NOFLUSH bit set (if PCID supported) when restoring context after:
nothing done in do_nmi() should require a TLB to be flushed here.
Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Hugh Dickins [Sat, 9 Sep 2017 02:26:30 +0000 (19:26 -0700)]
kaiser: PCID 0 for kernel and 128 for user
Why was 4 chosen for kernel PCID and 6 for user PCID?
No good reason in a backport where PCIDs are only used for Kaiser.
If we continue with those, then we shall need to add Andy Lutomirski's
4.13 commit
6c690ee1039b ("x86/mm: Split read_cr3() into read_cr3_pa()
and __read_cr3()"), which deals with the problem of read_cr3() callers
finding stray bits in the cr3 that they expected to be page-aligned;
and for hibernation, his 4.14 commit
f34902c5c6c0 ("x86/hibernate/64:
Mask off CR3's PCID bits in the saved CR3").
But if 0 is used for kernel PCID, then there's no need to add in those
commits - whenever the kernel looks, it sees 0 in the lower bits; and
0 for kernel seems an obvious choice.
And I naughtily propose 128 for user PCID. Because there's a place
in _SWITCH_TO_USER_CR3 where it takes note of the need for TLB FLUSH,
but needs to reset that to NOFLUSH for the next occasion. Currently
it does so with a "movb $(0x80)" into the high byte of the per-cpu
quadword, but that will cause a machine without PCID support to crash.
Now, if %al just happened to have 0x80 in it at that point, on a
machine with PCID support, but 0 on a machine without PCID support...
(That will go badly wrong once the pgd can be at a physical address
above 2^56, but even with 5-level paging, physical goes up to 2^52.)
Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Hugh Dickins [Thu, 17 Aug 2017 22:00:37 +0000 (15:00 -0700)]
kaiser: load_new_mm_cr3() let SWITCH_USER_CR3 flush user
We have many machines (Westmere, Sandybridge, Ivybridge) supporting
PCID but not INVPCID: on these load_new_mm_cr3() simply crashed.
Flushing user context inside load_new_mm_cr3() without the use of
invpcid is difficult: momentarily switch from kernel to user context
and back to do so? I'm not sure whether that can be safely done at
all, and would risk polluting user context with kernel internals,
and kernel context with stale user externals.
Instead, follow the hint in the comment that was there: change
X86_CR3_PCID_USER_VAR to be a per-cpu variable, then load_new_mm_cr3()
can leave a note in it, for SWITCH_USER_CR3 on return to userspace to
flush user context TLB, instead of default X86_CR3_PCID_USER_NOFLUSH.
Which works well enough that there's no need to do it this way only
when invpcid is unsupported: it's a good alternative to invpcid here.
But there's a couple of inlines in asm/tlbflush.h that need to do the
same trick, so it's best to localize all this per-cpu business in
mm/kaiser.c: moving that part of the initialization from setup_pcid()
to kaiser_setup_pcid(); with kaiser_flush_tlb_on_return_to_user() the
function for noting an X86_CR3_PCID_USER_FLUSH. And let's keep a
KAISER_SHADOW_PGD_OFFSET in there, to avoid the extra OR on exit.
I did try to make the feature tests in asm/tlbflush.h more consistent
with each other: there seem to be far too many ways of performing such
tests, and I don't have a good grasp of their differences. At first
I converted them all to be static_cpu_has(): but that proved to be a
mistake, as the comment in __native_flush_tlb_single() hints; so then
I reversed and made them all this_cpu_has(). Probably all gratuitous
change, but that's the way it's working at present.
I am slightly bothered by the way non-per-cpu X86_CR3_PCID_KERN_VAR
gets re-initialized by each cpu (before and after these changes):
no problem when (as usual) all cpus on a machine have the same
features, but in principle incorrect. However, my experiment
to per-cpu-ify that one did not end well...
Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Dave Hansen [Wed, 30 Aug 2017 23:23:00 +0000 (16:23 -0700)]
kaiser: enhanced by kernel and user PCIDs
Merged performance improvements to Kaiser, using distinct kernel
and user Process Context Identifiers to minimize the TLB flushing.
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Hugh Dickins [Sun, 10 Sep 2017 04:27:32 +0000 (21:27 -0700)]
kaiser: vmstat show NR_KAISERTABLE as nr_overhead
The kaiser update made an interesting choice, never to free any shadow
page tables. Contention on global spinlock was worrying, particularly
with it held across page table scans when freeing. Something had to be
done: I was going to add refcounting; but simply never to free them is
an appealing choice, minimizing contention without complicating the code
(the more a page table is found already, the less the spinlock is used).
But leaking pages in this way is also a worry: can we get away with it?
At the very least, we need a count to show how bad it actually gets:
in principle, one might end up wasting about 1/256 of memory that way
(1/512 for when direct-mapped pages have to be user-mapped, plus 1/512
for when they are user-mapped from the vmalloc area on another occasion
(but we don't have vmalloc'ed stacks, so only large ldts are vmalloc'ed).
Add per-cpu stat NR_KAISERTABLE: including 256 at startup for the
shared pgd entries, and 1 for each intermediate page table added
thereafter for user-mapping - but leave out the 1 per mm, for its
shadow pgd, because that distracts from the monotonic increase.
Shown in /proc/vmstat as nr_overhead (0 if kaiser not enabled).
In practice, it doesn't look so bad so far: more like 1/12000 after
nine hours of gtests below; and movable pageblock segregation should
tend to cluster the kaiser tables into a subset of the address space
(if not, they will be bad for compaction too). But production may
tell a different story: keep an eye on this number, and bring back
lighter freeing if it gets out of control (maybe a shrinker).
Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Hugh Dickins [Mon, 4 Sep 2017 01:30:43 +0000 (18:30 -0700)]
kaiser: delete KAISER_REAL_SWITCH option
We fail to see what CONFIG_KAISER_REAL_SWITCH is for: it seems to be
left over from early development, and now just obscures tricky parts
of the code. Delete it before adding PCIDs, or nokaiser boot option.
(Or if there is some good reason to keep the option, then it needs
a help text - and a "depends on KAISER", so that all those without
KAISER are not asked the question.)
Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Hugh Dickins [Sun, 10 Sep 2017 00:31:18 +0000 (17:31 -0700)]
kaiser: name that 0x1000 KAISER_SHADOW_PGD_OFFSET
There's a 0x1000 in various places, which looks better with a name.
Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Hugh Dickins [Tue, 22 Aug 2017 03:11:43 +0000 (20:11 -0700)]
kaiser: cleanups while trying for gold link
While trying to get our gold link to work, four cleanups:
matched the gdt_page declaration to its definition;
in fiddling unsuccessfully with PERCPU_INPUT(), lined up backslashes;
lined up the backslashes according to convention in percpu-defs.h;
deleted the unused irq_stack_pointer addition to irq_stack_union.
Sad to report that aligning backslashes does not appear to help gold
align to 8192: but while these did not help, they are worth keeping.
Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Hugh Dickins [Mon, 2 Oct 2017 17:57:24 +0000 (10:57 -0700)]
kaiser: kaiser_remove_mapping() move along the pgd
When removing the bogus comment from kaiser_remove_mapping(),
I really ought to have checked the extent of its bogosity: as
Neel points out, there is nothing to stop unmap_pud_range_nofree()
from continuing beyond the end of a pud (and starting in the wrong
position on the next).
Fix kaiser_remove_mapping() to constrain the extent and advance pgd
pointer correctly: use pgd_addr_end() macro as used throughout base
mm (but don't assume page-rounded start and size in this case).
But this bug was very unlikely to trigger in this backport: since
any buddy allocation is contained within a single pud extent, and
we are not using vmapped stacks (and are only mapping one page of
stack anyway): the only way to hit this bug here would be when
freeing a large modified ldt.
Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Hugh Dickins [Mon, 4 Sep 2017 02:23:08 +0000 (19:23 -0700)]
kaiser: tidied up kaiser_add/remove_mapping slightly
Yes, unmap_pud_range_nofree()'s declaration ought to be in a
header file really, but I'm not sure we want to use it anyway:
so for now just declare it inside kaiser_remove_mapping().
And there doesn't seem to be such a thing as unmap_p4d_range(),
even in a 5-level paging tree.
Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Hugh Dickins [Mon, 4 Sep 2017 02:18:07 +0000 (19:18 -0700)]
kaiser: tidied up asm/kaiser.h somewhat
Mainly deleting a surfeit of blank lines, and reflowing header comment.
Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Hugh Dickins [Mon, 4 Sep 2017 01:48:02 +0000 (18:48 -0700)]
kaiser: ENOMEM if kaiser_pagetable_walk() NULL
kaiser_add_user_map() took no notice when kaiser_pagetable_walk() failed.
And avoid its might_sleep() when atomic (though atomic at present unused).
Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Hugh Dickins [Wed, 23 Aug 2017 21:21:14 +0000 (14:21 -0700)]
kaiser: fix perf crashes
Avoid perf crashes: place debug_store in the user-mapped per-cpu area
instead of allocating, and use page allocator plus kaiser_add_mapping()
to keep the BTS and PEBS buffers user-mapped (that is, present in the
user mapping, though visible only to kernel and hardware). The PEBS
fixup buffer does not need this treatment.
The need for a user-mapped struct debug_store showed up before doing
any conscious perf testing: in a couple of kernel paging oopses on
Westmere, implicating the debug_store offset of the per-cpu area.
Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Hugh Dickins [Fri, 22 Sep 2017 03:39:56 +0000 (20:39 -0700)]
kaiser: fix regs to do_nmi() ifndef CONFIG_KAISER
pjt has observed that nmi's second (nmi_from_kernel) call to do_nmi()
adjusted the %rdi regs arg, rightly when CONFIG_KAISER, but wrongly
when not CONFIG_KAISER.
Although the minimal change is to add an #ifdef CONFIG_KAISER around
the addq line, that looks cluttered, and I prefer how the first call
to do_nmi() handled it: prepare args in %rdi and %rsi before getting
into the CONFIG_KAISER block, since it does not touch them at all.
And while we're here, place the "#ifdef CONFIG_KAISER" that follows
each, to enclose the "Unconditionally restore CR3" comment: matching
how the "Unconditionally use kernel CR3" comment above is enclosed.
Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Hugh Dickins [Wed, 13 Sep 2017 21:03:10 +0000 (14:03 -0700)]
kaiser: KAISER depends on SMP
It is absurd that KAISER should depend on SMP, but apparently nobody
has tried a UP build before: which breaks on implicit declaration of
function 'per_cpu_offset' in arch/x86/mm/kaiser.c.
Now, you would expect that to be trivially fixed up; but looking at
the System.map when that block is #ifdef'ed out of kaiser_init(),
I see that in a UP build __per_cpu_user_mapped_end is precisely at
__per_cpu_user_mapped_start, and the items carefully gathered into
that section for user-mapping on SMP, dispersed elsewhere on UP.
So, some other kind of section assignment will be needed on UP,
but implementing that is not a priority: just make KAISER depend
on SMP for now.
Also inserted a blank line before the option, tidied up the
brief Kconfig help message, and added an "If unsure, Y".
Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Hugh Dickins [Mon, 4 Sep 2017 00:09:44 +0000 (17:09 -0700)]
kaiser: fix build and FIXME in alloc_ldt_struct()
Include linux/kaiser.h instead of asm/kaiser.h to build ldt.c without
CONFIG_KAISER. kaiser_add_mapping() does already return an error code,
so fix the FIXME.
Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Hugh Dickins [Mon, 4 Sep 2017 01:57:03 +0000 (18:57 -0700)]
kaiser: stack map PAGE_SIZE at THREAD_SIZE-PAGE_SIZE
Kaiser only needs to map one page of the stack; and
kernel/fork.c did not build on powerpc (no __PAGE_KERNEL).
It's all cleaner if linux/kaiser.h provides kaiser_map_thread_stack()
and kaiser_unmap_thread_stack() wrappers around asm/kaiser.h's
kaiser_add_mapping() and kaiser_remove_mapping(). And use
linux/kaiser.h in init/main.c to avoid the #ifdefs there.
Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Hugh Dickins [Tue, 5 Sep 2017 19:05:01 +0000 (12:05 -0700)]
kaiser: do not set _PAGE_NX on pgd_none
native_pgd_clear() uses native_set_pgd(), so native_set_pgd() must
avoid setting the _PAGE_NX bit on an otherwise pgd_none() entry:
usually that just generated a warning on exit, but sometimes
more mysterious and damaging failures (our production machines
could not complete booting).
The original fix to this just avoided adding _PAGE_NX to
an empty entry; but eventually more problems surfaced with kexec,
and EFI mapping expected to be a problem too. So now instead
change native_set_pgd() to update shadow only if _PAGE_USER:
A few places (kernel/machine_kexec_64.c, platform/efi/efi_64.c for sure)
use set_pgd() to set up a temporary internal virtual address space, with
physical pages remapped at what Kaiser regards as userspace addresses:
Kaiser then assumes a shadow pgd follows, which it will try to corrupt.
This appears to be responsible for the recent kexec and kdump failures;
though it's unclear how those did not manifest as a problem before.
Ah, the shadow pgd will only be assumed to "follow" if the requested
pgd is on an even-numbered page: so I suppose it was going wrong 50%
of the time all along.
What we need is a flag to set_pgd(), to tell it we're dealing with
userspace. Er, isn't that what the pgd's _PAGE_USER bit is saying?
Add a test for that. But we cannot do the same for pgd_clear()
(which may be called to clear corrupted entries - set aside the
question of "corrupt in which pgd?" until later), so there just
rely on pgd_clear() not being called in the problematic cases -
with a WARN_ON_ONCE() which should fire half the time if it is.
But this is getting too big for an inline function: move it into
arch/x86/mm/kaiser.c (which then demands a boot/compressed mod);
and de-void and de-space native_get_shadow/normal_pgd() while here.
Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Dave Hansen [Wed, 30 Aug 2017 23:23:00 +0000 (16:23 -0700)]
kaiser: merged update
Merged fixes and cleanups, rebased to 4.4.89 tree (no 5-level paging).
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Richard Fellner [Thu, 4 May 2017 12:26:50 +0000 (14:26 +0200)]
KAISER: Kernel Address Isolation
This patch introduces our implementation of KAISER (Kernel Address Isolation to
have Side-channels Efficiently Removed), a kernel isolation technique to close
hardware side channels on kernel address information.
More information about the patch can be found on:
https://github.com/IAIK/KAISER
From: Richard Fellner <richard.fellner@student.tugraz.at>
From: Daniel Gruss <daniel.gruss@iaik.tugraz.at>
X-Subject: [RFC, PATCH] x86_64: KAISER - do not map kernel in user mode
Date: Thu, 4 May 2017 14:26:50 +0200
Link: http://marc.info/?l=linux-kernel&m=149390087310405&w=2
Kaiser-4.10-SHA1:
c4b1831d44c6144d3762ccc72f0c4e71a0c713e5
To: <linux-kernel@vger.kernel.org>
To: <kernel-hardening@lists.openwall.com>
Cc: <clementine.maurice@iaik.tugraz.at>
Cc: <moritz.lipp@iaik.tugraz.at>
Cc: Michael Schwarz <michael.schwarz@iaik.tugraz.at>
Cc: Richard Fellner <richard.fellner@student.tugraz.at>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: <kirill.shutemov@linux.intel.com>
Cc: <anders.fogh@gdata-adan.de>
After several recent works [1,2,3] KASLR on x86_64 was basically
considered dead by many researchers. We have been working on an
efficient but effective fix for this problem and found that not mapping
the kernel space when running in user mode is the solution to this
problem [4] (the corresponding paper [5] will be presented at ESSoS17).
With this RFC patch we allow anybody to configure their kernel with the
flag CONFIG_KAISER to add our defense mechanism.
If there are any questions we would love to answer them.
We also appreciate any comments!
Cheers,
Daniel (+ the KAISER team from Graz University of Technology)
[1] http://www.ieee-security.org/TC/SP2013/papers/
4977a191.pdf
[2] https://www.blackhat.com/docs/us-16/materials/us-16-Fogh-Using-Undocumented-CPU-Behaviour-To-See-Into-Kernel-Mode-And-Break-KASLR-In-The-Process.pdf
[3] https://www.blackhat.com/docs/us-16/materials/us-16-Jang-Breaking-Kernel-Address-Space-Layout-Randomization-KASLR-With-Intel-TSX.pdf
[4] https://github.com/IAIK/KAISER
[5] https://gruss.cc/files/kaiser.pdf
[patch based also on
https://raw.githubusercontent.com/IAIK/KAISER/master/KAISER/0001-KAISER-Kernel-Address-Isolation.patch]
Signed-off-by: Richard Fellner <richard.fellner@student.tugraz.at>
Signed-off-by: Moritz Lipp <moritz.lipp@iaik.tugraz.at>
Signed-off-by: Daniel Gruss <daniel.gruss@iaik.tugraz.at>
Signed-off-by: Michael Schwarz <michael.schwarz@iaik.tugraz.at>
Acked-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Tom Lendacky [Mon, 17 Jul 2017 21:10:33 +0000 (16:10 -0500)]
x86/boot: Add early cmdline parsing for options with arguments
commit
e505371dd83963caae1a37ead9524e8d997341be upstream.
Add a cmdline_find_option() function to look for cmdline options that
take arguments. The argument is returned in a supplied buffer and the
argument length (regardless of whether it fits in the supplied buffer)
is returned, with -1 indicating not found.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brijesh Singh <brijesh.singh@amd.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Larry Woodman <lwoodman@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Toshimitsu Kani <toshi.kani@hpe.com>
Cc: kasan-dev@googlegroups.com
Cc: kvm@vger.kernel.org
Cc: linux-arch@vger.kernel.org
Cc: linux-doc@vger.kernel.org
Cc: linux-efi@vger.kernel.org
Cc: linux-mm@kvack.org
Link: http://lkml.kernel.org/r/36b5f97492a9745dce27682305f990fc20e5cf8a.1500319216.git.thomas.lendacky@amd.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Daniel Rosenberg [Tue, 2 Jan 2018 22:44:49 +0000 (14:44 -0800)]
ANDROID: sdcardfs: Add default_normal option
The default_normal option causes mounts with the gid set to
AID_SDCARD_RW to have user specific gids, as in the normal case.
Signed-off-by: Daniel Rosenberg <drosen@google.com>
Change-Id: I9619b8ac55f41415df943484dc8db1ea986cef6f
Bug:
64672411
Daniel Rosenberg [Thu, 21 Dec 2017 00:59:11 +0000 (16:59 -0800)]
ANDROID: sdcardfs: notify lower file of opens
fsnotify_open is not called within dentry_open,
so we need to call it ourselves.
Change-Id: Ia7f323b3d615e6ca5574e114e8a5d7973fb4c119
Signed-off-by: Daniel Rosenberg <drosen@google.com>
Bug:
70706497
Greg Kroah-Hartman [Tue, 2 Jan 2018 19:58:26 +0000 (20:58 +0100)]
Merge 4.4.109 into android-4.4
Changes in 4.4.109
ACPI: APEI / ERST: Fix missing error handling in erst_reader()
crypto: mcryptd - protect the per-CPU queue with a lock
mfd: cros ec: spi: Don't send first message too soon
mfd: twl4030-audio: Fix sibling-node lookup
mfd: twl6040: Fix child-node lookup
ALSA: rawmidi: Avoid racy info ioctl via ctl device
ALSA: usb-audio: Fix the missing ctl name suffix at parsing SU
PCI / PM: Force devices to D0 in pci_pm_thaw_noirq()
parisc: Hide Diva-built-in serial aux and graphics card
spi: xilinx: Detect stall with Unknown commands
KVM: X86: Fix load RFLAGS w/o the fixed bit
kvm: x86: fix RSM when PCID is non-zero
powerpc/perf: Dereference BHRB entries safely
net: mvneta: clear interface link status on port disable
tracing: Remove extra zeroing out of the ring buffer page
tracing: Fix possible double free on failure of allocating trace buffer
tracing: Fix crash when it fails to alloc ring buffer
ring-buffer: Mask out the info bits when returning buffer page length
iw_cxgb4: Only validate the MSN for successful completions
ASoC: fsl_ssi: AC'97 ops need regmap, clock and cleaning up on failure
ASoC: twl4030: fix child-node lookup
ALSA: hda: Drop useless WARN_ON()
ALSA: hda - fix headset mic detection issue on a Dell machine
x86/vm86/32: Switch to flush_tlb_mm_range() in mark_screen_rdonly()
x86/mm: Remove flush_tlb() and flush_tlb_current_task()
x86/mm: Make flush_tlb_mm_range() more predictable
x86/mm: Reimplement flush_tlb_page() using flush_tlb_mm_range()
x86/mm: Remove the UP asm/tlbflush.h code, always use the (formerly) SMP code
x86/mm: Disable PCID on 32-bit kernels
x86/mm: Add the 'nopcid' boot option to turn off PCID
x86/mm: Enable CR4.PCIDE on supported systems
x86/mm/64: Fix reboot interaction with CR4.PCIDE
kbuild: add '-fno-stack-check' to kernel build options
ipv4: igmp: guard against silly MTU values
ipv6: mcast: better catch silly mtu values
net: igmp: Use correct source address on IGMPv3 reports
netlink: Add netns check on taps
net: qmi_wwan: add Sierra EM7565 1199:9091
net: reevalulate autoflowlabel setting after sysctl setting
tcp md5sig: Use skb's saddr when replying to an incoming segment
tg3: Fix rx hang on MTU change with 5717/5719
net: ipv4: fix for a race condition in raw_sendmsg
net: mvmdio: disable/unprepare clocks in EPROBE_DEFER case
sctp: Replace use of sockets_allocated with specified macro.
ipv4: Fix use-after-free when flushing FIB tables
net: bridge: fix early call to br_stp_change_bridge_id and plug newlink leaks
net: Fix double free and memory corruption in get_net_ns_by_id()
net: phy: micrel: ksz9031: reconfigure autoneg after phy autoneg workaround
sock: free skb in skb_complete_tx_timestamp on error
usbip: fix usbip bind writing random string after command in match_busid
usbip: stub: stop printing kernel pointer addresses in messages
usbip: vhci: stop printing kernel pointer addresses in messages
USB: serial: ftdi_sio: add id for Airbus DS P8GR
USB: serial: qcserial: add Sierra Wireless EM7565
USB: serial: option: add support for Telit ME910 PID 0x1101
USB: serial: option: adding support for YUGA CLM920-NC5
usb: Add device quirk for Logitech HD Pro Webcam C925e
usb: add RESET_RESUME for ELSA MicroLink 56K
USB: Fix off by one in type-specific length check of BOS SSP capability
usb: xhci: Add XHCI_TRUST_TX_LENGTH for Renesas uPD720201
nohz: Prevent a timer interrupt storm in tick_nohz_stop_sched_tick()
x86/smpboot: Remove stale TLB flush invocations
n_tty: fix EXTPROC vs ICANON interaction with TIOCINQ (aka FIONREAD)
mm/vmstat: Make NR_TLB_REMOTE_FLUSH_RECEIVED available even on UP
Linux 4.4.109
Signed-off-by: Greg Kroah-Hartman <gregkh@google.com>
Greg Kroah-Hartman [Tue, 2 Jan 2018 19:33:28 +0000 (20:33 +0100)]
Linux 4.4.109
Andy Lutomirski [Mon, 5 Jun 2017 14:40:25 +0000 (07:40 -0700)]
mm/vmstat: Make NR_TLB_REMOTE_FLUSH_RECEIVED available even on UP
commit
5dd0b16cdaff9b94da06074d5888b03235c0bf17 upstream.
This fixes CONFIG_SMP=n, CONFIG_DEBUG_TLBFLUSH=y without introducing
further #ifdef soup. Caught by a Kbuild bot randconfig build.
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes:
ce4a4e565f52 ("x86/mm: Remove the UP asm/tlbflush.h code, always use the (formerly) SMP code")
Link: http://lkml.kernel.org/r/76da9a3cc4415996f2ad2c905b93414add322021.1496673616.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Linus Torvalds [Thu, 21 Dec 2017 01:57:06 +0000 (17:57 -0800)]
n_tty: fix EXTPROC vs ICANON interaction with TIOCINQ (aka FIONREAD)
commit
966031f340185eddd05affcf72b740549f056348 upstream.
We added support for EXTPROC back in 2010 in commit
26df6d13406d ("tty:
Add EXTPROC support for LINEMODE") and the intent was to allow it to
override some (all?) ICANON behavior. Quoting from that original commit
message:
There is a new bit in the termios local flag word, EXTPROC.
When this bit is set, several aspects of the terminal driver
are disabled. Input line editing, character echo, and mapping
of signals are all disabled. This allows the telnetd to turn
off these functions when in linemode, but still keep track of
what state the user wants the terminal to be in.
but the problem turns out that "several aspects of the terminal driver
are disabled" is a bit ambiguous, and you can really confuse the n_tty
layer by setting EXTPROC and then causing some of the ICANON invariants
to no longer be maintained.
This fixes at least one such case (TIOCINQ) becoming unhappy because of
the confusion over whether ICANON really means ICANON when EXTPROC is set.
This basically makes TIOCINQ match the case of read: if EXTPROC is set,
we ignore ICANON. Also, make sure to reset the ICANON state ie EXTPROC
changes, not just if ICANON changes.
Fixes:
26df6d13406d ("tty: Add EXTPROC support for LINEMODE")
Reported-by: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Reported-by: syzkaller <syzkaller@googlegroups.com>
Cc: Jiri Slaby <jslaby@suse.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Thomas Gleixner [Sat, 30 Dec 2017 21:13:53 +0000 (22:13 +0100)]
x86/smpboot: Remove stale TLB flush invocations
commit
322f8b8b340c824aef891342b0f5795d15e11562 upstream.
smpboot_setup_warm_reset_vector() and smpboot_restore_warm_reset_vector()
invoke local_flush_tlb() for no obvious reason.
Digging in history revealed that the original code in the 2.1 era added
those because the code manipulated a swapper_pg_dir pagetable entry. The
pagetable manipulation was removed long ago in the 2.3 timeframe, but the
TLB flush invocations stayed around forever.
Remove them along with the pointless pr_debug()s which come from the same 2.1
change.
Reported-by: Dominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linuxfoundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20171230211829.586548655@linutronix.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Thomas Gleixner [Fri, 22 Dec 2017 14:51:13 +0000 (15:51 +0100)]
nohz: Prevent a timer interrupt storm in tick_nohz_stop_sched_tick()
commit
5d62c183f9e9df1deeea0906d099a94e8a43047a upstream.
The conditions in irq_exit() to invoke tick_nohz_irq_exit() which
subsequently invokes tick_nohz_stop_sched_tick() are:
if ((idle_cpu(cpu) && !need_resched()) || tick_nohz_full_cpu(cpu))
If need_resched() is not set, but a timer softirq is pending then this is
an indication that the softirq code punted and delegated the execution to
softirqd. need_resched() is not true because the current interrupted task
takes precedence over softirqd.
Invoking tick_nohz_irq_exit() in this case can cause an endless loop of
timer interrupts because the timer wheel contains an expired timer, but
softirqs are not yet executed. So it returns an immediate expiry request,
which causes the timer to fire immediately again. Lather, rinse and
repeat....
Prevent that by adding a check for a pending timer soft interrupt to the
conditions in tick_nohz_stop_sched_tick() which avoid calling
get_next_timer_interrupt(). That keeps the tick sched timer on the tick and
prevents a repetitive programming of an already expired timer.
Reported-by: Sebastian Siewior <bigeasy@linutronix.d>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Paul McKenney <paulmck@linux.vnet.ibm.com>
Cc: Anna-Maria Gleixner <anna-maria@linutronix.de>
Cc: Sebastian Siewior <bigeasy@linutronix.de>
Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1712272156050.2431@nanos
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Daniel Thompson [Thu, 21 Dec 2017 13:06:15 +0000 (15:06 +0200)]
usb: xhci: Add XHCI_TRUST_TX_LENGTH for Renesas uPD720201
commit
da99706689481717998d1d48edd389f339eea979 upstream.
When plugging in a USB webcam I see the following message:
xhci_hcd 0000:04:00.0: WARN Successful completion on short TX: needs
XHCI_TRUST_TX_LENGTH quirk?
handle_tx_event: 913 callbacks suppressed
All is quiet again with this patch (and I've done a fair but of soak
testing with the camera since).
Signed-off-by: Daniel Thompson <daniel.thompson@linaro.org>
Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Mathias Nyman [Tue, 19 Dec 2017 09:14:42 +0000 (11:14 +0200)]
USB: Fix off by one in type-specific length check of BOS SSP capability
commit
07b9f12864d16c3a861aef4817eb1efccbc5d0e6 upstream.
USB 3.1 devices are not detected as 3.1 capable since 4.15-rc3 due to a
off by one in commit
81cf4a45360f ("USB: core: Add type-specific length
check of BOS descriptors")
It uses USB_DT_USB_SSP_CAP_SIZE() to get SSP capability size which takes
the zero based SSAC as argument, not the actual count of sublink speed
attributes.
USB3 spec 9.6.2.5 says "The number of Sublink Speed Attributes = SSAC + 1."
The type-specific length check patch was added to stable and needs to be
fixed there as well
Fixes:
81cf4a45360f ("USB: core: Add type-specific length check of BOS descriptors")
CC: Masakazu Mokuno <masakazu.mokuno@gmail.com>
Signed-off-by: Mathias Nyman <mathias.nyman@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Oliver Neukum [Tue, 12 Dec 2017 15:11:30 +0000 (16:11 +0100)]
usb: add RESET_RESUME for ELSA MicroLink 56K
commit
b9096d9f15c142574ebebe8fbb137012bb9d99c2 upstream.
This modem needs this quirk to operate. It produces timeouts when
resumed without reset.
Signed-off-by: Oliver Neukum <oneukum@suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Dmitry Fleytman Dmitry Fleytman [Tue, 19 Dec 2017 04:02:04 +0000 (06:02 +0200)]
usb: Add device quirk for Logitech HD Pro Webcam C925e
commit
7f038d256c723dd390d2fca942919573995f4cfd upstream.
Commit
e0429362ab15
("usb: Add device quirk for Logitech HD Pro Webcams C920 and C930e")
introduced quirk to workaround an issue with some Logitech webcams.
There is one more model that has the same issue - C925e, so applying
the same quirk as well.
See aforementioned commit message for detailed explanation of the problem.
Signed-off-by: Dmitry Fleytman <dmitry.fleytman@gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
SZ Lin (林上智) [Tue, 19 Dec 2017 09:40:32 +0000 (17:40 +0800)]
USB: serial: option: adding support for YUGA CLM920-NC5
commit
3920bb713038810f25770e7545b79f204685c8f2 upstream.
This patch adds support for YUGA CLM920-NC5 PID 0x9625 USB modem to option
driver.
Interface layout:
0: QCDM/DIAG
1: ADB
2: MODEM
3: AT
4: RMNET
Signed-off-by: Taiyi Wu <taiyity.wu@moxa.com>
Signed-off-by: SZ Lin (林上智) <sz.lin@moxa.com>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Daniele Palmas [Thu, 14 Dec 2017 15:54:45 +0000 (16:54 +0100)]
USB: serial: option: add support for Telit ME910 PID 0x1101
commit
08933099e6404f588f81c2050bfec7313e06eeaf upstream.
This patch adds support for PID 0x1101 of Telit ME910.
Signed-off-by: Daniele Palmas <dnlplm@gmail.com>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reinhard Speyerer [Thu, 14 Dec 2017 23:39:27 +0000 (00:39 +0100)]
USB: serial: qcserial: add Sierra Wireless EM7565
commit
92a18a657fb2e2ffbfa0659af32cc18fd2346516 upstream.
Sierra Wireless EM7565 devices use the QCSERIAL_SWI layout for their
serial ports
T: Bus=01 Lev=03 Prnt=29 Port=01 Cnt=02 Dev#= 31 Spd=480 MxCh= 0
D: Ver= 2.00 Cls=00(>ifc ) Sub=00 Prot=00 MxPS=64 #Cfgs= 1
P: Vendor=1199 ProdID=9091 Rev= 0.06
S: Manufacturer=Sierra Wireless, Incorporated
S: Product=Sierra Wireless EM7565 Qualcomm Snapdragon X16 LTE-A
S: SerialNumber=xxxxxxxx
C:* #Ifs= 4 Cfg#= 1 Atr=a0 MxPwr=500mA
I:* If#= 0 Alt= 0 #EPs= 2 Cls=ff(vend.) Sub=ff Prot=ff Driver=qcserial
E: Ad=81(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E: Ad=01(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
I:* If#= 2 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=qcserial
E: Ad=83(I) Atr=03(Int.) MxPS= 10 Ivl=32ms
E: Ad=82(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E: Ad=02(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
I:* If#= 3 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=00 Prot=00 Driver=qcserial
E: Ad=85(I) Atr=03(Int.) MxPS= 10 Ivl=32ms
E: Ad=84(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E: Ad=03(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
I:* If#= 8 Alt= 0 #EPs= 3 Cls=ff(vend.) Sub=ff Prot=ff Driver=qmi_wwan
E: Ad=86(I) Atr=03(Int.) MxPS= 8 Ivl=32ms
E: Ad=8e(I) Atr=02(Bulk) MxPS= 512 Ivl=0ms
E: Ad=0f(O) Atr=02(Bulk) MxPS= 512 Ivl=0ms
but need sendsetup = true for the NMEA port to make it work properly.
Simplify the patch compared to v1 as suggested by Bjørn Mork by taking
advantage of the fact that existing devices work with sendsetup = true
too.
Use sendsetup = true for the NMEA interface of QCSERIAL_SWI and add
DEVICE_SWI entries for the EM7565 PID 0x9091 and the EM7565 QDL PID
0x9090.
Tests with several MC73xx/MC74xx/MC77xx devices have been performed in
order to verify backward compatibility.
Signed-off-by: Reinhard Speyerer <rspmn@arcor.de>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Max Schulze [Wed, 20 Dec 2017 19:47:44 +0000 (20:47 +0100)]
USB: serial: ftdi_sio: add id for Airbus DS P8GR
commit
c6a36ad383559a60a249aa6016cebf3cb8b6c485 upstream.
Add AIRBUS_DS_P8GR device IDs to ftdi_sio driver.
Signed-off-by: Max Schulze <max.schulze@posteo.de>
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Shuah Khan [Tue, 19 Dec 2017 00:24:22 +0000 (17:24 -0700)]
usbip: vhci: stop printing kernel pointer addresses in messages
commit
8272d099d05f7ab2776cf56a2ab9f9443be18907 upstream.
Remove and/or change debug, info. and error messages to not print
kernel pointer addresses.
Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Shuah Khan [Tue, 19 Dec 2017 00:23:37 +0000 (17:23 -0700)]
usbip: stub: stop printing kernel pointer addresses in messages
commit
248a22044366f588d46754c54dfe29ffe4f8b4df upstream.
Remove and/or change debug, info. and error messages to not print
kernel pointer addresses.
Signed-off-by: Shuah Khan <shuahkh@osg.samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Juan Zea [Fri, 15 Dec 2017 09:21:20 +0000 (10:21 +0100)]
usbip: fix usbip bind writing random string after command in match_busid
commit
544c4605acc5ae4afe7dd5914147947db182f2fb upstream.
usbip bind writes commands followed by random string when writing to
match_busid attribute in sysfs, caused by using full variable size
instead of string length.
Signed-off-by: Juan Zea <juan.zea@qindel.com>
Acked-by: Shuah Khan <shuahkh@osg.samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Willem de Bruijn [Wed, 13 Dec 2017 19:41:06 +0000 (14:41 -0500)]
sock: free skb in skb_complete_tx_timestamp on error
[ Upstream commit
35b99dffc3f710cafceee6c8c6ac6a98eb2cb4bf ]
skb_complete_tx_timestamp must ingest the skb it is passed. Call
kfree_skb if the skb cannot be enqueued.
Fixes:
b245be1f4db1 ("net-timestamp: no-payload only sysctl")
Fixes:
9ac25fc06375 ("net: fix socket refcounting in skb_complete_tx_timestamp()")
Reported-by: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Grygorii Strashko [Thu, 21 Dec 2017 00:45:10 +0000 (18:45 -0600)]
net: phy: micrel: ksz9031: reconfigure autoneg after phy autoneg workaround
[ Upstream commit
c1a8d0a3accf64a014d605e6806ce05d1c17adf1 ]
Under some circumstances driver will perform PHY reset in
ksz9031_read_status() to fix autoneg failure case (idle error count =
0xFF). When this happens ksz9031 will not detect link status change any
more when connecting to Netgear 1G switch (link can be recovered sometimes by
restarting netdevice "ifconfig down up"). Reproduced with TI am572x board
equipped with ksz9031 PHY while connecting to Netgear 1G switch.
Fix the issue by reconfiguring autonegotiation after PHY reset in
ksz9031_read_status().
Fixes:
d2fd719bcb0e ("net/phy: micrel: Add workaround for bad autoneg")
Signed-off-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Eric W. Biederman [Tue, 19 Dec 2017 17:27:56 +0000 (11:27 -0600)]
net: Fix double free and memory corruption in get_net_ns_by_id()
[ Upstream commit
21b5944350052d2583e82dd59b19a9ba94a007f0 ]
(I can trivially verify that that idr_remove in cleanup_net happens
after the network namespace count has dropped to zero --EWB)
Function get_net_ns_by_id() does not check for net::count
after it has found a peer in netns_ids idr.
It may dereference a peer, after its count has already been
finaly decremented. This leads to double free and memory
corruption:
put_net(peer) rtnl_lock()
atomic_dec_and_test(&peer->count) [count=0] ...
__put_net(peer) get_net_ns_by_id(net, id)
spin_lock(&cleanup_list_lock)
list_add(&net->cleanup_list, &cleanup_list)
spin_unlock(&cleanup_list_lock)
queue_work() peer = idr_find(&net->netns_ids, id)
| get_net(peer) [count=1]
| ...
| (use after final put)
v ...
cleanup_net() ...
spin_lock(&cleanup_list_lock) ...
list_replace_init(&cleanup_list, ..) ...
spin_unlock(&cleanup_list_lock) ...
... ...
... put_net(peer)
... atomic_dec_and_test(&peer->count) [count=0]
... spin_lock(&cleanup_list_lock)
... list_add(&net->cleanup_list, &cleanup_list)
... spin_unlock(&cleanup_list_lock)
... queue_work()
... rtnl_unlock()
rtnl_lock() ...
for_each_net(tmp) { ...
id = __peernet2id(tmp, peer) ...
spin_lock_irq(&tmp->nsid_lock) ...
idr_remove(&tmp->netns_ids, id) ...
... ...
net_drop_ns() ...
net_free(peer) ...
} ...
|
v
cleanup_net()
...
(Second free of peer)
Also, put_net() on the right cpu may reorder with left's cpu
list_replace_init(&cleanup_list, ..), and then cleanup_list
will be corrupted.
Since cleanup_net() is executed in worker thread, while
put_net(peer) can happen everywhere, there should be
enough time for concurrent get_net_ns_by_id() to pick
the peer up, and the race does not seem to be unlikely.
The patch fixes the problem in standard way.
(Also, there is possible problem in peernet2id_alloc(), which requires
check for net::count under nsid_lock and maybe_get_net(peer), but
in current stable kernel it's used under rtnl_lock() and it has to be
safe. Openswitch begun to use peernet2id_alloc(), and possibly it should
be fixed too. While this is not in stable kernel yet, so I'll send
a separate message to netdev@ later).
Cc: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Fixes:
0c7aecd4bde4 "netns: add rtnl cmd to add and get peer netns ids"
Reviewed-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Reviewed-by: "Eric W. Biederman" <ebiederm@xmission.com>
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Nikolay Aleksandrov [Mon, 18 Dec 2017 15:35:09 +0000 (17:35 +0200)]
net: bridge: fix early call to br_stp_change_bridge_id and plug newlink leaks
[ Upstream commit
84aeb437ab98a2bce3d4b2111c79723aedfceb33 ]
The early call to br_stp_change_bridge_id in bridge's newlink can cause
a memory leak if an error occurs during the newlink because the fdb
entries are not cleaned up if a different lladdr was specified, also
another minor issue is that it generates fdb notifications with
ifindex = 0. Another unrelated memory leak is the bridge sysfs entries
which get added on NETDEV_REGISTER event, but are not cleaned up in the
newlink error path. To remove this special case the call to
br_stp_change_bridge_id is done after netdev register and we cleanup the
bridge on changelink error via br_dev_delete to plug all leaks.
This patch makes netlink bridge destruction on newlink error the same as
dellink and ioctl del which is necessary since at that point we have a
fully initialized bridge device.
To reproduce the issue:
$ ip l add br0 address 00:11:22:33:44:55 type bridge group_fwd_mask 1
RTNETLINK answers: Invalid argument
$ rmmod bridge
[ 1822.142525] =============================================================================
[ 1822.143640] BUG bridge_fdb_cache (Tainted: G O ): Objects remaining in bridge_fdb_cache on __kmem_cache_shutdown()
[ 1822.144821] -----------------------------------------------------------------------------
[ 1822.145990] Disabling lock debugging due to kernel taint
[ 1822.146732] INFO: Slab 0x0000000092a844b2 objects=32 used=2 fp=0x00000000fef011b0 flags=0x1ffff8000000100
[ 1822.147700] CPU: 2 PID: 13584 Comm: rmmod Tainted: G B O 4.15.0-rc2+ #87
[ 1822.148578] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.7.5-20140531_083030-gandalf 04/01/2014
[ 1822.150008] Call Trace:
[ 1822.150510] dump_stack+0x78/0xa9
[ 1822.151156] slab_err+0xb1/0xd3
[ 1822.151834] ? __kmalloc+0x1bb/0x1ce
[ 1822.152546] __kmem_cache_shutdown+0x151/0x28b
[ 1822.153395] shutdown_cache+0x13/0x144
[ 1822.154126] kmem_cache_destroy+0x1c0/0x1fb
[ 1822.154669] SyS_delete_module+0x194/0x244
[ 1822.155199] ? trace_hardirqs_on_thunk+0x1a/0x1c
[ 1822.155773] entry_SYSCALL_64_fastpath+0x23/0x9a
[ 1822.156343] RIP: 0033:0x7f929bd38b17
[ 1822.156859] RSP: 002b:
00007ffd160e9a98 EFLAGS:
00000202 ORIG_RAX:
00000000000000b0
[ 1822.157728] RAX:
ffffffffffffffda RBX:
00005578316ba090 RCX:
00007f929bd38b17
[ 1822.158422] RDX:
00007f929bd9ec60 RSI:
0000000000000800 RDI:
00005578316ba0f0
[ 1822.159114] RBP:
0000000000000003 R08:
00007f929bff5f20 R09:
00007ffd160e8a11
[ 1822.159808] R10:
00007ffd160e9860 R11:
0000000000000202 R12:
00007ffd160e8a80
[ 1822.160513] R13:
0000000000000000 R14:
0000000000000000 R15:
00005578316ba090
[ 1822.161278] INFO: Object 0x000000007645de29 @offset=0
[ 1822.161666] INFO: Object 0x00000000d5df2ab5 @offset=128
Fixes:
30313a3d5794 ("bridge: Handle IFLA_ADDRESS correctly when creating bridge device")
Fixes:
5b8d5429daa0 ("bridge: netlink: register netdevice before executing changelink")
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Ido Schimmel [Wed, 20 Dec 2017 17:34:19 +0000 (19:34 +0200)]
ipv4: Fix use-after-free when flushing FIB tables
[ Upstream commit
b4681c2829e24943aadd1a7bb3a30d41d0a20050 ]
Since commit
0ddcf43d5d4a ("ipv4: FIB Local/MAIN table collapse") the
local table uses the same trie allocated for the main table when custom
rules are not in use.
When a net namespace is dismantled, the main table is flushed and freed
(via an RCU callback) before the local table. In case the callback is
invoked before the local table is iterated, a use-after-free can occur.
Fix this by iterating over the FIB tables in reverse order, so that the
main table is always freed after the local table.
v3: Reworded comment according to Alex's suggestion.
v2: Add a comment to make the fix more explicit per Dave's and Alex's
feedback.
Fixes:
0ddcf43d5d4a ("ipv4: FIB Local/MAIN table collapse")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Acked-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Tonghao Zhang [Fri, 22 Dec 2017 18:15:20 +0000 (10:15 -0800)]
sctp: Replace use of sockets_allocated with specified macro.
[ Upstream commit
8cb38a602478e9f806571f6920b0a3298aabf042 ]
The patch(
180d8cd942ce) replaces all uses of struct sock fields'
memory_pressure, memory_allocated, sockets_allocated, and sysctl_mem
to accessor macros. But the sockets_allocated field of sctp sock is
not replaced at all. Then replace it now for unifying the code.
Fixes:
180d8cd942ce ("foundations of per-cgroup memory pressure controlling.")
Cc: Glauber Costa <glommer@parallels.com>
Signed-off-by: Tonghao Zhang <zhangtonghao@didichuxing.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Tobias Jordan [Wed, 6 Dec 2017 14:23:23 +0000 (15:23 +0100)]
net: mvmdio: disable/unprepare clocks in EPROBE_DEFER case
[ Upstream commit
589bf32f09852041fbd3b7ce1a9e703f95c230ba ]
add appropriate calls to clk_disable_unprepare() by jumping to out_mdio
in case orion_mdio_probe() returns -EPROBE_DEFER.
Found by Linux Driver Verification project (linuxtesting.org).
Fixes:
3d604da1e954 ("net: mvmdio: get and enable optional clock")
Signed-off-by: Tobias Jordan <Tobias.Jordan@elektrobit.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Mohamed Ghannam [Sun, 10 Dec 2017 03:50:58 +0000 (03:50 +0000)]
net: ipv4: fix for a race condition in raw_sendmsg
[ Upstream commit
8f659a03a0ba9289b9aeb9b4470e6fb263d6f483 ]
inet->hdrincl is racy, and could lead to uninitialized stack pointer
usage, so its value should be read only once.
Fixes:
c008ba5bdc9f ("ipv4: Avoid reading user iov twice after raw_probe_proto_opt")
Signed-off-by: Mohamed Ghannam <simo.ghannam@gmail.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Brian King [Fri, 15 Dec 2017 21:21:50 +0000 (15:21 -0600)]
tg3: Fix rx hang on MTU change with 5717/5719
[ Upstream commit
748a240c589824e9121befb1cba5341c319885bc ]
This fixes a hang issue seen when changing the MTU size from 1500 MTU
to 9000 MTU on both 5717 and 5719 chips. In discussion with Broadcom,
they've indicated that these chipsets have the same phy as the 57766
chipset, so the same workarounds apply. This has been tested by IBM
on both Power 8 and Power 9 systems as well as by Broadcom on x86
hardware and has been confirmed to resolve the hang issue.
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Christoph Paasch [Mon, 11 Dec 2017 08:05:46 +0000 (00:05 -0800)]
tcp md5sig: Use skb's saddr when replying to an incoming segment
[ Upstream commit
30791ac41927ebd3e75486f9504b6d2280463bf0 ]
The MD5-key that belongs to a connection is identified by the peer's
IP-address. When we are in tcp_v4(6)_reqsk_send_ack(), we are replying
to an incoming segment from tcp_check_req() that failed the seq-number
checks.
Thus, to find the correct key, we need to use the skb's saddr and not
the daddr.
This bug seems to have been there since quite a while, but probably got
unnoticed because the consequences are not catastrophic. We will call
tcp_v4_reqsk_send_ack only to send a challenge-ACK back to the peer,
thus the connection doesn't really fail.
Fixes:
9501f9722922 ("tcp md5sig: Let the caller pass appropriate key for tcp_v{4,6}_do_calc_md5_hash().")
Signed-off-by: Christoph Paasch <cpaasch@apple.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Shaohua Li [Wed, 20 Dec 2017 20:10:21 +0000 (12:10 -0800)]
net: reevalulate autoflowlabel setting after sysctl setting
[ Upstream commit
513674b5a2c9c7a67501506419da5c3c77ac6f08 ]
sysctl.ip6.auto_flowlabels is default 1. In our hosts, we set it to 2.
If sockopt doesn't set autoflowlabel, outcome packets from the hosts are
supposed to not include flowlabel. This is true for normal packet, but
not for reset packet.
The reason is ipv6_pinfo.autoflowlabel is set in sock creation. Later if
we change sysctl.ip6.auto_flowlabels, the ipv6_pinfo.autoflowlabel isn't
changed, so the sock will keep the old behavior in terms of auto
flowlabel. Reset packet is suffering from this problem, because reset
packet is sent from a special control socket, which is created at boot
time. Since sysctl.ipv6.auto_flowlabels is 1 by default, the control
socket will always have its ipv6_pinfo.autoflowlabel set, even after
user set sysctl.ipv6.auto_flowlabels to 1, so reset packset will always
have flowlabel. Normal sock created before sysctl setting suffers from
the same issue. We can't even turn off autoflowlabel unless we kill all
socks in the hosts.
To fix this, if IPV6_AUTOFLOWLABEL sockopt is used, we use the
autoflowlabel setting from user, otherwise we always call
ip6_default_np_autolabel() which has the new settings of sysctl.
Note, this changes behavior a little bit. Before commit
42240901f7c4
(ipv6: Implement different admin modes for automatic flow labels), the
autoflowlabel behavior of a sock isn't sticky, eg, if sysctl changes,
existing connection will change autoflowlabel behavior. After that
commit, autoflowlabel behavior is sticky in the whole life of the sock.
With this patch, the behavior isn't sticky again.
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Tom Herbert <tom@quantonium.net>
Signed-off-by: Shaohua Li <shli@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Sebastian Sjoholm [Mon, 11 Dec 2017 20:51:14 +0000 (21:51 +0100)]
net: qmi_wwan: add Sierra EM7565 1199:9091
[ Upstream commit
aceef61ee56898cfa7b6960fb60b9326c3860441 ]
Sierra Wireless EM7565 is an Qualcomm MDM9x50 based M.2 modem.
The USB id is added to qmi_wwan.c to allow QMI communication
with the EM7565.
Signed-off-by: Sebastian Sjoholm <ssjoholm@mac.com>
Acked-by: Bjørn Mork <bjorn@mork.no>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Kevin Cernekee [Wed, 6 Dec 2017 20:12:27 +0000 (12:12 -0800)]
netlink: Add netns check on taps
[ Upstream commit
93c647643b48f0131f02e45da3bd367d80443291 ]
Currently, a nlmon link inside a child namespace can observe systemwide
netlink activity. Filter the traffic so that nlmon can only sniff
netlink messages from its own netns.
Test case:
vpnns -- bash -c "ip link add nlmon0 type nlmon; \
ip link set nlmon0 up; \
tcpdump -i nlmon0 -q -w /tmp/nlmon.pcap -U" &
sudo ip xfrm state add src 10.1.1.1 dst 10.1.1.2 proto esp \
spi 0x1 mode transport \
auth sha1 0x6162633132330000000000000000000000000000 \
enc aes 0x00000000000000000000000000000000
grep --binary abc123 /tmp/nlmon.pcap
Signed-off-by: Kevin Cernekee <cernekee@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Kevin Cernekee [Mon, 11 Dec 2017 19:13:45 +0000 (11:13 -0800)]
net: igmp: Use correct source address on IGMPv3 reports
[ Upstream commit
a46182b00290839fa3fa159d54fd3237bd8669f0 ]
Closing a multicast socket after the final IPv4 address is deleted
from an interface can generate a membership report that uses the
source IP from a different interface. The following test script, run
from an isolated netns, reproduces the issue:
#!/bin/bash
ip link add dummy0 type dummy
ip link add dummy1 type dummy
ip link set dummy0 up
ip link set dummy1 up
ip addr add 10.1.1.1/24 dev dummy0
ip addr add 192.168.99.99/24 dev dummy1
tcpdump -U -i dummy0 &
socat EXEC:"sleep 2" \
UDP4-DATAGRAM:239.101.1.68:8889,ip-add-membership=239.0.1.68:10.1.1.1 &
sleep 1
ip addr del 10.1.1.1/24 dev dummy0
sleep 5
kill %tcpdump
RFC 3376 specifies that the report must be sent with a valid IP source
address from the destination subnet, or from address 0.0.0.0. Add an
extra check to make sure this is the case.
Signed-off-by: Kevin Cernekee <cernekee@chromium.org>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Eric Dumazet [Mon, 11 Dec 2017 15:03:38 +0000 (07:03 -0800)]
ipv6: mcast: better catch silly mtu values
[ Upstream commit
b9b312a7a451e9c098921856e7cfbc201120e1a7 ]
syzkaller reported crashes in IPv6 stack [1]
Xin Long found that lo MTU was set to silly values.
IPv6 stack reacts to changes to small MTU, by disabling itself under
RTNL.
But there is a window where threads not using RTNL can see a wrong
device mtu. This can lead to surprises, in mld code where it is assumed
the mtu is suitable.
Fix this by reading device mtu once and checking IPv6 minimal MTU.
[1]
skbuff: skb_over_panic: text:
0000000010b86b8d len:196 put:20
head:
000000003b477e60 data:
000000000e85441e tail:0xd4 end:0xc0 dev:lo
------------[ cut here ]------------
kernel BUG at net/core/skbuff.c:104!
invalid opcode: 0000 [#1] SMP KASAN
Dumping ftrace buffer:
(ftrace buffer empty)
Modules linked in:
CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.15.0-rc2-mm1+ #39
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
RIP: 0010:skb_panic+0x15c/0x1f0 net/core/skbuff.c:100
RSP: 0018:
ffff8801db307508 EFLAGS:
00010286
RAX:
0000000000000082 RBX:
ffff8801c517e840 RCX:
0000000000000000
RDX:
0000000000000082 RSI:
1ffff1003b660e61 RDI:
ffffed003b660e95
RBP:
ffff8801db307570 R08:
1ffff1003b660e23 R09:
0000000000000000
R10:
0000000000000000 R11:
0000000000000000 R12:
ffffffff85bd4020
R13:
ffffffff84754ed2 R14:
0000000000000014 R15:
ffff8801c4e26540
FS:
0000000000000000(0000) GS:
ffff8801db300000(0000) knlGS:
0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0:
0000000080050033
CR2:
0000000000463610 CR3:
00000001c6698000 CR4:
00000000001406e0
DR0:
0000000000000000 DR1:
0000000000000000 DR2:
0000000000000000
DR3:
0000000000000000 DR6:
00000000fffe0ff0 DR7:
0000000000000400
Call Trace:
<IRQ>
skb_over_panic net/core/skbuff.c:109 [inline]
skb_put+0x181/0x1c0 net/core/skbuff.c:1694
add_grhead.isra.24+0x42/0x3b0 net/ipv6/mcast.c:1695
add_grec+0xa55/0x1060 net/ipv6/mcast.c:1817
mld_send_cr net/ipv6/mcast.c:1903 [inline]
mld_ifc_timer_expire+0x4d2/0x770 net/ipv6/mcast.c:2448
call_timer_fn+0x23b/0x840 kernel/time/timer.c:1320
expire_timers kernel/time/timer.c:1357 [inline]
__run_timers+0x7e1/0xb60 kernel/time/timer.c:1660
run_timer_softirq+0x4c/0xb0 kernel/time/timer.c:1686
__do_softirq+0x29d/0xbb2 kernel/softirq.c:285
invoke_softirq kernel/softirq.c:365 [inline]
irq_exit+0x1d3/0x210 kernel/softirq.c:405
exiting_irq arch/x86/include/asm/apic.h:540 [inline]
smp_apic_timer_interrupt+0x16b/0x700 arch/x86/kernel/apic/apic.c:1052
apic_timer_interrupt+0xa9/0xb0 arch/x86/entry/entry_64.S:920
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Tested-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Eric Dumazet [Mon, 11 Dec 2017 15:17:39 +0000 (07:17 -0800)]
ipv4: igmp: guard against silly MTU values
[ Upstream commit
b5476022bbada3764609368f03329ca287528dc8 ]
IPv4 stack reacts to changes to small MTU, by disabling itself under
RTNL.
But there is a window where threads not using RTNL can see a wrong
device mtu. This can lead to surprises, in igmp code where it is
assumed the mtu is suitable.
Fix this by reading device mtu once and checking IPv4 minimal MTU.
This patch adds missing IPV4_MIN_MTU define, to not abuse
ETH_MIN_MTUÂ anymore.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Linus Torvalds [Sat, 30 Dec 2017 01:34:43 +0000 (17:34 -0800)]
kbuild: add '-fno-stack-check' to kernel build options
commit
3ce120b16cc548472f80cf8644f90eda958cf1b6 upstream.
It appears that hardened gentoo enables "-fstack-check" by default for
gcc.
That doesn't work _at_all_ for the kernel, because the kernel stack
doesn't act like a user stack at all: it's much smaller, and it doesn't
auto-expand on use. So the extra "probe one page below the stack" code
generated by -fstack-check just breaks the kernel in horrible ways,
causing infinite double faults etc.
[ I have to say, that the particular code gcc generates looks very
stupid even for user space where it works, but that's a separate
issue. ]
Reported-and-tested-by: Alexander Tsoy <alexander@tsoy.me>
Reported-and-tested-by: Toralf Förster <toralf.foerster@gmx.de>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Jiri Kosina <jikos@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Andy Lutomirski [Mon, 9 Oct 2017 04:53:05 +0000 (21:53 -0700)]
x86/mm/64: Fix reboot interaction with CR4.PCIDE
commit
924c6b900cfdf376b07bccfd80e62b21914f8a5a upstream.
Trying to reboot via real mode fails with PCID on: long mode cannot
be exited while CR4.PCIDE is set. (No, I have no idea why, but the
SDM and actual CPUs are in agreement here.) The result is a GPF and
a hang instead of a reboot.
I didn't catch this in testing because neither my computer nor my VM
reboots this way. I can trigger it with reboot=bios, though.
Fixes:
660da7c9228f ("x86/mm: Enable CR4.PCIDE on supported systems")
Reported-and-tested-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Borislav Petkov <bp@alien8.de>
Link: https://lkml.kernel.org/r/f1e7d965998018450a7a70c2823873686a8b21c0.1507524746.git.luto@kernel.org
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Andy Lutomirski [Thu, 29 Jun 2017 15:53:21 +0000 (08:53 -0700)]
x86/mm: Enable CR4.PCIDE on supported systems
commit
660da7c9228f685b2ebe664f9fd69aaddcc420b5 upstream.
We can use PCID if the CPU has PCID and PGE and we're not on Xen.
By itself, this has no effect. A followup patch will start using PCID.
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Reviewed-by: Nadav Amit <nadav.amit@gmail.com>
Reviewed-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Juergen Gross <jgross@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: linux-mm@kvack.org
Link: http://lkml.kernel.org/r/6327ecd907b32f79d5aa0d466f04503bbec5df88.1498751203.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Andy Lutomirski [Thu, 29 Jun 2017 15:53:20 +0000 (08:53 -0700)]
x86/mm: Add the 'nopcid' boot option to turn off PCID
commit
0790c9aad84901ca1bdc14746175549c8b5da215 upstream.
The parameter is only present on x86_64 systems to save a few bytes,
as PCID is always disabled on x86_32.
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Reviewed-by: Nadav Amit <nadav.amit@gmail.com>
Reviewed-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: linux-mm@kvack.org
Link: http://lkml.kernel.org/r/8bbb2e65bcd249a5f18bfb8128b4689f08ac2b60.1498751203.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Andy Lutomirski [Thu, 29 Jun 2017 15:53:19 +0000 (08:53 -0700)]
x86/mm: Disable PCID on 32-bit kernels
commit
cba4671af7550e008f7a7835f06df0763825bf3e upstream.
32-bit kernels on new hardware will see PCID in CPUID, but PCID can
only be used in 64-bit mode. Rather than making all PCID code
conditional, just disable the feature on 32-bit builds.
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Reviewed-by: Nadav Amit <nadav.amit@gmail.com>
Reviewed-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: linux-mm@kvack.org
Link: http://lkml.kernel.org/r/2e391769192a4d31b808410c383c6bf0734bc6ea.1498751203.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Andy Lutomirski [Sun, 28 May 2017 17:00:14 +0000 (10:00 -0700)]
x86/mm: Remove the UP asm/tlbflush.h code, always use the (formerly) SMP code
commit
ce4a4e565f5264909a18c733b864c3f74467f69e upstream.
The UP asm/tlbflush.h generates somewhat nicer code than the SMP version.
Aside from that, it's fallen quite a bit behind the SMP code:
- flush_tlb_mm_range() didn't flush individual pages if the range
was small.
- The lazy TLB code was much weaker. This usually wouldn't matter,
but, if a kernel thread flushed its lazy "active_mm" more than
once (due to reclaim or similar), it wouldn't be unlazied and
would instead pointlessly flush repeatedly.
- Tracepoints were missing.
Aside from that, simply having the UP code around was a maintanence
burden, since it means that any change to the TLB flush code had to
make sure not to break it.
Simplify everything by deleting the UP code.
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Borislav Petkov <bpetkov@suse.de>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Nadav Amit <nadav.amit@gmail.com>
Cc: Nadav Amit <namit@vmware.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-mm@kvack.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Andy Lutomirski [Mon, 22 May 2017 22:30:01 +0000 (15:30 -0700)]
x86/mm: Reimplement flush_tlb_page() using flush_tlb_mm_range()
commit
ca6c99c0794875c6d1db6e22f246699691ab7e6b upstream.
flush_tlb_page() was very similar to flush_tlb_mm_range() except that
it had a couple of issues:
- It was missing an smp_mb() in the case where
current->active_mm != mm. (This is a longstanding bug reported by Nadav Amit)
- It was missing tracepoints and vm counter updates.
The only reason that I can see for keeping it at as a separate
function is that it could avoid a few branches that
flush_tlb_mm_range() needs to decide to flush just one page. This
hardly seems worthwhile. If we decide we want to get rid of those
branches again, a better way would be to introduce an
__flush_tlb_mm_range() helper and make both flush_tlb_page() and
flush_tlb_mm_range() use it.
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Acked-by: Kees Cook <keescook@chromium.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Borislav Petkov <bpetkov@suse.de>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Nadav Amit <nadav.amit@gmail.com>
Cc: Nadav Amit <namit@vmware.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-mm@kvack.org
Link: http://lkml.kernel.org/r/3cc3847cf888d8907577569b8bac3f01992ef8f9.1495492063.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Andy Lutomirski [Sat, 22 Apr 2017 07:01:21 +0000 (00:01 -0700)]
x86/mm: Make flush_tlb_mm_range() more predictable
commit
ce27374fabf553153c3f53efcaa9bfab9216bd8c upstream.
I'm about to rewrite the function almost completely, but first I
want to get a functional change out of the way. Currently, if
flush_tlb_mm_range() does not flush the local TLB at all, it will
never do individual page flushes on remote CPUs. This seems to be
an accident, and preserving it will be awkward. Let's change it
first so that any regressions in the rewrite will be easier to
bisect and so that the rewrite can attempt to change no visible
behavior at all.
The fix is simple: we can simply avoid short-circuiting the
calculation of base_pages_to_flush.
As a side effect, this also eliminates a potential corner case: if
tlb_single_page_flush_ceiling == TLB_FLUSH_ALL, flush_tlb_mm_range()
could have ended up flushing the entire address space one page at a
time.
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Acked-by: Dave Hansen <dave.hansen@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Nadav Amit <namit@vmware.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/4b29b771d9975aad7154c314534fec235618175a.1492844372.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Andy Lutomirski [Sat, 22 Apr 2017 07:01:20 +0000 (00:01 -0700)]
x86/mm: Remove flush_tlb() and flush_tlb_current_task()
commit
29961b59a51f8c6838a26a45e871a7ed6771809b upstream.
I was trying to figure out what how flush_tlb_current_task() would
possibly work correctly if current->mm != current->active_mm, but I
realized I could spare myself the effort: it has no callers except
the unused flush_tlb() macro.
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Nadav Amit <namit@vmware.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/e52d64c11690f85e9f1d69d7b48cc2269cd2e94b.1492844372.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Andy Lutomirski [Sat, 22 Apr 2017 07:01:19 +0000 (00:01 -0700)]
x86/vm86/32: Switch to flush_tlb_mm_range() in mark_screen_rdonly()
commit
9ccee2373f0658f234727700e619df097ba57023 upstream.
mark_screen_rdonly() is the last remaining caller of flush_tlb().
flush_tlb_mm_range() is potentially faster and isn't obsolete.
Compile-tested only because I don't know whether software that uses
this mechanism even exists.
Signed-off-by: Andy Lutomirski <luto@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Nadav Amit <namit@vmware.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Sasha Levin <sasha.levin@oracle.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/791a644076fc3577ba7f7b7cafd643cc089baa7d.1492844372.git.luto@kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Hui Wang [Fri, 22 Dec 2017 03:17:45 +0000 (11:17 +0800)]
ALSA: hda - fix headset mic detection issue on a Dell machine
commit
285d5ddcffafa5d5e68c586f4c9eaa8b24a2897d upstream.
It has the codec alc256, and add its pin definition to pin quirk
table to let it apply ALC255_FIXUP_DELL1_MIC_NO_PRESENCE.
Signed-off-by: Hui Wang <hui.wang@canonical.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Takashi Iwai [Fri, 22 Dec 2017 09:45:07 +0000 (10:45 +0100)]
ALSA: hda: Drop useless WARN_ON()
commit
a36c2638380c0a4676647a1f553b70b20d3ebce1 upstream.
Since the commit
97cc2ed27e5a ("ALSA: hda - Fix yet another i915
pointer leftover in error path") cleared hdac_acomp pointer, the
WARN_ON() non-NULL check in snd_hdac_i915_register_notifier() may give
a false-positive warning, as the function gets called no matter
whether the component is registered or not. For fixing it, let's get
rid of the spurious WARN_ON().
Fixes:
97cc2ed27e5a ("ALSA: hda - Fix yet another i915 pointer leftover in error path")
Reported-by: Kouta Okamoto <kouta.okamoto@toshiba.co.jp>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Johan Hovold [Mon, 13 Nov 2017 11:12:56 +0000 (12:12 +0100)]
ASoC: twl4030: fix child-node lookup
commit
15f8c5f2415bfac73f33a14bcd83422bcbfb5298 upstream.
Fix child-node lookup during probe, which ended up searching the whole
device tree depth-first starting at the parent rather than just matching
on its children.
To make things worse, the parent codec node was also prematurely freed,
while the child node was leaked.
Fixes:
2d6d649a2e0f ("ASoC: twl4030: Support for DT booted kernel")
Signed-off-by: Johan Hovold <johan@kernel.org>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Maciej S. Szmigiero [Mon, 20 Nov 2017 22:14:55 +0000 (23:14 +0100)]
ASoC: fsl_ssi: AC'97 ops need regmap, clock and cleaning up on failure
commit
695b78b548d8a26288f041e907ff17758df9e1d5 upstream.
AC'97 ops (register read / write) need SSI regmap and clock, so they have
to be set after them.
We also need to set these ops back to NULL if we fail the probe.
Signed-off-by: Maciej S. Szmigiero <mail@maciej.szmigiero.name>
Acked-by: Nicolin Chen <nicoleotsuka@gmail.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Steve Wise [Mon, 18 Dec 2017 21:10:00 +0000 (13:10 -0800)]
iw_cxgb4: Only validate the MSN for successful completions
commit
f55688c45442bc863f40ad678c638785b26cdce6 upstream.
If the RECV CQE is in error, ignore the MSN check. This was causing
recvs that were flushed into the sw cq to be completed with the wrong
status (BAD_MSN instead of FLUSHED).
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Steven Rostedt (VMware) [Sat, 23 Dec 2017 01:32:35 +0000 (20:32 -0500)]
ring-buffer: Mask out the info bits when returning buffer page length
commit
45d8b80c2ac5d21cd1e2954431fb676bc2b1e099 upstream.
Two info bits were added to the "commit" part of the ring buffer data page
when returned to be consumed. This was to inform the user space readers that
events have been missed, and that the count may be stored at the end of the
page.
What wasn't handled, was the splice code that actually called a function to
return the length of the data in order to zero out the rest of the page
before sending it up to user space. These data bits were returned with the
length making the value negative, and that negative value was not checked.
It was compared to PAGE_SIZE, and only used if the size was less than
PAGE_SIZE. Luckily PAGE_SIZE is unsigned long which made the compare an
unsigned compare, meaning the negative size value did not end up causing a
large portion of memory to be randomly zeroed out.
Fixes:
66a8cb95ed040 ("ring-buffer: Add place holder recording of dropped events")
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>