OSDN Git Service

uclinux-h8/linux.git
6 years agonet: dsa: return after vlan prepare phase
Vivien Didelot [Wed, 8 Nov 2017 15:50:10 +0000 (10:50 -0500)]
net: dsa: return after vlan prepare phase

The current code does not return after successfully preparing the VLAN
addition on every ports member of a it. Fix this.

Fixes: 1ca4aa9cd4cc ("net: dsa: check VLAN capability of every switch")
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: dsa: return after mdb prepare phase
Vivien Didelot [Wed, 8 Nov 2017 15:49:56 +0000 (10:49 -0500)]
net: dsa: return after mdb prepare phase

The current code does not return after successfully preparing the MDB
addition on every ports member of a multicast group. Fix this.

Fixes: a1a6b7ea7f2d ("net: dsa: add cross-chip multicast support")
Reported-by: Egil Hjelmeland <privat@egil-hjelmeland.no>
Signed-off-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoMerge tag 'ceph-for-4.14-rc9' of git://github.com/ceph/ceph-client
Linus Torvalds [Fri, 10 Nov 2017 22:18:24 +0000 (14:18 -0800)]
Merge tag 'ceph-for-4.14-rc9' of git://github.com/ceph/ceph-client

Pull ceph gix from Ilya Dryomov:
 "Memory allocation flags fix, marked for stable"

* tag 'ceph-for-4.14-rc9' of git://github.com/ceph/ceph-client:
  rbd: use GFP_NOIO for parent stat and data requests

6 years agoMerge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input
Linus Torvalds [Fri, 10 Nov 2017 22:14:23 +0000 (14:14 -0800)]
Merge branch 'for-linus' of git://git./linux/kernel/git/dtor/input

Pull input layer updates from Dmitry Torokhov:

 - a new ACPI ID for Elan touchpad found in yet another Ideapad model

 - Synaptics RMI4 will allow binding to controllers reporting SMB
   version 3 (note that we are not adding any new ACPI IDs to the
   Synaptics PS/2 drover so unless user explicitly enables intertouch
   support there is no user-visible change)

 - a fixup to TSC 2004/5 touchscreen driver to mark input devices as
   "direct" to help userspace identify the type of device they are
   dealing with

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
  Input: synaptics-rmi4 - RMI4 can also use SMBUS version 3
  Input: tsc200x-core - set INPUT_PROP_DIRECT
  Input: elan_i2c - add ELAN060C to the ACPI table

6 years agoMerge remote-tracking branches 'spi/topic/sh-msiof', 'spi/topic/slave', 'spi/topic...
Mark Brown [Fri, 10 Nov 2017 21:33:51 +0000 (21:33 +0000)]
Merge remote-tracking branches 'spi/topic/sh-msiof', 'spi/topic/slave', 'spi/topic/spreadtrum' and 'spi/topic/tegra114' into spi-next

6 years agoMerge remote-tracking branches 'spi/topic/imx', 'spi/topic/mxs', 'spi/topic/orion...
Mark Brown [Fri, 10 Nov 2017 21:33:47 +0000 (21:33 +0000)]
Merge remote-tracking branches 'spi/topic/imx', 'spi/topic/mxs', 'spi/topic/orion', 'spi/topic/rspi' and 'spi/topic/s3c64xx' into spi-next

6 years agoMerge remote-tracking branches 'spi/topic/armada', 'spi/topic/axi', 'spi/topic/davinc...
Mark Brown [Fri, 10 Nov 2017 21:33:44 +0000 (21:33 +0000)]
Merge remote-tracking branches 'spi/topic/armada', 'spi/topic/axi', 'spi/topic/davinci' and 'spi/topic/fsl-dspi' into spi-next

6 years agoMerge remote-tracking branch 'spi/topic/core' into spi-next
Mark Brown [Fri, 10 Nov 2017 21:33:43 +0000 (21:33 +0000)]
Merge remote-tracking branch 'spi/topic/core' into spi-next

6 years agoMerge remote-tracking branches 'spi/fix/idr' and 'spi/fix/sh-msiof' into spi-linus
Mark Brown [Fri, 10 Nov 2017 21:33:40 +0000 (21:33 +0000)]
Merge remote-tracking branches 'spi/fix/idr' and 'spi/fix/sh-msiof' into spi-linus

6 years agoMerge remote-tracking branches 'regulator/topic/da9211', 'regulator/topic/pfuze100...
Mark Brown [Fri, 10 Nov 2017 21:33:23 +0000 (21:33 +0000)]
Merge remote-tracking branches 'regulator/topic/da9211', 'regulator/topic/pfuze100' and 'regulator/topic/tps65218' into regulator-next

6 years agoMerge remote-tracking branch 'regulator/topic/qcom-spmi' into regulator-next
Mark Brown [Fri, 10 Nov 2017 21:33:22 +0000 (21:33 +0000)]
Merge remote-tracking branch 'regulator/topic/qcom-spmi' into regulator-next

6 years agoMerge remote-tracking branch 'regulator/topic/axp20x' into regulator-next
Mark Brown [Fri, 10 Nov 2017 21:33:20 +0000 (21:33 +0000)]
Merge remote-tracking branch 'regulator/topic/axp20x' into regulator-next

6 years agoMerge remote-tracking branch 'regulator/fix/qcom-spmi' into regulator-linus
Mark Brown [Fri, 10 Nov 2017 21:33:18 +0000 (21:33 +0000)]
Merge remote-tracking branch 'regulator/fix/qcom-spmi' into regulator-linus

6 years agospi: imx: Don't require platform data chipselect array
Trent Piepho [Tue, 31 Oct 2017 19:49:06 +0000 (12:49 -0700)]
spi: imx: Don't require platform data chipselect array

If the array is not present, assume all chip selects are native.  This
is the standard behavior for SPI masters configured via the device
tree and the behavior of this driver as well when it is configured via
device tree.

This reduces platform data vs DT differences and allows most of the
platform data based boards to remove their chip select arrays.

CC: Shawn Guo <shawnguo@kernel.org>
CC: Sascha Hauer <kernel@pengutronix.de>
CC: Fabio Estevam <fabio.estevam@nxp.com>
CC: Mark Brown <broonie@kernel.org>
Signed-off-by: Trent Piepho <tpiepho@impinj.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
6 years agospi: imx: Fix failure path leak on GPIO request error
Trent Piepho [Tue, 31 Oct 2017 19:49:05 +0000 (12:49 -0700)]
spi: imx: Fix failure path leak on GPIO request error

If the code that requests any chip select GPIOs fails, the cleanup of
spi_bitbang_start() by calling spi_bitbang_stop() is not done.  Add this
to the failure path.

Note that spi_bitbang_start() has to be called before requesting GPIOs
because the GPIO data in the spi master is populated when the master is
registed, and that doesn't happen until spi_bitbang_start() is called.

CC: Shawn Guo <shawnguo@kernel.org>
CC: Sascha Hauer <kernel@pengutronix.de>
CC: Fabio Estevam <fabio.estevam@nxp.com>
CC: Mark Brown <broonie@kernel.org>
CC: Oleksij Rempel <o.rempel@pengutronix.de>
Signed-off-by: Trent Piepho <tpiepho@impinj.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
6 years agospi: imx: GPIO based chip selects should not be required
Trent Piepho [Tue, 31 Oct 2017 19:49:04 +0000 (12:49 -0700)]
spi: imx: GPIO based chip selects should not be required

The driver will fail to load if no gpio chip selects are specified,
this patch changes this so that it no longer fails.

It's possible to use all native chip selects, in which case there is
no reason to have a gpio chip select array.  This is what happens if
the *optional* device tree property "cs-gpios" is omitted.

The spi core already checks for the absence of gpio chip selects in
the master and assigns any slaves the gpio_cs value of -ENOENT.

Also have the driver respect the standard SPI device tree property "num-cs"
to allow setting the number of chip selects without using cs-gpios.

CC: Mark Brown <broonie@kernel.org>
CC: Shawn Guo <shawnguo@kernel.org>
CC: Sascha Hauer <kernel@pengutronix.de>
CC: Fabio Estevam <fabio.estevam@nxp.com>
CC: Oleksij Rempel <o.rempel@pengutronix.de>
Signed-off-by: Trent Piepho <tpiepho@impinj.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
6 years agoMerge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Linus Torvalds [Fri, 10 Nov 2017 20:24:42 +0000 (12:24 -0800)]
Merge tag 'for-linus' of git://git./virt/kvm/kvm

Pull KVM fix from Radim Krčmář:
 "Fix PPC HV host crash that can occur as a result of resizing the guest
  hashed page table"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: PPC: Book3S HV: Fix exclusion between HPT resizing and other HPT updates

6 years agoMerge tag 'mips_fixes_4.14_2' of git://git.kernel.org/pub/scm/linux/kernel/git/jhogan...
Linus Torvalds [Fri, 10 Nov 2017 20:21:15 +0000 (12:21 -0800)]
Merge tag 'mips_fixes_4.14_2' of git://git./linux/kernel/git/jhogan/mips

Pull MIPS fixes from James Hogan:
 "A final few MIPS fixes for 4.14:

   - fix BMIPS NULL pointer dereference (4.7)

   - fix AR7 early GPIO init allocation failure (3.19)

   - fix dead serial output on certain AR7 platforms (2.6.35)"

* tag 'mips_fixes_4.14_2' of git://git.kernel.org/pub/scm/linux/kernel/git/jhogan/mips:
  MIPS: AR7: Ensure that serial ports are properly set up
  MIPS: AR7: Defer registration of GPIO
  MIPS: BMIPS: Fix missing cbr address

6 years ago.mailmap: Add Maciej W. Rozycki's Imagination e-mail address
Maciej W. Rozycki [Fri, 10 Nov 2017 20:05:24 +0000 (20:05 +0000)]
.mailmap: Add Maciej W. Rozycki's Imagination e-mail address

Following my recent transition from Imagination Technologies to the=20
reincarnated MIPS company add a .mailmap mapping for my work address,
so that `scripts/get_maintainer.pl' gets it right for past commits.

Signed-off-by: Maciej W. Rozycki <macro@mips.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agoRevert "x86: CPU: Fix up "cpu MHz" in /proc/cpuinfo"
Linus Torvalds [Fri, 10 Nov 2017 19:19:11 +0000 (11:19 -0800)]
Revert "x86: CPU: Fix up "cpu MHz" in /proc/cpuinfo"

This reverts commit 941f5f0f6ef5338814145cf2b813cf1f98873e2f.

Sadly, it turns out that we really can't just do the cross-CPU IPI to
all CPU's to get their proper frequencies, because it's much too
expensive on systems with lots of cores.

So we'll have to revert this for now, and revisit it using a smarter
model (probably doing one system-wide IPI at open time, and doing all
the frequency calculations in parallel).

Reported-by: WANG Chao <chao.wang@ucloud.cn>
Reported-by: Ingo Molnar <mingo@kernel.org>
Cc: Rafael J Wysocki <rafael.j.wysocki@intel.com>
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agoMerge tag 'drm-fixes-for-v4.14-rc9' of git://people.freedesktop.org/~airlied/linux
Linus Torvalds [Fri, 10 Nov 2017 17:59:41 +0000 (09:59 -0800)]
Merge tag 'drm-fixes-for-v4.14-rc9' of git://people.freedesktop.org/~airlied/linux

Pull drm fixes from Dave Airlie:
 "Last few patches to wrap up.

  Two i915 fixes that are on their way to stable, one vmware black
  screen bug, and one const patch that I was going to drop, but it was
  clearly a pretty safe one liner"

* tag 'drm-fixes-for-v4.14-rc9' of git://people.freedesktop.org/~airlied/linux:
  drm/i915: Deconstruct struct sgt_dma initialiser
  drm/i915: Reject unknown syncobj flags
  drm/vmwgfx: Fix Ubuntu 17.10 Wayland black screen issue
  drm/vmwgfx: constify vmw_fence_ops

6 years agoMAINTAINERS: add virtio-ccw.h to virtio/s390 section
Cornelia Huck [Fri, 10 Nov 2017 14:56:11 +0000 (15:56 +0100)]
MAINTAINERS: add virtio-ccw.h to virtio/s390 section

The file arch/s390/include/uapi/asm/virtio-ccw.h belongs to the
s390 virtio drivers as well.

Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
6 years agos390/noexec: execute kexec datamover without DAT
Heiko Carstens [Thu, 9 Nov 2017 22:00:14 +0000 (23:00 +0100)]
s390/noexec: execute kexec datamover without DAT

Rebooting into a new kernel with kexec fails (system dies) if tried on
a machine that has no-execute support. Reason for this is that the so
called datamover code gets executed with DAT on (MMU is active) and
the page that contains the datamover is marked as non-executable.
Therefore when branching into the datamover an unexpected program
check happens and afterwards the machine is dead.

This can be simply avoided by disabling DAT, which also disables any
no-execute checks, just before the datamover gets executed.

In fact the first thing done by the datamover is to disable DAT. The
code in the datamover that disables DAT can be removed as well.

Thanks to Michael Holzheu and Gerald Schaefer for tracking this down.

Reviewed-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Reviewed-by: Philipp Rudo <prudo@linux.vnet.ibm.com>
Cc: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Fixes: 57d7f939e7bd ("s390: add no-execute support")
Cc: <stable@vger.kernel.org> # v4.11+
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
6 years agos390: fix transactional execution control register handling
Heiko Carstens [Thu, 9 Nov 2017 11:29:34 +0000 (12:29 +0100)]
s390: fix transactional execution control register handling

Dan Horák reported the following crash related to transactional execution:

User process fault: interruption code 0013 ilc:3 in libpthread-2.26.so[3ff93c00000+1b000]
CPU: 2 PID: 1 Comm: /init Not tainted 4.13.4-300.fc27.s390x #1
Hardware name: IBM 2827 H43 400 (z/VM 6.4.0)
task: 00000000fafc8000 task.stack: 00000000fafc4000
User PSW : 0705200180000000 000003ff93c14e70
           R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:1 AS:0 CC:2 PM:0 RI:0 EA:3
User GPRS: 0000000000000077 000003ff00000000 000003ff93144d48 000003ff93144d5e
           0000000000000000 0000000000000002 0000000000000000 000003ff00000000
           0000000000000000 0000000000000418 0000000000000000 000003ffcc9fe770
           000003ff93d28f50 000003ff9310acf0 000003ff92b0319a 000003ffcc9fe6d0
User Code: 000003ff93c14e6260e0b030            std     %f14,48(%r11)
           000003ff93c14e6660f0b038            std     %f15,56(%r11)
          #000003ff93c14e6ae5600000ff0e        tbegin  0,65294
          >000003ff93c14e70a7740006            brc     7,3ff93c14e7c
           000003ff93c14e74a7080000            lhi     %r0,0
           000003ff93c14e78a7f40023            brc     15,3ff93c14ebe
           000003ff93c14e7cb2220000            ipm     %r0
           000003ff93c14e808800001c            srl     %r0,28

There are several bugs with control register handling with respect to
transactional execution:

- on task switch update_per_regs() is only called if the next task has
  an mm (is not a kernel thread). This however is incorrect. This
  breaks e.g. for user mode helper handling, where the kernel creates
  a kernel thread and then execve's a user space program. Control
  register contents related to transactional execution won't be
  updated on execve. If the previous task ran with transactional
  execution disabled then the new task will also run with
  transactional execution disabled, which is incorrect. Therefore call
  update_per_regs() unconditionally within switch_to().

- on startup the transactional execution facility is not enabled for
  the idle thread. This is not really a bug, but an inconsistency to
  other facilities. Therefore enable the facility if it is available.

- on fork the new thread's per_flags field is not cleared. This means
  that a child process inherits the PER_FLAG_NO_TE flag. This flag can
  be set with a ptrace request to disable transactional execution for
  the current process. It should not be inherited by new child
  processes in order to be consistent with the handling of all other
  PER related debugging options. Therefore clear the per_flags field in
  copy_thread_tls().

Reported-and-tested-by: Dan Horák <dan@danny.cz>
Fixes: d35339a42dd1 ("s390: add support for transactional memory")
Cc: <stable@vger.kernel.org> # v3.7+
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
6 years agos390/bpf: take advantage of stack_depth tracking
Michael Holzheu [Tue, 7 Nov 2017 18:16:25 +0000 (19:16 +0100)]
s390/bpf: take advantage of stack_depth tracking

Make use of the "stack_depth" tracking feature introduced with
commit 8726679a0fa31 ("bpf: teach verifier to track stack depth") for the
s390 JIT, so that stack usage can be reduced.

Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
6 years agoMerge tag 'vfio-ccw-20171109' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms39...
Heiko Carstens [Fri, 10 Nov 2017 17:54:11 +0000 (18:54 +0100)]
Merge tag 'vfio-ccw-20171109' of git://git./linux/kernel/git/kvms390/vfio-ccw into features

Pull vfio-ccw update from Cornelia Huck:
"A vfio-ccw bugfix: avoid freeing that which should not be freed."

6 years agolocking/x86: Use LOCK ADD for smp_mb() instead of MFENCE
Michael S. Tsirkin [Fri, 27 Oct 2017 16:14:31 +0000 (19:14 +0300)]
locking/x86: Use LOCK ADD for smp_mb() instead of MFENCE

MFENCE appears to be way slower than a locked instruction - let's use
LOCK ADD unconditionally, as we always did on old 32-bit.

Performance testing results:

  perf stat -r 10 -- ./virtio_ring_0_9 --sleep --host-affinity 0 --guest-affinity 0
  Before:
         0.922565990 seconds time elapsed                                          ( +-  1.15% )
  After:
         0.578667024 seconds time elapsed                                          ( +-  1.21% )

i.e. about ~60% faster.

Just poking at SP would be the most natural, but if we then read the
value from SP, we get a false dependency which will slow us down.

This was noted in this article:

  http://shipilev.net/blog/2014/on-the-fence-with-dependencies/

And is easy to reproduce by sticking a barrier in a small non-inline
function.

So let's use a negative offset - which avoids this problem since we
build with the red zone disabled.

For userspace, use an address just below the redzone.

The one difference between LOCK ADD and MFENCE is that LOCK ADD does
not affect CLFLUSH, previous patches converted all uses of CLFLUSH to
call mb(), such that changes to smp_mb() won't affect it.

Update mb/rmb/wmb() on 32-bit to use the negative offset, too, for
consistency.

As a follow-up, it might be worth considering switching users
of CLFLUSH to another API (e.g. clflush_mb()?) - we will
then be able to convert mb() to smp_mb() again.

Also arguably, GCC should switch to use LOCK ADD for __sync_synchronize().
This might be worth pursuing separately.

Suggested-by: Andy Lutomirski <luto@amacapital.net>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: qemu-devel@nongnu.org
Cc: virtualization@lists.linux-foundation.org
Link: http://lkml.kernel.org/r/1509118355-4890-1-git-send-email-mst@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
6 years agoregulator: tps65218: remove unused tps_info structure
Keerthy [Fri, 10 Nov 2017 11:52:44 +0000 (17:22 +0530)]
regulator: tps65218: remove unused tps_info structure

remove unused tps_info structure.

Signed-off-by: Keerthy <j-keerthy@ti.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
6 years agoregulator: tps65218: Fix strobe assignment
Keerthy [Fri, 10 Nov 2017 11:52:43 +0000 (17:22 +0530)]
regulator: tps65218: Fix strobe assignment

Currentlly tps_info structure is no longer used. So use the
strobes parameter in tps65218 structure to capture the info.

Fixes: 2dc4940360d4c0c (regulator: tps65218: Remove all the compatibles)
Signed-off-by: Keerthy <j-keerthy@ti.com>
Signed-off-by: Mark Brown <broonie@kernel.org>
6 years agocan: ifi: Fix transmitter delay calculation
Marek Vasut [Fri, 10 Nov 2017 10:22:39 +0000 (11:22 +0100)]
can: ifi: Fix transmitter delay calculation

The CANFD transmitter delay calculation formula was updated in the
latest software drop from IFI and improves the behavior of the IFI
CANFD core during bitrate switching. Use the new formula to improve
stability of the CANFD operation.

Signed-off-by: Marek Vasut <marex@denx.de>
Cc: Markus Marb <markus@marb.org>
Cc: linux-stable <stable@vger.kernel.org>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
6 years agotcp: fix tcp_fastretrans_alert warning
Yuchung Cheng [Tue, 7 Nov 2017 23:33:43 +0000 (15:33 -0800)]
tcp: fix tcp_fastretrans_alert warning

This patch fixes the cause of an WARNING indicatng TCP has pending
retransmission in Open state in tcp_fastretrans_alert().

The root cause is a bad interaction between path mtu probing,
if enabled, and the RACK loss detection. Upong receiving a SACK
above the sequence of the MTU probing packet, RACK could mark the
probe packet lost in tcp_fastretrans_alert(), prior to calling
tcp_simple_retransmit().

tcp_simple_retransmit() only enters Loss state if it newly marks
the probe packet lost. If the probe packet is already identified as
lost by RACK, the sender remains in Open state with some packets
marked lost and retransmitted. Then the next SACK would trigger
the warning. The likely scenario is that the probe packet was
lost due to its size or network congestion. The actual impact of
this warning is small by potentially entering fast recovery an
ACK later.

The simple fix is always entering recovery (Loss) state if some
packet is marked lost during path MTU probing.

Fixes: a0370b3f3f2c ("tcp: enable RACK loss detection to trigger recovery")
Reported-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Reported-by: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Reported-by: Roman Gushchin <guro@fb.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agotcp: gso: avoid refcount_t warning from tcp_gso_segment()
Eric Dumazet [Tue, 7 Nov 2017 23:15:04 +0000 (15:15 -0800)]
tcp: gso: avoid refcount_t warning from tcp_gso_segment()

When a GSO skb of truesize O is segmented into 2 new skbs of truesize N1
and N2, we want to transfer socket ownership to the new fresh skbs.

In order to avoid expensive atomic operations on a cache line subject to
cache bouncing, we replace the sequence :

refcount_add(N1, &sk->sk_wmem_alloc);
refcount_add(N2, &sk->sk_wmem_alloc); // repeated by number of segments

refcount_sub(O, &sk->sk_wmem_alloc);

by a single

refcount_add(sum_of(N) - O, &sk->sk_wmem_alloc);

Problem is :

In some pathological cases, sum(N) - O might be a negative number, and
syzkaller bot was apparently able to trigger this trace [1]

atomic_t was ok with this construct, but we need to take care of the
negative delta with refcount_t

[1]
refcount_t: saturated; leaking memory.
------------[ cut here ]------------
WARNING: CPU: 0 PID: 8404 at lib/refcount.c:77 refcount_add_not_zero+0x198/0x200 lib/refcount.c:77
Kernel panic - not syncing: panic_on_warn set ...

CPU: 0 PID: 8404 Comm: syz-executor2 Not tainted 4.14.0-rc5-mm1+ #20
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:16 [inline]
 dump_stack+0x194/0x257 lib/dump_stack.c:52
 panic+0x1e4/0x41c kernel/panic.c:183
 __warn+0x1c4/0x1e0 kernel/panic.c:546
 report_bug+0x211/0x2d0 lib/bug.c:183
 fixup_bug+0x40/0x90 arch/x86/kernel/traps.c:177
 do_trap_no_signal arch/x86/kernel/traps.c:211 [inline]
 do_trap+0x260/0x390 arch/x86/kernel/traps.c:260
 do_error_trap+0x120/0x390 arch/x86/kernel/traps.c:297
 do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:310
 invalid_op+0x18/0x20 arch/x86/entry/entry_64.S:905
RIP: 0010:refcount_add_not_zero+0x198/0x200 lib/refcount.c:77
RSP: 0018:ffff8801c606e3a0 EFLAGS: 00010282
RAX: 0000000000000026 RBX: 0000000000001401 RCX: 0000000000000000
RDX: 0000000000000026 RSI: ffffc900036fc000 RDI: ffffed0038c0dc68
RBP: ffff8801c606e430 R08: 0000000000000001 R09: 0000000000000000
R10: ffff8801d97f5eba R11: 0000000000000000 R12: ffff8801d5acf73c
R13: 1ffff10038c0dc75 R14: 00000000ffffffff R15: 00000000fffff72f
 refcount_add+0x1b/0x60 lib/refcount.c:101
 tcp_gso_segment+0x10d0/0x16b0 net/ipv4/tcp_offload.c:155
 tcp4_gso_segment+0xd4/0x310 net/ipv4/tcp_offload.c:51
 inet_gso_segment+0x60c/0x11c0 net/ipv4/af_inet.c:1271
 skb_mac_gso_segment+0x33f/0x660 net/core/dev.c:2749
 __skb_gso_segment+0x35f/0x7f0 net/core/dev.c:2821
 skb_gso_segment include/linux/netdevice.h:3971 [inline]
 validate_xmit_skb+0x4ba/0xb20 net/core/dev.c:3074
 __dev_queue_xmit+0xe49/0x2070 net/core/dev.c:3497
 dev_queue_xmit+0x17/0x20 net/core/dev.c:3538
 neigh_hh_output include/net/neighbour.h:471 [inline]
 neigh_output include/net/neighbour.h:479 [inline]
 ip_finish_output2+0xece/0x1460 net/ipv4/ip_output.c:229
 ip_finish_output+0x85e/0xd10 net/ipv4/ip_output.c:317
 NF_HOOK_COND include/linux/netfilter.h:238 [inline]
 ip_output+0x1cc/0x860 net/ipv4/ip_output.c:405
 dst_output include/net/dst.h:459 [inline]
 ip_local_out+0x95/0x160 net/ipv4/ip_output.c:124
 ip_queue_xmit+0x8c6/0x18e0 net/ipv4/ip_output.c:504
 tcp_transmit_skb+0x1ab7/0x3840 net/ipv4/tcp_output.c:1137
 tcp_write_xmit+0x663/0x4de0 net/ipv4/tcp_output.c:2341
 __tcp_push_pending_frames+0xa0/0x250 net/ipv4/tcp_output.c:2513
 tcp_push_pending_frames include/net/tcp.h:1722 [inline]
 tcp_data_snd_check net/ipv4/tcp_input.c:5050 [inline]
 tcp_rcv_established+0x8c7/0x18a0 net/ipv4/tcp_input.c:5497
 tcp_v4_do_rcv+0x2ab/0x7d0 net/ipv4/tcp_ipv4.c:1460
 sk_backlog_rcv include/net/sock.h:909 [inline]
 __release_sock+0x124/0x360 net/core/sock.c:2264
 release_sock+0xa4/0x2a0 net/core/sock.c:2776
 tcp_sendmsg+0x3a/0x50 net/ipv4/tcp.c:1462
 inet_sendmsg+0x11f/0x5e0 net/ipv4/af_inet.c:763
 sock_sendmsg_nosec net/socket.c:632 [inline]
 sock_sendmsg+0xca/0x110 net/socket.c:642
 ___sys_sendmsg+0x31c/0x890 net/socket.c:2048
 __sys_sendmmsg+0x1e6/0x5f0 net/socket.c:2138

Fixes: 14afee4b6092 ("net: convert sock.sk_wmem_alloc from atomic_t to refcount_t")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agox86/virt/xen: Use guest_late_init to detect Xen PVH guest
Juergen Gross [Thu, 9 Nov 2017 13:27:39 +0000 (14:27 +0100)]
x86/virt/xen: Use guest_late_init to detect Xen PVH guest

In case we are booted via the default boot entry by a generic loader
like grub or OVMF it is necessary to distinguish between a HVM guest
with a device model supporting legacy devices and a PVH guest without
device model.

PVH guests will always have x86_platform.legacy.no_vga set and
x86_platform.legacy.rtc cleared, while both won't be true for HVM
guests.

Test for both conditions in the guest_late_init hook and set xen_pvh
to true if they are met.

Move some of the early PVH initializations to the new hook in order
to avoid duplicated code.

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: boris.ostrovsky@oracle.com
Cc: xen-devel@lists.xenproject.org
Link: http://lkml.kernel.org/r/20171109132739.23465-6-jgross@suse.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
6 years agox86/virt, x86/platform: Add ->guest_late_init() callback to hypervisor_x86 structure
Juergen Gross [Thu, 9 Nov 2017 13:27:38 +0000 (14:27 +0100)]
x86/virt, x86/platform: Add ->guest_late_init() callback to hypervisor_x86 structure

Add a new guest_late_init callback to the hypervisor_x86 structure. It
will replace the current kvm_guest_init() call which is changed to
make use of the new callback.

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: kvm@vger.kernel.org
Cc: rkrcmar@redhat.com
Link: http://lkml.kernel.org/r/20171109132739.23465-5-jgross@suse.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
6 years agox86/virt, x86/acpi: Add test for ACPI_FADT_NO_VGA
Juergen Gross [Thu, 9 Nov 2017 13:27:37 +0000 (14:27 +0100)]
x86/virt, x86/acpi: Add test for ACPI_FADT_NO_VGA

Add a test for ACPI_FADT_NO_VGA when scanning the FADT and set the new
flag x86_platform.legacy.no_vga accordingly.

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: len.brown@intel.com
Cc: linux-pm@vger.kernel.org
Cc: pavel@ucw.cz
Cc: rjw@rjwysocki.net
Link: http://lkml.kernel.org/r/20171109132739.23465-4-jgross@suse.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
6 years agox86/virt: Add enum for hypervisors to replace x86_hyper
Juergen Gross [Thu, 9 Nov 2017 13:27:36 +0000 (14:27 +0100)]
x86/virt: Add enum for hypervisors to replace x86_hyper

The x86_hyper pointer is only used for checking whether a virtual
device is supporting the hypervisor the system is running on.

Use an enum for that purpose instead and drop the x86_hyper pointer.

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Xavier Deguillard <xdeguillard@vmware.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: akataria@vmware.com
Cc: arnd@arndb.de
Cc: boris.ostrovsky@oracle.com
Cc: devel@linuxdriverproject.org
Cc: dmitry.torokhov@gmail.com
Cc: gregkh@linuxfoundation.org
Cc: haiyangz@microsoft.com
Cc: kvm@vger.kernel.org
Cc: kys@microsoft.com
Cc: linux-graphics-maintainer@vmware.com
Cc: linux-input@vger.kernel.org
Cc: moltmann@vmware.com
Cc: pbonzini@redhat.com
Cc: pv-drivers@vmware.com
Cc: rkrcmar@redhat.com
Cc: sthemmin@microsoft.com
Cc: virtualization@lists.linux-foundation.org
Cc: xen-devel@lists.xenproject.org
Link: http://lkml.kernel.org/r/20171109132739.23465-3-jgross@suse.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
6 years agox86/virt, x86/platform: Merge 'struct x86_hyper' into 'struct x86_platform' and ...
Juergen Gross [Thu, 9 Nov 2017 13:27:35 +0000 (14:27 +0100)]
x86/virt, x86/platform: Merge 'struct x86_hyper' into 'struct x86_platform' and 'struct x86_init'

Instead of x86_hyper being either NULL on bare metal or a pointer to a
struct hypervisor_x86 in case of the kernel running as a guest merge
the struct into x86_platform and x86_init.

This will remove the need for wrappers making it hard to find out what
is being called. With dummy functions added for all callbacks testing
for a NULL function pointer can be removed, too.

Suggested-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: akataria@vmware.com
Cc: boris.ostrovsky@oracle.com
Cc: devel@linuxdriverproject.org
Cc: haiyangz@microsoft.com
Cc: kvm@vger.kernel.org
Cc: kys@microsoft.com
Cc: pbonzini@redhat.com
Cc: rkrcmar@redhat.com
Cc: rusty@rustcorp.com.au
Cc: sthemmin@microsoft.com
Cc: virtualization@lists.linux-foundation.org
Cc: xen-devel@lists.xenproject.org
Link: http://lkml.kernel.org/r/20171109132739.23465-2-jgross@suse.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
6 years agox86/tsc: Mark cyc2ns_init() and detect_art() __init
Dou Liyang [Wed, 8 Nov 2017 10:09:52 +0000 (18:09 +0800)]
x86/tsc: Mark cyc2ns_init() and detect_art() __init

These two functions are only called by tsc_init(), which is an __init
function during boot time, so mark them __init as well.

Signed-off-by: Dou Liyang <douly.fnst@cn.fujitsu.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1510135792-17429-1-git-send-email-douly.fnst@cn.fujitsu.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
6 years agocan: peak: Add support for new PCIe/M2 CAN FD interfaces
Stephane Grosjean [Thu, 9 Nov 2017 13:42:14 +0000 (14:42 +0100)]
can: peak: Add support for new PCIe/M2 CAN FD interfaces

This adds support for the following PEAK-System CAN FD interfaces:

PCAN-cPCIe FD         CAN FD Interface for cPCI Serial (2 or 4 channels)
PCAN-PCIe/104-Express CAN FD Interface for PCIe/104-Express (1, 2 or 4 ch.)
PCAN-miniPCIe FD      CAN FD Interface for PCIe Mini (1, 2 or 4 channels)
PCAN-PCIe FD OEM      CAN FD Interface for PCIe OEM version (1, 2 or 4 ch.)
PCAN-M.2              CAN FD Interface for M.2 (1 or 2 channels)

Like the PCAN-PCIe FD interface, all of these boards run the same IP Core
that is able to handle CAN FD (see also http://www.peak-system.com).

Signed-off-by: Stephane Grosjean <s.grosjean@peak-system.com>
Cc: linux-stable <stable@vger.kernel.org>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
6 years agocan: sun4i: handle overrun in RX FIFO
Gerhard Bertelsmann [Mon, 6 Nov 2017 17:16:56 +0000 (18:16 +0100)]
can: sun4i: handle overrun in RX FIFO

SUN4Is CAN IP has a 64 byte deep FIFO buffer. If the buffer is not
drained fast enough (overrun) it's getting mangled. Already received
frames are dropped - the data can't be restored.

Signed-off-by: Gerhard Bertelsmann <info@gerhard-bertelsmann.de>
Cc: linux-stable <stable@vger.kernel.org>
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
6 years agocan: c_can: don't indicate triple sampling support for D_CAN
Richard Schütz [Sun, 29 Oct 2017 12:03:22 +0000 (13:03 +0100)]
can: c_can: don't indicate triple sampling support for D_CAN

The D_CAN controller doesn't provide a triple sampling mode, so don't set
the CAN_CTRLMODE_3_SAMPLES flag in ctrlmode_supported. Currently enabling
triple sampling is a no-op.

Signed-off-by: Richard Schütz <rschuetz@uni-koblenz.de>
Cc: linux-stable <stable@vger.kernel.org> # >= v3.6
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
6 years agoMerge branch 'linus' into x86/platform, to refresh the branch
Ingo Molnar [Fri, 10 Nov 2017 07:21:08 +0000 (08:21 +0100)]
Merge branch 'linus' into x86/platform, to refresh the branch

Signed-off-by: Ingo Molnar <mingo@kernel.org>
6 years agoMerge branch 'linus' into x86/asm, to resolve conflict
Ingo Molnar [Fri, 10 Nov 2017 07:06:47 +0000 (08:06 +0100)]
Merge branch 'linus' into x86/asm, to resolve conflict

 Conflicts:
arch/x86/mm/mem_encrypt.c

Signed-off-by: Ingo Molnar <mingo@kernel.org>
6 years agoMerge branch 'x86/mm' into x86/asm, to merge branches
Ingo Molnar [Fri, 10 Nov 2017 07:05:30 +0000 (08:05 +0100)]
Merge branch 'x86/mm' into x86/asm, to merge branches

Most of x86/mm is already in x86/asm, so merge the rest too.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
6 years agox86/debug: Handle warnings before the notifier chain, to fix KGDB crash
Alexander Shishkin [Mon, 24 Jul 2017 10:04:28 +0000 (13:04 +0300)]
x86/debug: Handle warnings before the notifier chain, to fix KGDB crash

Commit:

  9a93848fe787 ("x86/debug: Implement __WARN() using UD0")

turned warnings into UD0, but the fixup code only runs after the
notify_die() chain. This is a problem, in particular, with kgdb,
which kicks in as if it was a BUG().

Fix this by running the fixup code before the notifier chain in
the invalid op handler path.

Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Tested-by: Ilya Dryomov <idryomov@gmail.com>
Acked-by: Daniel Thompson <daniel.thompson@linaro.org>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Jason Wessel <jason.wessel@windriver.com>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Richard Weinberger <richard.weinberger@gmail.com>
Cc: <stable@vger.kernel.org> # v4.12+
Link: http://lkml.kernel.org/r/20170724100428.19173-1-alexander.shishkin@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
6 years agonet/mlx5e: Increase Striding RQ minimum size limit to 4 multi-packet WQEs
Eugenia Emantayev [Thu, 12 Jan 2017 15:11:45 +0000 (17:11 +0200)]
net/mlx5e: Increase Striding RQ minimum size limit to 4 multi-packet WQEs

This is to prevent the case of working with a single MPWQE
(1 WQE is always reserved as RQ is linked-list).
When the WQE is fully consumed, HW should still have available buffer
in order not to drop packets.

Fixes: 461017cb006a ("net/mlx5e: Support RX multi-packet WQE (Striding RQ)")
Signed-off-by: Eugenia Emantayev <eugenia@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Cc: kernel-team@fb.com
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
6 years agonet/mlx5e: Set page to null in case dma mapping fails
Inbar Karmy [Sun, 15 Oct 2017 14:30:59 +0000 (17:30 +0300)]
net/mlx5e: Set page to null in case dma mapping fails

Currently, when dma mapping fails, put_page is called,
but the page is not set to null. Later, in the page_reuse treatment in
mlx5e_free_rx_descs(), mlx5e_page_release() is called for the second time,
improperly doing dma_unmap (for a non-mapped address) and an extra put_page.
Prevent this by nullifying the page pointer when dma_map fails.

Fixes: accd58833237 ("net/mlx5e: Introduce RX Page-Reuse")
Signed-off-by: Inbar Karmy <inbark@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Cc: kernel-team@fb.com
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
6 years agonet/mlx5e: Fix napi poll with zero budget
Saeed Mahameed [Tue, 31 Oct 2017 22:34:00 +0000 (15:34 -0700)]
net/mlx5e: Fix napi poll with zero budget

napi->poll can be called with budget 0, e.g. in netpoll scenarios
where the caller only wants to poll TX rings
(poll_one_napi@net/core/netpoll.c).

The below commit changed RX polling from "while" loop to "do {} while",
which caused to ignore the initial budget and handle at least one RX
packet.

This fixes the following warning:
[ 2852.049194] mlx5e_napi_poll+0x0/0x260 [mlx5_core] exceeded budget in poll
[ 2852.049195] ------------[ cut here ]------------
[ 2852.049195] WARNING: CPU: 0 PID: 25691 at net/core/netpoll.c:171 netpoll_poll_dev+0x18a/0x1a0

Fixes: 4b7dfc992514 ("net/mlx5e: Early-return on empty completion queues")
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
Reviewed-by: Tariq Toukan <tariqt@mellanox.com>
Reported-by: Martin KaFai Lau <kafai@fb.com>
Tested-by: Martin KaFai Lau <kafai@fb.com>
Cc: kernel-team@fb.com
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
6 years agonet/mlx5: Cancel health poll before sending panic teardown command
Huy Nguyen [Tue, 26 Sep 2017 20:11:56 +0000 (15:11 -0500)]
net/mlx5: Cancel health poll before sending panic teardown command

After the panic teardown firmware command, health_care detects the error
in PCI bus and calls the mlx5_pci_err_detected. This health_care flow is
no longer needed because the panic teardown firmware command will bring
down the PCI bus communication with the HCA.

The solution is to cancel the health care timer and its pending
workqueue request before sending panic teardown firmware command.

Kernel trace:
mlx5_core 0033:01:00.0: Shutdown was called
mlx5_core 0033:01:00.0: health_care:154:(pid 9304): handling bad device here
mlx5_core 0033:01:00.0: mlx5_handle_bad_state:114:(pid 9304): NIC state 1
mlx5_core 0033:01:00.0: mlx5_pci_err_detected was called
mlx5_core 0033:01:00.0: mlx5_enter_error_state:96:(pid 9304): start
mlx5_3:mlx5_ib_event:3061:(pid 9304): warning: event on port 0
mlx5_core 0033:01:00.0: mlx5_enter_error_state:104:(pid 9304): end
Unable to handle kernel paging request for data at address 0x0000003f
Faulting instruction address: 0xc0080000434b8c80

Fixes: 8812c24d28f4 ('net/mlx5: Add fast unload support in shutdown flow')
Signed-off-by: Huy Nguyen <huyn@mellanox.com>
Reviewed-by: Moshe Shemesh <moshe@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
6 years agonet/mlx5: Loop over temp list to release delay events
Huy Nguyen [Mon, 30 Oct 2017 03:40:56 +0000 (22:40 -0500)]
net/mlx5: Loop over temp list to release delay events

list_splice_init initializing waiting_events_list after splicing it to
temp list, therefore we should loop over temp list to fire the events.

Fixes: 4ca637a20a52 ("net/mlx5: Delay events till mlx5 interface's add complete for pci resume")
Signed-off-by: Huy Nguyen <huyn@mellanox.com>
Signed-off-by: Feras Daoud <ferasda@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
6 years agords: ib: Fix NULL pointer dereference in debug code
Håkon Bugge [Tue, 7 Nov 2017 15:33:34 +0000 (16:33 +0100)]
rds: ib: Fix NULL pointer dereference in debug code

rds_ib_recv_refill() is a function that refills an IB receive
queue. It can be called from both the CQE handler (tasklet) and a
worker thread.

Just after the call to ib_post_recv(), a debug message is printed with
rdsdebug():

            ret = ib_post_recv(ic->i_cm_id->qp, &recv->r_wr, &failed_wr);
            rdsdebug("recv %p ibinc %p page %p addr %lu ret %d\n", recv,
                     recv->r_ibinc, sg_page(&recv->r_frag->f_sg),
                     (long) ib_sg_dma_address(
                            ic->i_cm_id->device,
                            &recv->r_frag->f_sg),
                    ret);

Now consider an invocation of rds_ib_recv_refill() from the worker
thread, which is preemptible. Further, assume that the worker thread
is preempted between the ib_post_recv() and rdsdebug() statements.

Then, if the preemption is due to a receive CQE event, the
rds_ib_recv_cqe_handler() will be invoked. This function processes
receive completions, including freeing up data structures, such as the
recv->r_frag.

In this scenario, rds_ib_recv_cqe_handler() will process the receive
WR posted above. That implies, that the recv->r_frag has been freed
before the above rdsdebug() statement has been executed. When it is
later executed, we will have a NULL pointer dereference:

[ 4088.068008] BUG: unable to handle kernel NULL pointer dereference at 0000000000000020
[ 4088.076754] IP: rds_ib_recv_refill+0x87/0x620 [rds_rdma]
[ 4088.082686] PGD 0 P4D 0
[ 4088.085515] Oops: 0000 [#1] SMP
[ 4088.089015] Modules linked in: rds_rdma(OE) rds(OE) rpcsec_gss_krb5(E) nfsv4(E) dns_resolver(E) nfs(E) fscache(E) mlx4_ib(E) ib_ipoib(E) rdma_ucm(E) ib_ucm(E) ib_uverbs(E) ib_umad(E) rdma_cm(E) ib_cm(E) iw_cm(E) ib_core(E) binfmt_misc(E) sb_edac(E) intel_powerclamp(E) coretemp(E) kvm_intel(E) kvm(E) irqbypass(E) crct10dif_pclmul(E) crc32_pclmul(E) ghash_clmulni_intel(E) pcbc(E) aesni_intel(E) crypto_simd(E) iTCO_wdt(E) glue_helper(E) iTCO_vendor_support(E) sg(E) cryptd(E) pcspkr(E) ipmi_si(E) ipmi_devintf(E) ipmi_msghandler(E) shpchp(E) ioatdma(E) i2c_i801(E) wmi(E) lpc_ich(E) mei_me(E) mei(E) mfd_core(E) nfsd(E) auth_rpcgss(E) nfs_acl(E) lockd(E) grace(E) sunrpc(E) ip_tables(E) ext4(E) mbcache(E) jbd2(E) fscrypto(E) mgag200(E) i2c_algo_bit(E) drm_kms_helper(E) syscopyarea(E) sysfillrect(E) sysimgblt(E)
[ 4088.168486]  fb_sys_fops(E) ahci(E) ixgbe(E) libahci(E) ttm(E) mdio(E) ptp(E) pps_core(E) drm(E) sd_mod(E) libata(E) crc32c_intel(E) mlx4_core(E) i2c_core(E) dca(E) megaraid_sas(E) dm_mirror(E) dm_region_hash(E) dm_log(E) dm_mod(E) [last unloaded: rds]
[ 4088.193442] CPU: 20 PID: 1244 Comm: kworker/20:2 Tainted: G           OE   4.14.0-rc7.master.20171105.ol7.x86_64 #1
[ 4088.205097] Hardware name: Oracle Corporation ORACLE SERVER X5-2L/ASM,MOBO TRAY,2U, BIOS 31110000 03/03/2017
[ 4088.216074] Workqueue: ib_cm cm_work_handler [ib_cm]
[ 4088.221614] task: ffff885fa11d0000 task.stack: ffffc9000e598000
[ 4088.228224] RIP: 0010:rds_ib_recv_refill+0x87/0x620 [rds_rdma]
[ 4088.234736] RSP: 0018:ffffc9000e59bb68 EFLAGS: 00010286
[ 4088.240568] RAX: 0000000000000000 RBX: ffffc9002115d050 RCX: ffffc9002115d050
[ 4088.248535] RDX: ffffffffa0521380 RSI: ffffffffa0522158 RDI: ffffffffa0525580
[ 4088.256498] RBP: ffffc9000e59bbf8 R08: 0000000000000005 R09: 0000000000000000
[ 4088.264465] R10: 0000000000000339 R11: 0000000000000001 R12: 0000000000000000
[ 4088.272433] R13: ffff885f8c9d8000 R14: ffffffff81a0a060 R15: ffff884676268000
[ 4088.280397] FS:  0000000000000000(0000) GS:ffff885fbec80000(0000) knlGS:0000000000000000
[ 4088.289434] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 4088.295846] CR2: 0000000000000020 CR3: 0000000001e09005 CR4: 00000000001606e0
[ 4088.303816] Call Trace:
[ 4088.306557]  rds_ib_cm_connect_complete+0xe0/0x220 [rds_rdma]
[ 4088.312982]  ? __dynamic_pr_debug+0x8c/0xb0
[ 4088.317664]  ? __queue_work+0x142/0x3c0
[ 4088.321944]  rds_rdma_cm_event_handler+0x19e/0x250 [rds_rdma]
[ 4088.328370]  cma_ib_handler+0xcd/0x280 [rdma_cm]
[ 4088.333522]  cm_process_work+0x25/0x120 [ib_cm]
[ 4088.338580]  cm_work_handler+0xd6b/0x17aa [ib_cm]
[ 4088.343832]  process_one_work+0x149/0x360
[ 4088.348307]  worker_thread+0x4d/0x3e0
[ 4088.352397]  kthread+0x109/0x140
[ 4088.355996]  ? rescuer_thread+0x380/0x380
[ 4088.360467]  ? kthread_park+0x60/0x60
[ 4088.364563]  ret_from_fork+0x25/0x30
[ 4088.368548] Code: 48 89 45 90 48 89 45 98 eb 4d 0f 1f 44 00 00 48 8b 43 08 48 89 d9 48 c7 c2 80 13 52 a0 48 c7 c6 58 21 52 a0 48 c7 c7 80 55 52 a0 <4c> 8b 48 20 44 89 64 24 08 48 8b 40 30 49 83 e1 fc 48 89 04 24
[ 4088.389612] RIP: rds_ib_recv_refill+0x87/0x620 [rds_rdma] RSP: ffffc9000e59bb68
[ 4088.397772] CR2: 0000000000000020
[ 4088.401505] ---[ end trace fe922e6ccf004431 ]---

This bug was provoked by compiling rds out-of-tree with
EXTRA_CFLAGS="-DRDS_DEBUG -DDEBUG" and inserting an artificial delay
between the rdsdebug() and ib_ib_port_recv() statements:

           /* XXX when can this fail? */
       ret = ib_post_recv(ic->i_cm_id->qp, &recv->r_wr, &failed_wr);
+ if (can_wait)
+ usleep_range(1000, 5000);
       rdsdebug("recv %p ibinc %p page %p addr %lu ret %d\n", recv,
recv->r_ibinc, sg_page(&recv->r_frag->f_sg),
(long) ib_sg_dma_address(

The fix is simply to move the rdsdebug() statement up before the
ib_post_recv() and remove the printing of ret, which is taken care of
anyway by the non-debug code.

Signed-off-by: Håkon Bugge <haakon.bugge@oracle.com>
Reviewed-by: Knut Omang <knut.omang@oracle.com>
Reviewed-by: Wei Lin Guay <wei.lin.guay@oracle.com>
Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoMerge branch 'akpm' (patches from Andrew)
Linus Torvalds [Fri, 10 Nov 2017 02:26:51 +0000 (18:26 -0800)]
Merge branch 'akpm' (patches from Andrew)

Merge misc fixes from Andrew Morton:
 "2 fixes"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  MAINTAINERS: update TPM driver infrastructure changes
  sysctl: add register_sysctl() dummy helper

6 years agoMAINTAINERS: update TPM driver infrastructure changes
Jarkko Sakkinen [Thu, 9 Nov 2017 21:38:21 +0000 (13:38 -0800)]
MAINTAINERS: update TPM driver infrastructure changes

[akpm@linux-foundation.org: alpha-sort CREDITS, per Randy]
Link: http://lkml.kernel.org/r/20170915223811.21368-1-jarkko.sakkinen@linux.intel.com
Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Cc: Marcel Selhorst <tpmdd@selhorst.net>
Cc: Ashley Lai <ashleydlai@gmail.com>
Cc: Mimi Zohar <zohar@linux.vnet.ibm.com>
Cc: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Cc: Boris Brezillon <boris.brezillon@free-electrons.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Håvard Skinnemoen <hskinnemoen@gmail.com>
Cc: Martin Kepplinger <martink@posteo.de>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Hans-Christian Noren Egtvedt <egtvedt@samfundet.no>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Gertjan van Wingerde <gwingerde@gmail.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: David S. Miller <davem@davemloft.net>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agosysctl: add register_sysctl() dummy helper
Arnd Bergmann [Thu, 9 Nov 2017 21:38:18 +0000 (13:38 -0800)]
sysctl: add register_sysctl() dummy helper

register_sysctl() has been around for five years with commit
fea478d4101a ("sysctl: Add register_sysctl for normal sysctl users") but
now that arm64 started using it, I ran into a compile error:

  arch/arm64/kernel/armv8_deprecated.c: In function 'register_insn_emulation_sysctl':
  arch/arm64/kernel/armv8_deprecated.c:257:2: error: implicit declaration of function 'register_sysctl'

This adds a inline function like we already have for
register_sysctl_paths() and register_sysctl_table().

Link: http://lkml.kernel.org/r/20171106133700.558647-1-arnd@arndb.de
Fixes: 38b9aeb32fa7 ("arm64: Port deprecated instruction emulation to new sysctl interface")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Dave Martin <Dave.Martin@arm.com>
Acked-by: Kees Cook <keescook@chromium.org>
Acked-by: Will Deacon <will.deacon@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: "Luis R. Rodriguez" <mcgrof@kernel.org>
Cc: Alex Benne <alex.bennee@linaro.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
6 years agoMerge tag 'pci-v4.14-fixes-7' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaa...
Linus Torvalds [Fri, 10 Nov 2017 01:43:27 +0000 (17:43 -0800)]
Merge tag 'pci-v4.14-fixes-7' of git://git./linux/kernel/git/helgaas/pci

Pull PCI maintainership updates from Bjorn Helgaas:
 "Update MAINTAINERS for HiSilicon, Microsemi Switchtec, and native host
  bridge drivers (Gabriele Paoloni, Sebastian Andrzej Siewior).

  Note that starting with changes intended for v4.16, Lorenzo Pieralisi
  will maintain the drivers/pci/{dwc,endpoint,host} directories. My
  intent is to continue to merge those changes via my tree, so this
  should be transparent to you"

* tag 'pci-v4.14-fixes-7' of git://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci:
  MAINTAINERS: Add Lorenzo Pieralisi for PCI host bridge drivers
  MAINTAINERS: Remove Gabriele Paoloni as HiSilicon PCI maintainer
  MAINTAINERS: Remove Stephen Bates as Microsemi Switchtec maintainer

6 years agoMerge branch 'fixes' of git://git.armlinux.org.uk/~rmk/linux-arm
Linus Torvalds [Fri, 10 Nov 2017 01:41:39 +0000 (17:41 -0800)]
Merge branch 'fixes' of git://git.armlinux.org.uk/~rmk/linux-arm

Pull ARM fix from Russell King:
 "Last ARM fix for 4.14.

  This plugs a hole in dump_instr(), which, with certain conditions
  satisfied, can dump instructions from kernel space"

* 'fixes' of git://git.armlinux.org.uk/~rmk/linux-arm:
  ARM: 8720/1: ensure dump_instr() checks addr_limit

6 years agom68k/defconfig: Update defconfigs for v4.14-rc7
Geert Uytterhoeven [Mon, 30 Oct 2017 07:13:32 +0000 (08:13 +0100)]
m68k/defconfig: Update defconfigs for v4.14-rc7

Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
6 years agom68k/mac: Add mutual exclusion for IOP interrupt polling
Finn Thain [Fri, 27 Oct 2017 02:45:24 +0000 (22:45 -0400)]
m68k/mac: Add mutual exclusion for IOP interrupt polling

The IOP interrupt handler iop_ism_irq() is used by the adb-iop
driver to poll for ADB request completion. Unfortunately, it is not
re-entrant. Fix the race condition by adding an iop_ism_irq_poll()
function with suitable mutual exclusion.

Tested-by: Stan Johnson <userm57@yahoo.com>
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
6 years agom68k/mac: Disentangle VIA/RBV and NuBus initialization
Finn Thain [Fri, 27 Oct 2017 02:45:24 +0000 (22:45 -0400)]
m68k/mac: Disentangle VIA/RBV and NuBus initialization

The Nubus subsystem should not be concerned with differences between VIA,
RBV and OSS platforms. It should be portable across Macs and PowerMacs.
This goal has implications for the initialization code relating to bus
locking and slot interrupts.

During Nubus initialization, bus transactions are "unlocked": on VIA2 and
RBV machines, via_nubus_init() sets a bit in the via2[gBufB] register to
allow bus-mastering Nubus cards to arbitrate for the bus. This happens
upon subsys_initcall(nubus_init). But because nubus_init() has no effect
on card state, this sequence is arbitrary.

Moreover, when Penguin is used to boot Linux, the bus is already unlocked
when Linux starts. On OSS machines there's no attempt to unlock Nubus
transactions at all. (Maybe there's no benefit on that platform or maybe
no-one knows how.)

All of this demonstrates that there's no benefit in locking out
bus-mastering cards, as yet. (If the need arises, we could lock the bus
for the duration of a timing-critical operation.) NetBSD unlocks the
Nubus early (at VIA initialization) and we can do the same.

via_nubus_init() is also responsible for some VIA interrupt setup that
should happen earlier than subsys_initcall(nubus_init). And actually, the
Nubus subsystem need not be involved with slot interrupts: SLOT2IRQ
works fine because Nubus slot IRQs are geographically assigned
(regardless of platform).

For certain platforms with PDS slots, some Nubus IRQs may be platform
IRQs and this is not something that the NuBus subsystem should worry
about. So let's invoke via_nubus_init() earlier and make the platform
responsible for bus unlocking and interrupt setup instead of the NuBus
subsystem.

Tested-by: Stan Johnson <userm57@yahoo.com>
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
6 years agom68k/mac: Disentangle VIA and OSS initialization
Finn Thain [Fri, 27 Oct 2017 02:45:24 +0000 (22:45 -0400)]
m68k/mac: Disentangle VIA and OSS initialization

macintosh_config->via_type is meaningless on Mac IIfx (i.e. the only
model with OSS chip), so skip the via_type switch statement.

Call oss_init() before via_init() because it is more important and
because that is the right place to initialize the oss_present flag.

On this model, bringing forward oss_init() and delaying via_init()
is no problem because those functions are independent.

The only requirement here is that oss_register_interrupts() happens
after via_init(). That is, mac_init_IRQ() happens after config_mac().

Tested-by: Stan Johnson <userm57@yahoo.com>
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
6 years agom68k/mac: More printk modernization
Finn Thain [Fri, 27 Oct 2017 02:45:24 +0000 (22:45 -0400)]
m68k/mac: More printk modernization

Log message fragments used to be printed on one line but now get split up.
Fix this. Also, suppress log spam that merely prints known pointer values.

Tested-by: Stan Johnson <userm57@yahoo.com>
Signed-off-by: Finn Thain <fthain@telegraphics.com.au>
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
6 years agoMerge tag 'pm-final-4.14' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael...
Linus Torvalds [Thu, 9 Nov 2017 19:16:28 +0000 (11:16 -0800)]
Merge tag 'pm-final-4.14' of git://git./linux/kernel/git/rafael/linux-pm

Pull final power management fixes from Rafael Wysocki:
 "These fix a regression in the schedutil cpufreq governor introduced by
  a recent change and blacklist Dell XPS13 9360 from using the Low Power
  S0 Idle _DSM interface which triggers serious problems on one of these
  machines.

  Specifics:

   - Prevent the schedutil cpufreq governor from using the utilization
     of a wrong CPU in some cases which started to happen after one of
     the recent changes in it (Chris Redpath).

   - Blacklist Dell XPS13 9360 from using the Low Power S0 Idle _DSM
     interface as that causes serious issue (related to NVMe) to appear
     on one of these machines, even though the other Dells XPS13 9360 in
     somewhat different HW configurations behave correctly (Rafael
     Wysocki)"

* tag 'pm-final-4.14' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  ACPI / PM: Blacklist Low Power S0 Idle _DSM for Dell XPS13 9360
  cpufreq: schedutil: Examine the correct CPU when we update util

6 years agoMerge tag 'sound-4.14' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Linus Torvalds [Thu, 9 Nov 2017 17:58:11 +0000 (09:58 -0800)]
Merge tag 'sound-4.14' of git://git./linux/kernel/git/tiwai/sound

Pull sound fixes from Takashi Iwai:
 "The amount of the changes isn't as quite small as wished, nevertheless
  they are straight fixes that deserve merging to 4.14 final.

  Most of fixes are about ALSA core bugs spotted by fuzzer: a follow-up
  fix for the previous nested rwsem patch, a fix to avoid the resource
  hogs due to too many concurrent ALSA timer invocations, and a fix for
  a crash with SYSEX MIDI transfer over OSS sequencer emulation that is
  used by none but fuzzer.

  The rest are usual HD-audio and USB-audio device-specific quirks,
  which are safe to apply"

* tag 'sound-4.14' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
  ALSA: hda - fix headset mic problem for Dell machines with alc274
  ALSA: seq: Fix OSS sysex delivery in OSS emulation
  ALSA: seq: Avoid invalid lockdep class warning
  ALSA: timer: Limit max instances per timer
  ALSA: usb-audio: support new Amanero Combo384 firmware version

6 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Linus Torvalds [Thu, 9 Nov 2017 17:31:34 +0000 (09:31 -0800)]
Merge git://git./linux/kernel/git/davem/net

Pull networking fixes from David Miller:

 1) Fix use-after-free in IPSEC input parsing, desintation address
    pointer was loaded before pskb_may_pull() which can change the SKB
    data pointers. From Florian Westphal.

 2) Stack out-of-bounds read in xfrm_state_find(), from Steffen
    Klassert.

 3) IPVS state of SKB is not properly reset when moving between
    namespaces, from Ye Yin.

 4) Fix crash in asix driver suspend and resume, from Andrey Konovalov.

 5) Don't deliver ipv6 l2tp tunnel packets to ipv4 l2tp tunnels, and
    vice versa, from Guillaume Nault.

 6) Fix DSACK undo on non-dup ACKs, from Priyaranjan Jha.

 7) Fix regression in bond_xmit_hash()'s behavior after the TCP port
    selection changes back in 4.2, from Hangbin Liu.

 8) Two divide by zero bugs in USB networking drivers when parsing
    descriptors, from Bjorn Mork.

 9) Fix bonding slaves being stuck in BOND_LINK_FAIL state, from Jay
    Vosburgh.

10) Missing skb_reset_mac_header() in qmi_wwan, from Kristian Evensen.

11) Fix the destruction of tc action object races properly, from Cong
    Wang.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (31 commits)
  cls_u32: use tcf_exts_get_net() before call_rcu()
  cls_tcindex: use tcf_exts_get_net() before call_rcu()
  cls_rsvp: use tcf_exts_get_net() before call_rcu()
  cls_route: use tcf_exts_get_net() before call_rcu()
  cls_matchall: use tcf_exts_get_net() before call_rcu()
  cls_fw: use tcf_exts_get_net() before call_rcu()
  cls_flower: use tcf_exts_get_net() before call_rcu()
  cls_flow: use tcf_exts_get_net() before call_rcu()
  cls_cgroup: use tcf_exts_get_net() before call_rcu()
  cls_bpf: use tcf_exts_get_net() before call_rcu()
  cls_basic: use tcf_exts_get_net() before call_rcu()
  net_sched: introduce tcf_exts_get_net() and tcf_exts_put_net()
  Revert "net_sched: hold netns refcnt for each action"
  net: usb: asix: fill null-ptr-deref in asix_suspend
  Revert "net: usb: asix: fill null-ptr-deref in asix_suspend"
  qmi_wwan: Add missing skb_reset_mac_header-call
  bonding: fix slave stuck in BOND_LINK_FAIL state
  qrtr: Move to postcore_initcall
  net: qmi_wwan: fix divide by 0 on bad descriptors
  net: cdc_ether: fix divide by 0 on bad descriptors
  ...

6 years agox86/mm: Fix ELF_ET_DYN_BASE for 5-level paging
Kirill A. Shutemov [Tue, 7 Nov 2017 10:38:04 +0000 (13:38 +0300)]
x86/mm: Fix ELF_ET_DYN_BASE for 5-level paging

On machines with 5-level paging we don't want to allocate mapping above
47-bit unless user explicitly asked for it. See b569bab78d8d ("x86/mm:
Prepare to expose larger address space to userspace") for details.

c715b72c1ba4 ("mm: revert x86_64 and arm64 ELF_ET_DYN_BASE base
changes") broke the behaviour. After the commit elf binary and heap got
mapped above 47-bits.

Use DEFAULT_MAP_WINDOW instead of TASK_SIZE to determine ELF_ET_DYN_BASE so
it's forced to be below 47-bits unconditionally.

Fixes: c715b72c1ba4 ("mm: revert x86_64 and arm64 ELF_ET_DYN_BASE base changes")
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: linux-mm@kvack.org
Cc: Andrew Morton <akpm@linux-foundation.org>
Link: https://lkml.kernel.org/r/20171107103804.47341-1-kirill.shutemov@linux.intel.com
6 years agos390: simplify transactional execution elf hwcap handling
Heiko Carstens [Thu, 9 Nov 2017 12:20:12 +0000 (13:20 +0100)]
s390: simplify transactional execution elf hwcap handling

Just use MACHINE_HAS_TE to decide if HWCAP_S390_TE needs
to be added to elf_hwcap.

Suggested-by: Dan Horák <dan@danny.cz>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
6 years agos390/zcrypt: Rework struct ap_qact_ap_info.
Harald Freudenberger [Mon, 30 Oct 2017 11:10:54 +0000 (12:10 +0100)]
s390/zcrypt: Rework struct ap_qact_ap_info.

The ap_qact_ap_info struct can get more easy handled when the fields
in there can be accessed by their names but also the struct as a whole
with just an unsigned long value. This patch reworks this struct to be
a union and adapt the using code accordingly.

Signed-off-by: Harald Freudenberger <freude@linux.vnet.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
6 years agos390/virtio: remove unused header file kvm_virtio.h
Christian Borntraeger [Thu, 9 Nov 2017 07:57:47 +0000 (08:57 +0100)]
s390/virtio: remove unused header file kvm_virtio.h

With commit 7fb2b2d51244 ("s390/virtio: remove the old KVM virtio
transport") the pre-ccw virtio transport for s390 was removed. To
complete the removal the uapi header file that contains the related data
structures must also be removed.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
6 years agoperf trace: Call machine__exit() at exit
Andrei Vagin [Wed, 8 Nov 2017 00:22:45 +0000 (16:22 -0800)]
perf trace: Call machine__exit() at exit

Otherwise 'perf trace' leaves a temporary file /tmp/perf-vdso.so-XXXXXX.

  $ perf trace -o log true
  $ ls -l /tmp/perf-vdso.*
  -rw------- 1 root root 8192 Nov  8 03:08 /tmp/perf-vdso.so-5bCpD0

Signed-off-by: Andrei Vagin <avagin@openvz.org>
Reviewed-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Vasily Averin <vvs@virtuozzo.com>
Link: http://lkml.kernel.org/r/20171108002246.8924-1-avagin@openvz.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
6 years agoperf tools: Fix eBPF event specification parsing
Jiri Olsa [Thu, 9 Nov 2017 09:02:10 +0000 (10:02 +0100)]
perf tools: Fix eBPF event specification parsing

Looks like I've reached the new level of stupidity, adding missing braces.

Committer testing:

Given the following eBPF C filter, that will add a record when it
returns true, i.e. when the tv_nsec variable is > 2000ns, should be
built and installed via sys_bpf(), but fails to do so before this patch:

  # cat filter.c
  #include <uapi/linux/bpf.h>
  #define SEC(NAME) __attribute__((section(NAME), used))

  SEC("func=hrtimer_nanosleep rqtp->tv_nsec")
  int func(void *ctx, int err, long nsec)
  {
  return nsec > 1000;
  }
  char _license[] SEC("license") = "GPL";
  int _version SEC("version") = LINUX_VERSION_CODE;
  #

  # perf trace -e nanosleep,filter.c usleep 1
  invalid or unsupported event: 'filter.c'
  Run 'perf list' for a list of valid events

   Usage: perf trace [<options>] [<command>]
      or: perf trace [<options>] -- <command> [<options>]
      or: perf trace record [<options>] [<command>]
      or: perf trace record [<options>] -- <command> [<options>]

      -e, --event <event>   event/syscall selector. use 'perf list' to list available events
  #

And works again after it is applied, the nothing is inserted when the co

  # perf trace -e *sleep,filter.c usleep 1
     0.000 ( 0.066 ms): usleep/23994 nanosleep(rqtp: 0x7ffead94a0d0) = 0
  # perf trace -e *sleep,filter.c usleep 2
     0.000 ( 0.008 ms): usleep/24378 nanosleep(rqtp: 0x7fffa021ba50) ...
     0.008 (         ): perf_bpf_probe:func:(ffffffffb410cb30) tv_nsec=2000)
     0.000 ( 0.066 ms): usleep/24378  ... [continued]: nanosleep()) = 0
  #

The intent of 9445464bb831 is kept:

  # perf stat -e 'cpu/uops_executed.core,krava/'  true
  event syntax error: '..cuted.core,krava/'
                                    \___ unknown term

  valid terms: cmask,pc,event,edge,in_tx,any,ldlat,inv,umask,in_tx_cp,offcore_rsp,config,config1,config2,name,period
  Run 'perf list' for a list of valid events

   Usage: perf stat [<options>] [<command>]

      -e, --event <event>   event selector. use 'perf list' to list available events
  #
  # perf stat -e 'cpu/uops_executed.core,period=1/'  true

   Performance counter stats for 'true':

           808,332      cpu/uops_executed.core,period=1/

       0.002997237 seconds time elapsed

  #

Reported-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Fixes: 9445464bb831 ("perf tools: Unwind properly location after REJECT")
Link: http://lkml.kernel.org/n/tip-diea0ihbwpxfw6938huv3whj@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
6 years agoperf tools: Add "reject" option for parse-events.l
Jiri Olsa [Wed, 8 Nov 2017 15:43:09 +0000 (16:43 +0100)]
perf tools: Add "reject" option for parse-events.l

Arnaldo reported broken builds in some distros using a newer flex
release, 2.6.4, found in Alpine Linux 3.6 and Edge, with flex not
spotting the REJECT macro:

  CC       /tmp/build/perf/util/parse-events-flex.o
  util/parse-events.l: In function 'parse_events_lex':
  /tmp/build/perf/util/parse-events-flex.c:4734:16: error: \
  'reject_used_but_not_detected' undeclared (first use in this function)

It's happening because we put the REJECT under another USER_REJECT macro
in following commit:

  9445464bb831 perf tools: Unwind properly location after REJECT

Fortunately flex provides option for force it to use REJECT, adding it
to parse-events.l.

Reported-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Reported-by: Markus Trippelsdorf <markus@trippelsdorf.de>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Reviewed-by: Andi Kleen <andi@firstfloor.org>
Tested-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Fixes: 9445464bb831 ("perf tools: Unwind properly location after REJECT")
Link: http://lkml.kernel.org/n/tip-7kdont984mw12ijk7rji6b8p@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
6 years agoMerge tag 'irqchip-4.15-3' of git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm...
Thomas Gleixner [Thu, 9 Nov 2017 11:57:46 +0000 (12:57 +0100)]
Merge tag 'irqchip-4.15-3' of git://git./linux/kernel/git/maz/arm-platforms into irq/core

Pull irqchip updates for 4.15, take #3 from Marc Zyngier:

 - New Socionext Synquacer EXIU driver
 - stm32 new platform support and fixes
 - One GICv4 bugfix
 - A couple of MIPS GIC cleanups

6 years agoirqchip: mips-gic: Print warning if inherited GIC base is used
Matt Redfearn [Thu, 9 Nov 2017 11:02:45 +0000 (11:02 +0000)]
irqchip: mips-gic: Print warning if inherited GIC base is used

If the physical address of the GIC resource cannot be read from device
tree, then the code falls back to reading it from the gcr_gic_base
register. Hopefully this has been set to a sane value by the bootloader
or some platform code, but is defined by the hardware manual to have
"undefined" reset state. Using it as the address at which the GIC will
be mapped into physical memory space can therefore be risky if it has
not been initialised, since it may result in the GIC being mapped to an
effectively random address anywhere in physical memory, where it might
conflict with peripherals or RAM and lead to weird crashes.

Since a "sane value" is very platform specific because it is particular
to the platform's memory map, it is difficult to test for. At the very
least, a warning message should be printed in the case that we trust the
inherited value.

Reported-by: Amit Kama <amit.kama@satixfy.com>
Signed-off-by: Matt Redfearn <matt.redfearn@mips.com>
Reviewed-by: Paul Burton <paul.burton@mips.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
6 years agoirqchip/mips-gic: Add pr_fmt and reword pr_* messages
Matt Redfearn [Thu, 9 Nov 2017 11:02:44 +0000 (11:02 +0000)]
irqchip/mips-gic: Add pr_fmt and reword pr_* messages

Several messages from the MIPS GIC driver include the text "GIC", but
the format is not standard. Add a pr_fmt of "irq-mips-gic: " and reword
the messages now that they will be prefixed with the driver name.

Signed-off-by: Matt Redfearn <matt.redfearn@mips.com>
Reviewed-by: Paul Burton <paul.burton@mips.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
6 years agoMerge tag 'timers-conversion-next5' of git://git.kernel.org/pub/scm/linux/kernel...
Thomas Gleixner [Thu, 9 Nov 2017 10:42:52 +0000 (11:42 +0100)]
Merge tag 'timers-conversion-next5' of git://git./linux/kernel/git/kees/linux into timers/core

Pull the 5th batch of timer conversions from Kees Cook

 - qla2xxx patches have passed testing
 - ipvs patches have been Acked
 - prepare for more tree-wide DEFINE_TIMER() changes

6 years agorbd: use GFP_NOIO for parent stat and data requests
Ilya Dryomov [Mon, 6 Nov 2017 10:33:36 +0000 (11:33 +0100)]
rbd: use GFP_NOIO for parent stat and data requests

rbd_img_obj_exists_submit() and rbd_img_obj_parent_read_full() are on
the writeback path for cloned images -- we attempt a stat on the parent
object to see if it exists and potentially read it in to call copyup.
GFP_NOIO should be used instead of GFP_KERNEL here.

Cc: stable@vger.kernel.org
Link: http://tracker.ceph.com/issues/22014
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: David Disseldorp <ddiss@suse.de>
6 years agoALSA: hda - fix headset mic problem for Dell machines with alc274
Hui Wang [Thu, 9 Nov 2017 00:48:08 +0000 (08:48 +0800)]
ALSA: hda - fix headset mic problem for Dell machines with alc274

Confirmed with Kailang of Realtek, the pin 0x19 is for Headset Mic, and
the pin 0x1a is for Headphone Mic, he suggested to apply
ALC269_FIXUP_DELL1_MIC_NO_PRESENCE to fix this problem. And we
verified applying this FIXUP can fix this problem.

Cc: <stable@vger.kernel.org>
Cc: Kailang Yang <kailang@realtek.com>
Signed-off-by: Hui Wang <hui.wang@canonical.com>
Signed-off-by: Takashi Iwai <tiwai@suse.de>
6 years agosched/core: Optimize sched_feat() for !CONFIG_SCHED_DEBUG builds
Patrick Bellasi [Wed, 8 Nov 2017 18:41:01 +0000 (18:41 +0000)]
sched/core: Optimize sched_feat() for !CONFIG_SCHED_DEBUG builds

When the kernel is compiled with !CONFIG_SCHED_DEBUG support, we expect that
all SCHED_FEAT are turned into compile time constants being propagated
to support compiler optimizations.

Specifically, we expect that code blocks like this:

   if (sched_feat(FEATURE_NAME) [&& <other_conditions>]) {
/* FEATURE CODE */
   }

are turned into dead-code in case FEATURE_NAME defaults to FALSE, and thus
being removed by the compiler from the finale image.

For this mechanism to properly work it's required for the compiler to
have full access, from each translation unit, to whatever is the value
defined by the sched_feat macro. This macro is defined as:

   #define sched_feat(x) (sysctl_sched_features & (1UL << __SCHED_FEAT_##x))

and thus, the compiler can optimize that code only if the value of
sysctl_sched_features is visible within each translation unit.

Since:

   029632fbb ("sched: Make separate sched*.c translation units")

the scheduler code has been split into separate translation units
however the definition of sysctl_sched_features is part of
kernel/sched/core.c while, for all the other scheduler modules, it is
visible only via kernel/sched/sched.h as an:

   extern const_debug unsigned int sysctl_sched_features

Unfortunately, an extern reference does not allow the compiler to apply
constants propagation. Thus, on !CONFIG_SCHED_DEBUG kernel we still end up
with code to load a memory reference and (eventually) doing an unconditional
jump of a chunk of code.

This mechanism is unavoidable when sched_features can be turned on and off at
run-time. However, this is not the case for "production" kernels compiled with
!CONFIG_SCHED_DEBUG. In this case, sysctl_sched_features is just a constant value
which cannot be changed at run-time and thus memory loads and jumps can be
avoided altogether.

This patch fixes the case of !CONFIG_SCHED_DEBUG kernel by declaring a local version
of the sysctl_sched_features constant for each translation unit. This will
ultimately allow the compiler to perform constants propagation and dead-code
pruning.

Tests have been done, with !CONFIG_SCHED_DEBUG on a v4.14-rc8 with and without
the patch, by running 30 iterations of:

   perf bench sched messaging --pipe --thread --group 4 --loop 50000

on a 40 cores Intel(R) Xeon(R) CPU E5-2690 v2 @ 3.00GHz using the
powersave governor to rule out variations due to frequency scaling.

Statistics on the reported completion time:

                   count     mean       std     min       99%     max
  v4.14-rc8         30.0  15.7831  0.176032  15.442  16.01226  16.014
  v4.14-rc8+patch   30.0  15.5033  0.189681  15.232  15.93938  15.962

... show a 1.8% speedup on average completion time and 0.5% speedup in the
99 percentile.

Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com>
Signed-off-by: Chris Redpath <chris.redpath@arm.com>
Reviewed-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
Reviewed-by: Brendan Jackman <brendan.jackman@arm.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Morten Rasmussen <morten.rasmussen@arm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Link: http://lkml.kernel.org/r/20171108184101.16006-1-patrick.bellasi@arm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
6 years agox86/build: Make the boot image generation less verbose
Changbin Du [Thu, 9 Nov 2017 06:09:11 +0000 (14:09 +0800)]
x86/build: Make the boot image generation less verbose

This change suppresses the 'dd' output and adds the '-quiet' parameter
to mkisofs tool. It also removes the 'Using ...' messages, as none of the
messages matter to the user normally.

"make V=1" can still be used for a more verbose build.

The new build messages are now a streamlined set of:

  $ make isoimage
  ...
  Kernel: arch/x86/boot/bzImage is ready  (#75)
    GENIMAGE arch/x86/boot/image.iso
  Kernel: arch/x86/boot/image.iso is ready

Signed-off-by: Changbin Du <changbin.du@intel.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1510207751-22166-1-git-send-email-changbin.du@intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
6 years agoMerge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec
David S. Miller [Thu, 9 Nov 2017 01:58:35 +0000 (10:58 +0900)]
Merge branch 'master' of git://git./linux/kernel/git/klassert/ipsec

Steffen Klassert says:

====================
pull request (net): ipsec 2017-11-09

1) Fix a use after free due to a reallocated skb head.
   From Florian Westphal.

2) Fix sporadic lookup failures on labeled IPSEC.
   From Florian Westphal.

3) Fix a stack out of bounds when a socket policy is applied
   to an IPv6 socket that sends IPv4 packets.

Please pull or let me know if there are problems.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoMerge tag 'drm-intel-fixes-2017-11-08' of git://anongit.freedesktop.org/drm/drm-intel...
Dave Airlie [Thu, 9 Nov 2017 01:17:32 +0000 (11:17 +1000)]
Merge tag 'drm-intel-fixes-2017-11-08' of git://anongit.freedesktop.org/drm/drm-intel into drm-fixes

- Fix possible NULL dereference (Chris).
- Avoid miss usage of syncobj by rejecting unknown flags (Tvrtko).

* tag 'drm-intel-fixes-2017-11-08' of git://anongit.freedesktop.org/drm/drm-intel:
  drm/i915: Deconstruct struct sgt_dma initialiser
  drm/i915: Reject unknown syncobj flags

6 years agoMerge branch 'net-sched-race-fix'
David S. Miller [Thu, 9 Nov 2017 01:03:10 +0000 (10:03 +0900)]
Merge branch 'net-sched-race-fix'

Cong Wang says:

====================
net_sched: close the race between call_rcu() and cleanup_net()

This patchset tries to fix the race between call_rcu() and
cleanup_net() again. Without holding the netns refcnt the
tc_action_net_exit() in netns workqueue could be called before
filter destroy works in tc filter workqueue. This patchset
moves the netns refcnt from tc actions to tcf_exts, without
breaking per-netns tc actions.

Patch 1 reverts the previous fix, patch 2 introduces two new
API's to help to address the bug and the rest patches switch
to the new API's. Please see each patch for details.

I was not able to reproduce this bug, but now after adding
some delay in filter destroy work I manage to trigger the
crash. After this patchset, the crash is not reproducible
any more and the debugging printk's show the order is expected
too.
====================

Fixes: ddf97ccdd7cb ("net_sched: add network namespace support for tc actions")
Reported-by: Lucas Bates <lucasb@mojatatu.com>
Cc: Lucas Bates <lucasb@mojatatu.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agocls_u32: use tcf_exts_get_net() before call_rcu()
Cong Wang [Mon, 6 Nov 2017 21:47:30 +0000 (13:47 -0800)]
cls_u32: use tcf_exts_get_net() before call_rcu()

Hold netns refcnt before call_rcu() and release it after
the tcf_exts_destroy() is done.

Note, on ->destroy() path we have to respect the return value
of tcf_exts_get_net(), on other paths it should always return
true, so we don't need to care.

Cc: Lucas Bates <lucasb@mojatatu.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agocls_tcindex: use tcf_exts_get_net() before call_rcu()
Cong Wang [Mon, 6 Nov 2017 21:47:29 +0000 (13:47 -0800)]
cls_tcindex: use tcf_exts_get_net() before call_rcu()

Hold netns refcnt before call_rcu() and release it after
the tcf_exts_destroy() is done.

Note, on ->destroy() path we have to respect the return value
of tcf_exts_get_net(), on other paths it should always return
true, so we don't need to care.

Cc: Lucas Bates <lucasb@mojatatu.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agocls_rsvp: use tcf_exts_get_net() before call_rcu()
Cong Wang [Mon, 6 Nov 2017 21:47:28 +0000 (13:47 -0800)]
cls_rsvp: use tcf_exts_get_net() before call_rcu()

Hold netns refcnt before call_rcu() and release it after
the tcf_exts_destroy() is done.

Note, on ->destroy() path we have to respect the return value
of tcf_exts_get_net(), on other paths it should always return
true, so we don't need to care.

Cc: Lucas Bates <lucasb@mojatatu.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agocls_route: use tcf_exts_get_net() before call_rcu()
Cong Wang [Mon, 6 Nov 2017 21:47:27 +0000 (13:47 -0800)]
cls_route: use tcf_exts_get_net() before call_rcu()

Hold netns refcnt before call_rcu() and release it after
the tcf_exts_destroy() is done.

Note, on ->destroy() path we have to respect the return value
of tcf_exts_get_net(), on other paths it should always return
true, so we don't need to care.

Cc: Lucas Bates <lucasb@mojatatu.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agocls_matchall: use tcf_exts_get_net() before call_rcu()
Cong Wang [Mon, 6 Nov 2017 21:47:26 +0000 (13:47 -0800)]
cls_matchall: use tcf_exts_get_net() before call_rcu()

Hold netns refcnt before call_rcu() and release it after
the tcf_exts_destroy() is done.

Cc: Lucas Bates <lucasb@mojatatu.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agocls_fw: use tcf_exts_get_net() before call_rcu()
Cong Wang [Mon, 6 Nov 2017 21:47:25 +0000 (13:47 -0800)]
cls_fw: use tcf_exts_get_net() before call_rcu()

Hold netns refcnt before call_rcu() and release it after
the tcf_exts_destroy() is done.

Note, on ->destroy() path we have to respect the return value
of tcf_exts_get_net(), on other paths it should always return
true, so we don't need to care.

Cc: Lucas Bates <lucasb@mojatatu.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agocls_flower: use tcf_exts_get_net() before call_rcu()
Cong Wang [Mon, 6 Nov 2017 21:47:24 +0000 (13:47 -0800)]
cls_flower: use tcf_exts_get_net() before call_rcu()

Hold netns refcnt before call_rcu() and release it after
the tcf_exts_destroy() is done.

Note, on ->destroy() path we have to respect the return value
of tcf_exts_get_net(), on other paths it should always return
true, so we don't need to care.

Cc: Lucas Bates <lucasb@mojatatu.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agocls_flow: use tcf_exts_get_net() before call_rcu()
Cong Wang [Mon, 6 Nov 2017 21:47:23 +0000 (13:47 -0800)]
cls_flow: use tcf_exts_get_net() before call_rcu()

Hold netns refcnt before call_rcu() and release it after
the tcf_exts_destroy() is done.

Note, on ->destroy() path we have to respect the return value
of tcf_exts_get_net(), on other paths it should always return
true, so we don't need to care.

Cc: Lucas Bates <lucasb@mojatatu.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agocls_cgroup: use tcf_exts_get_net() before call_rcu()
Cong Wang [Mon, 6 Nov 2017 21:47:22 +0000 (13:47 -0800)]
cls_cgroup: use tcf_exts_get_net() before call_rcu()

Hold netns refcnt before call_rcu() and release it after
the tcf_exts_destroy() is done.

Note, on ->destroy() path we have to respect the return value
of tcf_exts_get_net(), on other paths it should always return
true, so we don't need to care.

Cc: Lucas Bates <lucasb@mojatatu.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agocls_bpf: use tcf_exts_get_net() before call_rcu()
Cong Wang [Mon, 6 Nov 2017 21:47:21 +0000 (13:47 -0800)]
cls_bpf: use tcf_exts_get_net() before call_rcu()

Hold netns refcnt before call_rcu() and release it after
the tcf_exts_destroy() is done.

Note, on ->destroy() path we have to respect the return value
of tcf_exts_get_net(), on other paths it should always return
true, so we don't need to care.

Cc: Lucas Bates <lucasb@mojatatu.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agocls_basic: use tcf_exts_get_net() before call_rcu()
Cong Wang [Mon, 6 Nov 2017 21:47:20 +0000 (13:47 -0800)]
cls_basic: use tcf_exts_get_net() before call_rcu()

Hold netns refcnt before call_rcu() and release it after
the tcf_exts_destroy() is done.

Note, on ->destroy() path we have to respect the return value
of tcf_exts_get_net(), on other paths it should always return
true, so we don't need to care.

Cc: Lucas Bates <lucasb@mojatatu.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet_sched: introduce tcf_exts_get_net() and tcf_exts_put_net()
Cong Wang [Mon, 6 Nov 2017 21:47:19 +0000 (13:47 -0800)]
net_sched: introduce tcf_exts_get_net() and tcf_exts_put_net()

Instead of holding netns refcnt in tc actions, we can minimize
the holding time by saving it in struct tcf_exts instead. This
means we can just hold netns refcnt right before call_rcu() and
release it after tcf_exts_destroy() is done.

However, because on netns cleanup path we call tcf_proto_destroy()
too, obviously we can not hold netns for a zero refcnt, in this
case we have to do cleanup synchronously. It is fine for RCU too,
the caller cleanup_net() already waits for a grace period.

For other cases, refcnt is non-zero and we can safely grab it as
normal and release it after we are done.

This patch provides two new API for each filter to use:
tcf_exts_get_net() and tcf_exts_put_net(). And all filters now can
use the following pattern:

void __destroy_filter() {
  tcf_exts_destroy();
  tcf_exts_put_net();  // <== release netns refcnt
  kfree();
}
void some_work() {
  rtnl_lock();
  __destroy_filter();
  rtnl_unlock();
}
void some_rcu_callback() {
  tcf_queue_work(some_work);
}

if (tcf_exts_get_net())  // <== hold netns refcnt
  call_rcu(some_rcu_callback);
else
  __destroy_filter();

Cc: Lucas Bates <lucasb@mojatatu.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoRevert "net_sched: hold netns refcnt for each action"
Cong Wang [Mon, 6 Nov 2017 21:47:18 +0000 (13:47 -0800)]
Revert "net_sched: hold netns refcnt for each action"

This reverts commit ceffcc5e254b450e6159f173e4538215cebf1b59.
If we hold that refcnt, the netns can never be destroyed until
all actions are destroyed by user, this breaks our netns design
which we expect all actions are destroyed when we destroy the
whole netns.

Cc: Lucas Bates <lucasb@mojatatu.com>
Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: usb: asix: fill null-ptr-deref in asix_suspend
Andrey Konovalov [Mon, 6 Nov 2017 12:26:46 +0000 (13:26 +0100)]
net: usb: asix: fill null-ptr-deref in asix_suspend

When asix_suspend() is called dev->driver_priv might not have been
assigned a value, so we need to check that it's not NULL.

Similar issue is present in asix_resume(), this patch fixes it as well.

Found by syzkaller.

kasan: CONFIG_KASAN_INLINE enabled
kasan: GPF could be caused by NULL-ptr deref or user memory access
general protection fault: 0000 [#1] PREEMPT SMP KASAN
Modules linked in:
CPU: 0 PID: 24 Comm: kworker/0:1 Not tainted 4.14.0-rc4-43422-geccacdd69a8c #400
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Workqueue: usb_hub_wq hub_event
task: ffff88006bb36300 task.stack: ffff88006bba8000
RIP: 0010:asix_suspend+0x76/0xc0 drivers/net/usb/asix_devices.c:629
RSP: 0018:ffff88006bbae718 EFLAGS: 00010202
RAX: dffffc0000000000 RBX: ffff880061ba3b80 RCX: 1ffff1000c34d644
RDX: 0000000000000001 RSI: 0000000000000402 RDI: 0000000000000008
RBP: ffff88006bbae738 R08: 1ffff1000d775cad R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: ffff8800630a8b40
R13: 0000000000000000 R14: 0000000000000402 R15: ffff880061ba3b80
FS:  0000000000000000(0000) GS:ffff88006c600000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007ff33cf89000 CR3: 0000000061c0a000 CR4: 00000000000006f0
Call Trace:
 usb_suspend_interface drivers/usb/core/driver.c:1209
 usb_suspend_both+0x27f/0x7e0 drivers/usb/core/driver.c:1314
 usb_runtime_suspend+0x41/0x120 drivers/usb/core/driver.c:1852
 __rpm_callback+0x339/0xb60 drivers/base/power/runtime.c:334
 rpm_callback+0x106/0x220 drivers/base/power/runtime.c:461
 rpm_suspend+0x465/0x1980 drivers/base/power/runtime.c:596
 __pm_runtime_suspend+0x11e/0x230 drivers/base/power/runtime.c:1009
 pm_runtime_put_sync_autosuspend ./include/linux/pm_runtime.h:251
 usb_new_device+0xa37/0x1020 drivers/usb/core/hub.c:2487
 hub_port_connect drivers/usb/core/hub.c:4903
 hub_port_connect_change drivers/usb/core/hub.c:5009
 port_event drivers/usb/core/hub.c:5115
 hub_event+0x194d/0x3740 drivers/usb/core/hub.c:5195
 process_one_work+0xc7f/0x1db0 kernel/workqueue.c:2119
 worker_thread+0x221/0x1850 kernel/workqueue.c:2253
 kthread+0x3a1/0x470 kernel/kthread.c:231
 ret_from_fork+0x2a/0x40 arch/x86/entry/entry_64.S:431
Code: 8d 7c 24 20 48 89 fa 48 c1 ea 03 80 3c 02 00 75 5b 48 b8 00 00
00 00 00 fc ff df 4d 8b 6c 24 20 49 8d 7d 08 48 89 fa 48 c1 ea 03 <80>
3c 02 00 75 34 4d 8b 6d 08 4d 85 ed 74 0b e8 26 2b 51 fd 4c
RIP: asix_suspend+0x76/0xc0 RSP: ffff88006bbae718
---[ end trace dfc4f5649284342c ]---

Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoRevert "net: usb: asix: fill null-ptr-deref in asix_suspend"
David S. Miller [Thu, 9 Nov 2017 00:21:44 +0000 (09:21 +0900)]
Revert "net: usb: asix: fill null-ptr-deref in asix_suspend"

This reverts commit baedf68a068ca29624f241426843635920f16e1d.

There is an updated version of this fix which covers
the problem more thoroughly.

Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agotimer: Prepare to change all DEFINE_TIMER() callbacks
Kees Cook [Tue, 10 Oct 2017 04:52:09 +0000 (21:52 -0700)]
timer: Prepare to change all DEFINE_TIMER() callbacks

Before we can globally change the function prototype of all timer callbacks,
we have to change those set up by DEFINE_TIMER(). Prepare for this by
casting the callbacks until the prototype changes globally.

Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Kees Cook <keescook@chromium.org>
6 years agonetfilter: ipvs: Convert timers to use timer_setup()
Kees Cook [Sat, 21 Oct 2017 05:24:18 +0000 (22:24 -0700)]
netfilter: ipvs: Convert timers to use timer_setup()

In preparation for unconditionally passing the struct timer_list pointer to
all timer callbacks, switch to using the new timer_setup() and from_timer()
to pass the timer pointer explicitly.

Cc: Wensong Zhang <wensong@linux-vs.org>
Cc: Simon Horman <horms@verge.net.au>
Cc: Julian Anastasov <ja@ssi.bg>
Cc: Pablo Neira Ayuso <pablo@netfilter.org>
Cc: Jozsef Kadlecsik <kadlec@blackhole.kfki.hu>
Cc: Florian Westphal <fw@strlen.de>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: netdev@vger.kernel.org
Cc: lvs-devel@vger.kernel.org
Cc: netfilter-devel@vger.kernel.org
Cc: coreteam@netfilter.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: Julian Anastasov <ja@ssi.bg>
Acked-by: Simon Horman <horms@verge.net.au>
6 years agoscsi: qla2xxx: Convert timers to use timer_setup()
Kees Cook [Sun, 3 Sep 2017 20:23:32 +0000 (13:23 -0700)]
scsi: qla2xxx: Convert timers to use timer_setup()

In preparation for unconditionally passing the struct timer_list pointer to
all timer callbacks, switch to using the new timer_setup() and from_timer()
to pass the timer pointer explicitly.

Cc: qla2xxx-upstream@qlogic.com
Cc: "James E.J. Bottomley" <jejb@linux.vnet.ibm.com>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: linux-scsi@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Tested-by: Bart Van Assche <Bart.VanAssche@wdc.com>