OSDN Git Service

uclinux-h8/linux.git
5 years agoi40e: Limiting RSS queues to CPUs
Aleksandr Loktionov [Wed, 19 Dec 2018 14:45:37 +0000 (06:45 -0800)]
i40e: Limiting RSS queues to CPUs

Limiting RSS queues number to online CPUs number in order to
avoid issues with creating misconfigured RSS queues.

Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
5 years agoi40e: Remove umem from VSI
Jan Sokolowski [Tue, 18 Dec 2018 13:45:14 +0000 (13:45 +0000)]
i40e: Remove umem from VSI

As current implementation of netdev already contains and provides
umems for us, we no longer have the need to contain these
structures in i40e_vsi.

Refactor the code to operate on netdev-provided umems.

Signed-off-by: Jan Sokolowski <jan.sokolowski@intel.com>
Acked-by: Björn Töpel <bjorn.topel@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
5 years agoxsk: export xdp_get_umem_from_qid
Jan Sokolowski [Tue, 18 Dec 2018 13:45:13 +0000 (13:45 +0000)]
xsk: export xdp_get_umem_from_qid

Export xdp_get_umem_from_qid for other modules to use.

Signed-off-by: Jan Sokolowski <jan.sokolowski@intel.com>
Acked-by: Björn Töpel <bjorn.topel@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
5 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
David S. Miller [Mon, 21 Jan 2019 22:41:32 +0000 (14:41 -0800)]
Merge git://git./linux/kernel/git/davem/net

Completely minor snmp doc conflict.

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoLinux 5.0-rc3 v5.0-rc3
Linus Torvalds [Mon, 21 Jan 2019 00:14:44 +0000 (13:14 +1300)]
Linux 5.0-rc3

5 years agoMerge tag 'pstore-v5.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/kees...
Linus Torvalds [Mon, 21 Jan 2019 00:12:03 +0000 (13:12 +1300)]
Merge tag 'pstore-v5.0-rc4' of git://git./linux/kernel/git/kees/linux

Pull pstore fixes from Kees Cook:

 - Fix console ramoops to show the previous boot logs (Sai Prakash
   Ranjan)

 - Avoid allocation and leak of platform data

* tag 'pstore-v5.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  pstore/ram: Avoid allocation and leak of platform data
  pstore/ram: Fix console ramoops to show the previous boot logs

5 years agoMerge tag 'gcc-plugins-v5.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git...
Linus Torvalds [Mon, 21 Jan 2019 00:07:03 +0000 (13:07 +1300)]
Merge tag 'gcc-plugins-v5.0-rc4' of git://git./linux/kernel/git/kees/linux

Pull gcc-plugins fixes from Kees Cook:
 "Fix ARM per-task stack protector plugin under GCC 9 (Ard Biesheuvel)"

* tag 'gcc-plugins-v5.0-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  gcc-plugins: arm_ssp_per_task_plugin: fix for GCC 9+
  gcc-plugins: arm_ssp_per_task_plugin: sign extend the SP mask

5 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Linus Torvalds [Sun, 20 Jan 2019 23:52:31 +0000 (12:52 +1300)]
Merge git://git./linux/kernel/git/davem/net

Pull networking fixes from David Miller:

 1) Fix endless loop in nf_tables, from Phil Sutter.

 2) Fix cross namespace ip6_gre tunnel hash list corruption, from
    Olivier Matz.

 3) Don't be too strict in phy_start_aneg() otherwise we might not allow
    restarting auto negotiation. From Heiner Kallweit.

 4) Fix various KMSAN uninitialized value cases in tipc, from Ying Xue.

 5) Memory leak in act_tunnel_key, from Davide Caratti.

 6) Handle chip errata of mv88e6390 PHY, from Andrew Lunn.

 7) Remove linear SKB assumption in fou/fou6, from Eric Dumazet.

 8) Missing udplite rehash callbacks, from Alexey Kodanev.

 9) Log dirty pages properly in vhost, from Jason Wang.

10) Use consume_skb() in neigh_probe() as this is a normal free not a
    drop, from Yang Wei. Likewise in macvlan_process_broadcast().

11) Missing device_del() in mdiobus_register() error paths, from Thomas
    Petazzoni.

12) Fix checksum handling of short packets in mlx5, from Cong Wang.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (96 commits)
  bpf: in __bpf_redirect_no_mac pull mac only if present
  virtio_net: bulk free tx skbs
  net: phy: phy driver features are mandatory
  isdn: avm: Fix string plus integer warning from Clang
  net/mlx5e: Fix cb_ident duplicate in indirect block register
  net/mlx5e: Fix wrong (zero) TX drop counter indication for representor
  net/mlx5e: Fix wrong error code return on FEC query failure
  net/mlx5e: Force CHECKSUM_UNNECESSARY for short ethernet frames
  tools: bpftool: Cleanup license mess
  bpf: fix inner map masking to prevent oob under speculation
  bpf: pull in pkt_sched.h header for tooling to fix bpftool build
  selftests: forwarding: Add a test case for externally learned FDB entries
  selftests: mlxsw: Test FDB offload indication
  mlxsw: spectrum_switchdev: Do not treat static FDB entries as sticky
  net: bridge: Mark FDB entries that were added by user as such
  mlxsw: spectrum_fid: Update dummy FID index
  mlxsw: pci: Return error on PCI reset timeout
  mlxsw: pci: Increase PCI SW reset timeout
  mlxsw: pci: Ring CQ's doorbell before RDQ's
  MAINTAINERS: update email addresses of liquidio driver maintainers
  ...

5 years agopstore/ram: Avoid allocation and leak of platform data
Kees Cook [Sun, 20 Jan 2019 22:33:34 +0000 (14:33 -0800)]
pstore/ram: Avoid allocation and leak of platform data

Yue Hu noticed that when parsing device tree the allocated platform data
was never freed. Since it's not used beyond the function scope, this
switches to using a stack variable instead.

Reported-by: Yue Hu <huyue2@yulong.com>
Fixes: 35da60941e44 ("pstore/ram: add Device Tree bindings")
Cc: stable@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
5 years agogcc-plugins: arm_ssp_per_task_plugin: fix for GCC 9+
Ard Biesheuvel [Fri, 18 Jan 2019 10:58:07 +0000 (11:58 +0100)]
gcc-plugins: arm_ssp_per_task_plugin: fix for GCC 9+

GCC 9 reworks the way the references to the stack canary are
emitted, to prevent the value from being spilled to the stack
before the final comparison in the epilogue, defeating the
purpose, given that the spill slot is under control of the
attacker that we are protecting ourselves from.

Since our canary value address is obtained without accessing
memory (as opposed to pre-v7 code that will obtain it from a
literal pool), it is unlikely (although not guaranteed) that
the compiler will spill the canary value in the same way, so
let's just disable this improvement when building with GCC9+.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
5 years agogcc-plugins: arm_ssp_per_task_plugin: sign extend the SP mask
Ard Biesheuvel [Fri, 18 Jan 2019 10:58:06 +0000 (11:58 +0100)]
gcc-plugins: arm_ssp_per_task_plugin: sign extend the SP mask

The ARM per-task stack protector GCC plugin hits an assert in
the compiler in some case, due to the fact the the SP mask
expression is not sign-extended as it should be. So fix that.

Suggested-by: Kugan Vivekanandarajah <kugan.vivekanandarajah@linaro.org>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
5 years agoMerge branch 'mlxsw-spectrum_router-Add-GRE-tunnel-support-for-Spectrum-2'
David S. Miller [Sun, 20 Jan 2019 19:12:59 +0000 (11:12 -0800)]
Merge branch 'mlxsw-spectrum_router-Add-GRE-tunnel-support-for-Spectrum-2'

Ido Schimmel says:

====================
mlxsw: spectrum_router: Add GRE tunnel support for Spectrum-2

Nir says:

In Spectrum-2, HW implementation of layer 3 tunnels differs from
Spectrum-1 when it comes to the underlay routing table selection.
Spectrum-2 uses a dedicated RIF that points to the virtual router used
for forwarding the encapsulated packets, while Spectrum-1 explicitly
specifies the virtual router itself.

Patches #1 and #2 add additional fields in RITR - Router interface table
register and RTDP - Routing tunnel decap properties respectively, the
fields are required for the new underlay RIF needed for Spectrum-2.

Patches #3-4 allow different set of RIF operations per ASIC type. The
first patch splits the operations and the following patch sets RIF ops
according to ASIC type.

Patches #5-9 introduce small changes to existing code to allow existence
of a dedicated underlay RIF along with the underlay virtual router, and
to support that new type of RIF that has no device.

Patch #10 takes care of updating the tunnel decap properties egress
underlay RIF required for Spectrum-2.

Patch #11 adds the implementation of Spectrum-2 specific RIF operations
and essentially enables layer 3 GRE tunnels on Spectrum-2.

Finally patches #12-18 add tests for GRE IP-in-IP tunnels, both in flat
and hierarchical topologies.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoselftests: forwarding: Add IP-in-IP GRE hierarchical topology with keys test
Nir Dotan [Sun, 20 Jan 2019 06:50:58 +0000 (06:50 +0000)]
selftests: forwarding: Add IP-in-IP GRE hierarchical topology with keys test

Add a test that checks IP-in-IP GRE tunneling and MTU change of tunnel,
where an ikey/okey pair is set. This test is based on hierarchical topology
described in file ipip_lib.sh.

Signed-off-by: Nir Dotan <nird@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoselftests: forwarding: Add IP-in-IP GRE hierarchical topology with key test
Nir Dotan [Sun, 20 Jan 2019 06:50:57 +0000 (06:50 +0000)]
selftests: forwarding: Add IP-in-IP GRE hierarchical topology with key test

Add a test that checks IP-in-IP GRE tunneling and MTU change of tunnel,
where a key is set. This test is based on hierarchical topology described
in file ipip_lib.sh.

Signed-off-by: Nir Dotan <nird@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoselftests: forwarding: Add IP-in-IP GRE hierarchical topology test
Nir Dotan [Sun, 20 Jan 2019 06:50:56 +0000 (06:50 +0000)]
selftests: forwarding: Add IP-in-IP GRE hierarchical topology test

Add a test that checks IP-in-IP GRE tunneling and MTU change of tunnel,
based on hierarchical topology described in file ipip_lib.sh.

Signed-off-by: Nir Dotan <nird@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoselftests: forwarding: Add IP-in-IP GRE flat topology with keys test
Nir Dotan [Sun, 20 Jan 2019 06:50:55 +0000 (06:50 +0000)]
selftests: forwarding: Add IP-in-IP GRE flat topology with keys test

Add a test that checks IP-in-IP GRE tunneling and MTU change of tunnel,
where an ikey/okey pair is set. This test is based on flat topology
described in file ipip_lib.sh.

Signed-off-by: Nir Dotan <nird@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoselftests: forwarding: Add IP-in-IP GRE flat topology with key test
Nir Dotan [Sun, 20 Jan 2019 06:50:54 +0000 (06:50 +0000)]
selftests: forwarding: Add IP-in-IP GRE flat topology with key test

Add a test that checks IP-in-IP GRE tunneling and MTU change of tunnel,
where a key is set. This test is based on flat topology described in file
ipip_lib.sh.

Signed-off-by: Nir Dotan <nird@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoselftests: forwarding: Add IP-in-IP GRE flat topology test
Nir Dotan [Sun, 20 Jan 2019 06:50:53 +0000 (06:50 +0000)]
selftests: forwarding: Add IP-in-IP GRE flat topology test

Add a test that checks IP-in-IP GRE tunneling and MTU change of tunnel,
based on flat topology described in file ipip_lib.sh.

Signed-off-by: Nir Dotan <nird@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoselftests: forwarding: Add IP tunneling lib
Nir Dotan [Sun, 20 Jan 2019 06:50:52 +0000 (06:50 +0000)]
selftests: forwarding: Add IP tunneling lib

Add a library with helper functions, to be used in testing IP-in-IP and GRE
tunnels, both in flat and in hierarchical topologies.
The topologies used in this library cover the three scenarios of tunnels -
a tunel with no bound device, a tunnel with bound device in the same VRF
and a tunnel with a bound device in a different VRF.

Signed-off-by: Nir Dotan <nird@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agomlxsw: spectrum_router: Add GRE tunnel support for Spectrum-2
Nir Dotan [Sun, 20 Jan 2019 06:50:51 +0000 (06:50 +0000)]
mlxsw: spectrum_router: Add GRE tunnel support for Spectrum-2

Spectrum-2 GRE tunnel implementation requires a specific underlay RIF that
points to the virtual router used for forwarding the encapsulated packet.

Add Spectrum-2 specific loopback router interface creation methods which
may create or reuse the dedicated underlay RIF.

Signed-off-by: Nir Dotan <nird@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agomlxsw: spectrum_router: Update tunnel decap properties
Nir Dotan [Sun, 20 Jan 2019 06:50:50 +0000 (06:50 +0000)]
mlxsw: spectrum_router: Update tunnel decap properties

Spectrum-2 requires to specify the egress RIF when setting tunnel decap
properties. Add a method for accessing the underlay RIF index and then use
it when setting decap properties.

Signed-off-by: Nir Dotan <nird@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agomlxsw: spectrum_router: Support RIF without device
Nir Dotan [Sun, 20 Jan 2019 06:50:49 +0000 (06:50 +0000)]
mlxsw: spectrum_router: Support RIF without device

Spectrum-2 underlay RIF is merely an auxiliary RIF that points to the
virtual router used for encapsulated packets lookup. It exists only when
its overlay RIF exists but may be shared with other overlay RIFs.
Hence it is undesired to mark any device as related to it.

Therefore allow usage of NULL device when allocating RIF.

Signed-off-by: Nir Dotan <nird@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agomlxsw: spectrum_router: Change mlxsw_sp_ipip_lb_ul_vr_id()
Nir Dotan [Sun, 20 Jan 2019 06:50:48 +0000 (06:50 +0000)]
mlxsw: spectrum_router: Change mlxsw_sp_ipip_lb_ul_vr_id()

For the sake of Spectrum-2 GRE support, as ul_vr_id field is reserved for
Spectrum-2, Change mlxsw_sp_ipip_lb_ul_vr_id() implementation not to use
the reserved field.

Signed-off-by: Nir Dotan <nird@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agomlxsw: spectrum_router: Add underlay RIF ID support
Nir Dotan [Sun, 20 Jan 2019 06:50:47 +0000 (06:50 +0000)]
mlxsw: spectrum_router: Add underlay RIF ID support

Spectrum-2 GRE tunnels underlay should be given not only the virtual router
information for an encapsulated packet lookup, but also an underlay RIF
object which belongs to a virtual router.

Therefore add ul_rif_id field in struct mlxsw_sp_rif_ipip_lb, to be used
later in Spectrum-2 underlay RIF implementation. This field complements
ul_vr_id field, already present and defined as reserved for Spectrum-2.

Signed-off-by: Nir Dotan <nird@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agomlxsw: spectrum_router: Mark RIF index as taken before creation
Nir Dotan [Sun, 20 Jan 2019 06:50:46 +0000 (06:50 +0000)]
mlxsw: spectrum_router: Mark RIF index as taken before creation

The presence of an allocated RIF in mlxsw_sp->router->rifs[rif_index] marks
that rif_index as taken.
Set the marking of a taken RIF to happen before calling ops->create in
order to allow creation of a GRE underlay RIF, which may be allocated and
created as part of an overlay RIF creation.

Signed-off-by: Nir Dotan <nird@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agomlxsw: spectrum_router: Adjust loopback RIF configuration
Nir Dotan [Sun, 20 Jan 2019 06:50:42 +0000 (06:50 +0000)]
mlxsw: spectrum_router: Adjust loopback RIF configuration

In Spectrum-2, the underlay routing table is pointed by an underlay router
interface in contrary to Spectrum where only an underlay virtual router
should be set. That makes the underlay virtual router field in RITR
reserved for Spectrum-2.

Change loopback RIF creation function to support the new underlay RIF
field, however leave this field reserved for Spectrum-1 updates.

Signed-off-by: Nir Dotan <nird@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agomlxsw: spectrum: Set RIF ops per ASIC type
Nir Dotan [Sun, 20 Jan 2019 06:50:41 +0000 (06:50 +0000)]
mlxsw: spectrum: Set RIF ops per ASIC type

Set RIF ops array as member of mlxsw_sp in order to control which RIF
operations callbacks are called per ASIC type. This is needed to control
per ASIC handling of loopback RIF configurations.

Signed-off-by: Nir Dotan <nird@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agomlxsw: spectrum_router: Split RIF ops array for Spectrum-2 support
Nir Dotan [Sun, 20 Jan 2019 06:50:40 +0000 (06:50 +0000)]
mlxsw: spectrum_router: Split RIF ops array for Spectrum-2 support

Split RIF ops array for Spectrum-1 and Spectrum-2 callbacks in order to
support different sets of operations for loopback RIF handling, as
underlying implementation differs between the ASICs.

Signed-off-by: Nir Dotan <nird@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agomlxsw: reg: Add underlay egress RIF field in RTDP register
Ido Schimmel [Sun, 20 Jan 2019 06:50:39 +0000 (06:50 +0000)]
mlxsw: reg: Add underlay egress RIF field in RTDP register

In Spectrum-2 we need to specify the underlay egress router interface
when performing IP-in-IP and NVE packet decapsulation in the underlay
router.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agomlxsw: reg: Add fields to RITR - Router Interface Table Register
Nir Dotan [Sun, 20 Jan 2019 06:50:39 +0000 (06:50 +0000)]
mlxsw: reg: Add fields to RITR - Router Interface Table Register

Add fields relevant for Spectrum-2 Loopback IPinIP router interface
creation. Add additional Loopback RIF protocol value - Generic, used for
creation of an explicit underlay RIF, and also add a field named
underlay_rif used for specifying the underlay RIF of a tunnel.

Signed-off-by: Nir Dotan <nird@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost
Linus Torvalds [Sun, 20 Jan 2019 18:37:16 +0000 (07:37 +1300)]
Merge tag 'for_linus' of git://git./linux/kernel/git/mst/vhost

Pull virtio/vhost fixes and cleanups from Michael Tsirkin:
 "Fixes and cleanups all over the place"

* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
  vhost/scsi: Use copy_to_iter() to send control queue response
  vhost: return EINVAL if iovecs size does not match the message size
  virtio-balloon: tweak config_changed implementation
  virtio: don't allocate vqs when names[i] = NULL
  virtio_pci: use queue idx instead of array idx to set up the vq
  virtio: document virtio_config_ops restrictions
  virtio: fix virtio_config_ops description

5 years agoMerge tag 'for-5.0-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave...
Linus Torvalds [Sun, 20 Jan 2019 18:35:26 +0000 (07:35 +1300)]
Merge tag 'for-5.0-rc2-tag' of git://git./linux/kernel/git/kdave/linux

Pull btrfs fixes from David Sterba:
 "A handful of fixes (some of them in testing for a long time):

   - fix some test failures regarding cleanup after transaction abort

   - revert of a patch that could cause a deadlock

   - delayed iput fixes, that can help in ENOSPC situation when there's
     low space and a lot data to write"

* tag 'for-5.0-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
  btrfs: wakeup cleaner thread when adding delayed iput
  btrfs: run delayed iputs before committing
  btrfs: wait on ordered extents on abort cleanup
  btrfs: handle delayed ref head accounting cleanup in abort
  Revert "btrfs: balance dirty metadata pages in btrfs_finish_ordered_io"

5 years agoMerge tags 'compiler-attributes-for-linus-v5.0-rc3' and 'clang-format-for-linus-v5...
Linus Torvalds [Sun, 20 Jan 2019 18:23:42 +0000 (07:23 +1300)]
Merge tags 'compiler-attributes-for-linus-v5.0-rc3' and 'clang-format-for-linus-v5.0-rc3' of git://github.com/ojeda/linux

Pull misc clang fixes from Miguel Ojeda:

  - A fix for OPTIMIZER_HIDE_VAR from Michael S Tsirkin

  - Update clang-format with the latest for_each macro list from Jason
    Gunthorpe

* tag 'compiler-attributes-for-linus-v5.0-rc3' of git://github.com/ojeda/linux:
  include/linux/compiler*.h: fix OPTIMIZER_HIDE_VAR

* tag 'clang-format-for-linus-v5.0-rc3' of git://github.com/ojeda/linux:
  clang-format: Update .clang-format with the latest for_each macro list

5 years agofix int_sqrt64() for very large numbers
Florian La Roche [Sat, 19 Jan 2019 15:14:50 +0000 (16:14 +0100)]
fix int_sqrt64() for very large numbers

If an input number x for int_sqrt64() has the highest bit set, then
fls64(x) is 64.  (1UL << 64) is an overflow and breaks the algorithm.

Subtracting 1 is a better guess for the initial value of m anyway and
that's what also done in int_sqrt() implicitly [*].

[*] Note how int_sqrt() uses __fls() with two underscores, which already
    returns the proper raw bit number.

    In contrast, int_sqrt64() used fls64(), and that returns bit numbers
    illogically starting at 1, because of error handling for the "no
    bits set" case. Will points out that he bug probably is due to a
    copy-and-paste error from the regular int_sqrt() case.

Signed-off-by: Florian La Roche <Florian.LaRoche@googlemail.com>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agox86: uaccess: Inhibit speculation past access_ok() in user_access_begin()
Will Deacon [Sat, 19 Jan 2019 21:56:05 +0000 (21:56 +0000)]
x86: uaccess: Inhibit speculation past access_ok() in user_access_begin()

Commit 594cc251fdd0 ("make 'user_access_begin()' do 'access_ok()'")
makes the access_ok() check part of the user_access_begin() preceding a
series of 'unsafe' accesses.  This has the desirable effect of ensuring
that all 'unsafe' accesses have been range-checked, without having to
pick through all of the callsites to verify whether the appropriate
checking has been made.

However, the consolidated range check does not inhibit speculation, so
it is still up to the caller to ensure that they are not susceptible to
any speculative side-channel attacks for user addresses that ultimately
fail the access_ok() check.

This is an oversight, so use __uaccess_begin_nospec() to ensure that
speculation is inhibited until the access_ok() check has passed.

Reported-by: Julien Thierry <julien.thierry@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
5 years agoMerge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Linus Torvalds [Sun, 20 Jan 2019 03:27:59 +0000 (15:27 +1200)]
Merge tag 'arm64-fixes' of git://git./linux/kernel/git/arm64/linux

Pull arm64 fixes from Will Deacon:
 "Three arm64 fixes for -rc3.

  We've plugged a couple of nasty issues involving KASLR-enabled
  kernels, and removed a redundant #define that was introduced as part
  of the KHWASAN fixes from akpm at -rc2.

   - Fix broken kpti page-table rewrite in bizarre KASLR configuration

   - Fix module loading with KASLR

   - Remove redundant definition of ARCH_SLAB_MINALIGN"

* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
  kasan, arm64: remove redundant ARCH_SLAB_MINALIGN define
  arm64: kaslr: ensure randomized quantities are clean to the PoC
  arm64: kpti: Update arm64_kernel_use_ng_mappings() when forced on

5 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
David S. Miller [Sun, 20 Jan 2019 00:38:12 +0000 (16:38 -0800)]
Merge git://git./pub/scm/linux/kernel/git/bpf/bpf

Daniel Borkmann says:

====================
pull-request: bpf 2019-01-20

The following pull-request contains BPF updates for your *net* tree.

The main changes are:

1) Fix a out-of-bounds access in __bpf_redirect_no_mac, from Willem.

2) Fix bpf_setsockopt to reset sock dst on SO_MARK changes, from Peter.

3) Fix map in map masking to prevent out-of-bounds access under
   speculative execution, from Daniel.

4) Fix bpf_setsockopt's SO_MAX_PACING_RATE to support TCP internal
   pacing, from Yuchung.

5) Fix json writer license in bpftool, from Thomas.

6) Fix AF_XDP to check if an actually queue exists during umem
   setup, from Krzysztof.

7) Several fixes to BPF stackmap's build id handling. Another fix
   for bpftool build to account for libbfd variations wrt linking
   requirements, from Stanislav.

8) Fix BPF samples build with clang by working around missing asm
   goto, from Yonghong.

9) Fix libbpf to retry program load on signal interrupt, from Lorenz.

10) Various minor compile warning fixes in BPF code, from Mathieu.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agobpf: in __bpf_redirect_no_mac pull mac only if present
Willem de Bruijn [Wed, 16 Jan 2019 01:19:22 +0000 (20:19 -0500)]
bpf: in __bpf_redirect_no_mac pull mac only if present

Syzkaller was able to construct a packet of negative length by
redirecting from bpf_prog_test_run_skb with BPF_PROG_TYPE_LWT_XMIT:

    BUG: KASAN: slab-out-of-bounds in memcpy include/linux/string.h:345 [inline]
    BUG: KASAN: slab-out-of-bounds in skb_copy_from_linear_data include/linux/skbuff.h:3421 [inline]
    BUG: KASAN: slab-out-of-bounds in __pskb_copy_fclone+0x2dd/0xeb0 net/core/skbuff.c:1395
    Read of size 4294967282 at addr ffff8801d798009c by task syz-executor2/12942

    kasan_report.cold.9+0x242/0x309 mm/kasan/report.c:412
    check_memory_region_inline mm/kasan/kasan.c:260 [inline]
    check_memory_region+0x13e/0x1b0 mm/kasan/kasan.c:267
    memcpy+0x23/0x50 mm/kasan/kasan.c:302
    memcpy include/linux/string.h:345 [inline]
    skb_copy_from_linear_data include/linux/skbuff.h:3421 [inline]
    __pskb_copy_fclone+0x2dd/0xeb0 net/core/skbuff.c:1395
    __pskb_copy include/linux/skbuff.h:1053 [inline]
    pskb_copy include/linux/skbuff.h:2904 [inline]
    skb_realloc_headroom+0xe7/0x120 net/core/skbuff.c:1539
    ipip6_tunnel_xmit net/ipv6/sit.c:965 [inline]
    sit_tunnel_xmit+0xe1b/0x30d0 net/ipv6/sit.c:1029
    __netdev_start_xmit include/linux/netdevice.h:4325 [inline]
    netdev_start_xmit include/linux/netdevice.h:4334 [inline]
    xmit_one net/core/dev.c:3219 [inline]
    dev_hard_start_xmit+0x295/0xc90 net/core/dev.c:3235
    __dev_queue_xmit+0x2f0d/0x3950 net/core/dev.c:3805
    dev_queue_xmit+0x17/0x20 net/core/dev.c:3838
    __bpf_tx_skb net/core/filter.c:2016 [inline]
    __bpf_redirect_common net/core/filter.c:2054 [inline]
    __bpf_redirect+0x5cf/0xb20 net/core/filter.c:2061
    ____bpf_clone_redirect net/core/filter.c:2094 [inline]
    bpf_clone_redirect+0x2f6/0x490 net/core/filter.c:2066
    bpf_prog_41f2bcae09cd4ac3+0xb25/0x1000

The generated test constructs a packet with mac header, network
header, skb->data pointing to network header and skb->len 0.

Redirecting to a sit0 through __bpf_redirect_no_mac pulls the
mac length, even though skb->data already is at skb->network_header.
bpf_prog_test_run_skb has already pulled it as LWT_XMIT !is_l2.

Update the offset calculation to pull only if skb->data differs
from skb->network_header, which is not true in this case.

The test itself can be run only from commit 1cf1cae963c2 ("bpf:
introduce BPF_PROG_TEST_RUN command"), but the same type of packets
with skb at network header could already be built from lwt xmit hooks,
so this fix is more relevant to that commit.

Also set the mac header on redirect from LWT_XMIT, as even after this
change to __bpf_redirect_no_mac that field is expected to be set, but
is not yet in ip_finish_output2.

Fixes: 3a0af8fd61f9 ("bpf: BPF for lightweight tunnel infrastructure")
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Willem de Bruijn <willemb@google.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
5 years agoMerge branch 'r8169-series-with-smaller-improvements'
David S. Miller [Sun, 20 Jan 2019 00:09:14 +0000 (16:09 -0800)]
Merge branch 'r8169-series-with-smaller-improvements'

Heiner Kallweit says:

====================
r8169: series with smaller improvements

Series with smaller improvements.

v2:
- fixed a small copy & paste error in patch 4
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agor8169: factor out getting ether_clk
Heiner Kallweit [Sat, 19 Jan 2019 21:07:34 +0000 (22:07 +0100)]
r8169: factor out getting ether_clk

rtl_init_one() is complex enough, so we better factor out getting the
ether_clk.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agor8169: replace mii_bus member with phy_device member in struct rtl8169_private
Heiner Kallweit [Sat, 19 Jan 2019 21:07:05 +0000 (22:07 +0100)]
r8169: replace mii_bus member with phy_device member in struct rtl8169_private

Accessing the phy_device indirectly via the netdevice causes few issues:
- Accessing the phy_device when it's not attached may cause a NPE.
- If we have to access the phy_device when it's not attached we have
  to use mdiobus_get_phy() to get a reference to the phy_device.

Therefore store a phy_device reference in struct rtl8169_private directly.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agor8169: reset chip synchronously in __rtl8169_resume
Heiner Kallweit [Sat, 19 Jan 2019 21:06:25 +0000 (22:06 +0100)]
r8169: reset chip synchronously in __rtl8169_resume

Triggering an asynchronous reset is problematic for the following
reasons, therefore reset the chip synchronously.

- The reset routine resets registers and parameters behind our back
  what may collide with code executed after triggering the reset.

- __rtl8169_resume() is called as part of pm_runtime_get_sync() and
  callers expect that the chip is fully resumed afterwards.

In context of this driver triggering an asynchonous reset should be
considered an emergency procedure.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agor8169: add helpers for locking / unlocking the config registers
Heiner Kallweit [Sat, 19 Jan 2019 21:05:48 +0000 (22:05 +0100)]
r8169: add helpers for locking / unlocking the config registers

Add helpers for locking / unlocking the config registers.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agor8169: improve rtl_pcie_state_l2l3_enable
Heiner Kallweit [Sat, 19 Jan 2019 21:05:14 +0000 (22:05 +0100)]
r8169: improve rtl_pcie_state_l2l3_enable

All calls to this function have the enable parameter set to false.
So we can replace the function with a disable-only version.

v2:
- fixed copy & paste error

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agor8169: initialize task workqueue only once
Heiner Kallweit [Sat, 19 Jan 2019 21:03:49 +0000 (22:03 +0100)]
r8169: initialize task workqueue only once

It's sufficient to initialize the workqueue once, therefore remove the
additional initialization whenever rtl_open() is called.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agor8169: remove unneeded call in pcierr
Heiner Kallweit [Sat, 19 Jan 2019 21:03:13 +0000 (22:03 +0100)]
r8169: remove unneeded call in pcierr

rtl8169_hw_reset() is called as part of the reset routine which is
scheduled in the line after. So we can remove the call to
rtl8169_hw_reset() here.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agor8169: remove rtl_get_events
Heiner Kallweit [Sat, 19 Jan 2019 21:02:40 +0000 (22:02 +0100)]
r8169: remove rtl_get_events

This helper is used only once, so remove it.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agovirtio_net: bulk free tx skbs
Michael S. Tsirkin [Fri, 18 Jan 2019 04:20:07 +0000 (23:20 -0500)]
virtio_net: bulk free tx skbs

Use napi_consume_skb() to get bulk free.  Note that napi_consume_skb is
safe to call in a non-napi context as long as the napi_budget flag is
correct.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet_sched: add performance counters for basic filter
Cong Wang [Fri, 18 Jan 2019 01:14:01 +0000 (17:14 -0800)]
net_sched: add performance counters for basic filter

Similar to u32 filter, it is useful to know how many times
we reach each basic filter and how many times we pass the
ematch attached to it.

Sample output:

filter protocol arp pref 49152 basic chain 0
filter protocol arp pref 49152 basic chain 0 handle 0x1  (rule hit 3 success 3)
action order 1: gact action pass
 random type none pass val 0
 index 1 ref 1 bind 1 installed 81 sec used 4 sec
Action statistics:
Sent 126 bytes 3 pkt (dropped 0, overlimits 0 requeues 0)
backlog 0b 0p requeues 0

Cc: Jamal Hadi Salim <jhs@mojatatu.com>
Cc: Jiri Pirko <jiri@resnulli.us>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge tag 'mips_fixes_5.0_2' of git://git.kernel.org/pub/scm/linux/kernel/git/mips...
Linus Torvalds [Sat, 19 Jan 2019 22:33:18 +0000 (10:33 +1200)]
Merge tag 'mips_fixes_5.0_2' of git://git./linux/kernel/git/mips/linux

Pull MIPS fixes from Paul Burton:

 - Fix IPI handling for Lantiq SoCs, which was broken by changes made
   back in v4.12.

 - Enable OF/DT serial support in ath79_defconfig to give us working
   serial by default.

 - Fix 64b builds for the Jazz platform.

 - Set up a struct device for the BCM47xx SoC to allow BCM47xx drivers
   to perform DMA again following the major DMA mapping changes made in
   v4.19.

 - Disable MSI on Cavium Octeon systems when the pcie_disable command
   line parameter introduced in v3.3 is used, in order to avoid
   inadvetently accessing PCIe controller registers despite the command
   line.

 - Fix a build failure for Cavium Octeon kernels with kexec enabled,
   introduced in v4.20.

 - Fix a regression in the behaviour of semctl/shmctl/msgctl IPC
   syscalls for kernels including n32 support but not o32 support caused
   by some cleanup in v3.19.

* tag 'mips_fixes_5.0_2' of git://git.kernel.org/pub/scm/linux/kernel/git/mips/linux:
  MIPS: OCTEON: fix kexec support
  mips: fix n32 compat_ipc_parse_version
  Disable MSI also when pcie-octeon.pcie_disable on
  MIPS: BCM47XX: Setup struct device for the SoC
  MIPS: jazz: fix 64bit build
  MIPS: ath79: Enable OF serial ports in the default config
  MIPS: lantiq: Use CP0_LEGACY_COMPARE_IRQ
  MIPS: lantiq: Fix IPI interrupt handling

5 years agoMerge tag 'devicetree-fixes-for-5.0-2' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sat, 19 Jan 2019 22:28:46 +0000 (10:28 +1200)]
Merge tag 'devicetree-fixes-for-5.0-2' of git://git./linux/kernel/git/robh/linux

Pull Devicetree fix from Rob Herring:
 "A single build fix for powerpc due to device_node.type removal"

* tag 'devicetree-fixes-for-5.0-2' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux:
  powerpc: chrp: Use of_node_is_type to access device_type

5 years agoMerge tag 'libnvdimm-fixes-5.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel...
Linus Torvalds [Sat, 19 Jan 2019 22:24:30 +0000 (10:24 +1200)]
Merge tag 'libnvdimm-fixes-5.0-rc3' of git://git./linux/kernel/git/nvdimm/nvdimm

Pull libnvdimm fixes from Dan Williams:
 "A crash fix, a build warning fix, a miscellaneous small cleanups.

  In case anyone is looking for them, there was a regression caught by
  testing that caused two patches to be dropped from this update.  Those
  patches have been reworked and will soak for another week / re-target
  5.0-rc4.

   - Fix driver initialization crash due to the inability to report an
     'error' state for a DIMM's security capability.

   - Build warning fix for little-endian ARM64 builds

   - Fix a potential race between the EDAC driver's usage of the NFIT
     SMBIOS id for a DIMM and the driver shutdown path.

   - A small collection of one-line benign cleanups for duplicate
     variable assignments, a duplicate header include and a mis-typed
     function argument"

* tag 'libnvdimm-fixes-5.0-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
  libnvdimm/security: Fix nvdimm_security_state() state request selection
  acpi/nfit: Remove duplicate set nd_set in acpi_nfit_init_interleave_set()
  acpi/nfit: Fix race accessing memdev in nfit_get_smbios_id()
  libnvdimm/dimm: Fix security capability detection for non-Intel NVDIMMs
  nfit: Mark some functions as __maybe_unused
  ACPI/nfit: delete the function to_acpi_nfit_desc
  ACPI/nfit: delete the redundant header file

5 years agoMerge tag 'linux-watchdog-5.0-rc-fixes' of git://www.linux-watchdog.org/linux-watchdog
Linus Torvalds [Sat, 19 Jan 2019 21:58:52 +0000 (09:58 +1200)]
Merge tag 'linux-watchdog-5.0-rc-fixes' of git://linux-watchdog.org/linux-watchdog

Pull watchdog fixes from Wim Van Sebroeck:

 - mt7621_wdt/rt2880_wdt: Fix compilation problem

 - tqmx86: Fix a couple IS_ERR() vs NULL bugs

* tag 'linux-watchdog-5.0-rc-fixes' of git://www.linux-watchdog.org/linux-watchdog:
  watchdog: tqmx86: Fix a couple IS_ERR() vs NULL bugs
  watchdog: mt7621_wdt/rt2880_wdt: Fix compilation problem

5 years agoMerge tag 'nfs-for-5.0-2' of git://git.linux-nfs.org/projects/anna/linux-nfs
Linus Torvalds [Sat, 19 Jan 2019 21:27:38 +0000 (09:27 +1200)]
Merge tag 'nfs-for-5.0-2' of git://git.linux-nfs.org/projects/anna/linux-nfs

Pull NFS client fixes from Anna Schumaker:
 "These are mostly fixes for SUNRPC bugs, with a single v4.2
  copy_file_range() fix mixed in.

  Stable bugfixes:
   - Fix TCP receive code on archs with flush_dcache_page()

  Other bugfixes:
   - Fix error code in rpcrdma_buffer_create()
   - Fix a double free in rpcrdma_send_ctxs_create()
   - Fix kernel BUG at kernel/cred.c:825
   - Fix unnecessary retry in nfs42_proc_copy_file_range()
   - Ensure rq_bytes_sent is reset before request transmission
   - Ensure we respect the RPCSEC_GSS sequence number limit
   - Address Kerberos performance/behavior regression"

* tag 'nfs-for-5.0-2' of git://git.linux-nfs.org/projects/anna/linux-nfs:
  SUNRPC: Address Kerberos performance/behavior regression
  SUNRPC: Ensure we respect the RPCSEC_GSS sequence number limit
  SUNRPC: Ensure rq_bytes_sent is reset before request transmission
  NFSv4.2 fix unnecessary retry in nfs4_copy_file_range
  sunrpc: kernel BUG at kernel/cred.c:825!
  SUNRPC: Fix TCP receive code on archs with flush_dcache_page()
  xprtrdma: Double free in rpcrdma_sendctxs_create()
  xprtrdma: Fix error code in rpcrdma_buffer_create()

5 years agoMerge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Linus Torvalds [Sat, 19 Jan 2019 21:15:04 +0000 (09:15 +1200)]
Merge tag 'scsi-fixes' of git://git./linux/kernel/git/jejb/scsi

Pull SCSI fixes from James Bottomley:
 "A set of 17 fixes. Most of these are minor or trivial.

  The one fix that may be serious is the isci one: the bug can cause hba
  parameters to be set from uninitialized memory. I don't think it's
  exploitable, but you never know"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: cxgb4i: add wait_for_completion()
  scsi: qla1280: set 64bit coherent mask
  scsi: ufs: Fix geometry descriptor size
  scsi: megaraid_sas: Retry reads of outbound_intr_status reg
  scsi: qedi: Add ep_state for login completion on un-reachable targets
  scsi: ufs: Fix system suspend status
  scsi: qla2xxx: Use correct number of vectors for online CPUs
  scsi: hisi_sas: Set protection parameters prior to adding SCSI host
  scsi: tcmu: avoid cmd/qfull timers updated whenever a new cmd comes
  scsi: isci: initialize shost fully before calling scsi_add_host()
  scsi: lpfc: lpfc_sli: Mark expected switch fall-throughs
  scsi: smartpqi_init: fix boolean expression in pqi_device_remove_start
  scsi: core: Synchronize request queue PM status only on successful resume
  scsi: pm80xx: reduce indentation
  scsi: qla4xxx: check return code of qla4xxx_copy_from_fwddb_param
  scsi: megaraid_sas: correct an info message
  scsi: target/iscsi: fix error msg typo when create lio_qr_cache failed
  scsi: sd: Fix cache_type_store()

5 years agoMerge tag 'for-linus-20190118' of git://git.kernel.dk/linux-block
Linus Torvalds [Sat, 19 Jan 2019 21:12:50 +0000 (09:12 +1200)]
Merge tag 'for-linus-20190118' of git://git.kernel.dk/linux-block

Pull block fixes from Jens Axboe:

 - block size setting fixes for loop/nbd (Jan Kara)

 - md bio_alloc_mddev() cleanup (Marcos)

 - Ensure we don't lose the REQ_INTEGRITY flag (Ming)

 - Two NVMe fixes by way of Christoph:
    - Fix NVMe IRQ calculation (Ming)
    - Uninitialized variable in nvmet-tcp (Sagi)

 - BFQ comment fix (Paolo)

 - License cleanup for recently added blk-mq-debugfs-zoned (Thomas)

* tag 'for-linus-20190118' of git://git.kernel.dk/linux-block:
  block: Cleanup license notice
  nvme-pci: fix nvme_setup_irqs()
  nvmet-tcp: fix uninitialized variable access
  block: don't lose track of REQ_INTEGRITY flag
  blockdev: Fix livelocks on loop device
  nbd: Use set_blocksize() to set device blocksize
  md: Make bio_alloc_mddev use bio_alloc_bioset
  block, bfq: fix comments on __bfq_deactivate_entity

5 years agonet: sock: do not set sk_cookie in sk_clone_lock()
Yafang Shao [Fri, 18 Jan 2019 05:00:51 +0000 (13:00 +0800)]
net: sock: do not set sk_cookie in sk_clone_lock()

The only call site of sk_clone_lock is in inet_csk_clone_lock,
and sk_cookie will be set there.
So we don't need to set sk_cookie in sk_clone_lock().

Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoisdn: remove unneeded semicolon
YueHaibing [Fri, 18 Jan 2019 03:05:11 +0000 (11:05 +0800)]
isdn: remove unneeded semicolon

remove unneeded semicolon

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: usb: rtl8150: remove set but not used variable 'rx_stat'
Yue Haibing [Fri, 18 Jan 2019 02:06:49 +0000 (02:06 +0000)]
net: usb: rtl8150: remove set but not used variable 'rx_stat'

Fixes gcc '-Wunused-but-set-variable' warning:

drivers/net/usb/rtl8150.c: In function 'read_bulk_callback':
drivers/net/usb/rtl8150.c:391:6: warning:
 variable 'rx_stat' set but not used [-Wunused-but-set-variable]

Signed-off-by: Yue Haibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge branch 'dpaa2-eth-add-debugfs-statistics'
David S. Miller [Sat, 19 Jan 2019 18:28:43 +0000 (10:28 -0800)]
Merge branch 'dpaa2-eth-add-debugfs-statistics'

Ioana Ciornei says:

====================
dpaa2-eth: add debugfs statistics

This patch set exports detailed driver counters through debugfs.
Counters which are already available through ethtool are now
presented in a structured manner (per-core, per-FQ and
per-channel) in debugfs.

The first patch is changing the dpaa2_eth_queue_count into a macro
(in order to avoid a warning) while the second one is adding the
debugfs support.

Changes in v2:
  - remove the _exit annotation of dpaa2_eth_dbg_exit
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agodpaa2-eth: add debugfs statistics
Ioana Radulescu [Fri, 18 Jan 2019 16:16:00 +0000 (16:16 +0000)]
dpaa2-eth: add debugfs statistics

Export detailed driver counters through debugfs.

Statistics already available in ethtool are presented in a
structured manner. Includes per-core, per-FQ and per-channel statistics.

Also transition from module_fsl_mc_driver to explicit module_init/exit
in order to create the debugfs directory besides registering the driver.

Signed-off-by: Ioana Radulescu <ruxandra.radulescu@nxp.com>
Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agodpaa2-eth: transform dpaa2_eth_queue_count into a macro
Ioana Ciornei [Fri, 18 Jan 2019 16:15:59 +0000 (16:15 +0000)]
dpaa2-eth: transform dpaa2_eth_queue_count into a macro

Transform dpaa2_eth_queue_count into a macro to follow the
the convention used by dpaa2_eth_fs_count and other functions.

Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoclang-format: Update .clang-format with the latest for_each macro list
Jason Gunthorpe [Fri, 18 Jan 2019 22:57:04 +0000 (22:57 +0000)]
clang-format: Update .clang-format with the latest for_each macro list

Re-run the shell fragment that generated the original list. In particular
this adds the missing xarray related functions.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
5 years agoMerge branch 'net-use-strict-checks-in-doit-handlers'
David S. Miller [Sat, 19 Jan 2019 18:09:59 +0000 (10:09 -0800)]
Merge branch 'net-use-strict-checks-in-doit-handlers'

Jakub Kicinski says:

====================
net: use strict checks in doit handlers

This series extends strict argument checking to doit handlers
of the GET* nature.  This is a bit tricky since strict checking
flag has already been released..

iproute2 did not have a release with strick checks enabled,
and it will only need a minor one-liner to pass strick checks
after all the work that DaveA has already done.

Big thanks to Dave Ahern for help and guidence.

v2:
 - remove unnecessary check in patch 5 (Nicolas);
 - add path 7 (DaveA);
 - improve messages in patch 8 (DaveA).
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: mpls: netconf: perform strict checks also for doit handlers
Jakub Kicinski [Fri, 18 Jan 2019 18:46:26 +0000 (10:46 -0800)]
net: mpls: netconf: perform strict checks also for doit handlers

Make RTM_GETNETCONF's doit handler use strict checks when
NETLINK_F_STRICT_CHK is set.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: mpls: route: perform strict checks also for doit handlers
Jakub Kicinski [Fri, 18 Jan 2019 18:46:25 +0000 (10:46 -0800)]
net: mpls: route: perform strict checks also for doit handlers

Make RTM_GETROUTE's doit handler use strict checks when
NETLINK_F_STRICT_CHK is set.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: ipv6: route: perform strict checks also for doit handlers
Jakub Kicinski [Fri, 18 Jan 2019 18:46:24 +0000 (10:46 -0800)]
net: ipv6: route: perform strict checks also for doit handlers

Make RTM_GETROUTE's doit handler use strict checks when
NETLINK_F_STRICT_CHK is set.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: ipv6: addrlabel: perform strict checks also for doit handlers
Jakub Kicinski [Fri, 18 Jan 2019 18:46:23 +0000 (10:46 -0800)]
net: ipv6: addrlabel: perform strict checks also for doit handlers

Make RTM_GETADDRLABEL's doit handler use strict checks when
NETLINK_F_STRICT_CHK is set.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: ipv6: netconf: perform strict checks also for doit handlers
Jakub Kicinski [Fri, 18 Jan 2019 18:46:22 +0000 (10:46 -0800)]
net: ipv6: netconf: perform strict checks also for doit handlers

Make RTM_GETNETCONF's doit handler use strict checks when
NETLINK_F_STRICT_CHK is set.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: ipv6: addr: perform strict checks also for doit handlers
Jakub Kicinski [Fri, 18 Jan 2019 18:46:21 +0000 (10:46 -0800)]
net: ipv6: addr: perform strict checks also for doit handlers

Make RTM_GETADDR's doit handler use strict checks when
NETLINK_F_STRICT_CHK is set.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: ipv4: ipmr: perform strict checks also for doit handlers
Jakub Kicinski [Fri, 18 Jan 2019 18:46:20 +0000 (10:46 -0800)]
net: ipv4: ipmr: perform strict checks also for doit handlers

Make RTM_GETROUTE's doit handler use strict checks when
NETLINK_F_STRICT_CHK is set.

v2: - improve extack messages (DaveA).

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: ipv4: route: perform strict checks also for doit handlers
Jakub Kicinski [Fri, 18 Jan 2019 18:46:19 +0000 (10:46 -0800)]
net: ipv4: route: perform strict checks also for doit handlers

Make RTM_GETROUTE's doit handler use strict checks when
NETLINK_F_STRICT_CHK is set.

v2: - new patch (DaveA).

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: ipv4: netconf: perform strict checks also for doit handlers
Jakub Kicinski [Fri, 18 Jan 2019 18:46:18 +0000 (10:46 -0800)]
net: ipv4: netconf: perform strict checks also for doit handlers

Make RTM_GETNETCONF's doit handler use strict checks when
NETLINK_F_STRICT_CHK is set.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: namespace: perform strict checks also for doit handlers
Jakub Kicinski [Fri, 18 Jan 2019 18:46:17 +0000 (10:46 -0800)]
net: namespace: perform strict checks also for doit handlers

Make RTM_GETNSID's doit handler use strict checks when
NETLINK_F_STRICT_CHK is set.

v2: - don't check size >= sizeof(struct rtgenmsg) (Nicolas).

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agortnetlink: ifinfo: perform strict checks also for doit handler
Jakub Kicinski [Fri, 18 Jan 2019 18:46:16 +0000 (10:46 -0800)]
rtnetlink: ifinfo: perform strict checks also for doit handler

Make RTM_GETLINK's doit handler use strict checks when
NETLINK_F_STRICT_CHK is set.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agortnetlink: stats: reject requests for unknown stats
Jakub Kicinski [Fri, 18 Jan 2019 18:46:15 +0000 (10:46 -0800)]
rtnetlink: stats: reject requests for unknown stats

In the spirit of strict checks reject requests of stats the kernel
does not support when NETLINK_F_STRICT_CHK is set.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agortnetlink: stats: validate attributes in get as well as dumps
Jakub Kicinski [Fri, 18 Jan 2019 18:46:14 +0000 (10:46 -0800)]
rtnetlink: stats: validate attributes in get as well as dumps

Make sure NETLINK_GET_STRICT_CHK influences both GETSTATS doit
as well as the dump.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: netlink: add helper to retrieve NETLINK_F_STRICT_CHK
Jakub Kicinski [Fri, 18 Jan 2019 18:46:13 +0000 (10:46 -0800)]
net: netlink: add helper to retrieve NETLINK_F_STRICT_CHK

Dumps can read state of the NETLINK_F_STRICT_CHK flag from
a field in the callback structure.  For non-dump GET requests
we need a way to access the state of that flag from a socket.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agovirtio-net: per-queue RPS config
Willem de Bruijn [Fri, 18 Jan 2019 01:08:53 +0000 (20:08 -0500)]
virtio-net: per-queue RPS config

On multiqueue network devices, RPS maps are configured independently
for each receive queue through /sys/class/net/$DEV/queues/rx-*.

On virtio-net currently all packets use the map from rx-0, because the
real rx queue is not known at time of map lookup by get_rps_cpu.

Call skb_record_rx_queue in the driver rx path to make lookup work.

Recording the receive queue has ramifications beyond RPS, such as in
sticky load balancing decisions for sockets (skb_tx_hash) and XPS.

Reported-by: Mark Hlady <mhlady@google.com>
Signed-off-by: Willem de Bruijn <willemb@google.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: phy: phy driver features are mandatory
Camelia Groza [Thu, 17 Jan 2019 12:22:36 +0000 (14:22 +0200)]
net: phy: phy driver features are mandatory

Since phy driver features became a link_mode bitmap, phy drivers that
don't have a list of features configured will cause the kernel to crash
when probed.

Prevent the phy driver from registering if the features field is missing.

Fixes: 719655a14971 ("net: phy: Replace phy driver features u32 with link_mode bitmap")
Reported-by: Scott Wood <oss@buserror.net>
Signed-off-by: Camelia Groza <camelia.groza@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoisdn: avm: Fix string plus integer warning from Clang
Nathan Chancellor [Thu, 10 Jan 2019 05:41:08 +0000 (22:41 -0700)]
isdn: avm: Fix string plus integer warning from Clang

A recent commit in Clang expanded the -Wstring-plus-int warning, showing
some odd behavior in this file.

drivers/isdn/hardware/avm/b1.c:426:30: warning: adding 'int' to a string does not append to the string [-Wstring-plus-int]
                cinfo->version[j] = "\0\0" + 1;
                                    ~~~~~~~^~~
drivers/isdn/hardware/avm/b1.c:426:30: note: use array indexing to silence this warning
                cinfo->version[j] = "\0\0" + 1;
                                           ^
                                    &      [  ]
1 warning generated.

This is equivalent to just "\0". Nick pointed out that it is smarter to
use "" instead of "\0" because "" is used elsewhere in the kernel and
can be deduplicated at the linking stage.

Link: https://github.com/ClangBuiltLinux/linux/issues/309
Suggested-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agosch_api: Change signature of qdisc_tree_reduce_backlog() to use ints
Toke Høiland-Jørgensen [Wed, 9 Jan 2019 16:10:57 +0000 (17:10 +0100)]
sch_api: Change signature of qdisc_tree_reduce_backlog() to use ints

There are now several places where qdisc_tree_reduce_backlog() is called
with a negative number of packets (to signal an increase in number of
packets in the queue). Rather than rely on overflow behaviour, change the
function signature to use signed integers to communicate this usage to
people reading the code.

Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agopowerpc: chrp: Use of_node_is_type to access device_type
Rob Herring [Fri, 18 Jan 2019 14:12:10 +0000 (08:12 -0600)]
powerpc: chrp: Use of_node_is_type to access device_type

Commit 8ce5f8415753 ("of: Remove struct device_node.type pointer")
removed struct device_node.type pointer, but the conversion to use
of_node_is_type() accessor was missed in chrp_init_IRQ().

Fixes: 8ce5f8415753 ("of: Remove struct device_node.type pointer")
Reported-by: kbuild test robot <lkp@intel.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: linuxppc-dev@lists.ozlabs.org
Acked-by: Michael Ellerman <mpe@ellerman.id.au>
Signed-off-by: Rob Herring <robh@kernel.org>
5 years agoMerge tag 'mlx5-fixes-2019-01-18' of git://git.kernel.org/pub/scm/linux/kernel/git...
David S. Miller [Sat, 19 Jan 2019 02:23:23 +0000 (18:23 -0800)]
Merge tag 'mlx5-fixes-2019-01-18' of git://git./linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
Mellanox, mlx5 fixes 2019-01-18

This series introduces some fixes to mlx5 driver.

Please pull and let me know if there is any problem.

For -stable v4.18
('net/mlx5e: Force CHECKSUM_UNNECESSARY for short ethernet frames')

The patch doesn't apply cleanly to 4.18.y, but it is very simple to
resolve, what should be the procedure here ?
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet/mlx5e: Fix cb_ident duplicate in indirect block register
Eli Britstein [Wed, 19 Dec 2018 05:36:51 +0000 (07:36 +0200)]
net/mlx5e: Fix cb_ident duplicate in indirect block register

Previously the identifier used for indirect block callback registry
and for block rule cb registry (when done via indirect blocks) was the
pointer to the tunnel netdev we were interested in receiving updates on.
This worked fine if a single PF existed that registered one callback for
the tunnel netdev of interest. However, if multiple PFs are in place then
the 2nd PF tries to register with the same tunnel netdev identifier. This
leads to EEXIST errors and/or incorrect cb deletions.

Prevent this conflict by using the rpriv pointer as the identifier for
netdev indirect block cb registry, allowing each PF to register a unique
callback per tunnel netdev. For block cb registry, the same PF may
register multiple cbs to the same block if using TC shared blocks.
Instead of the rpriv, use the pointer to the allocated indr_priv data as
the identifier here. This means that there can be a unique block callback
for each PF/tunnel netdev combo.

Fixes: f5bc2c5de101 ("net/mlx5e: Support TC indirect block notifications
for eswitch uplink reprs")
Signed-off-by: Eli Britstein <elibr@mellanox.com>
Reviewed-by: Oz Shlomo <ozsh@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
5 years agonet/mlx5e: Fix wrong (zero) TX drop counter indication for representor
Tariq Toukan [Thu, 8 Nov 2018 10:06:53 +0000 (12:06 +0200)]
net/mlx5e: Fix wrong (zero) TX drop counter indication for representor

For representors, the TX dropped counter is not folded from the
per-ring counters. Fix it.

Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
5 years agonet/mlx5e: Fix wrong error code return on FEC query failure
Shay Agroskin [Sun, 9 Dec 2018 10:00:13 +0000 (12:00 +0200)]
net/mlx5e: Fix wrong error code return on FEC query failure

Advertised and configured FEC query failure resulted in printing
wrong error code.

Fixes: 6cfa94605091 ("net/mlx5e: Ethtool driver callback for query/set FEC policy")
Signed-off-by: Shay Agroskin <shayag@mellanox.com>
Reported-by: Or Gerlitz <ogerlitz@mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
5 years agonet/mlx5e: Force CHECKSUM_UNNECESSARY for short ethernet frames
Cong Wang [Tue, 4 Dec 2018 06:14:04 +0000 (22:14 -0800)]
net/mlx5e: Force CHECKSUM_UNNECESSARY for short ethernet frames

When an ethernet frame is padded to meet the minimum ethernet frame
size, the padding octets are not covered by the hardware checksum.
Fortunately the padding octets are usually zero's, which don't affect
checksum. However, we have a switch which pads non-zero octets, this
causes kernel hardware checksum fault repeatedly.

Prior to:
commit '88078d98d1bb ("net: pskb_trim_rcsum() and CHECKSUM_COMPLETE ...")'
skb checksum was forced to be CHECKSUM_NONE when padding is detected.
After it, we need to keep skb->csum updated, like what we do for RXFCS.
However, fixing up CHECKSUM_COMPLETE requires to verify and parse IP
headers, it is not worthy the effort as the packets are so small that
CHECKSUM_COMPLETE can't save anything.

Fixes: 88078d98d1bb ("net: pskb_trim_rcsum() and CHECKSUM_COMPLETE are friends"),
Cc: Eric Dumazet <edumazet@google.com>
Cc: Tariq Toukan <tariqt@mellanox.com>
Cc: Nikola Ciprich <nikola.ciprich@linuxbox.cz>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
5 years agotools: bpftool: Cleanup license mess
Thomas Gleixner [Thu, 17 Jan 2019 23:14:24 +0000 (00:14 +0100)]
tools: bpftool: Cleanup license mess

Precise and non-ambiguous license information is important. The recent
relicensing of the bpftools introduced a license conflict.

The files have now:

     SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause

and

     * This program is free software; you can redistribute it and/or
     * modify it under the terms of the GNU General Public License
     * as published by the Free Software Foundation; either version
     * 2 of the License, or (at your option) any later version

Amazingly about 20 people acked that change and neither they nor the
committer noticed. Oh well.

Digging deeper: The files were imported from the iproute2 repository with
the GPL V2 or later boiler plate text in commit b66e907cfee2 ("tools:
bpftool: copy JSON writer from iproute2 repository")

Looking at the iproute2 repository at

  git://git.kernel.org/pub/scm/network/iproute2/iproute2.git

the following commit is the equivivalent:

  commit d9d8c839 ("json_writer: add SPDX Identifier (GPL-2/BSD-2)")

That commit explicitly removes the boiler plate and relicenses the code
uner GPL-2.0-only and BSD-2-Clause. As Steven wrote the original code and
also the relicensing commit, it's assumed that the relicensing was intended
to do exaclty that. Just the kernel side update failed to remove the boiler
plate. Do so now.

Fixes: 907b22365115 ("tools: bpftool: dual license all files")
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Jakub Kicinski <jakub.kicinski@netronome.com>
Cc: Roman Gushchin <guro@fb.com>
Cc: YueHaibing <yuehaibing@huawei.com>
Cc: Yonghong Song <yhs@fb.com>
Cc: Stanislav Fomichev <sdf@google.com>
Cc: Sean Young <sean@mess.org>
Cc: Jiri Benc <jbenc@redhat.com>
Cc: David Calavera <david.calavera@gmail.com>
Cc: Andrey Ignatov <rdna@fb.com>
Cc: Joe Stringer <joe@wand.net.nz>
Cc: David Ahern <dsahern@gmail.com>
Cc: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Cc: Petar Penkov <ppenkov@stanford.edu>
Cc: Sandipan Das <sandipan@linux.ibm.com>
Cc: Prashant Bhole <bhole_prashant_q7@lab.ntt.co.jp>
Cc: Stephen Hemminger <stephen@networkplumber.org>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: Taeung Song <treeze.taeung@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Quentin Monnet <quentin.monnet@netronome.com>
CC: okash.khawaja@gmail.com
Cc: netdev@vger.kernel.org
Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
5 years agobpf: fix inner map masking to prevent oob under speculation
Daniel Borkmann [Thu, 17 Jan 2019 15:34:45 +0000 (16:34 +0100)]
bpf: fix inner map masking to prevent oob under speculation

During review I noticed that inner meta map setup for map in
map is buggy in that it does not propagate all needed data
from the reference map which the verifier is later accessing.

In particular one such case is index masking to prevent out of
bounds access under speculative execution due to missing the
map's unpriv_array/index_mask field propagation. Fix this such
that the verifier is generating the correct code for inlined
lookups in case of unpriviledged use.

Before patch (test_verifier's 'map in map access' dump):

  # bpftool prog dump xla id 3
     0: (62) *(u32 *)(r10 -4) = 0
     1: (bf) r2 = r10
     2: (07) r2 += -4
     3: (18) r1 = map[id:4]
     5: (07) r1 += 272                |
     6: (61) r0 = *(u32 *)(r2 +0)     |
     7: (35) if r0 >= 0x1 goto pc+6   | Inlined map in map lookup
     8: (54) (u32) r0 &= (u32) 0      | with index masking for
     9: (67) r0 <<= 3                 | map->unpriv_array.
    10: (0f) r0 += r1                 |
    11: (79) r0 = *(u64 *)(r0 +0)     |
    12: (15) if r0 == 0x0 goto pc+1   |
    13: (05) goto pc+1                |
    14: (b7) r0 = 0                   |
    15: (15) if r0 == 0x0 goto pc+11
    16: (62) *(u32 *)(r10 -4) = 0
    17: (bf) r2 = r10
    18: (07) r2 += -4
    19: (bf) r1 = r0
    20: (07) r1 += 272                |
    21: (61) r0 = *(u32 *)(r2 +0)     | Index masking missing (!)
    22: (35) if r0 >= 0x1 goto pc+3   | for inner map despite
    23: (67) r0 <<= 3                 | map->unpriv_array set.
    24: (0f) r0 += r1                 |
    25: (05) goto pc+1                |
    26: (b7) r0 = 0                   |
    27: (b7) r0 = 0
    28: (95) exit

After patch:

  # bpftool prog dump xla id 1
     0: (62) *(u32 *)(r10 -4) = 0
     1: (bf) r2 = r10
     2: (07) r2 += -4
     3: (18) r1 = map[id:2]
     5: (07) r1 += 272                |
     6: (61) r0 = *(u32 *)(r2 +0)     |
     7: (35) if r0 >= 0x1 goto pc+6   | Same inlined map in map lookup
     8: (54) (u32) r0 &= (u32) 0      | with index masking due to
     9: (67) r0 <<= 3                 | map->unpriv_array.
    10: (0f) r0 += r1                 |
    11: (79) r0 = *(u64 *)(r0 +0)     |
    12: (15) if r0 == 0x0 goto pc+1   |
    13: (05) goto pc+1                |
    14: (b7) r0 = 0                   |
    15: (15) if r0 == 0x0 goto pc+12
    16: (62) *(u32 *)(r10 -4) = 0
    17: (bf) r2 = r10
    18: (07) r2 += -4
    19: (bf) r1 = r0
    20: (07) r1 += 272                |
    21: (61) r0 = *(u32 *)(r2 +0)     |
    22: (35) if r0 >= 0x1 goto pc+4   | Now fixed inlined inner map
    23: (54) (u32) r0 &= (u32) 0      | lookup with proper index masking
    24: (67) r0 <<= 3                 | for map->unpriv_array.
    25: (0f) r0 += r1                 |
    26: (05) goto pc+1                |
    27: (b7) r0 = 0                   |
    28: (b7) r0 = 0
    29: (95) exit

Fixes: b2157399cc98 ("bpf: prevent out-of-bounds speculation")
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
5 years agobpf: pull in pkt_sched.h header for tooling to fix bpftool build
Daniel Borkmann [Thu, 17 Jan 2019 15:15:09 +0000 (16:15 +0100)]
bpf: pull in pkt_sched.h header for tooling to fix bpftool build

Dan reported that bpftool does not compile for him:

  $ make tools/bpf
    DESCEND  bpf

  Auto-detecting system features:
  ..                        libbfd: [ on  ]
  ..        disassembler-four-args: [ OFF ]

    DESCEND  bpftool

  Auto-detecting system features:
  ..                        libbfd: [ on  ]
  ..        disassembler-four-args: [ OFF ]

    CC       /opt/linux.git/tools/bpf/bpftool/net.o
  In file included from /opt/linux.git/tools/include/uapi/linux/pkt_cls.h:6:0,
                 from /opt/linux.git/tools/include/uapi/linux/tc_act/tc_bpf.h:14,
                 from net.c:13:
  net.c: In function 'show_dev_tc_bpf':
  net.c:164:21: error: 'TC_H_CLSACT' undeclared (first use in this function)
    handle = TC_H_MAKE(TC_H_CLSACT, TC_H_MIN_INGRESS);
  [...]

Fix it by importing pkt_sched.h header copy into tooling
infrastructure.

Fixes: 49a249c38726 ("tools/bpftool: copy a few net uapi headers to tools directory")
Fixes: f6f3bac08ff9 ("tools/bpf: bpftool: add net support")
Reported-by: Dan Gilson <dan_gilson@yahoo.com>
Reference: https://bugzilla.kernel.org/show_bug.cgi?id=202315
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
5 years agoMerge branch 'mlxsw-fixes'
David S. Miller [Fri, 18 Jan 2019 23:12:16 +0000 (15:12 -0800)]
Merge branch 'mlxsw-fixes'

Ido Schimmel says:

====================
mlxsw: Various fixes

This patchset contains small fixes in mlxsw and one fix in the bridge
driver.

Patches #1-#4 perform small adjustments in PCI and FID code following
recent tests that were performed on the Spectrum-2 ASIC.

Patch #5 fixes the bridge driver to mark FDB entries that were added by
user as such. Otherwise, these entries will be ignored by underlying
switch drivers.

Patch #6 fixes a long standing issue in mlxsw where the driver
incorrectly programmed static FDB entries as both static and sticky.

Patches #7-#8 add test cases for above mentioned bugs.

Please consider patches #1, #2 and #4 for stable.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoselftests: forwarding: Add a test case for externally learned FDB entries
Ido Schimmel [Fri, 18 Jan 2019 15:58:03 +0000 (15:58 +0000)]
selftests: forwarding: Add a test case for externally learned FDB entries

Test that externally learned FDB entries can roam, but not age out.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoselftests: mlxsw: Test FDB offload indication
Ido Schimmel [Fri, 18 Jan 2019 15:58:02 +0000 (15:58 +0000)]
selftests: mlxsw: Test FDB offload indication

Test that externally learned FDB entries added from user space are
marked as offloaded.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agomlxsw: spectrum_switchdev: Do not treat static FDB entries as sticky
Ido Schimmel [Fri, 18 Jan 2019 15:58:01 +0000 (15:58 +0000)]
mlxsw: spectrum_switchdev: Do not treat static FDB entries as sticky

The driver currently treats static FDB entries as both static and
sticky. This is incorrect and prevents such entries from being roamed to
a different port via learning.

Fix this by configuring static entries with ageing disabled and roaming
enabled.

In net-next we can add proper support for the newly introduced 'sticky'
flag.

Fixes: 56ade8fe3fe1 ("mlxsw: spectrum: Add initial support for Spectrum ASIC")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reported-by: Alexander Petrovskiy <alexpe@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: bridge: Mark FDB entries that were added by user as such
Ido Schimmel [Fri, 18 Jan 2019 15:58:00 +0000 (15:58 +0000)]
net: bridge: Mark FDB entries that were added by user as such

Externally learned entries can be added by a user or by a switch driver
that is notifying the bridge driver about entries that were learned in
hardware.

In the first case, the entries are not marked with the 'added_by_user'
flag, which causes switch drivers to ignore them and not offload them.

The 'added_by_user' flag can be set on externally learned FDB entries
based on the 'swdev_notify' parameter in br_fdb_external_learn_add(),
which effectively means if the created / updated FDB entry was added by
a user or not.

Fixes: 816a3bed9549 ("switchdev: Add fdb.added_by_user to switchdev notifications")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reported-by: Alexander Petrovskiy <alexpe@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Cc: Roopa Prabhu <roopa@cumulusnetworks.com>
Cc: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Cc: bridge@lists.linux-foundation.org
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agomlxsw: spectrum_fid: Update dummy FID index
Nir Dotan [Fri, 18 Jan 2019 15:57:59 +0000 (15:57 +0000)]
mlxsw: spectrum_fid: Update dummy FID index

When using a tc flower action of egress mirred redirect, the driver adds
an implicit FID setting action. This implicit action sets a dummy FID to
the packet and is used as part of a design for trapping unmatched flows
in OVS.  While this implicit FID setting action is supposed to be a NOP
when a redirect action is added, in Spectrum-2 the FID record is
consulted as the dummy FID index is an 802.1D FID index and the packet
is dropped instead of being redirected.

Set the dummy FID index value to be within 802.1Q range. This satisfies
both Spectrum-1 which ignores the FID and Spectrum-2 which identifies it
as an 802.1Q FID and will then follow the redirect action.

Fixes: c3ab435466d5 ("mlxsw: spectrum: Extend to support Spectrum-2 ASIC")
Signed-off-by: Nir Dotan <nird@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agomlxsw: pci: Return error on PCI reset timeout
Nir Dotan [Fri, 18 Jan 2019 15:57:57 +0000 (15:57 +0000)]
mlxsw: pci: Return error on PCI reset timeout

Return an appropriate error in the case when the driver timeouts on waiting
for firmware to go out of PCI reset.

Fixes: 233fa44bd67a ("mlxsw: pci: Implement reset done check")
Signed-off-by: Nir Dotan <nird@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agomlxsw: pci: Increase PCI SW reset timeout
Nir Dotan [Fri, 18 Jan 2019 15:57:56 +0000 (15:57 +0000)]
mlxsw: pci: Increase PCI SW reset timeout

Spectrum-2 PHY layer introduces a calibration period which is a part of the
Spectrum-2 firmware boot process. Hence increase the SW timeout waiting for
the firmware to come out of boot. This does not increase system boot time
in cases where the firmware PHY calibration process is done quickly.

Fixes: c3ab435466d5 ("mlxsw: spectrum: Extend to support Spectrum-2 ASIC")
Signed-off-by: Nir Dotan <nird@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agomlxsw: pci: Ring CQ's doorbell before RDQ's
Ido Schimmel [Fri, 18 Jan 2019 15:57:55 +0000 (15:57 +0000)]
mlxsw: pci: Ring CQ's doorbell before RDQ's

When a packet should be trapped to the CPU the device consumes a WQE
(work queue element) from an RDQ (receive descriptor queue) and copies
the packet to the address specified in the WQE. The device then tries to
post a CQE (completion queue element) that contains various metadata
(e.g., ingress port) about the packet to a CQ (completion queue).

In case the device managed to consume a WQE, but did not manage to post
the corresponding CQE, it will get stuck. This unlikely situation can be
triggered due to the scheme the driver is currently using to process
CQEs.

The driver will consume up to 512 CQEs at a time and after processing
each corresponding WQE it will ring the RDQ's doorbell, letting the
device know that a new WQE was posted for it to consume. Only after
processing all the CQEs (up to 512), the driver will ring the CQ's
doorbell, letting the device know that new ones can be posted.

Fix this by having the driver ring the CQ's doorbell for every processed
CQE, but before ringing the RDQ's doorbell. This guarantees that
whenever we post a new WQE, there is a corresponding CQE available. Copy
the currently processed CQE to prevent the device from overwriting it
with a new CQE after ringing the doorbell.

Note that the driver still arms the CQ only after processing all the
pending CQEs, so that interrupts for this CQ will only be delivered
after the driver finished its processing.

Before commit 8404f6f2e8ed ("mlxsw: pci: Allow to use CQEs of version 1
and version 2") the issue was virtually impossible to trigger since the
number of CQEs was twice the number of WQEs and the number of CQEs
processed at a time was equal to the number of available WQEs.

Fixes: 8404f6f2e8ed ("mlxsw: pci: Allow to use CQEs of version 1 and version 2")
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reported-by: Semion Lisyansky <semionl@mellanox.com>
Tested-by: Semion Lisyansky <semionl@mellanox.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>