OSDN Git Service

uclinux-h8/linux.git
6 years agonet: Convert mpls_net_ops
Kirill Tkhai [Thu, 15 Mar 2018 09:11:06 +0000 (12:11 +0300)]
net: Convert mpls_net_ops

These pernet_operations register and unregister sysctl table.
Exit methods frees platform_labels from net::mpls::platform_label.
Everything is per-net, and they looks safe to be marked async.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: Convert l2tp_net_ops
Kirill Tkhai [Thu, 15 Mar 2018 09:10:57 +0000 (12:10 +0300)]
net: Convert l2tp_net_ops

Init method is rather simple. Exit method queues del_work
for every tunnel from per-net list. This seems to be safe
to be marked async.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Acked-by: Guillaume Nault <g.nault@alphalink.fr>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet-tcp_bbr: set tp->snd_ssthresh to BDP upon STARTUP exit
Yousuk Seung [Fri, 16 Mar 2018 17:51:49 +0000 (10:51 -0700)]
net-tcp_bbr: set tp->snd_ssthresh to BDP upon STARTUP exit

Set tp->snd_ssthresh to BDP upon STARTUP exit. This allows us
to check if a BBR flow exited STARTUP and the BDP at the
time of STARTUP exit with SCM_TIMESTAMPING_OPT_STATS. Since BBR does not
use snd_ssthresh this fix has no impact on BBR's behavior.

Signed-off-by: Yousuk Seung <ysseung@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Priyaranjan Jha <priyarjha@google.com>
Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agotcp: add snd_ssthresh stat in SCM_TIMESTAMPING_OPT_STATS
Yousuk Seung [Fri, 16 Mar 2018 17:51:07 +0000 (10:51 -0700)]
tcp: add snd_ssthresh stat in SCM_TIMESTAMPING_OPT_STATS

This patch adds TCP_NLA_SND_SSTHRESH stat into SCM_TIMESTAMPING_OPT_STATS
that reports tcp_sock.snd_ssthresh.

Signed-off-by: Yousuk Seung <ysseung@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Priyaranjan Jha <priyarjha@google.com>
Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: Yuchung Cheng <ycheng@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoselftests/txtimestamp: Add more configurable parameters
Vinicius Costa Gomes [Fri, 16 Mar 2018 17:41:14 +0000 (10:41 -0700)]
selftests/txtimestamp: Add more configurable parameters

Add a way to configure if poll() should wait forever for an event, the
number of packets that should be sent for each and if there should be
any delay between packets.

Signed-off-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoliquidio: Simplified napi poll
Intiyaz Basha [Fri, 16 Mar 2018 17:21:31 +0000 (10:21 -0700)]
liquidio: Simplified napi poll

1) Moved interrupt enable related code from octeon_process_droq_poll_cmd()
   to separate function octeon_enable_irq().
2) Removed wrapper function octeon_process_droq_poll_cmd(), and directlyi
   using octeon_droq_process_poll_pkts().
3) Removed unused macros POLL_EVENT_XXX.

Signed-off-by: Intiyaz Basha <intiyaz.basha@cavium.com>
Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoMerge branch 'net-smc-IPv6-support'
David S. Miller [Fri, 16 Mar 2018 18:57:26 +0000 (14:57 -0400)]
Merge branch 'net-smc-IPv6-support'

Ursula Braun says:

====================
net/smc: IPv6 support

these smc patches for the net-next tree add IPv6 support.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet/smc: enable ipv6 support for smc
Karsten Graul [Fri, 16 Mar 2018 14:06:41 +0000 (15:06 +0100)]
net/smc: enable ipv6 support for smc

Add ipv6 support to the smc socket layer functions. Make use of the
updated clc layer functions to retrieve and match ipv6 information.
The indicator for ipv4 or ipv6 is the protocol constant that is provided
in the socket() call with address family AF_SMC.

Based-on-patch-by: Takanori Ueda <tkueda@jp.ibm.com>
Signed-off-by: Karsten Graul <kgraul@linux.vnet.ibm.com>
Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet/smc: add ipv6 support to CLC layer
Karsten Graul [Fri, 16 Mar 2018 14:06:40 +0000 (15:06 +0100)]
net/smc: add ipv6 support to CLC layer

The CLC layer is updated to support ipv6 proposal messages from peers and
to match incoming proposal messages against the ipv6 addresses of the net
device. struct smc_clc_ipv6_prefix is updated to provide the space for an
ipv6 address (struct was not used before). SMC_CLC_MAX_LEN is updated to
include the size of the proposal prefix. Existing code in net is not
affected, the previous SMC_CLC_MAX_LEN value is large enough to hold ipv4
proposal messages.

Signed-off-by: Karsten Graul <kgraul@linux.vnet.ibm.com>
Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet/smc: restructure netinfo for CLC proposal msgs
Karsten Graul [Fri, 16 Mar 2018 14:06:39 +0000 (15:06 +0100)]
net/smc: restructure netinfo for CLC proposal msgs

Introduce functions smc_clc_prfx_set to retrieve IP information for the
CLC proposal msg and smc_clc_prfx_match to match the contents of a
proposal message against the IP addresses of the net device. The new
functions replace the functionality provided by smc_clc_netinfo_by_tcpsk,
which is removed by this patch. The match functionality is extended to
scan all ipv4 addresses of the net device for a match against the
ipv4 subnet from the proposal msg.

Signed-off-by: Karsten Graul <kgraul@linux.vnet.ibm.com>
Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agocxgb4: notify fatal error to uld drivers
Ganesh Goudar [Fri, 16 Mar 2018 08:52:57 +0000 (14:22 +0530)]
cxgb4: notify fatal error to uld drivers

notify uld drivers if the adapter encounters fatal
error.

Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoMerge branch 'rtnl_lock_killable'
David S. Miller [Fri, 16 Mar 2018 16:31:19 +0000 (12:31 -0400)]
Merge branch 'rtnl_lock_killable'

Kirill Tkhai says:

====================
Introduce rtnl_lock_killable()

rtnl_lock() is widely used mutex in kernel. Some of kernel code
does memory allocations under it. In case of memory deficit this
may invoke OOM killer, but the problem is a killed task can't
exit if it's waiting for the mutex. This may be a reason of deadlock
and panic.

This patchset adds a new primitive, which responds on SIGKILL,
and it allows to use it in the places, where we don't want
to sleep forever. Also, the first place is made to use it.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: Use rtnl_lock_killable() in register_netdev()
Kirill Tkhai [Wed, 14 Mar 2018 19:17:28 +0000 (22:17 +0300)]
net: Use rtnl_lock_killable() in register_netdev()

This patch adds rtnl_lock_killable() to one of hot path
using rtnl_lock().

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: Add rtnl_lock_killable()
Kirill Tkhai [Wed, 14 Mar 2018 19:17:20 +0000 (22:17 +0300)]
net: Add rtnl_lock_killable()

rtnl_lock() is widely used mutex in kernel. Some of kernel code
does memory allocations under it. In case of memory deficit this
may invoke OOM killer, but the problem is a killed task can't
exit if it's waiting for the mutex. This may be a reason of deadlock
and panic.

This patch adds a new primitive, which responds on SIGKILL, and
it allows to use it in the places, where we don't want to sleep
forever.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agodoc: Change the udp/sctp rmem/wmem default value.
Tonghao Zhang [Wed, 14 Mar 2018 04:57:17 +0000 (21:57 -0700)]
doc: Change the udp/sctp rmem/wmem default value.

The SK_MEM_QUANTUM was changed from PAGE_SIZE to 4096.

Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoudp: Move the udp sysctl to namespace.
Tonghao Zhang [Wed, 14 Mar 2018 04:57:16 +0000 (21:57 -0700)]
udp: Move the udp sysctl to namespace.

This patch moves the udp_rmem_min, udp_wmem_min
to namespace and init the udp_l3mdev_accept explicitly.

The udp_rmem_min/udp_wmem_min affect udp rx/tx queue,
with this patch namespaces can set them differently.

Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoMerge branch 'net-ipv6-Address-checks-need-to-consider-the-L3-domain'
David S. Miller [Fri, 16 Mar 2018 15:28:40 +0000 (11:28 -0400)]
Merge branch 'net-ipv6-Address-checks-need-to-consider-the-L3-domain'

David Ahern says:

====================
net/ipv6: Address checks need to consider the L3 domain

IPv6 prohibits a local address from being used as a gateway for a route.
However, it is ok for the gateway to be a local address in a different L3
domain (e.g., VRF). This allows, for example, veth pairs to connect VRFs.

ip6_route_info_create calls ipv6_chk_addr_and_flags for gateway addresses
to determine if the address is a local one, but ipv6_chk_addr_and_flags
does not currently consider L3 domains. As a result routes can not be
added in one VRF with a nexthop that points to a local address in a
second VRF.

Resolve by comparing the l3mdev for the passed in device and requiring an
l3mdev match with the device containing an address. The intent of checking
for an address on the specified device versus any device in the domain is
mantained by a new argument to skip the check between the passed in device
and the device with the address.

Patch 1 moves the gateway validation from ip6_route_info_create into a
helper; the function is long enough and refactoring drops the indent
level.

Patch 2 adds a skip_dev_check argument to ipv6_chk_addr_and_flags to
allow a device to always be passed yet skip the device check when
looking at addresses and fixes up a few ipv6_chk_addr callers that
pass a NULL device.

Patch 3 adds l3mdev checks to ipv6_chk_addr_and_flags.

Patches 4 and 5 do some refactoring to the fib_tests script and then
patch 6 adds nexthop validation tests.

v4
- separated l3mdev check into a separate patch (patch 3 of this set)
  as suggested by Kirill
- consolidated dev and ipv6_chk_addr_and_flags call into 1 if (Kirill)
- added a temp variable for gw type (Kirill)

v3
- set skip_dev_check in ipv6_chk_addr based on dev == NULL (per
  comment from Ido)

v2
- handle 2 variations of route spec with sane error path
- add test cases
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoselftests: fib_tests: Add IPv6 nexthop spec tests
David Ahern [Tue, 13 Mar 2018 15:29:41 +0000 (08:29 -0700)]
selftests: fib_tests: Add IPv6 nexthop spec tests

Add series of tests for valid and invalid nexthop specs for IPv6.

$ TEST=fib_nexthop_test ./fib_tests.sh
...
IPv6 nexthop tests
    TEST: Directly connected nexthop, unicast address              [ OK ]
    TEST: Directly connected nexthop, unicast address with device  [ OK ]
    TEST: Gateway is linklocal address                             [ OK ]
    TEST: Gateway is linklocal address, no device                  [ OK ]
    TEST: Gateway can not be local unicast address                 [ OK ]
    TEST: Gateway can not be local unicast address, with device    [ OK ]
    TEST: Gateway can not be a local linklocal address             [ OK ]
    TEST: Gateway can be local address in a VRF                    [ OK ]
    TEST: Gateway can be local address in a VRF, with device       [ OK ]
    TEST: Gateway can be local linklocal address in a VRF          [ OK ]
    TEST: Redirect to VRF lookup                                   [ OK ]
    TEST: VRF route, gateway can be local address in default VRF   [ OK ]
    TEST: VRF route, gateway can not be a local address            [ OK ]
    TEST: VRF route, gateway can not be a local addr with device   [ OK ]

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoselftests: fib_tests: Allow user to run a specific test
David Ahern [Tue, 13 Mar 2018 15:29:40 +0000 (08:29 -0700)]
selftests: fib_tests: Allow user to run a specific test

Allow a user to run just a specific fib test by setting the TEST
environment variable.

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoselftests: fib_tests: Use an alias for ip command
David Ahern [Tue, 13 Mar 2018 15:29:39 +0000 (08:29 -0700)]
selftests: fib_tests: Use an alias for ip command

Replace 'ip -netns testns' with the alias IP. Shortens the line lengths
and makes running the commands manually a bit easier.

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet/ipv6: Add l3mdev check to ipv6_chk_addr_and_flags
David Ahern [Tue, 13 Mar 2018 15:29:38 +0000 (08:29 -0700)]
net/ipv6: Add l3mdev check to ipv6_chk_addr_and_flags

Lookup the L3 master device for the passed in device. Only consider
addresses on netdev's with the same master device. If the device is
not enslaved or is NULL, then the l3mdev is NULL which means only
devices not enslaved (ie, in the default domain) are considered.

Signed-off-by: David Ahern <dsahern@gmail.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet/ipv6: Change address check to always take a device argument
David Ahern [Tue, 13 Mar 2018 15:29:37 +0000 (08:29 -0700)]
net/ipv6: Change address check to always take a device argument

ipv6_chk_addr_and_flags determines if an address is a local address and
optionally if it is an address on a specific device. For example, it is
called by ip6_route_info_create to determine if a given gateway address
is a local address. The address check currently does not consider L3
domains and as a result does not allow a route to be added in one VRF
if the nexthop points to an address in a second VRF. e.g.,

    $ ip route add 2001:db8:1::/64 vrf r2 via 2001:db8:102::23
    Error: Invalid gateway address.

where 2001:db8:102::23 is an address on an interface in vrf r1.

ipv6_chk_addr_and_flags needs to allow callers to always pass in a device
with a separate argument to not limit the address to the specific device.
The device is used used to determine the L3 domain of interest.

To that end add an argument to skip the device check and update callers
to always pass a device where possible and use the new argument to mean
any address in the domain.

Update a handful of users of ipv6_chk_addr with a NULL dev argument. This
patch handles the change to these callers without adding the domain check.

ip6_validate_gw needs to handle 2 cases - one where the device is given
as part of the nexthop spec and the other where the device is resolved.
There is at least 1 VRF case where deferring the check to only after
the route lookup has resolved the device fails with an unintuitive error
"RTNETLINK answers: No route to host" as opposed to the preferred
"Error: Gateway can not be a local address." The 'no route to host'
error is because of the fallback to a full lookup. The check is done
twice to avoid this error.

Signed-off-by: David Ahern <dsahern@gmail.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet/ipv6: Refactor gateway validation on route add
David Ahern [Tue, 13 Mar 2018 15:29:36 +0000 (08:29 -0700)]
net/ipv6: Refactor gateway validation on route add

Move gateway validation code from ip6_route_info_create into
ip6_validate_gw. Code move plus adjustments to handle the potential
reset of dev and idev and to make checkpatch happy.

Signed-off-by: David Ahern <dsahern@gmail.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoMerge branch 'macb-Introduce-phy-handle-DT-functionality'
David S. Miller [Fri, 16 Mar 2018 15:14:34 +0000 (11:14 -0400)]
Merge branch 'macb-Introduce-phy-handle-DT-functionality'

Brad Mouring says:

====================
net: macb: Introduce phy-handle DT functionality

Consider the situation where a macb netdev is connected through
a phydev that sits on a mii bus other than the one provided to
this particular netdev. This situation is what this patchset aims
to accomplish through the existing phy-handle optional binding.

This optional binding (as described in the ethernet DT bindings doc)
directs the netdev to the phydev to use. This is precisely the
situation this patchset aims to solve, so it makes sense to introduce
the functionality to this driver (where the physical layout discussed
was encountered).

The devicetree snippet would look something like this:

...
   ethernet@feedf00d {
           ...
           phy-handle = <&phy0> // the first netdev is physically wired to phy0
           ...
           phy0: phy@0 {
                   ...
                   reg = <0x0> // MDIO address 0
                   ...
           }
           phy1: phy@1 {
                   ...
                   reg = <0x1> // MDIO address 1
                   ...
           }
           ...
   }

   ethernet@deadbeef {
           ...
           phy-handle = <&phy1> // tells the driver to use phy1 on the
                                // first mac's mdio bus (it's wired thusly)
           ...
   }
...

The work done to add the phy_node in the first place (dacdbb4dfc1a1:
"net: macb: add fixed-link node support") will consume the
device_node (if found).

v2: Reorganization of mii probe/init functions, suggested by Andrew Lunn
v3: Moved some of the bus init code back into init (erroneously moved to probe)
    some style issues, and an unintialized variable warning addressed.
v4: Add Reviewed-by: tags
    Skip fallback code if phy-handle phandle is found
v5: Cleanup formatting issues
    Fix compile failure introduced in 1/4 "net: macb: Reorganize macb_mii
        bringup"
    Fix typo in "Documentation: macb: Document phy-handle binding"
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoDocumentation: macb: Document phy-handle binding
Brad Mouring [Tue, 13 Mar 2018 21:32:16 +0000 (16:32 -0500)]
Documentation: macb: Document phy-handle binding

Document the existence of the optional binding, directing to the
general ethernet document that describes this binding.

Signed-off-by: Brad Mouring <brad.mouring@ni.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: macb: Add phy-handle DT support
Brad Mouring [Tue, 13 Mar 2018 21:32:15 +0000 (16:32 -0500)]
net: macb: Add phy-handle DT support

This optional binding (as described in the ethernet DT bindings doc)
directs the netdev to the phydev to use. This is useful for a phy
chip that has >1 phy in it, and two netdevs are using the same phy
chip (i.e. the second mac's phy lives on the first mac's MDIO bus)

The devicetree snippet would look something like this:

ethernet@feedf00d {
...
phy-handle = <&phy0> // the first netdev is physically wired to phy0
...
phy0: phy@0 {
...
reg = <0x0> // MDIO address 0
...
}
phy1: phy@1 {
...
reg = <0x1> // MDIO address 1
...
}
...
}

ethernet@deadbeef {
...
phy-handle = <&phy1> // tells the driver to use phy1 on the
 // first mac's mdio bus (it's wired thusly)
...
}

The work done to add the phy_node in the first place (dacdbb4dfc1a1:
"net: macb: add fixed-link node support") will consume the
device_node (if found).

Signed-off-by: Brad Mouring <brad.mouring@ni.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: macb: Remove redundant poll irq assignment
Brad Mouring [Tue, 13 Mar 2018 21:32:14 +0000 (16:32 -0500)]
net: macb: Remove redundant poll irq assignment

In phy_device's general probe, this device will already be set for
phy register polling, rendering this code redundant.

Signed-off-by: Brad Mouring <brad.mouring@ni.com>
Suggested-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: macb: Reorganize macb_mii bringup
Brad Mouring [Tue, 13 Mar 2018 21:32:13 +0000 (16:32 -0500)]
net: macb: Reorganize macb_mii bringup

The macb mii setup (mii_probe() and mii_init()) previously was
somewhat interspersed, likely a result of organic growth and hacking.

This change moves mii bus registration into mii_init and probing the
bus for devices into mii_probe.

Signed-off-by: Brad Mouring <brad.mouring@ni.com>
Suggested-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agodoc: remove out of date links and info from packet mmap
Stephen Hemminger [Tue, 13 Mar 2018 19:24:19 +0000 (12:24 -0700)]
doc: remove out of date links and info from packet mmap

The packet_mmap documentation had links to no longer existing web
sites; replace with other site which has similar example.

Support for packet mmap has been in mainline versions of libpcap
for several years.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoenic: drop IP proto check for vxlan tunnel delete
Govindarajulu Varadarajan [Tue, 13 Mar 2018 12:24:33 +0000 (05:24 -0700)]
enic: drop IP proto check for vxlan tunnel delete

Commit d11790941dd3 ("enic: Add vxlan offload support for IPv6 pkts")
added vxlan offload support for IPv6 pkts. Required change in
enic_udp_tunnel_del was not made. This creates a bug where once user
adds IPv6 tunnel, hw offload for that cannot be deleted.

This patch removes check for IP proto in tunnel delete path. Driver need
not check for IP proto since same UDP port cannot be used to create two
tunnels.

Fixes: d11790941dd3 ("enic: Add vxlan offload support for IPv6 pkts")
Signed-off-by: Govindarajulu Varadarajan <gvaradar@cisco.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agorxrpc: remove redundant initialization of variable 'len'
Colin Ian King [Mon, 12 Mar 2018 17:25:38 +0000 (17:25 +0000)]
rxrpc: remove redundant initialization of variable 'len'

The variable 'len' is being initialized with a value that is never
read and it is re-assigned later, hence the initialization is redundant
and can be removed.

Cleans up clang warning:
net/rxrpc/recvmsg.c:275:15: warning: Value stored to 'len' during its
initialization is never read

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agosctp: Fix double free in sctp_sendmsg_to_asoc
Neil Horman [Mon, 12 Mar 2018 18:15:25 +0000 (14:15 -0400)]
sctp: Fix double free in sctp_sendmsg_to_asoc

syzbot/kasan detected a double free in sctp_sendmsg_to_asoc:
BUG: KASAN: use-after-free in sctp_association_free+0x7b7/0x930
net/sctp/associola.c:332
Read of size 8 at addr ffff8801d8006ae0 by task syzkaller914861/4202

CPU: 1 PID: 4202 Comm: syzkaller914861 Not tainted 4.16.0-rc4+ #258
Hardware name: Google Google Compute Engine/Google Compute Engine
01/01/2011
Call Trace:
 __dump_stack lib/dump_stack.c:17 [inline]
 dump_stack+0x194/0x24d lib/dump_stack.c:53
 print_address_description+0x73/0x250 mm/kasan/report.c:256
 kasan_report_error mm/kasan/report.c:354 [inline]
 kasan_report+0x23c/0x360 mm/kasan/report.c:412
 __asan_report_load8_noabort+0x14/0x20 mm/kasan/report.c:433
 sctp_association_free+0x7b7/0x930 net/sctp/associola.c:332
 sctp_sendmsg+0xc67/0x1a80 net/sctp/socket.c:2075
 inet_sendmsg+0x11f/0x5e0 net/ipv4/af_inet.c:763
 sock_sendmsg_nosec net/socket.c:629 [inline]
 sock_sendmsg+0xca/0x110 net/socket.c:639
 SYSC_sendto+0x361/0x5c0 net/socket.c:1748
 SyS_sendto+0x40/0x50 net/socket.c:1716
 do_syscall_64+0x281/0x940 arch/x86/entry/common.c:287
 entry_SYSCALL_64_after_hwframe+0x42/0xb7

This was introduced by commit:
f84af33 sctp: factor out sctp_sendmsg_to_asoc from sctp_sendmsg

As the newly refactored function moved the wait_for_sndbuf call to a
point after the association was connected, allowing for peeloff events
to occur, which in turn caused wait_for_sndbuf to return -EPIPE which
was not caught by the logic that determines if an association should be
freed or not.

Fix it the easy way by returning the ordering of
sctp_primitive_ASSOCIATE and sctp_wait_for_sndbuf to the old order, to
ensure that EPIPE will not happen.

Tested by myself using the syzbot reproducers with positive results

Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
CC: davem@davemloft.net
CC: Xin Long <lucien.xin@gmail.com>
Reported-by: syzbot+a4e4112c3aff00c8cfd8@syzkaller.appspotmail.com
Reviewed-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: drivers/net: Remove unnecessary skb_copy_expand OOM messages
Joe Perches [Mon, 12 Mar 2018 15:07:12 +0000 (08:07 -0700)]
net: drivers/net: Remove unnecessary skb_copy_expand OOM messages

skb_copy_expand without __GFP_NOWARN already does a dump_stack
on OOM so these messages are redundant.

Signed-off-by: Joe Perches <joe@perches.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoMerge branch '40GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next...
David S. Miller [Thu, 15 Mar 2018 18:04:57 +0000 (14:04 -0400)]
Merge branch '40GbE' of git://git./linux/kernel/git/jkirsher/next-queue

Jeff Kirsher says:

====================
40GbE Intel Wired LAN Driver Updates 2018-03-14

This series contains updates to i40e and i40evf only.

Corentin Labbe cleans up the left over FCoE files in the i40e driver.

Gustavo A R Silva fixes a cut and paste error.

PaweÅ‚ fixes a race condition when the VF driver is loaded on a host and
virsh is trying to attach it to the virtual machine and set a MAC
address.  Resolve the issue by adding polling in i40e_ndo_set_vf_mac()
when the VF is in reset mode.

Jake cleans up i40e_vlan_rx_register() since this only used in a single
location, so just inline the contents of the function.  Created a helper
function to proper update the per-filter statistics when we delete it.
Factored out the re-enabling ATR and SB rules.  Fixed an issue when
re-enabling ATR after the last TCPv4 filter is removed and ntuple is
still active, we were not restoring the TCPv4 filter input set.

Filip modifies the permission check function to ensure that it knows how
many filters are being requested, which allows the check to ensure that
the total number of filters in a single request does not cause us to go
over the limit.

Mariusz fixed an issue where the wrong calculation of partition id was
being done on OCP PHY mezzanine cards, which in turn caused wake on LAN
to be disabled on certain ports.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoi40e: restore TCPv4 input set when re-enabling ATR
Jacob Keller [Thu, 8 Mar 2018 22:52:11 +0000 (14:52 -0800)]
i40e: restore TCPv4 input set when re-enabling ATR

When we re-enable ATR we need to restore the input set for TCPv4
filters, in order for ATR to function correctly. We already do this for
the normal case of re-enabling ATR when disabling ntuple support.
However, when re-enabling ATR after the last TCPv4 filter is removed (but
when ntuple support is still active), we did not restore the TCPv4
filter input set.

This can cause problems if the TCPv4 filters from FDir had changed the
input set, as ATR will no longer behave as expected.

When clearing the ATR auto-disable flag, make sure we restore the TCPv4
input set to avoid this.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
6 years agoi40e: fix for wrong partition id calculation on OCP mezz cards
Mariusz Stachura [Thu, 8 Mar 2018 22:52:10 +0000 (14:52 -0800)]
i40e: fix for wrong partition id calculation on OCP mezz cards

This patch overwrites number of ports for X722 devices with support
for OCP PHY mezzanine.
The old method with checking if port is disabled in the PRTGEN_CNF
register cannot be used in this case. When the OCP is removed, ports
were seen as disabled, which resulted in wrong calculation of partition
id, that caused WoL to be disabled on certain ports.

Signed-off-by: Mariusz Stachura <mariusz.stachura@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
6 years agoi40e: factor out re-enable functions for ATR and SB
Jacob Keller [Thu, 8 Mar 2018 22:52:09 +0000 (14:52 -0800)]
i40e: factor out re-enable functions for ATR and SB

A future patch needs to expand on the logic for re-enabling ATR. Doing
so would cause some code to break the 80-character line limit.

To reduce the level of indentation, factor out helper functions for
re-enabling ATR and SB rules.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
6 years agoi40e: track filter type statistics when deleting invalid filters
Jacob Keller [Thu, 8 Mar 2018 22:52:08 +0000 (14:52 -0800)]
i40e: track filter type statistics when deleting invalid filters

When hardware has trouble with a particular filter, we delete it from
the list. Unfortunately, we did not properly update the per-filter
statistic when doing so.

Create a helper function to handle this, and properly reduce the
necessary counter so that it tracks the number of active filters
properly.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
6 years agoi40e: Fix permission check for VF MAC filters
Filip Sadowski [Thu, 8 Mar 2018 22:52:07 +0000 (14:52 -0800)]
i40e: Fix permission check for VF MAC filters

When VF requests adding of MAC filters the checking is done against number
of already present MAC filters not adding them at the same time. It makes
it possible to add a bunch of filters at once possibly exceeding
acceptable limit of I40E_VC_MAX_MAC_ADDR_PER_VF filters.

This happens because when checking vf->num_mac, we do not check how many
filters are being requested at once. Modify the check function to ensure
that it knows how many filters are being requested. This allows the
check to ensure that the total number of filters in a single request
does not cause us to go over the limit.

Additionally, move the check to within the lock to ensure that the
vf->num_mac is checked while holding the lock to maintain consistency.
We could have simply moved the call to i40e_vf_check_permission to
within the loop, but this could cause a request to be non-atomic, and
add some but not all the addresses, while reporting an error code. We
want to avoid this behavior so that users are not confused about which
filters have or have not been added.

Signed-off-by: Filip Sadowski <filip.sadowski@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
6 years agoi40e: Cleanup i40e_vlan_rx_register
Jacob Keller [Thu, 8 Mar 2018 22:52:06 +0000 (14:52 -0800)]
i40e: Cleanup i40e_vlan_rx_register

We used to use the function i40e_vlan_rx_register as a way to hook
into the now defunct .ndo_vlan_rx_register netdev hook. This was
removed but we kept the function around because we still used it
internally to control enabling or disabling of VLAN stripping.

As pointed out in upstream review, VLAN stripping is only used in a
single location and the previous function is quite small, just inline
it into i40e_restore_vlan() rather than carrying the function
separately.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
6 years agoi40e: Fix attach VF to VM issue
Paweł Jabłoński [Thu, 8 Mar 2018 22:52:05 +0000 (14:52 -0800)]
i40e: Fix attach VF to VM issue

Fix for "Resource temporarily unavailable" problem when virsh is
trying to attach a device to VM. When the VF driver is loaded on
host and virsh is trying to attach it to the VM and set a MAC
address, it ends with a race condition between i40e_reset_vf and
i40e_ndo_set_vf_mac functions. The bug is fixed by adding polling
in i40e_ndo_set_vf_mac function For when the VF is in Reset mode.

Signed-off-by: Paweł Jabłoński <pawel.jablonski@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
6 years agoi40evf/i40evf_main: Fix variable assignment in i40evf_parse_cls_flower
Gustavo A R Silva [Thu, 15 Feb 2018 17:44:35 +0000 (11:44 -0600)]
i40evf/i40evf_main: Fix variable assignment in i40evf_parse_cls_flower

It seems this is a copy-paste error and that the proper variable to use
in this particular case is _src_ instead of _dst_.

Addresses-Coverity-ID: 1465282 ("Copy-paste error")
Fixes: 0075fa0fadd0 ("i40evf: Add support to apply cloud filters")
Signed-off-by: Gustavo A R Silva <garsilva@embeddedor.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
6 years agoi40e: remove i40e_fcoe files
Corentin Labbe [Sun, 28 Jan 2018 20:22:30 +0000 (20:22 +0000)]
i40e: remove i40e_fcoe files

i40e_fcoe support was removed via commit 9eed69a9147c ("i40e: Drop FCoE code from core driver files")
But this left files in place but un-compilable.
Let's finish the cleaning.

Signed-off-by: Corentin Labbe <clabbe@baylibre.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
6 years agoMerge branch 'sctp-add-support-for-some-sctp-auth-APIs-from-RFC6458'
David S. Miller [Wed, 14 Mar 2018 17:48:28 +0000 (13:48 -0400)]
Merge branch 'sctp-add-support-for-some-sctp-auth-APIs-from-RFC6458'

Xin Long says:

====================
sctp: add support for some sctp auth APIs from RFC6458

This patchset mainly adds support for SCTP AUTH Information for sendmsg,
described in RFC6458:

    5.3.8.  SCTP AUTH Information Structure (SCTP_AUTHINFO)

and also adds a sockopt described in RFC6458:

    8.3.4.  Deactivate a Shared Key (SCTP_AUTH_DEACTIVATE_KEY)

and two types of events for AUTHENTICATION_EVENT described in RFC6458:

    6.1.8.  SCTP_AUTHENTICATION_EVENT:
             - SCTP_AUTH_NO_AUTH
             - SCTP_AUTH_FREE_KEY

After this patchset, we have fully support for sctp_sendv in kernel.

Note that this patchset won't touch that sctp options merge conflict.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agosctp: add SCTP_AUTH_NO_AUTH type for AUTHENTICATION_EVENT
Xin Long [Wed, 14 Mar 2018 11:05:34 +0000 (19:05 +0800)]
sctp: add SCTP_AUTH_NO_AUTH type for AUTHENTICATION_EVENT

This patch is to add SCTP_AUTH_NO_AUTH type for AUTHENTICATION_EVENT,
as described in section 6.1.8 of RFC6458.

      SCTP_AUTH_NO_AUTH:  This report indicates that the peer does not
         support SCTP authentication as defined in [RFC4895].

Note that the implementation is quite similar as that of
SCTP_ADAPTATION_INDICATION.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agosctp: add SCTP_AUTH_FREE_KEY type for AUTHENTICATION_EVENT
Xin Long [Wed, 14 Mar 2018 11:05:33 +0000 (19:05 +0800)]
sctp: add SCTP_AUTH_FREE_KEY type for AUTHENTICATION_EVENT

This patch is to add SCTP_AUTH_FREE_KEY type for AUTHENTICATION_EVENT,
as described in section 6.1.8 of RFC6458.

      SCTP_AUTH_FREE_KEY:  This report indicates that the SCTP
         implementation will no longer use the key identifier specified
         in auth_keynumber.

After deactivating a key, it would never be used again, which means
it's refcnt can't be held/increased by new chunks. But there may be
some chunks in out queue still using it. So only when refcnt is 1,
which means no chunk in outqueue is using/holding this key either,
this EVENT would be sent.

When users receive this notification, they could do DEL_KEY sockopt to
remove this shkey, and also tell the peer that this key won't be used
in any chunk thoroughly from now on, then the peer can remove it as
well safely.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agosctp: add sockopt SCTP_AUTH_DEACTIVATE_KEY
Xin Long [Wed, 14 Mar 2018 11:05:32 +0000 (19:05 +0800)]
sctp: add sockopt SCTP_AUTH_DEACTIVATE_KEY

This patch is to add sockopt SCTP_AUTH_DEACTIVATE_KEY, as described in
section 8.3.4 of RFC6458.

This set option indicates that the application will no longer send user
messages using the indicated key identifier.

Note that RFC requires that only deactivated keys that are no longer used
by an association can be deleted, but for the backward compatibility, it
is not to check deactivated when deleting or replacing one sh_key.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agosctp: add support for SCTP AUTH Information for sendmsg
Xin Long [Wed, 14 Mar 2018 11:05:31 +0000 (19:05 +0800)]
sctp: add support for SCTP AUTH Information for sendmsg

This patch is to add support for SCTP AUTH Information for sendmsg,
as described in section 5.3.8 of RFC6458.

With this option, you can provide shared key identifier used for
sending the user message.

It's also a necessary send info for sctp_sendv.

Note that it reuses sinfo->sinfo_tsn to indicate if this option is
set and sinfo->sinfo_ssn to save the shkey ID which can be 0.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agosctp: add refcnt support for sh_key
Xin Long [Wed, 14 Mar 2018 11:05:30 +0000 (19:05 +0800)]
sctp: add refcnt support for sh_key

With refcnt support for sh_key, chunks auth sh_keys can be decided
before enqueuing it. Changing the active key later will not affect
the chunks already enqueued.

Furthermore, this is necessary when adding the support for authinfo
for sendmsg in next patch.

Note that struct sctp_chunk can't be grown due to that performance
drop issue on slow cpu, so it just reuses head_skb memory for shkey
in sctp_chunk.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoMerge branch 'smc-fixes'
David S. Miller [Wed, 14 Mar 2018 17:40:44 +0000 (13:40 -0400)]
Merge branch 'smc-fixes'

Ursula Braun says:

====================
net/smc: fixes 2018-03-14

here are smc changes for the net-next tree.
The first patch enables SMC to work with mlx5-RoCE-devices.
Patches 2 and 3 deal with link group freeing.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet/smc: schedule free_work when link group is terminated
Karsten Graul [Wed, 14 Mar 2018 10:01:02 +0000 (11:01 +0100)]
net/smc: schedule free_work when link group is terminated

The free_work worker must be scheduled when the link group is
abnormally terminated.

Signed-off-by: Karsten Graul <kgraul@linux.vnet.ibm.com>
Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet/smc: free link group without pending free_work only
Ursula Braun [Wed, 14 Mar 2018 10:01:01 +0000 (11:01 +0100)]
net/smc: free link group without pending free_work only

Make sure there is no pending or running free_work worker for the link
group when freeing the link group.

Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet/smc: pay attention to MAX_ORDER for CQ entries
Ursula Braun [Wed, 14 Mar 2018 10:01:00 +0000 (11:01 +0100)]
net/smc: pay attention to MAX_ORDER for CQ entries

smc allocates a certain number of CQ entries for used RoCE devices. For
mlx5 devices the chosen constant number results in a large allocation
causing this warning:

[13355.124656] WARNING: CPU: 3 PID: 16535 at mm/page_alloc.c:3883 __alloc_pages_nodemask+0x2be/0x10c0
[13355.124657] Modules linked in: smc_diag(O) smc(O) xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp bridge stp llc ip6table_filter ip6_tables iptable_filter mlx5_ib ib_core sunrpc mlx5_core s390_trng rng_core ghash_s390 prng aes_s390 des_s390 des_generic sha512_s390 sha256_s390 sha1_s390 sha_common ptp pps_core eadm_sch dm_multipath dm_mod vhost_net tun vhost tap sch_fq_codel kvm ip_tables x_tables autofs4 [last unloaded: smc]
[13355.124672] CPU: 3 PID: 16535 Comm: kworker/3:0 Tainted: G           O    4.14.0uschi #1
[13355.124673] Hardware name: IBM 3906 M04 704 (LPAR)
[13355.124675] Workqueue: events smc_listen_work [smc]
[13355.124677] task: 00000000e2f22100 task.stack: 0000000084720000
[13355.124678] Krnl PSW : 0704c00180000000 000000000029da76 (__alloc_pages_nodemask+0x2be/0x10c0)
[13355.124681]            R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:0 PM:0 RI:0 EA:3
[13355.124682] Krnl GPRS: 0000000000000000 00550e00014080c0 0000000000000000 0000000000000001
[13355.124684]            000000000029d8b6 00000000f3bfd710 0000000000000000 00000000014080c0
[13355.124685]            0000000000000009 00000000ec277a00 0000000000200000 0000000000000000
[13355.124686]            0000000000000000 00000000000001ff 000000000029d8b6 0000000084723720
[13355.124708] Krnl Code: 000000000029da6aa7110200 tmll %r1,512
                          000000000029da6ea774ff29 brc 7,29d8c0
                         #000000000029da72a7f40001 brc 15,29da74
                         >000000000029da76a7f4ff25 brc 15,29d8c0
                          000000000029da7aa7380000 lhi %r3,0
                          000000000029da7ea7f4fef1 brc 15,29d860
                          000000000029da825820f0c4 l %r2,196(%r15)
                          000000000029da86a53e0048 llilh %r3,72
[13355.124720] Call Trace:
[13355.124722] ([<000000000029d8b6>] __alloc_pages_nodemask+0xfe/0x10c0)
[13355.124724]  [<000000000013bd1e>] s390_dma_alloc+0x6e/0x148
[13355.124733]  [<000003ff802eeba6>] mlx5_dma_zalloc_coherent_node+0x8e/0xe0 [mlx5_core]
[13355.124740]  [<000003ff802eee18>] mlx5_buf_alloc_node+0x70/0x108 [mlx5_core]
[13355.124744]  [<000003ff804eb410>] mlx5_ib_create_cq+0x558/0x898 [mlx5_ib]
[13355.124749]  [<000003ff80407d40>] ib_create_cq+0x48/0x88 [ib_core]
[13355.124751]  [<000003ff80109fba>] smc_ib_setup_per_ibdev+0x52/0x118 [smc]
[13355.124753]  [<000003ff8010bcb6>] smc_conn_create+0x65e/0x728 [smc]
[13355.124755]  [<000003ff801081a2>] smc_listen_work+0x2d2/0x540 [smc]
[13355.124756]  [<0000000000162c66>] process_one_work+0x1be/0x440
[13355.124758]  [<0000000000162f40>] worker_thread+0x58/0x458
[13355.124759]  [<0000000000169e7e>] kthread+0x14e/0x168
[13355.124760]  [<00000000009ce8be>] kernel_thread_starter+0x6/0xc
[13355.124762]  [<00000000009ce8b8>] kernel_thread_starter+0x0/0xc
[13355.124762] Last Breaking-Event-Address:
[13355.124764]  [<000000000029da72>] __alloc_pages_nodemask+0x2ba/0x10c0
[13355.124764] ---[ end trace 34be38b581c0b585 ]---

This patch reduces the smc constant for the maximum number of allocated
completion queue entries SMC_MAX_CQE by 2 to avoid high round up values
in the mlx5 code, and reduces the number of allocated completion queue
entries even more, if the final allocation for an mlx5 device hits the
MAX_ORDER limit.

Reported-by: Ihnken Menssen <menssen@de.ibm.com>
Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoliquidio: Add support for liquidio 10GBase-T NIC
Veerasenareddy Burru [Wed, 14 Mar 2018 05:04:45 +0000 (22:04 -0700)]
liquidio: Add support for liquidio 10GBase-T NIC

Added ethtool changes to show port type as TP (Twisted Pair) for
10GBASE-T ports. Same driver and firmware works for liquidio NIC with
SFP+ ports or TP ports.

Signed-off-by: Veerasenareddy Burru <veerasenareddy.burru@cavium.com>
Signed-off-by: Felix Manlunas <felix.manlunas@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agotuntap: XDP_TX can use native XDP
Jason Wang [Wed, 14 Mar 2018 03:23:40 +0000 (11:23 +0800)]
tuntap: XDP_TX can use native XDP

Now we have ndo_xdp_xmit, switch to use it instead of the slow generic
XDP TX routine. XDP_TX on TAP gets ~20% improvements from ~1.5Mpps to
~1.8Mpps on 2.60GHz Core(TM) i7-5600U.

Signed-off-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoMerge branch 'sfc-support-FEC-configuration'
David S. Miller [Wed, 14 Mar 2018 17:12:15 +0000 (13:12 -0400)]
Merge branch 'sfc-support-FEC-configuration'

Edward Cree says:

====================
sfc: support FEC configuration

Implements the ethtool get & set fecparam operations.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agosfc: support FEC configuration through ethtool
Edward Cree [Wed, 14 Mar 2018 14:21:26 +0000 (14:21 +0000)]
sfc: support FEC configuration through ethtool

As well as 'auto' and the forced 'off', 'rs' and 'baser' states, we also
 handle combinations of settings (since the fecparam->fec field is a
 bitmask), where auto|rs and auto|baser specify a preferred FEC mode but
 will fall back to the other if the cable or link partner doesn't support
 it.  rs|baser (with or without auto bit) means prefer FEC even where
 auto wouldn't use it, but let FW choose which encoding to use.

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agosfc: update MCDI protocol headers
Edward Cree [Wed, 14 Mar 2018 14:21:00 +0000 (14:21 +0000)]
sfc: update MCDI protocol headers

Signed-off-by: Edward Cree <ecree@solarflare.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoibmvnic: Fix reset return from closed state
John Allen [Wed, 14 Mar 2018 15:41:29 +0000 (10:41 -0500)]
ibmvnic: Fix reset return from closed state

The case in which we handle a reset from the state where the device is
closed seems to be bugged for all types of reset. For most types of reset
we currently exit the reset routine correctly, but don't set the state to
indicate that we are back in the "closed" state. For some specific cases,
we don't exit the reset routine at all and resetting will cause a closed
device to be opened.

This patch fixes the problem by unconditionally checking the reset_state
and correctly setting the adapter state before returning.

Signed-off-by: John Allen <jallen@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agosock: remove zerocopy sockopt restriction on closed tcp state
Willem de Bruijn [Wed, 14 Mar 2018 16:49:19 +0000 (12:49 -0400)]
sock: remove zerocopy sockopt restriction on closed tcp state

Socket option SO_ZEROCOPY determines whether the kernel ignores or
processes flag MSG_ZEROCOPY on subsequent send calls. This to avoid
changing behavior for legacy processes.

Limiting the state change to closed sockets is annoying with passive
sockets and not necessary for correctness. Once created, zerocopy skbs
are processed based on their private state, not this socket flag.

Remove the constraint.

Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agopktgen: Fix memory leak in pktgen_if_write
Gustavo A. R. Silva [Wed, 14 Mar 2018 08:07:27 +0000 (03:07 -0500)]
pktgen: Fix memory leak in pktgen_if_write

_buf_ is an array and the one that must be freed is _tp_ instead.

Fixes: a870a02cc963 ("pktgen: use dynamic allocation for debug print buffer")
Reported-by: Wang Jian <jianjian.wang1@gmail.com>
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agopktgen: use dynamic allocation for debug print buffer
Arnd Bergmann [Tue, 13 Mar 2018 20:58:39 +0000 (21:58 +0100)]
pktgen: use dynamic allocation for debug print buffer

After the removal of the VLA, we get a harmless warning about a large
stack frame:

net/core/pktgen.c: In function 'pktgen_if_write':
net/core/pktgen.c:1710:1: error: the frame size of 1076 bytes is larger than 1024 bytes [-Werror=frame-larger-than=]

The function was previously shown to be safe despite hitting
the 1024 bye warning level. To get rid of the annoyging warning,
while keeping it readable, this changes it to use strndup_user().

Obviously this is not a fast path, so the kmalloc() overhead
can be disregarded.

Fixes: 35951393bbff ("pktgen: Remove VLA usage")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: fix sysctl_fb_tunnels_only_for_init_net link error
Arnd Bergmann [Tue, 13 Mar 2018 11:44:53 +0000 (12:44 +0100)]
net: fix sysctl_fb_tunnels_only_for_init_net link error

The new variable is only available when CONFIG_SYSCTL is enabled,
otherwise we get a link error:

net/ipv4/ip_tunnel.o: In function `ip_tunnel_init_net':
ip_tunnel.c:(.text+0x278b): undefined reference to `sysctl_fb_tunnels_only_for_init_net'
net/ipv6/sit.o: In function `sit_init_net':
sit.c:(.init.text+0x4c): undefined reference to `sysctl_fb_tunnels_only_for_init_net'
net/ipv6/ip6_tunnel.o: In function `ip6_tnl_init_net':
ip6_tunnel.c:(.init.text+0x39): undefined reference to `sysctl_fb_tunnels_only_for_init_net'

This adds an extra condition, keeping the traditional behavior when
CONFIG_SYSCTL is disabled.

Fixes: 79134e6ce2c9 ("net: do not create fallback tunnels for non-default namespaces")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: Add comment about pernet_operations methods and synchronization
Kirill Tkhai [Tue, 13 Mar 2018 10:55:55 +0000 (13:55 +0300)]
net: Add comment about pernet_operations methods and synchronization

Make locking scheme be visible for users, and provide
a comment what for we are need exit_batch() methods,
and when it should be used.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agocxgb4: Add HMA support
Arjun Vynipadath [Tue, 13 Mar 2018 10:54:45 +0000 (16:24 +0530)]
cxgb4: Add HMA support

HMA(Host Memory Access) maps a part of host memory for T6-SO memfree cards.

This commit does the following:
- Query FW to check if we have HMA support. If yes, the params will
  return HMA size configured in FW. We will dma map memory based
  on this size.
- Also contains changes to get HMA memory information via debugfs.

Signed-off-by: Arjun Vynipadath <arjun@chelsio.com>
Signed-off-by: Santosh Rastapur <santosh@chelsio.com>
Signed-off-by: Michael Werner <werner@chelsio.com>
Signed-off-by: Ganesh GR <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoMerge branch 'pernet-convert-part6'
David S. Miller [Tue, 13 Mar 2018 15:24:57 +0000 (11:24 -0400)]
Merge branch 'pernet-convert-part6'

Kirill Tkhai says:

====================
Converting pernet_operations (part #6)

this series continues to review and to convert pernet_operations
to make them possible to be executed in parallel for several
net namespaces in the same time. There are sctp, tipc and rds
in this series.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: Convert rds_tcp_net_ops
Kirill Tkhai [Tue, 13 Mar 2018 10:37:21 +0000 (13:37 +0300)]
net: Convert rds_tcp_net_ops

These pernet_operations create and destroy sysctl table
and listen socket. Also, exit method flushes global
workqueue and work. Everything looks per-net safe,
so we can mark them async.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: Convert tipc_net_ops
Kirill Tkhai [Tue, 13 Mar 2018 10:37:11 +0000 (13:37 +0300)]
net: Convert tipc_net_ops

TIPC looks concentrated in itself, and other pernet_operations
seem not touching its entities.

tipc_net_ops look pernet-divided, and they should be safe to
be executed in parallel for several net the same time.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: Convert sctp_ctrlsock_ops
Kirill Tkhai [Tue, 13 Mar 2018 10:37:02 +0000 (13:37 +0300)]
net: Convert sctp_ctrlsock_ops

These pernet_operations create and destroy net::sctp::ctl_sock.
Since pernet_operations do not send sctp packets each other,
they look safe to be marked as async.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: Convert sctp_defaults_ops
Kirill Tkhai [Tue, 13 Mar 2018 10:36:51 +0000 (13:36 +0300)]
net: Convert sctp_defaults_ops

These pernet_operations have a deal with sysctl, /proc
entries and statistics. Also, there are freeing of
net::sctp::addr_waitq queue and net::sctp::local_addr_list
in exit method. All of them look pernet-divided, and it
seems these items are only interesting for sctp_defaults_ops,
which are safe to be executed in parallel.

Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agosctp: fix error return code in sctp_sendmsg_new_asoc()
Wei Yongjun [Tue, 13 Mar 2018 03:03:30 +0000 (03:03 +0000)]
sctp: fix error return code in sctp_sendmsg_new_asoc()

Return error code -EINVAL in the address len check error handling
case since 'err' can be overwrite to 0 by 'err = sctp_verify_addr()'
in the for loop.

Fixes: 2c0dbaa0c43d ("sctp: add support for SCTP_DSTADDRV4/6 Information for sendmsg")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Reviewed-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoibmvnic: Fix recent errata commit
Thomas Falcon [Tue, 13 Mar 2018 02:05:26 +0000 (21:05 -0500)]
ibmvnic: Fix recent errata commit

Sorry, one of the patches I sent in an earlier series
has some dumb mistakes. One was that I had changed the
parameter for the errata workaround function but forgot
to make that change in the code that called it.

The second mistake was a forgotten return value at the end
of the function in case the workaround was not needed.

Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoMerge branch 'ibmvnic-Fix-VLAN-and-other-device-errata'
David S. Miller [Tue, 13 Mar 2018 14:09:46 +0000 (10:09 -0400)]
Merge branch 'ibmvnic-Fix-VLAN-and-other-device-errata'

Thomas Falcon says:

====================
ibmvnic: Fix VLAN and other device errata

This patch series contains fixes for VLAN and other backing hardware
errata. The VLAN fixes are mostly to account for the additional four
bytes VLAN header in TX descriptors and buffers, when applicable.

The other fixes for device errata are to pad small packets to avoid a
possible connection error that can occur when some devices attempt to
transmit small packets. The other fixes are GSO related. Some devices
cannot handle a smaller MSS or a packet with a single segment, so
disable GSO in those cases.

v2: Fix style mistake (unneeded brackets) in patch 3/4
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoibmvnic: Handle TSO backing device errata
Thomas Falcon [Mon, 12 Mar 2018 16:51:05 +0000 (11:51 -0500)]
ibmvnic: Handle TSO backing device errata

TSO packets with one segment or with an MSS less than 224 can
cause errors on some backing devices, so disable GSO in those cases.

Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoibmvnic: Pad small packets to minimum MTU size
Thomas Falcon [Mon, 12 Mar 2018 16:51:04 +0000 (11:51 -0500)]
ibmvnic: Pad small packets to minimum MTU size

Some backing devices cannot handle small packets well,
so pad any small packets to avoid that. It was recommended
that the VNIC driver should not send packets smaller than the
minimum MTU value provided by firmware, so pad small packets
to be at least that long.

Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoibmvnic: Account for VLAN header length in TX buffers
Thomas Falcon [Mon, 12 Mar 2018 16:51:03 +0000 (11:51 -0500)]
ibmvnic: Account for VLAN header length in TX buffers

The extra four bytes of a VLAN packet was throwing off
TX buffer entry values used by the driver. Account for those
bytes when in buffer size and buffer entry calculations

Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoibmvnic: Account for VLAN tag in L2 Header descriptor
Thomas Falcon [Mon, 12 Mar 2018 16:51:02 +0000 (11:51 -0500)]
ibmvnic: Account for VLAN tag in L2 Header descriptor

If a VLAN tag is present in the Ethernet header, account
for that when providing the L2 header to firmware.

Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agotc-testing: updated gact tests with batch test cases
Roman Mashak [Mon, 12 Mar 2018 20:07:02 +0000 (16:07 -0400)]
tc-testing: updated gact tests with batch test cases

Signed-off-by: Roman Mashak <mrv@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agotc-testing: add TC vlan action tests
Roman Mashak [Mon, 12 Mar 2018 20:06:37 +0000 (16:06 -0400)]
tc-testing: add TC vlan action tests

Signed-off-by: Roman Mashak <mrv@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: phy: set link state to down when creating the phy_device
Heiner Kallweit [Sun, 11 Mar 2018 14:00:37 +0000 (15:00 +0100)]
net: phy: set link state to down when creating the phy_device

Currently the link state is initialized to "up" when the phy_device is
being created. This is not consistent with the phy state being
initialized to PHY_DOWN.

Usually this doen't do any harm because the link state is updated
once the PHY reaches state PHY_AN. However e.g. if a LAN port isn't
used and the PHY remains down this inconsistency remains and calls
to functions like phy_print_status() give false results.
Therefore change the initialization to link being down.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoMerge branch '10GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/jkirsher/next...
David S. Miller [Mon, 12 Mar 2018 21:30:02 +0000 (17:30 -0400)]
Merge branch '10GbE' of git://git./linux/kernel/git/jkirsher/next-queue

Jeff Kirsher says:

====================
10GbE Intel Wired LAN Driver Updates 2018-03-12

This series contains updates to ixgbe and ixgbevf only.

Shannon Nelson provides three fixes to the ipsec portion of ixgbe.  Make
sure we are using 128-bit authentication, since it is the only size
supported for hardware offload.  Fixed the transmit trailer length
calculation for ipsec by finding the padding value and adding it to the
authentication length, then save it off so that we can put it in the
transmit descriptor to tell the device where to stop the checksum
calculation.  Lastly, cleaned up useless and dead code.

Tonghao Zhang adds a ethtool stat for receive length errors, since the
driver was already collecting this counter.

Arnd Bergmann fixed a warning about an used variable by "rephrasing" the
code so that the compiler can see the use of the variable in question.

Paul fixes an issue where "HIDE_VLAN" was being cleared on VF reset, so
ensure to set "HIDE_VLAN" when port VLAN is enabled after a VF reset.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoixgbe: fix disabling hide VLAN on VF reset
Paul Greenwalt [Thu, 8 Mar 2018 12:26:08 +0000 (07:26 -0500)]
ixgbe: fix disabling hide VLAN on VF reset

If port VLAN is enabled, set PFQDE.HIDE_VLAN during VF reset.

Setting only PFQDE.PFQDE during VF reset was clearing PFQDE.HIDE_VLAN.

Signed-off-by: Paul Greenwalt <paul.greenwalt@intel.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
6 years agonet: rds: drop VLA in rds_walk_conn_path_info()
Salvatore Mesoraca [Sun, 11 Mar 2018 21:07:50 +0000 (22:07 +0100)]
net: rds: drop VLA in rds_walk_conn_path_info()

Avoid VLA[1] by using an already allocated buffer passed
by the caller.

[1] https://lkml.org/lkml/2018/3/7/621

Signed-off-by: Salvatore Mesoraca <s.mesoraca16@gmail.com>
Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: rds: drop VLA in rds_for_each_conn_info()
Salvatore Mesoraca [Sun, 11 Mar 2018 21:07:49 +0000 (22:07 +0100)]
net: rds: drop VLA in rds_for_each_conn_info()

Avoid VLA[1] by using an already allocated buffer passed
by the caller.

[1] https://lkml.org/lkml/2018/3/7/621

Signed-off-by: Salvatore Mesoraca <s.mesoraca16@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoixgbevf: fix unused variable warning
Arnd Bergmann [Wed, 28 Feb 2018 23:17:36 +0000 (00:17 +0100)]
ixgbevf: fix unused variable warning

The new ixgbevf_set_rx_buffer_len() function causes a harmless warnings
in configurations with large page size:

drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c: In function 'ixgbevf_set_rx_buffer_len':
drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c:1758:15: error: unused variable 'max_frame' [-Werror=unused-variable]

This rephrases the code so that the compiler can see the use of that
variable, making it slightly easier to read in the process.

Fixes: f15c5ba5b6cd ("ixgbevf: add support for using order 1 pages to receive large frames")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Acked-by: Alexander Duyck <alexander.h.duyck@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
6 years agoixgbe: Add receive length error counter
Tonghao Zhang [Wed, 28 Feb 2018 11:59:09 +0000 (03:59 -0800)]
ixgbe: Add receive length error counter

ixgbe enabled rlec counter and the rx_error used it.
We can export the counter directly via ethtool -S ethX.

Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
6 years agoixgbe: remove unneeded ipsec state free callback
Shannon Nelson [Thu, 22 Feb 2018 19:09:57 +0000 (11:09 -0800)]
ixgbe: remove unneeded ipsec state free callback

With commit 7f05b467a735 ("xfrm: check for xdo_dev_state_free")
we no longer need to add an empty callback function
to the driver, so now let's remove the useless code.

Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
6 years agoixgbe: fix ipsec trailer length
Shannon Nelson [Thu, 22 Feb 2018 19:09:56 +0000 (11:09 -0800)]
ixgbe: fix ipsec trailer length

Fix up the Tx trailer length calculation.  We can't believe the
trailer len from the xstate information because it was calculated
before the packet was put together and padding added.  This bit
of code finds the padding value in the trailer, adds it to the
authentication length, and saves it so later we can put it into
the Tx descriptor to tell the device where to stop the checksum
calculation.

Fixes: 592594704761 ("ixgbe: process the Tx ipsec offload")
Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
6 years agoixgbe: check for 128-bit authentication
Shannon Nelson [Thu, 22 Feb 2018 19:09:55 +0000 (11:09 -0800)]
ixgbe: check for 128-bit authentication

Make sure the Security Association is using
a 128-bit authentication, since that's the only
size that the hardware offload supports.

Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
Tested-by: Andrew Bowers <andrewx.bowers@intel.com>
Signed-off-by: Jeff Kirsher <jeffrey.t.kirsher@intel.com>
6 years agomlxsw: spectrum_kvdl: Make some functions static
Wei Yongjun [Mon, 12 Mar 2018 12:25:24 +0000 (12:25 +0000)]
mlxsw: spectrum_kvdl: Make some functions static

Fixes the following sparse warnings:

drivers/net/ethernet/mellanox/mlxsw/spectrum_kvdl.c:371:5: warning:
 symbol 'mlxsw_sp_kvdl_single_occ_get' was not declared. Should it be static?
drivers/net/ethernet/mellanox/mlxsw/spectrum_kvdl.c:384:5: warning:
 symbol 'mlxsw_sp_kvdl_chunks_occ_get' was not declared. Should it be static?
drivers/net/ethernet/mellanox/mlxsw/spectrum_kvdl.c:397:5: warning:
 symbol 'mlxsw_sp_kvdl_large_chunks_occ_get' was not declared. Should it be static?

Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: Make RX-FCS and HW GRO mutually exclusive
Gal Pressman [Mon, 12 Mar 2018 09:48:49 +0000 (11:48 +0200)]
net: Make RX-FCS and HW GRO mutually exclusive

Same as LRO, hardware GRO cannot be enabled with RX-FCS.
When both are requested, hardware GRO will be dropped.

Suggested-by: David Miller <davem@davemloft.net>
Signed-off-by: Gal Pressman <galp@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: llc: drop VLA in llc_sap_mcast()
Salvatore Mesoraca [Sun, 11 Mar 2018 21:12:04 +0000 (22:12 +0100)]
net: llc: drop VLA in llc_sap_mcast()

Avoid a VLA[1] by using a real constant expression instead of a variable.
The compiler should be able to optimize the original code and avoid using
an actual VLA. Anyway this change is useful because it will avoid a false
positive with -Wvla, it might also help the compiler generating better
code.

[1] https://lkml.org/lkml/2018/3/7/621

Signed-off-by: Salvatore Mesoraca <s.mesoraca16@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agolan743x: make functions lan743x_csr_read and lan743x_csr_read static
Colin Ian King [Sun, 11 Mar 2018 16:55:47 +0000 (17:55 +0100)]
lan743x: make functions lan743x_csr_read and lan743x_csr_read static

Functions lan743x_csr_read and lan743x_csr_read are local to the source
and do not need to be in global scope, so make them static.

Cleans up sparse warning:
drivers/net/ethernet/microchip/lan743x_main.c:56:5: warning: symbol
lan743x_csr_read' was not declared. Should it be static?
drivers/net/ethernet/microchip/lan743x_main.c:61:6: warning: symbol
'lan743x_csr_write' was not declared. Should it be static?

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agolan743x: remove some redundant variables and assignments
Colin Ian King [Sun, 11 Mar 2018 16:42:33 +0000 (17:42 +0100)]
lan743x: remove some redundant variables and assignments

Function lan743x_phy_init assigns pointer 'netdev' but this is never read
and hence it can be removed. The return error code handling can also be
cleaned up to remove the variable 'ret'.

Function lan743x_phy_link_status_change assigns pointer 'phy' twice and
this is never read, so it also can be removed.

Finally, function lan743x_tx_napi_poll initializes pointer 'adapter'
and then re-assigns the same value into this pointer a little later on
so this second assignment is redundant and can be also removed.

Cleans up clang warnings:
drivers/net/ethernet/microchip/lan743x_main.c:951:2: warning: Value
stored to 'netdev' is never read
drivers/net/ethernet/microchip/lan743x_main.c:971:3: warning: Value
stored to 'phy' is never read
drivers/net/ethernet/microchip/lan743x_main.c:1583:26: warning: Value
stored to 'adapter' during its initialization is never read

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agords: remove redundant variable 'sg_off'
Colin Ian King [Sun, 11 Mar 2018 16:27:56 +0000 (17:27 +0100)]
rds: remove redundant variable 'sg_off'

Variable sg_off is assigned a value but it is never read, hence it is
redundant and can be removed.

Cleans up clang warning:
net/rds/message.c:373:2: warning: Value stored to 'sg_off' is never read

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Sowmini Varadhan <sowmini.varadhan@oracle.com>
Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoipv6: Use ip6_multipath_hash_policy() in rt6_multipath_hash().
David S. Miller [Mon, 12 Mar 2018 15:09:33 +0000 (11:09 -0400)]
ipv6: Use ip6_multipath_hash_policy() in rt6_multipath_hash().

Make use of the new helper.

Suggested-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agoMerge branch 'mlxsw-Removing-dependency-of-mlxsw-on-GRE'
David S. Miller [Mon, 12 Mar 2018 15:07:16 +0000 (11:07 -0400)]
Merge branch 'mlxsw-Removing-dependency-of-mlxsw-on-GRE'

Ido Schimmel says:

====================
mlxsw: Removing dependency of mlxsw on GRE

Petr says:

mlxsw_spectrum supports offloading of a tc action mirred egress mirror
to a gretap or ip6gretap netdevice, which necessitates calls to
functions defined in ip_gre, ip6_gre and ip6_tunnel modules. Previously
this was enabled by introducing a hard dependency of MLXSW_SPECTRUM on
NET_IPGRE and IPV6_GRE. However the rest of mlxsw is careful about
picking which modules are absolutely required, and therefore the better
approach is to make mlxsw_spectrum tolerant of absence of one or both of
the GRE flavors.

One way this might be resolved is by keeping the code in mlxsw_spectrum
intact, and defining defaults for functions that mlxsw_spectrum depends
on. The downsides are that other modules end up littered with these
do-nothing defaults; that the driver ends up carrying quite a bit of
dead code; and that the driver ends up having to explicitly depend on
IPV6_TUNNEL to prevent configurations where mlxsw_spectrum is compiled
in and and ip6_tunnel is a module, something that it currently can treat
as an implementation detail of the IPV6_GRE dependency.

Alternatively, the driver should just bite the bullet and ifdef-out the
code that handles configurations that are not supported. Since that's
what we are doing for IPv6 dependency, let's do the same for the GRE
flavors.

Patch #1 introduces a wrapper function for determining the value of
ipv6.sysctl.multipath_hash_policy, which defaults to 0 on non-IPv6
builds. That function is then used from spectrum_router.c, instead of
the direct variable reference that was introduced there during the short
window when the Spectrum driver had a hard dependency on IPv6.

Patch #2 moves one function to keep together in one block all the
callbacks for handling (IPv4) gretap mirroring.

Patch #3 then introduces the ifdefs to hide the irrelevant code.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agomlxsw: spectrum: Don't depend on ip_gre and ip6_gre
Petr Machata [Sun, 11 Mar 2018 07:45:49 +0000 (09:45 +0200)]
mlxsw: spectrum: Don't depend on ip_gre and ip6_gre

mlxsw_spectrum supports offloading of a tc action mirred egress mirror
to a gretap or an ip6gretap netdevice, which necessitates calls to
functions defined in ip_gre, ip6_gre and ip6_tunnel modules. Previously
this was enabled by introducing a hard dependency of MLXSW_SPECTRUM on
NET_IPGRE and IPV6_GRE. However the rest of mlxsw is careful about
picking which modules are absolutely required, and therefore the better
approach is to make mlxsw_spectrum tolerant of absence of one or both of
the GRE flavors.

Hence rework the NET_IPGRE and IPV6_GRE dependencies to just guard
matching modularity, and hide the corresponding code in spectrum_span.c
in an #if IS_ENABLED. Mark mlxsw_sp_span_entry_tunnel_parms_common as
maybe unused, to muffle warnings if neither GRE flavor is selected,
which seems cleaner than introducing a composite #if.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agomlxsw: spectrum: Move mlxsw_sp_span_gretap4_route()
Petr Machata [Sun, 11 Mar 2018 07:45:48 +0000 (09:45 +0200)]
mlxsw: spectrum: Move mlxsw_sp_span_gretap4_route()

Move the function next to the rest of gretap4 functions. Thus the
generic functions shared between gretap4 and gretap6 are in one block at
the beginning, followed by a gretap4 block, followed by a gretap6 block.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
6 years agonet: ipv6: Introduce ip6_multipath_hash_policy()
Petr Machata [Sun, 11 Mar 2018 07:45:47 +0000 (09:45 +0200)]
net: ipv6: Introduce ip6_multipath_hash_policy()

In order to abstract away access to the
ipv6.sysctl.multipath_hash_policy variable, which is not available on
systems compiled without IPv6 support, introduce a wrapper function
ip6_multipath_hash_policy() that falls back to 0 on non-IPv6 systems.

Use this wrapper from mlxsw/spectrum_router instead of a direct
reference.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Acked-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>