OSDN Git Service

tomoyo/tomoyo-test1.git
5 years agonet: bridge: add support for per-port vlan stats
Nikolay Aleksandrov [Fri, 12 Oct 2018 10:41:16 +0000 (13:41 +0300)]
net: bridge: add support for per-port vlan stats

This patch adds an option to have per-port vlan stats instead of the
default global stats. The option can be set only when there are no port
vlans in the bridge since we need to allocate the stats if it is set
when vlans are being added to ports (and respectively free them
when being deleted). Also bump RTNL_MAX_TYPE as the bridge is the
largest user of options. The current stats design allows us to add
these without any changes to the fast-path, it all comes down to
the per-vlan stats pointer which, if this option is enabled, will
be allocated for each port vlan instead of using the global bridge-wide
one.

CC: bridge@lists.linux-foundation.org
CC: Roopa Prabhu <roopa@cumulusnetworks.com>
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agofore200e: fix sbus compile
Christoph Hellwig [Fri, 12 Oct 2018 08:17:51 +0000 (10:17 +0200)]
fore200e: fix sbus compile

Fix a stupid typo introduced in the refactoring.

Fixes: 0efe5523 ("fore200e: simplify fore200e_bus usage")
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: Evict neighbor entries on carrier down
David Ahern [Fri, 12 Oct 2018 03:33:49 +0000 (20:33 -0700)]
net: Evict neighbor entries on carrier down

When a link's carrier goes down it could be a sign of the port changing
networks. If the new network has overlapping addresses with the old one,
then the kernel will continue trying to use neighbor entries established
based on the old network until the entries finally age out - meaning a
potentially long delay with communications not working.

This patch evicts neighbor entries on carrier down with the exception of
those marked permanent. Permanent entries are managed by userspace (either
an admin or a routing daemon such as FRR).

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet/ipv6: Add knob to skip DELROUTE message on device down
David Ahern [Fri, 12 Oct 2018 03:17:21 +0000 (20:17 -0700)]
net/ipv6: Add knob to skip DELROUTE message on device down

Another difference between IPv4 and IPv6 is the generation of RTM_DELROUTE
notifications when a device is taken down (admin down) or deleted. IPv4
does not generate a message for routes evicted by the down or delete;
IPv6 does. A NOS at scale really needs to avoid these messages and have
IPv4 and IPv6 behave similarly, relying on userspace to handle link
notifications and evict the routes.

At this point existing user behavior needs to be preserved. Since
notifications are a global action (not per app) the only way to preserve
existing behavior and allow the messages to be skipped is to add a new
sysctl (net/ipv6/route/skip_notify_on_dev_down) which can be set to
disable the notificatioons.

IPv6 route code already supports the option to skip the message (it is
used for multipath routes for example). Besides the new sysctl we need
to pass the skip_notify setting through the generic fib6_clean and
fib6_walk functions to fib6_clean_node and to set skip_notify on calls
to __ip_del_rt for the addrconf_ifdown path.

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: fddi: skfp: Remove unused macros 'PNMI_GET_ID' and 'PNMI_SET_ID'
YueHaibing [Fri, 12 Oct 2018 02:37:41 +0000 (10:37 +0800)]
net: fddi: skfp: Remove unused macros 'PNMI_GET_ID' and 'PNMI_SET_ID'

The two PNMI macros are never used

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: cdc_ncm: remove set but not used variable 'ctx'
YueHaibing [Fri, 12 Oct 2018 01:49:13 +0000 (01:49 +0000)]
net: cdc_ncm: remove set but not used variable 'ctx'

Fixes gcc '-Wunused-but-set-variable' warning:

drivers/net/usb/cdc_ncm.c: In function 'cdc_ncm_status':
drivers/net/usb/cdc_ncm.c:1603:22: warning:
 variable 'ctx' set but not used [-Wunused-but-set-variable]
  struct cdc_ncm_ctx *ctx;

It not used any more after
commit fa83dbeee558 ("net: cdc_ncm: remove redundant "disconnected" flag")

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonfp: replace long license headers with SPDX
Jakub Kicinski [Thu, 11 Oct 2018 15:57:42 +0000 (08:57 -0700)]
nfp: replace long license headers with SPDX

Replace the repeated license text with SDPX identifiers.
While at it bump the Copyright dates for files we touched
this year.

Signed-off-by: Edwin Peer <edwin.peer@netronome.com>
Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: Nic Viljoen <nick.viljoen@netronome.com>
Reviewed-by: Simon Horman <simon.horman@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: cdc_ncm: use tasklet_init() for tasklet_struct init
Ben Dooks [Thu, 11 Oct 2018 13:03:32 +0000 (14:03 +0100)]
net: cdc_ncm: use tasklet_init() for tasklet_struct init

The tasklet initialisation would be better done by tasklet_init()
instead of assuming all the fields are in an ok state by default.

This does not fix any actual know bug.

Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoselftests: use posix-style redirection in ip_defrag.sh
Paolo Abeni [Thu, 11 Oct 2018 09:17:37 +0000 (11:17 +0200)]
selftests: use posix-style redirection in ip_defrag.sh

The ip_defrag.sh script requires bash-style output redirection but
use the default shell. This may cause random failures if the default
shell is not bash.
Address the above using posix compliant output redirection.

Fixes: 02c7f38b7ace ("selftests/net: add ip_defrag selftest")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agocxgb4: fix thermal configuration dependencies
Arnd Bergmann [Thu, 11 Oct 2018 08:57:57 +0000 (10:57 +0200)]
cxgb4: fix thermal configuration dependencies

With CONFIG_THERMAL=m, we get a build error:

drivers/net/ethernet/chelsio/cxgb4/cxgb4_thermal.c: In function 'cxgb4_thermal_get_trip_type':
drivers/net/ethernet/chelsio/cxgb4/cxgb4_thermal.c:48:11: error: 'struct adapter' has no member named 'ch_thermal'

Once that is fixed by using IS_ENABLED() checks, we get a link error
against the thermal subsystem when cxgb4 is built-in:

drivers/net/ethernet/chelsio/cxgb4/cxgb4_thermal.o: In function `cxgb4_thermal_init':
cxgb4_thermal.c:(.text+0x180): undefined reference to `thermal_zone_device_register'
drivers/net/ethernet/chelsio/cxgb4/cxgb4_thermal.o: In function `cxgb4_thermal_remove':
cxgb4_thermal.c:(.text+0x1e0): undefined reference to `thermal_zone_device_unregister'

Finally, since CONFIG_THERMAL can be =m, the Makefile fails to pick up the
extra file into built-in.a, and we get another link failure against the
cxgb4_thermal_init/cxgb4_thermal_remove files, so the Makefile has to
be adapted as well to work for both CONFIG_THERMAL=y and =m.

Fixes: b18719157762 ("cxgb4: Add thermal zone support")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge branch 'ena-next'
David S. Miller [Thu, 11 Oct 2018 17:13:52 +0000 (10:13 -0700)]
Merge branch 'ena-next'

Arthur Kiyanovski says:

====================
Improving performance and reducing latencies, by using latest capabilities exposed in ENA device

This patchset introduces the following:
1. A new placement policy of Tx headers and descriptors, which takes
advantage of an option to place headers + descriptors in device memory
space. This is sometimes referred to as LLQ - low latency queue.
The patch set defines the admin capability, maps the device memory as
write-combined, and adds a mode in transmit datapath to do header +
descriptor placement on the device.
2. Support for RX checksum offloading
3. Miscelaneous small improvements and code cleanups

Note: V1 of this patchset was created as if patches e2a322a 248ab77
from net were applied to net-next before applying the patchset. This V2
version does not assume this, and should be applyed directly on net-next
without the aformentioned patches.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: ena: fix indentations in ena_defs for better readability
Arthur Kiyanovski [Thu, 11 Oct 2018 08:26:27 +0000 (11:26 +0300)]
net: ena: fix indentations in ena_defs for better readability

Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: ena: update driver version to 2.0.1
Arthur Kiyanovski [Thu, 11 Oct 2018 08:26:26 +0000 (11:26 +0300)]
net: ena: update driver version to 2.0.1

Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: ena: remove redundant parameter in ena_com_admin_init()
Arthur Kiyanovski [Thu, 11 Oct 2018 08:26:25 +0000 (11:26 +0300)]
net: ena: remove redundant parameter in ena_com_admin_init()

Remove redundant spinlock acquire parameter from ena_com_admin_init()

Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: ena: change rx copybreak default to reduce kernel memory pressure
Arthur Kiyanovski [Thu, 11 Oct 2018 08:26:24 +0000 (11:26 +0300)]
net: ena: change rx copybreak default to reduce kernel memory pressure

Improves socket memory utilization when receiving packets larger
than 128 bytes (the previous rx copybreak) and smaller than 256 bytes.

Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: ena: limit refill Rx threshold to 256 to avoid latency issues
Arthur Kiyanovski [Thu, 11 Oct 2018 08:26:23 +0000 (11:26 +0300)]
net: ena: limit refill Rx threshold to 256 to avoid latency issues

Currently Rx refill is done when the number of required descriptors is
above 1/8 queue size. With a default of 1024 entries per queue the
threshold is 128 descriptors.
There is intention to increase the queue size to 8196 entries.
In this case threshold of 1024 descriptors is too large and can hurt
latency.
Add another limitation to Rx threshold to be at most 256 descriptors.

Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: ena: explicit casting and initialization, and clearer error handling
Arthur Kiyanovski [Thu, 11 Oct 2018 08:26:22 +0000 (11:26 +0300)]
net: ena: explicit casting and initialization, and clearer error handling

Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: ena: use CSUM_CHECKED device indication to report skb's checksum status
Arthur Kiyanovski [Thu, 11 Oct 2018 08:26:21 +0000 (11:26 +0300)]
net: ena: use CSUM_CHECKED device indication to report skb's checksum status

Set skb->ip_summed to the correct value as reported by the device.
Add counter for the case where rx csum offload is enabled but
device didn't check it.

Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: ena: add functions for handling Low Latency Queues in ena_netdev
Arthur Kiyanovski [Thu, 11 Oct 2018 08:26:20 +0000 (11:26 +0300)]
net: ena: add functions for handling Low Latency Queues in ena_netdev

This patch includes all code changes necessary in ena_netdev to enable
packet sending via the LLQ placemnt mode.

Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: ena: add functions for handling Low Latency Queues in ena_com
Arthur Kiyanovski [Thu, 11 Oct 2018 08:26:19 +0000 (11:26 +0300)]
net: ena: add functions for handling Low Latency Queues in ena_com

This patch introduces APIs for detection, initialization, configuration
and actual usage of low latency queues(LLQ). It extends transmit API with
creation of LLQ descriptors in device memory (which include host buffers
descriptors as well as packet header)

Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: ena: introduce Low Latency Queues data structures according to ENA spec
Arthur Kiyanovski [Thu, 11 Oct 2018 08:26:18 +0000 (11:26 +0300)]
net: ena: introduce Low Latency Queues data structures according to ENA spec

Low Latency Queues(LLQ) allow usage of device's memory for descriptors
and headers. Such queues decrease processing time since data is already
located on the device when driver rings the doorbell.

Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: ena: complete host info to match latest ENA spec
Arthur Kiyanovski [Thu, 11 Oct 2018 08:26:17 +0000 (11:26 +0300)]
net: ena: complete host info to match latest ENA spec

Add new fields and definitions to host info and fill them
according to the latest ENA spec version.

Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: ena: minor performance improvement
Arthur Kiyanovski [Thu, 11 Oct 2018 08:26:16 +0000 (11:26 +0300)]
net: ena: minor performance improvement

Reduce fastpath overhead by making ena_com_tx_comp_req_id_get() inline.
Also move it to ena_eth_com.h file with its dependency function
ena_com_cq_inc_head().

Signed-off-by: Arthur Kiyanovski <akiyano@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge branch 'mlxsw-Preparations-for-VxLAN-support'
David S. Miller [Thu, 11 Oct 2018 17:08:24 +0000 (10:08 -0700)]
Merge branch 'mlxsw-Preparations-for-VxLAN-support'

Ido Schimmel says:

====================
mlxsw: Preparations for VxLAN support

This patchset prepares mlxsw for VxLAN support. It contains small and
mostly non-functional changes.

The first eight patches perform small changes in the code to make it
more receptive towards the actual VxLAN changes in the next patchset.

Patches 9-17 add the registers used to configure the device for VxLAN
offload.

Last two patches add the required resources and trap IDs.

The next patchset is available here [1].

1. https://github.com/idosch/linux/tree/vxlan
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agomlxsw: spectrum: Add NVE packet traps
Ido Schimmel [Thu, 11 Oct 2018 07:48:11 +0000 (07:48 +0000)]
mlxsw: spectrum: Add NVE packet traps

The DECAP_ECN0 trap will be used to trap packets where the overlay
packet is marked with Non-ECT, but the underlay packet is marked with
either ECT(0), ECT(1) or CE. When trapped, such packets will be counted
as errors by the VxLAN driver and thus provide better visibility.

The NVE_ENCAP_ARP trap will be used to trap ARP packets undergoing NVE
encapsulation. This is needed in order to support E-VPN ARP suppression,
where the Linux bridge does not flood ARP packets through tunnel ports
in case it can answer the ARP request itself.

Note that all the packets trapped via these traps are marked with
'offload_fwd_mark', so as to not be re-flooded by the Linux bridge
through the ASIC ports.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agomlxsw: resources: Add NVE resources
Ido Schimmel [Thu, 11 Oct 2018 07:48:09 +0000 (07:48 +0000)]
mlxsw: resources: Add NVE resources

Add the following resources to be used by the NVE code:
* Number of IPv4 underlay destination IPs in a single TNUMT record
* Number of IPv6 underlay destination IPs in a single TNUMT record

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agomlxsw: reg: Add Monitoring Parsing State Register
Ido Schimmel [Thu, 11 Oct 2018 07:48:08 +0000 (07:48 +0000)]
mlxsw: reg: Add Monitoring Parsing State Register

This register is used for setting up the parsing for hash, policy-engine
and routing.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agomlxsw: reg: Add definition of unicast tunnel record for SFD register
Ido Schimmel [Thu, 11 Oct 2018 07:48:07 +0000 (07:48 +0000)]
mlxsw: reg: Add definition of unicast tunnel record for SFD register

Will be used to program the device with FDB records pointing to a NVE
tunnel.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agomlxsw: reg: Add Tunneling NVE QoS Default Register
Ido Schimmel [Thu, 11 Oct 2018 07:48:06 +0000 (07:48 +0000)]
mlxsw: reg: Add Tunneling NVE QoS Default Register

The TNQDR register configures the default QoS settings for NVE
encapsulation.

It will be used to set the default DSCP of each port to 0, so that when
DSCP is set to inherit and the overlay packet does not have an IP header
the outer DSCP will be set to 0, in accordance with the software data
path.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agomlxsw: reg: Add Tunneling NVE QoS Configuration Register
Ido Schimmel [Thu, 11 Oct 2018 07:48:04 +0000 (07:48 +0000)]
mlxsw: reg: Add Tunneling NVE QoS Configuration Register

The register configures how QoS is set in Encapsulation into the
underlay network.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agomlxsw: reg: Add Tunneling NVE Decapsulation ECN Mapping Register
Ido Schimmel [Thu, 11 Oct 2018 07:48:03 +0000 (07:48 +0000)]
mlxsw: reg: Add Tunneling NVE Decapsulation ECN Mapping Register

This register configures the actions that are done during NVE
decapsulation based on the ECN bits.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agomlxsw: reg: Add Tunneling NVE Encapsulation ECN Mapping Register
Ido Schimmel [Thu, 11 Oct 2018 07:48:02 +0000 (07:48 +0000)]
mlxsw: reg: Add Tunneling NVE Encapsulation ECN Mapping Register

This register performs mapping from overlay ECN to underlay ECN during
NVE encapsulation.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agomlxsw: reg: Add Tunneling NVE Underlay Multicast Table Register
Ido Schimmel [Thu, 11 Oct 2018 07:48:01 +0000 (07:48 +0000)]
mlxsw: reg: Add Tunneling NVE Underlay Multicast Table Register

This register builds the linked list of underlay destination IPs used
for BUM traffic on the overlay.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agomlxsw: reg: Add Tunnel Port Configuration Register
Ido Schimmel [Thu, 11 Oct 2018 07:48:00 +0000 (07:48 +0000)]
mlxsw: reg: Add Tunnel Port Configuration Register

This register enables / disables learning on different types of tunnel
ports (e.g., NVE, VPLS).

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agomlxsw: reg: Add Tunneling NVE General Configuration Register
Ido Schimmel [Thu, 11 Oct 2018 07:47:59 +0000 (07:47 +0000)]
mlxsw: reg: Add Tunneling NVE General Configuration Register

This register configures global NVE configuration such as source IP of
the NVE tunnel and UDP source port calculation.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agomlxsw: spectrum: Seed LAG hash function
Ido Schimmel [Thu, 11 Oct 2018 07:47:57 +0000 (07:47 +0000)]
mlxsw: spectrum: Seed LAG hash function

Currently, the seed of the LAG hash function is always set to 0, which
means it is identical across all switches. Instead, use a random number.

This is especially important now that VxLAN is supported, as the LAG
hash function is used to calculate the UDP source port of the
encapsulated packet.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agomlxsw: reg: Extend FDB flush types for NVE
Ido Schimmel [Thu, 11 Oct 2018 07:47:56 +0000 (07:47 +0000)]
mlxsw: reg: Extend FDB flush types for NVE

The device has the ability to flush all the FDB records that perform NVE
encapsulation or only a subset of these with a specific filtering
identifier (FID).

Expose these types so that they could be used by subsequent patches
where we need to flush the FDB records when an NVE device is unlinked
from a bridge (FID).

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agomlxsw: spectrum: Add a new type of KVD linear record
Ido Schimmel [Thu, 11 Oct 2018 07:47:55 +0000 (07:47 +0000)]
mlxsw: spectrum: Add a new type of KVD linear record

When the device needs to flood an overlay packet to remote VTEPs it
retrieves a pointer to the head of a linked-list of records that store
the IP addresses of these VTEPs.

These records are stored in the KVD linear memory and configured via the
Tunneling NVE Underlay Multicast Table (TNUMT) register.

Add a new KVD linear entry type for these records, so that we will be
able to allocate and free them.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agomlxsw: spectrum: Move L3 protocol and address definitions to global header file
Ido Schimmel [Thu, 11 Oct 2018 07:47:54 +0000 (07:47 +0000)]
mlxsw: spectrum: Move L3 protocol and address definitions to global header file

The L3 protocol and address definitions are going to be used by the NVE
code, so move them to the global header file from the one private to the
router.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agomlxsw: spectrum_switchdev: Do not assume notifier information type
Ido Schimmel [Thu, 11 Oct 2018 07:47:53 +0000 (07:47 +0000)]
mlxsw: spectrum_switchdev: Do not assume notifier information type

VxLAN notifications are going to use a different notifier information
type, so cast to the correct type based on the received event.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agomlxsw: spectrum_switchdev: Check notification relevance based on upper device
Ido Schimmel [Thu, 11 Oct 2018 07:47:52 +0000 (07:47 +0000)]
mlxsw: spectrum_switchdev: Check notification relevance based on upper device

VxLAN FDB updates are sent with the VxLAN device which is not our upper
and will therefore be ignored by current code.

Solve this by checking whether the upper device (bridge) is our upper.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agomlxsw: spectrum_switchdev: Prepare for VxLAN FDB notifications
Ido Schimmel [Thu, 11 Oct 2018 07:47:50 +0000 (07:47 +0000)]
mlxsw: spectrum_switchdev: Prepare for VxLAN FDB notifications

VxLAN FDB notifications need to be handled differently than bridge FDB
notifications, so initialize the work item based on the received
notification and rename the invoked function accordingly.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agomlxsw: spectrum: Remove misuses of private header file
Ido Schimmel [Thu, 11 Oct 2018 07:47:49 +0000 (07:47 +0000)]
mlxsw: spectrum: Remove misuses of private header file

The spectrum_router.h header file is private to the router block and
should only be included by direct consumers of it, such as dpipe and the
multicast routing code.

Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Reviewed-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoocteontx2-af: Remove set but not used variable 'dev'
YueHaibing [Thu, 11 Oct 2018 07:37:07 +0000 (07:37 +0000)]
octeontx2-af: Remove set but not used variable 'dev'

Fixes gcc '-Wunused-but-set-variable' warning:

drivers/net/ethernet/marvell/octeontx2/af/cgx.c: In function 'cgx_fwi_event_handler':
drivers/net/ethernet/marvell/octeontx2/af/cgx.c:257:17: warning:
 variable 'dev' set but not used [-Wunused-but-set-variable]

It never be used since introduction in
commit 1463f382f58d ("octeontx2-af: Add support for CGX link management")

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: mscc: allow extracting the FCS into the skb
Antoine Tenart [Thu, 11 Oct 2018 07:12:24 +0000 (09:12 +0200)]
net: mscc: allow extracting the FCS into the skb

This patch adds support for the NETIF_F_RXFCS feature in the Mscc
Ethernet driver. This feature is disabled by default and allow a user
to request the driver not to drop the FCS and to extract it into the skb
for debugging purposes.

Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com>
Reviewed-by: Alexandre Belloni <alexandre.belloni@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge branch 'hns3-next'
David S. Miller [Thu, 11 Oct 2018 05:59:08 +0000 (22:59 -0700)]
Merge branch 'hns3-next'

Salil Mehta says:

====================
Adds support of RSS to HNS3 Driver for Rev 2(=0x21) H/W

This patch-set mainly adds new additions related to RSS for the new
hardware Revision 0x21. It also adds support to use RSS hash value
provided by the hardware along with descriptor.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: hns3: Add HW RSS hash information to RX skb
Peng Li [Wed, 10 Oct 2018 19:05:37 +0000 (20:05 +0100)]
net: hns3: Add HW RSS hash information to RX skb

Drivers should call skb_set_hash to set the hash and its type
in an skbuff.

Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: hns3: Add RSS tuples support for VF
Jian Shen [Wed, 10 Oct 2018 19:05:36 +0000 (20:05 +0100)]
net: hns3: Add RSS tuples support for VF

This patch adds RSS tuple support for VF in revision
0x21.

Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: hns3: Add RSS general configuration support for VF
Jian Shen [Wed, 10 Oct 2018 19:05:35 +0000 (20:05 +0100)]
net: hns3: Add RSS general configuration support for VF

This patch adds RSS key, hash algorithm configuration support
for VF in revision 0x21.

Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: hns3: Add new RSS hash algorithm support for PF
Jian Shen [Wed, 10 Oct 2018 19:05:34 +0000 (20:05 +0100)]
net: hns3: Add new RSS hash algorithm support for PF

This patch adds ETH_RSS_HASH_XOR hash algorithm supports, which
is supported by hw revision 0x21.

Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Salil Mehta <salil.mehta@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agophy: phy-ocelot-serdes: fix return value check in serdes_probe()
Wei Yongjun [Wed, 10 Oct 2018 02:00:24 +0000 (02:00 +0000)]
phy: phy-ocelot-serdes: fix return value check in serdes_probe()

In case of error, the function syscon_node_to_regmap() returns ERR_PTR()
and never returns NULL. The NULL test in the return value check should
be replaced with IS_ERR().

Fixes: 51f6b410fc22 ("phy: add driver for Microsemi Ocelot SerDes muxing")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Reviewed-by: Quentin Schulz <quentin.schulz@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge branch 'net-dsa-bcm_sf2-Couple-of-fixes'
David S. Miller [Thu, 11 Oct 2018 05:53:03 +0000 (22:53 -0700)]
Merge branch 'net-dsa-bcm_sf2-Couple-of-fixes'

Florian Fainelli says:

====================
net: dsa: bcm_sf2: Couple of fixes

Here are two fixes for the bcm_sf2 driver that were found during testing
unbind and analysing another issue during system suspend/resume.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: dsa: bcm_sf2: Call setup during switch resume
Florian Fainelli [Tue, 9 Oct 2018 23:48:58 +0000 (16:48 -0700)]
net: dsa: bcm_sf2: Call setup during switch resume

There is no reason to open code what the switch setup function does, in
fact, because we just issued a switch reset, we would make all the
register get their default values, including for instance, having unused
port be enabled again and wasting power and leading to an inappropriate
switch core clock being selected.

Fixes: 8cfa94984c9c ("net: dsa: bcm_sf2: add suspend/resume callbacks")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: dsa: bcm_sf2: Fix unbind ordering
Florian Fainelli [Tue, 9 Oct 2018 23:48:57 +0000 (16:48 -0700)]
net: dsa: bcm_sf2: Fix unbind ordering

The order in which we release resources is unfortunately leading to bus
errors while dismantling the port. This is because we set
priv->wol_ports_mask to 0 to tell bcm_sf2_sw_suspend() that it is now
permissible to clock gate the switch. Later on, when dsa_slave_destroy()
comes in from dsa_unregister_switch() and calls
dsa_switch_ops::port_disable, we perform the same dismantling again, and
this time we hit registers that are clock gated.

Make sure that dsa_unregister_switch() is the first thing that happens,
which takes care of releasing all user visible resources, then proceed
with clock gating hardware. We still need to set priv->wol_ports_mask to
0 to make sure that an enabled port properly gets disabled in case it
was previously used as part of Wake-on-LAN.

Fixes: d9338023fb8e ("net: dsa: bcm_sf2: Make it a real platform device driver")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: sched: avoid writing on noop_qdisc
Eric Dumazet [Tue, 9 Oct 2018 22:20:50 +0000 (15:20 -0700)]
net: sched: avoid writing on noop_qdisc

While noop_qdisc.gso_skb and noop_qdisc.skb_bad_txq are not used
in other places, it seems not correct to overwrite their fields
in dev_init_scheduler_queue().

noop_qdisc is essentially a shared and read-only object, even if
it is not marked as const because of some implementation detail.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet/mpls: Implement handler for strict data checking on dumps
David Ahern [Tue, 9 Oct 2018 18:10:43 +0000 (11:10 -0700)]
net/mpls: Implement handler for strict data checking on dumps

Without CONFIG_INET enabled compiles fail with:

net/mpls/af_mpls.o: In function `mpls_dump_routes':
af_mpls.c:(.text+0xed0): undefined reference to `ip_valid_fib_dump_req'

The preference is for MPLS to use the same handler as ipv4 and ipv6
to allow consistency when doing a dump for AF_UNSPEC which walks
all address families invoking the route dump handler. If INET is
disabled then fallback to an MPLS version which can be tighter on
the data checks.

Fixes: e8ba330ac0c5 ("rtnetlink: Update fib dumps for strict data checking")
Reported-by: Randy Dunlap <rdunlap@infradead.org>
Reported-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge branch 'fore200e-DMA-cleanups-and-fixes'
David S. Miller [Thu, 11 Oct 2018 05:38:50 +0000 (22:38 -0700)]
Merge branch 'fore200e-DMA-cleanups-and-fixes'

Christoph Hellwig says:

====================
fore200e DMA cleanups and fixes

The fore200e driver came up during some dma-related audits, so
here is the fallout.  Compile tested (x86 & sparc) only.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agofore200e: check for dma mapping failures
Christoph Hellwig [Tue, 9 Oct 2018 14:57:20 +0000 (16:57 +0200)]
fore200e: check for dma mapping failures

The driver was lacking any handling for failures from the DMA mapping
routines.  With an iommu or swiotlb this can be fatal.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agofore200e: don't use GFP_DMA
Christoph Hellwig [Tue, 9 Oct 2018 14:57:19 +0000 (16:57 +0200)]
fore200e: don't use GFP_DMA

The driver properly uses the DMA mapping API, so it should not
pointlessly dip into the GFP_DMA pool, which is only 16MB on x86.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agofore200e: devirtualize dma alloc calls
Christoph Hellwig [Tue, 9 Oct 2018 14:57:18 +0000 (16:57 +0200)]
fore200e: devirtualize dma alloc calls

There is no need for an indirection before calling the dma alloc
routines now that we store a struct device in struct fore200e.

Also remove the pointless GFP_ATOMIC for the sbus case, and fix the
up the error handling by removing the 0 dma_addr test - some iommus
can return 0 as a perfectly valid bus address.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agofore200e: devirtualize dma mapping calls
Christoph Hellwig [Tue, 9 Oct 2018 14:57:17 +0000 (16:57 +0200)]
fore200e: devirtualize dma mapping calls

There is no need for an indirection before calling the dma mapping
routines now that we store a struct device in struct fore200e.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agofore200e: remove the align_size field of struct chunk
Christoph Hellwig [Tue, 9 Oct 2018 14:57:16 +0000 (16:57 +0200)]
fore200e: remove the align_size field of struct chunk

There is no need for this field, as the only user of it can just use
the local size variable instead.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agofore200e: store a struct device in struct fore200e
Christoph Hellwig [Tue, 9 Oct 2018 14:57:15 +0000 (16:57 +0200)]
fore200e: store a struct device in struct fore200e

This can be used much better than the untyped void pointer containing
either a PCI or platform device.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agofore200e: simplify fore200e_bus usage
Christoph Hellwig [Tue, 9 Oct 2018 14:57:14 +0000 (16:57 +0200)]
fore200e: simplify fore200e_bus usage

There is no need to have a global array of the ops, instead PCI and sbus
can have their own instances assigned in *_probe.  Also switch to C99
initializers.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: tun: remove useless codes of tun_automq_select_queue
Wang Li [Tue, 9 Oct 2018 02:32:04 +0000 (10:32 +0800)]
net: tun: remove useless codes of tun_automq_select_queue

Because the function __skb_get_hash_symmetric always returns non-zero.

Signed-off-by: Zhang Yu <zhangyu31@baidu.com>
Signed-off-by: Wang Li <wangli39@baidu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agovirtio_net: ethtool tx napi configuration
Jason Wang [Tue, 9 Oct 2018 02:06:26 +0000 (10:06 +0800)]
virtio_net: ethtool tx napi configuration

Implement ethtool .set_coalesce (-C) and .get_coalesce (-c) handlers.
Interrupt moderation is currently not supported, so these accept and
display the default settings of 0 usec and 1 frame.

Toggle tx napi through setting tx-frames. So as to not interfere
with possible future interrupt moderation, value 1 means tx napi while
value 0 means not.

Only allow the switching when device is down for simplicity.

Link: https://patchwork.ozlabs.org/patch/948149/
Suggested-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge branch 'nfp-flower-speed-up-stats-update-loop'
David S. Miller [Thu, 11 Oct 2018 05:32:45 +0000 (22:32 -0700)]
Merge branch 'nfp-flower-speed-up-stats-update-loop'

Jakub Kicinski says:

====================
nfp: flower: speed up stats update loop

This set from Pieter improves performance of processing FW stats
update notifications.  The FW seems to send those at relatively
high rate (roughly ten per second per flow), therefore if we want
to approach the million flows mark we have to be very careful
about our data structures.

We tried rhashtable for stat updates, but according to our experiments
rhashtable lookup on a u32 takes roughly 60ns on an Xeon E5-2670 v3.
Which translate to a hard limit of 16M lookups per second on this CPU,
and, according to perf record jhash and memcmp account for 60% of CPU
usage on the core handling the updates.

Given that our statistic IDs are already array indices, and considering
each statistic is only 24B in size, we decided to forego the use
of hashtables and use a directly indexed array.  The CPU savings are
considerable.

With the recent improvements in TC core and with our own bottlenecks
out of the way Pieter removes the artificial limit of 128 flows, and
allows the driver to install as many flows as FW supports.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonfp: flower: use host context count provided by firmware
Pieter Jansen van Vuuren [Tue, 9 Oct 2018 01:57:36 +0000 (18:57 -0700)]
nfp: flower: use host context count provided by firmware

Read the host context count symbols provided by firmware and use
it to determine the number of allocated stats ids. Previously it
won't be possible to offload more than 2^17 filter even if FW was
able to do so.

Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonfp: flower: use stats array instead of storing stats per flow
Pieter Jansen van Vuuren [Tue, 9 Oct 2018 01:57:35 +0000 (18:57 -0700)]
nfp: flower: use stats array instead of storing stats per flow

Make use of an array stats instead of storing stats per flow which
would require a hash lookup at critical times.

Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonfp: flower: use rhashtable for flow caching
Pieter Jansen van Vuuren [Tue, 9 Oct 2018 01:57:34 +0000 (18:57 -0700)]
nfp: flower: use rhashtable for flow caching

Make use of relativistic hash tables for tracking flows instead
of fixed sized hash tables.

Signed-off-by: Pieter Jansen van Vuuren <pieter.jansenvanvuuren@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoisdn/hisax: amd7930_fn: Remove unnecessary parentheses
Nathan Chancellor [Mon, 8 Oct 2018 22:59:05 +0000 (15:59 -0700)]
isdn/hisax: amd7930_fn: Remove unnecessary parentheses

Clang warns when multiple sets of parentheses are used for a single
conditional statement.

drivers/isdn/hisax/amd7930_fn.c:628:32: warning: equality comparison
with extraneous parentheses [-Wparentheses-equality]
                if ((cs->dc.amd7930.ph_state == 8)) {
                     ~~~~~~~~~~~~~~~~~~~~~~~~^~~~
drivers/isdn/hisax/amd7930_fn.c:628:32: note: remove extraneous
parentheses around the comparison to silence this warning
                if ((cs->dc.amd7930.ph_state == 8)) {
                    ~                        ^   ~
drivers/isdn/hisax/amd7930_fn.c:628:32: note: use '=' to turn this
equality comparison into an assignment
                if ((cs->dc.amd7930.ph_state == 8)) {
                                             ^~
                                             =
1 warning generated.

Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agotcp: refactor DCTCP ECN ACK handling
Yuchung Cheng [Mon, 8 Oct 2018 22:32:20 +0000 (15:32 -0700)]
tcp: refactor DCTCP ECN ACK handling

DCTCP has two parts - a new ECN signalling mechanism and the response
function to it. The first part can be used by other congestion
control for DCTCP-ECN deployed networks. This patch moves that part
into a separate tcp_dctcp.h to be used by other congestion control
module (like how Yeah uses Vegas algorithmas). For example, BBR is
experimenting such ECN signal currently
https://tinyurl.com/ietf-102-iccrg-bbr2

Signed-off-by: Yuchung Cheng <ycheng@google.com>
Signed-off-by: Yousuk Seung <ysseung@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet/ipv6: Make ipv6_route_table_template static
David Ahern [Mon, 8 Oct 2018 21:06:34 +0000 (14:06 -0700)]
net/ipv6: Make ipv6_route_table_template static

ipv6_route_table_template is exported but there are no users outside
of route.c. Make it static.

Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agortnetlink: Update comment in rtnl_stats_dump regarding strict data checking
David Ahern [Mon, 8 Oct 2018 20:58:07 +0000 (13:58 -0700)]
rtnetlink: Update comment in rtnl_stats_dump regarding strict data checking

The NLM_F_DUMP_PROPER_HDR netlink flag was replaced by a setsockopt.
Update the comment in rtnl_stats_dump.

Fixes: 841891ec0c65 ("rtnetlink: Update rtnl_stats_dump for strict data checking")
Reported-by: Christian Brauner <christian@brauner.io>
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agortnetlink: Move ifm in valid_fdb_dump_legacy to closer to use
David Ahern [Mon, 8 Oct 2018 20:57:24 +0000 (13:57 -0700)]
rtnetlink: Move ifm in valid_fdb_dump_legacy to closer to use

Move setting of local variable ifm to after the message parsing in
valid_fdb_dump_legacy. Avoid potential future use of unchecked variable.

Fixes: 8dfbda19a21b ("rtnetlink: Move input checking for rtnl_fdb_dump to helper")
Reported-by: Christian Brauner <christian@brauner.io>
Signed-off-by: David Ahern <dsahern@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge branch 'mlxsw-selftests-Few-small-updates'
David S. Miller [Thu, 11 Oct 2018 05:22:40 +0000 (22:22 -0700)]
Merge branch 'mlxsw-selftests-Few-small-updates'

Ido Schimmel says:

====================
mlxsw: selftests: Few small updates

First patch fixes a typo in mlxsw.

Second patch fixes a race in a recent test.

Third patch makes a recent test executable.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoselftests: mlxsw: qos_mc_aware: Make executable
Petr Machata [Mon, 8 Oct 2018 18:50:42 +0000 (18:50 +0000)]
selftests: mlxsw: qos_mc_aware: Make executable

This is a self-standing test and as such should be itself executable.

Fixes: b5638d46c90a ("selftests: mlxsw: Add a test for UC behavior under MC flood")
Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoselftests: forwarding: Have lldpad_app_wait_set() wait for unknown, too
Petr Machata [Mon, 8 Oct 2018 18:50:41 +0000 (18:50 +0000)]
selftests: forwarding: Have lldpad_app_wait_set() wait for unknown, too

Immediately after mlxsw module is probed and lldpad started, added APP
entries are briefly in "unknown" state before becoming "pending". That's
the state that lldpad_app_wait_set() typically sees, and since there are
no pending entries at that time, it bails out. However the entries have
not been pushed to the kernel yet at that point, and thus the test case
fails.

Fix by waiting for both unknown and pending entries to disappear before
proceeding.

Fixes: d159261f3662 ("selftests: mlxsw: Add test for trust-DSCP")
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agomlxsw: pci: Fix a typo
Nir Dotan [Mon, 8 Oct 2018 18:50:40 +0000 (18:50 +0000)]
mlxsw: pci: Fix a typo

Signed-off-by: Nir Dotan <nird@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet: aquantia: remove some redundant variable initializations
Colin Ian King [Mon, 8 Oct 2018 13:35:58 +0000 (14:35 +0100)]
net: aquantia: remove some redundant variable initializations

There are several variables being initialized that are being set later
and hence the initialization is redundant and can be removed. Remove
then.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge branch 'octeontx2-af-Add-RVU-Admin-Function-driver'
David S. Miller [Wed, 10 Oct 2018 17:06:03 +0000 (10:06 -0700)]
Merge branch 'octeontx2-af-Add-RVU-Admin-Function-driver'

Sunil Goutham says:

====================
octeontx2-af: Add RVU Admin Function driver

Resource virtualization unit (RVU) on Marvell's OcteonTX2 SOC maps HW
resources from the network, crypto and other functional blocks into
PCI-compatible physical and virtual functions. Each functional block
again has multiple local functions (LFs) for provisioning to PCI devices.
RVU supports multiple PCIe SRIOV physical functions (PFs) and virtual
functions (VFs). PF0 is called the administrative / admin function (AF)
and has privileges to provision RVU functional block's LFs to each of the
PF/VF.

RVU managed networking functional blocks
 - Network pool allocator (NPA)
 - Network interface controller (NIX)
 - Network parser CAM (NPC)
 - Schedule/Synchronize/Order unit (SSO)

RVU managed non-networking functional blocks
 - Crypto accelerator (CPT)
 - Scheduled timers unit (TIM)
 - Schedule/Synchronize/Order unit (SSO)
   Used for both networking and non networking usecases
 - Compression (upcoming in future variants of the silicons)

Resource provisioning examples
 - A PF/VF with NIX-LF & NPA-LF resources works as a pure network device
 - A PF/VF with CPT-LF resource works as a pure cyrpto offload device.

This admin function driver neither receives any data nor processes it i.e
no I/O, a configuration only driver.

PF/VFs communicates with AF via a shared memory region (mailbox). Upon
receiving requests from PF/VF, AF does resource provisioning and other
HW configuration. AF is always attached to host, but PF/VFs may be used
by host kernel itself, or attached to VMs or to userspace applications
like DPDK etc. So AF has to handle provisioning/configuration requests
sent by any device from any domain.

This patch series adds logic for the following
 - RVU AF driver with functional blocks provisioning support.
 - Mailbox infrastructure for communication between AF and PFs.
 - CGX (MAC controller) driver which communicates with firmware for
   managing  physical ethernet interfaces. AF collects info from this
   driver and forwards the same to the PF/VFs uaing these interfaces.

This is the first set of patches out of 80+ patches.

Changes from v8:
 1 Removed unnecessary typecasts in entire series
   - Suggested by David Miller
 2 Added COMPILE_TEST to AF driver
   - Suggested by Arnd Bergmann
 3 Changed udelay() to usleep_range() in rvu_poll_reg
   - Suggested by Arnd Bergmann
 4 MSIX vector base IOMMU mapping is done using dma_map_resource()
   API instead of dma_map_single() as it accepts physical address.
   - Issue pointed by Arnd Bergmann

Changes from v7:
 1 Removed unnecessary typecasts in mbox infra code.
   - Suggested by David Miller
 2 Fixed MAINTAINERS patch
   - Suggested by Joe Perches

Changes from v6:
 Fixed ordering of local variables from longest to shortest line.
   - Suggested by David Miller

Changes from v5:
 Modified bitfield based command structures to bitmasks for communication
 with firmware, to address endianness issues.
   - Suggested by Arnd Bergmann

Changes from v4:
 1 Removed module author/version/description from CGX driver as it's now
   merged with AF driver module.
   - Suggested by Arnd Bergmann
 2 Added big-endian bitfields for CGX's kernel <=> firmware communication
   command structures.
   - Suggested by Arnd Bergmann

Changes from v3:
 Moved driver from drivers/soc to drivers/net/ethernet
   - Suggested by Arnd Bergmann
 https://patchwork.kernel.org/cover/10587635/

Changes from v2:
 No changes, submitted again with netdev mailing list in loop.
   - Suggested by Arnd Bergmann and Andrew Lunn

Changes from v1:
 1 Merged RVU admin function and CGX drivers into a single module
   - Suggested by Arnd Bergmann
 2 Pulled mbox communication APIs into a separate module to remove
   admin function driver dependency in a VM where AF is not attached.
   - Suggested by Arnd Bergmann
====================

Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMAINTAINERS: Add entry for Marvell OcteonTX2 Admin Function driver
Sunil Goutham [Wed, 10 Oct 2018 12:44:35 +0000 (18:14 +0530)]
MAINTAINERS: Add entry for Marvell OcteonTX2 Admin Function driver

Added maintainers entry for Marvell OcteonTX2 SOC's RVU
admin function driver.

Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoocteontx2-af: Register for CGX lmac events
Linu Cherian [Wed, 10 Oct 2018 12:44:34 +0000 (18:14 +0530)]
octeontx2-af: Register for CGX lmac events

Added support in RVU AF driver to register for
CGX LMAC link status change events from firmware
and managing them. Processing part will be added
in followup patches.

- Introduced eventqueue for posting events from cgx lmac.
  Queueing mechanism will ensure that events can be posted
  and firmware can be acked immediately and hence event
  reception and processing are decoupled.
- Events gets added to the queue by notification callback.
  Notification callback is expected to be atomic, since it
  is called from interrupt context.
- Events are dequeued and processed in a worker thread.

Signed-off-by: Linu Cherian <lcherian@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoocteontx2-af: Add support for CGX link management
Linu Cherian [Wed, 10 Oct 2018 12:44:33 +0000 (18:14 +0530)]
octeontx2-af: Add support for CGX link management

CGX LMAC initialization, link status polling etc is done
by low level secure firmware. For link management this patch
adds a interface or communication mechanism between firmware
and this kernel CGX driver.

- Firmware interface specification is defined in cgx_fw_if.h.
- Support to send/receive commands/events to/form firmware.
- events/commands implemented
  * link up
  * link down
  * reading firmware version

Signed-off-by: Linu Cherian <lcherian@marvell.com>
Signed-off-by: Nithya Mani <nmani@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoocteontx2-af: Set RVU PFs to CGX LMACs mapping
Linu Cherian [Wed, 10 Oct 2018 12:44:32 +0000 (18:14 +0530)]
octeontx2-af: Set RVU PFs to CGX LMACs mapping

Each of the enabled CGX LMAC is considered a physical
interface and RVU PFs are mapped to these. VFs of these
SRIOV PFs will be virtual interfaces and share CGX LMAC
along with PF.

This mapping info will be used later on for Rx/Tx pkt steering.

Signed-off-by: Linu Cherian <lcherian@marvell.com>
Signed-off-by: Geetha sowjanya <gakula@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoocteontx2-af: Add Marvell OcteonTX2 CGX driver
Sunil Goutham [Wed, 10 Oct 2018 12:44:31 +0000 (18:14 +0530)]
octeontx2-af: Add Marvell OcteonTX2 CGX driver

This patch adds basic template for Marvell OcteonTX2's
CGX ethernet interface driver. Just the probe.
RVU AF driver will use APIs exported by this driver
for various things like PF to physical interface mapping,
loopback mode, interface stats etc. Hence marged both
drivers into a single module.

Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoocteontx2-af: Reconfig MSIX base with IOVA
Geetha sowjanya [Wed, 10 Oct 2018 12:44:30 +0000 (18:14 +0530)]
octeontx2-af: Reconfig MSIX base with IOVA

HW interprets RVU_AF_MSIXTR_BASE address as an IOVA, hence
create a IOMMU mapping for the physcial address configured by
firmware and reconfig RVU_AF_MSIXTR_BASE with IOVA.

Signed-off-by: Geetha sowjanya <gakula@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoocteontx2-af: Configure block LF's MSIX vector offset
Sunil Goutham [Wed, 10 Oct 2018 12:44:29 +0000 (18:14 +0530)]
octeontx2-af: Configure block LF's MSIX vector offset

Firmware configures a certain number of MSIX vectors to each of
enabled RVU PF/VF. When a block LF is attached to a PF/VF, number
of MSIX vectors needed by that LF are set aside (out of PF/VF's
total MSIX vectors) and LF's msix_offset is configured in HW.

Also added support for a RVU PF/VF to retrieve that block LF's
MSIX vector offset information from AF via mbox.

Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoocteontx2-af: Add RVU block LF provisioning support
Sunil Goutham [Wed, 10 Oct 2018 12:44:28 +0000 (18:14 +0530)]
octeontx2-af: Add RVU block LF provisioning support

Added support for a RVU PF/VF to request AF via mailbox
to attach or detach NPA/NIX/SSO/SSOW/TIM/CPT block LFs.
Also supports partial detachment and modifying current
LF attached count of a certian block type.

Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoocteontx2-af: Scan blocks for LFs provisioned to PF/VF
Sunil Goutham [Wed, 10 Oct 2018 12:44:27 +0000 (18:14 +0530)]
octeontx2-af: Scan blocks for LFs provisioned to PF/VF

Scan all RVU blocks to find any 'LF to RVU PF/VF' mapping done by
low level firmware. If found any, mark them as used in respective
block's LF bitmap and also save mapped PF/VF's PF_FUNC info.

This is done to avoid reattaching a block LF to a different RVU PF/VF.

Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoocteontx2-af: Convert mbox msg id check to a macro
Aleksey Makarov [Wed, 10 Oct 2018 12:44:26 +0000 (18:14 +0530)]
octeontx2-af: Convert mbox msg id check to a macro

With 10's of mailbox messages expected to be handled in future,
checking for message id could become a lengthy switch case. Hence
added a macro to auto generate the switch case for each msg id.

Signed-off-by: Aleksey Makarov <amakarov@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoocteontx2-af: Add mailbox IRQ and msg handlers
Sunil Goutham [Wed, 10 Oct 2018 12:44:25 +0000 (18:14 +0530)]
octeontx2-af: Add mailbox IRQ and msg handlers

This patch adds support for mailbox interrupt and message
handling. Mapped mailbox region and registered a workqueue
for message handling. Enabled mailbox IRQ of RVU PFs
and registered a interrupt handler. When IRQ is triggered
work is added to the mbox workqueue for msgs to get processed.

Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoocteontx2-af: Add mailbox support infra
Aleksey Makarov [Wed, 10 Oct 2018 12:44:24 +0000 (18:14 +0530)]
octeontx2-af: Add mailbox support infra

This patch adds mailbox support infrastructure APIs.
Each RVU device has a dedicated 64KB mailbox region
shared with it's peer for communication. RVU AF has
a separate mailbox region shared with each of RVU PFs
and a RVU PF has a separate region shared with each of
it's VF.

These set of APIs are used by this driver (RVU AF) and
other RVU PF/VF drivers eg netdev, crypto e.t.c.

Signed-off-by: Aleksey Makarov <amakarov@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: Lukasz Bartosik <lbartosik@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoocteontx2-af: Gather RVU blocks HW info
Sunil Goutham [Wed, 10 Oct 2018 12:44:23 +0000 (18:14 +0530)]
octeontx2-af: Gather RVU blocks HW info

This patch gathers NPA/NIX/SSO/SSOW/TIM/CPT RVU blocks's
HW info like number of LFs. Important register offsets
saved for later use to avoid code duplication for each block.
A bitmap is allocated for each of the blocks which later
on will be used to allocate a LF for a RVU PF/VF.

Also added RVU NIX/NPA block registers and few registers
of other blocks.

Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoocteontx2-af: Reset all RVU blocks
Sunil Goutham [Wed, 10 Oct 2018 12:44:22 +0000 (18:14 +0530)]
octeontx2-af: Reset all RVU blocks

Go through all BLKADDRs and check which ones are implemented
on this silicon and do a HW reset of each implemented block.
Also added all RVU AF and PF register offsets.

Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoocteontx2-af: Add Marvell OcteonTX2 RVU AF driver
Sunil Goutham [Wed, 10 Oct 2018 12:44:21 +0000 (18:14 +0530)]
octeontx2-af: Add Marvell OcteonTX2 RVU AF driver

This patch adds basic template for Marvell OcteonTX2's
resource virtualization unit (RVU) admin function (AF)
driver. Just the driver registration and probe.

Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoqed: Add support for virtual link.
Sudarsana Reddy Kalluru [Wed, 10 Oct 2018 12:00:12 +0000 (05:00 -0700)]
qed: Add support for virtual link.

Currently driver registers to physical link notifications (of the device)
from Management firmware (MFW). Driver doesn't get notified if there's a
change in the virtual link e.g., link-flap on the peer PF interface.
Virtual link indication from MFW reflects the per PF link status instead
of the physical link.

The patch adds driver support for,
  - Advertising the virtual link support to MFW.
  - Handling the virtual link notification from MFW.

Please consider applying it to 'net-next'.

Signed-off-by: Sudarsana Reddy Kalluru <Sudarsana.Kalluru@cavium.com>
Signed-off-by: Tomer Tayar <Tomer.Tayar@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agocxgb4: Add thermal zone support
Ganesh Goudar [Tue, 9 Oct 2018 13:44:13 +0000 (19:14 +0530)]
cxgb4: Add thermal zone support

Add thermal zone support to monitor ASIC's temperature.

Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agonet/mlx4_en: Use minimal rx and tx ring sizes on kdump kernel
Alaa Hleihel [Tue, 9 Oct 2018 09:06:52 +0000 (12:06 +0300)]
net/mlx4_en: Use minimal rx and tx ring sizes on kdump kernel

When memory is limited (on kdump kernel), reduce size of rx and tx rings.
Also reduce the number of rx rings.

Signed-off-by: Alaa Hleihel <alaa@mellanox.com>
Signed-off-by: Tariq Toukan <tariqt@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
5 years agoMerge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
David S. Miller [Tue, 9 Oct 2018 06:42:44 +0000 (23:42 -0700)]
Merge git://git./linux/kernel/git/bpf/bpf-next

Alexei Starovoitov says:

====================
pull-request: bpf-next 2018-10-08

The following pull-request contains BPF updates for your *net-next* tree.

The main changes are:

1) sk_lookup_[tcp|udp] and sk_release helpers from Joe Stringer which allow
BPF programs to perform lookups for sockets in a network namespace. This would
allow programs to determine early on in processing whether the stack is
expecting to receive the packet, and perform some action (eg drop,
forward somewhere) based on this information.

2) per-cpu cgroup local storage from Roman Gushchin.
Per-cpu cgroup local storage is very similar to simple cgroup storage
except all the data is per-cpu. The main goal of per-cpu variant is to
implement super fast counters (e.g. packet counters), which don't require
neither lookups, neither atomic operations in a fast path.
The example of these hybrid counters is in selftests/bpf/netcnt_prog.c

3) allow HW offload of programs with BPF-to-BPF function calls from Quentin Monnet

4) support more than 64-byte key/value in HW offloaded BPF maps from Jakub Kicinski

5) rename of libbpf interfaces from Andrey Ignatov.
libbpf is maturing as a library and should follow good practices in
library design and implementation to play well with other libraries.
This patch set brings consistent naming convention to global symbols.

6) relicense libbpf as LGPL-2.1 OR BSD-2-Clause from Alexei Starovoitov
to let Apache2 projects use libbpf

7) various AF_XDP fixes from Björn and Magnus
====================

Signed-off-by: David S. Miller <davem@davemloft.net>