git.osdn.net Git - uclinux-h8/linux.git/log

net: mscc: ocelot: use separate flooding PGID for broadcast

In preparation of offloading the bridge port flags which have
independent settings for unknown multicast and for broadcast, we should
also start reserving one destination Port Group ID for the flooding of
broadcast packets, to allow configuring it individually.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: dsa: felix: restore multicast flood to CPU when NPI tagger reinitializes

ocelot_init sets up PGID_MC to include the CPU port module, and that is
fine, but the ocelot-8021q tagger removes the CPU port module from the
unknown multicast replicator. So after a transition from the default
ocelot tagger towards ocelot-8021q and then again towards ocelot,
multicast flooding towards the CPU port module will be disabled.

Fixes: e21268efbe26 ("net: dsa: felix: perform switch setup for tag_8021q")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: dsa: act as passthrough for bridge port flags

There are multiple ways in which a PORT_BRIDGE_FLAGS attribute can be
expressed by the bridge through switchdev, and not all of them can be
emulated by DSA mid-layer API at the same time.

One possible configuration is when the bridge offloads the port flags
using a mask that has a single bit set - therefore only one feature
should change. However, DSA currently groups together unicast and
multicast flooding in the .port_egress_floods method, which limits our
options when we try to add support for turning off broadcast flooding:
do we extend .port_egress_floods with a third parameter which b53 and
mv88e6xxx will ignore? But that means that the DSA layer, which
currently implements the PRE_BRIDGE_FLAGS attribute all by itself, will
see that .port_egress_floods is implemented, and will report that all 3
types of flooding are supported - not necessarily true.

Another configuration is when the user specifies more than one flag at
the same time, in the same netlink message. If we were to create one
individual function per offloadable bridge port flag, we would limit the
expressiveness of the switch driver of refusing certain combinations of
flag values. For example, a switch may not have an explicit knob for
flooding of unknown multicast, just for flooding in general. In that
case, the only correct thing to do is to allow changes to BR_FLOOD and
BR_MCAST_FLOOD in tandem, and never allow mismatched values. But having
a separate .port_set_unicast_flood and .port_set_multicast_flood would
not allow the driver to possibly reject that.

Also, DSA doesn't consider it necessary to inform the driver that a
SWITCHDEV_ATTR_ID_BRIDGE_MROUTER attribute was offloaded, because it
just calls .port_egress_floods for the CPU port. When we'll add support
for the plain SWITCHDEV_ATTR_ID_PORT_MROUTER, that will become a real
problem because the flood settings will need to be held statefully in
the DSA middle layer, otherwise changing the mrouter port attribute will
impact the flooding attribute. And that's _assuming_ that the underlying
hardware doesn't have anything else to do when a multicast router
attaches to a port than flood unknown traffic to it. If it does, there
will need to be a dedicated .port_set_mrouter anyway.

So we need to let the DSA drivers see the exact form that the bridge
passes this switchdev attribute in, otherwise we are standing in the
way. Therefore we also need to use this form of language when
communicating to the driver that it needs to configure its initial
(before bridge join) and final (after bridge leave) port flags.

The b53 and mv88e6xxx drivers are converted to the passthrough API and
their implementation of .port_egress_floods is split into two: a
function that configures unicast flooding and another for multicast.
The mv88e6xxx implementation is quite hairy, and it turns out that
the implementations of unknown unicast flooding are actually the same
for 6185 and for 6352:

behind the confusing names actually lie two individual bits:
NO_UNKNOWN_MC -> FLOOD_UC = 0x4 = BIT(2)
NO_UNKNOWN_UC -> FLOOD_MC = 0x8 = BIT(3)

so there was no reason to entangle them in the first place.

Whereas the 6185 writes to MV88E6185_PORT_CTL0_FORWARD_UNKNOWN of
PORT_CTL0, which has the exact same bit index. I have left the
implementations separate though, for the only reason that the names are
different enough to confuse me, since I am not able to double-check with
a user manual. The multicast flooding setting for 6185 is in a different
register than for 6352 though.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: switchdev: pass flags and mask to both {PRE_,}BRIDGE_FLAGS attributes

This switchdev attribute offers a counterproductive API for a driver
writer, because although br_switchdev_set_port_flag gets passed a
"flags" and a "mask", those are passed piecemeal to the driver, so while
the PRE_BRIDGE_FLAGS listener knows what changed because it has the
"mask", the BRIDGE_FLAGS listener doesn't, because it only has the final
value. But certain drivers can offload only certain combinations of
settings, like for example they cannot change unicast flooding
independently of multicast flooding - they must be both on or both off.
The way the information is passed to switchdev makes drivers not
expressive enough, and unable to reject this request ahead of time, in
the PRE_BRIDGE_FLAGS notifier, so they are forced to reject it during
the deferred BRIDGE_FLAGS attribute, where the rejection is currently
ignored.

This patch also changes drivers to make use of the "mask" field for edge
detection when possible.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Grygorii Strashko <grygorii.strashko@ti.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: dsa: configure better brport flags when ports leave the bridge

For a DSA switch port operating in standalone mode, address learning
doesn't make much sense since that is a bridge function. In fact,
address learning even breaks setups such as this one:

   +---------------------------------------------+
   |                                             |
   | +-------------------+                       |
   | |        br0        |    send      receive  |
   | +--------+-+--------+ +--------+ +--------+ |
   | |        | |        | |        | |        | |
   | |  swp0  | |  swp1  | |  swp2  | |  swp3  | |
   | |        | |        | |        | |        | |
   +-+--------+-+--------+-+--------+-+--------+-+
          |         ^           |          ^
          |         |           |          |
          |         +-----------+          |
          |                                |
          +--------------------------------+

because if the switch has a single FDB (can offload a single bridge)
then source address learning on swp3 can "steal" the source MAC address
of swp2 from br0's FDB, because learning frames coming from swp2 will be
done twice: first on the swp1 ingress port, second on the swp3 ingress
port. So the hardware FDB will become out of sync with the software
bridge, and when swp2 tries to send one more packet towards swp1, the
ASIC will attempt to short-circuit the forwarding path and send it
directly to swp3 (since that's the last port it learned that address on),
which it obviously can't, because swp3 operates in standalone mode.

So DSA drivers operating in standalone mode should still configure a
list of bridge port flags even when they are standalone. Currently DSA
attempts to call dsa_port_bridge_flags with 0, which disables egress
flooding of unknown unicast and multicast, something which doesn't make
much sense. For the switches that implement .port_egress_floods - b53
and mv88e6xxx, it probably doesn't matter too much either, since they
can possibly inject traffic from the CPU into a standalone port,
regardless of MAC DA, even if egress flooding is turned off for that
port, but certainly not all DSA switches can do that - sja1105, for
example, can't. So it makes sense to use a better common default there,
such as "flood everything".

It should also be noted that what DSA calls "dsa_port_bridge_flags()"
is a degenerate name for just calling .port_egress_floods(), since
nothing else is implemented - not learning, in particular. But disabling
address learning, something that this driver is also coding up for, will
be supported by individual drivers once .port_egress_floods is replaced
with a more generic .port_bridge_flags.

Previous attempts to code up this logic have been in the common bridge
layer, but as pointed out by Ido Schimmel, there are corner cases that
are missed when doing that:
https://patchwork.kernel.org/project/netdevbpf/patch/20210209151936.97382-5-olteanv@gmail.com/

So, at least for now, let's leave DSA in charge of setting port flags
before and after the bridge join and leave.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: bridge: don't print in br_switchdev_set_port_flag

For the netlink interface, propagate errors through extack rather than
simply printing them to the console. For the sysfs interface, we still
print to the console, but at least that's one layer higher than in
switchdev, which also allows us to silently ignore the offloading of
flags if that is ever needed in the future.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Acked-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: bridge: offload all port flags at once in br_setport

If for example this command:

ip link set swp0 type bridge_slave flood off mcast_flood off learning off

succeeded at configuring BR_FLOOD and BR_MCAST_FLOOD but not at
BR_LEARNING, there would be no attempt to revert the partial state in
any way. Arguably, if the user changes more than one flag through the
same netlink command, this one _should_ be all or nothing, which means
it should be passed through switchdev as all or nothing.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Acked-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: switchdev: propagate extack to port attributes

When a struct switchdev_attr is notified through switchdev, there is no
way to report informational messages, unlike for struct switchdev_obj.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Reviewed-by: Grygorii Strashko <grygorii.strashko@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

octeontx2: Fix condition.

Fixes: 93efb0c656837 ("octeontx2-pf: Fix out-of-bounds read in otx2_get_fecparam()")
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'ipa-cleanups'

Alex Elder says:

====================
net: ipa: some more cleanup

Version 3 of this series uses dev_err_probe() in the second patch,
as suggested by Heiner Kallweit.

Version 2 was sent to ensure the series was based on current
net-next/master, and added copyright updates to files touched.

The original introduction is below.

This is another fairly innocuous set of cleanup patches.

The first was motivated by a bug found that would affect IPA v4.5.
It maintain a new GSI address pointer; one is the "raw" (original
mapped) address, and the other will have been adjusted if necessary
for use on newer platforms.

The second just quiets some unnecessary noise during early probe.

The third fixes some errors that show up when IPA_VALIDATION is
enabled.

The last two just create helper functions to improve readability.

====================

Signed-off-by: David S. Miller <davem@davemloft.net>

net: ipa: introduce gsi_channel_initialized()

Create a simple helper function that indicates whether a channel has
been initialized. This abstacts/hides the details of how this is
determined.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: ipa: introduce ipa_table_hash_support()

Introduce a new function to abstract the knowledge of whether hashed
routing and filter tables are supported for a given IPA instance.

IPA v4.2 is the only one that doesn't support hashed tables (now
and for the foreseeable future), but the name of the helper function
is better for explaining what's going on.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: ipa: fix register write command validation

In ipa_cmd_register_write_valid() we verify that values we will
supply to a REGISTER_WRITE IPA immediate command will fit in
the fields that need to hold them.  This patch fixes some issues
in that function and ipa_cmd_register_write_offset_valid().

The dev_err() call in ipa_cmd_register_write_offset_valid() has
some printf format errors:
  - The name of the register (corresponding to the string format
    specifier) was not supplied.
  - The IPA base offset and offset need to be supplied separately to
    match the other format specifiers.
Also make the ~0 constant used there to compute the maximum
supported offset value explicitly unsigned.

There are two other issues in ipa_cmd_register_write_valid():
  - There's no need to check the hash flush register for platforms
    (like IPA v4.2) that do not support hashed tables
  - The highest possible endpoint number, whose status register
    offset is computed, is COUNT - 1, not COUNT.

Fix these problems, and add some additional commentary.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: ipa: use dev_err_probe() in ipa_clock.c

When initializing the IPA core clock and interconnects, it's
possible we'll get an EPROBE_DEFER error. This isn't really an
error, it's just means we need to be re-probed later.

Use dev_err_probe() to report the error rather than dev_err().
This avoids polluting the log with these "error" messages.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: ipa: use a separate pointer for adjusted GSI memory

This patch actually fixes a bug, though it doesn't affect the two
platforms supported currently.  The fix implements GSI memory
pointers a bit differently.

For IPA version 4.5 and above, the address space for almost all GSI
registers is adjusted downward by a fixed amount.  This is currently
handled by adjusting the I/O virtual address pointer after it has
been mapped.  The bug is that the pointer is not "de-adjusted" as it
should be when it's unmapped.

This patch fixes that error, but it does so by maintaining one "raw"
pointer for the mapped memory range.  This is assigned when the
memory is mapped and used to unmap the memory.  This pointer is also
used to access the two registers that do *not* sit in the "adjusted"
memory space.

Rather than adjusting *that* pointer, we maintain a separate pointer
that's an adjusted copy of the "raw" pointer, and that is used for
most GSI register accesses.

Signed-off-by: Alex Elder <elder@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge tag 'mac80211-next-for-net-next-2021-02-12' of git://git./linux/kernel/git/jberg/mac80211-next

Johannes Berg says:

====================
Last set of updates:
* more minstrel work from Felix to reduce the
probing overhead
* QoS for nl80211 control port frames
* STBC injection support
* and a couple of small fixes
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

octeontx2-pf: Fix out-of-bounds read in otx2_get_fecparam()

Code at line 967 implies that rsp->fwdata.supported_fec may be up to 4:

967: if (rsp->fwdata.supported_fec <= FEC_MAX_INDEX)

If rsp->fwdata.supported_fec evaluates to 4, then there is an
out-of-bounds read at line 971 because fec is an array with
a maximum of 4 elements:

954         const int fec[] = {
955                 ETHTOOL_FEC_OFF,
956                 ETHTOOL_FEC_BASER,
957                 ETHTOOL_FEC_RS,
958                 ETHTOOL_FEC_BASER | ETHTOOL_FEC_RS};
959 #define FEC_MAX_INDEX 4

971: fecparam->fec = fec[rsp->fwdata.supported_fec];

Fix this by properly indexing fec[] with rsp->fwdata.supported_fec - 1.
In this case the proper indexes 0 to 3 are used when
rsp->fwdata.supported_fec evaluates to a range of 1 to 4, correspondingly.

Fixes: d0cf9503e908 ("octeontx2-pf: ethtool fec mode support")
Addresses-Coverity-ID: 1501722 ("Out-of-bounds read")
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

octeontx2-af: Fix spelling mistake "recievd" -> "received"

There is a spelling mistake in the text in array rpm_rx_stats_fields,
fix it.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge tag 'wireless-drivers-next-2021-02-12' of git://git./linux/kernel/git/kvalo/wireless-drivers-next

Kalle Valo says:
====================
wireless-drivers-next patches for v5.12

Second set of patches for v5.12. Last time there was a smaller pull
request so unsurprisingly this time we have a big one. mt76 has new
hardware support and lots of new features, iwlwifi getting new
features and rtw88 got NAPI support. And the usual cleanups and fixes
all over.

Major changes:

ath10k

* support setting SAR limits via nl80211

rtw88

* support 8821 RFE type2 devices

* NAPI support

iwlwifi

* add new FW API support

* support for new So devices

* support for RF interference mitigation (RFI)

* support for PNVM (Platform Non-Volatile Memory, a firmware data
file) from BIOS

mt76

* add new mt7921e driver

* 802.11 encap offload support

* support for multiple pcie gen1 host interfaces on 7915

* 7915 testmode support

* 7915 txbf support

brcmfmac

* support for CQM RSSI notifications

wil6210

* support for extended DMG MCS 12.1 rate
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

rxrpc: Fix dependency on IPv6 in udp tunnel config

As udp_port_cfg struct changes its members with dependency on IPv6
configuration, the code in rxrpc should also check for IPv6.

Fixes: 1a9b86c9fd95 ("rxrpc: use udp tunnel APIs instead of open code in rxrpc_open_socket")
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Vadim Fedorenko <vfedorenko@novek.ru>
Acked-by: David Howells <dhowells@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'mptcp-genl-events'

Mat Martineau says:

====================
mptcp: Add genl events for connection info

This series from the MPTCP tree adds genl multicast events that are
important for implementing a userspace path manager. In MPTCP, a path
manager is responsible for adding or removing additional subflows on
each MPTCP connection. The in-kernel path manager (already part of the
kernel) is a better fit for many server use cases, but the additional
flexibility of userspace path managers is often useful for client
devices.

Patches 1, 2, 4, 5, and 6 do some refactoring to streamline the netlink
event implementation in the final patch.

Patch 3 improves the timeliness of subflow destruction to ensure the
'subflow closed' event will be sent soon enough.

Patch 7 allows use of the GENL_UNS_ADMIN_PERM flag on genl mcast groups
to mandate CAP_NET_ADMIN, which is important to protect token information
in the MPTCP events. This is a genetlink change.

Patch 8 adds the MPTCP netlink events.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

mptcp: add netlink event support

Allow userspace (mptcpd) to subscribe to mptcp genl multicast events.
This implementation reuses the same event API as the mptcp kernel fork
to ease integration of existing tools, e.g. mptcpd.

Supported events include:
1. start and close of an mptcp connection
2. start and close of subflows (joins)
3. announce and withdrawals of addresses
4. subflow priority (backup/non-backup) change.

Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

mptcp: avoid lock_fast usage in accept path

Once event support is added this may need to allocate memory while msk
lock is held with softirqs disabled.

Not using lock_fast also allows to do the allocation with GFP_KERNEL.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

mptcp: pass subflow socket to a few helpers

Pass the first/initial subflow to the existing functions so they can
pass this on to the notification handler that is added later in the
series.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

mptcp: move subflow close loop after sk close check

In case mptcp socket is already dead the entire mptcp socket
will be freed. We can avoid the close check in this case.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

mptcp: schedule worker when subflow is closed

When remote side closes a subflow we should schedule the worker to
dispose of the subflow in a timely manner.

Otherwise, SF_CLOSED event won't be generated until the mptcp
socket itself is closing or local side is closing another subflow.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

mptcp: split __mptcp_close_ssk helper

Prepare for subflow close events:

When mptcp connection is torn down its enough to send the mptcp socket
close notification rather than a subflow close event for all of the
subflows followed by the mptcp close event.

This splits the helper: mptcp_close_ssk() will emit the close
notification, __mptcp_close_ssk will not.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

mptcp: move pm netlink work into pm_netlink

Allows to make some functions static and avoids acquire of the pm
spinlock in protocol.c.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'mptcp-selftests'

Mat Martineau says:

====================
mptcp: Selftest enhancement and fixes

This is a collection of selftest updates from the MPTCP tree.

Patch 1 uses additional 'ss' command line parameters and 'nstat' to
improve output when certain MPTCP tests fail.

Patches 2 & 3 fix a copy/paste error and some output formatting.

Patch 4 makes sure tests still pass if certain connection-related
packets are retransmitted.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

selftests: mptcp: fail if not enough SYN/3rd ACK

If we receive less MPCapable SYN or 3rd ACK than expected, we now mark
the test as failed.

On the other hand, if we receive more, we keep the warning but we add a
hint that it is probably due to retransmissions and that's why we don't
mark the test as failed.

Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/148
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

selftests: mptcp: display warnings on one line

Before we had this in case of SYN retransmissions:

  (...)
  # ns4 MPTCP -> ns2 (10.0.1.2:10034      ) MPTCP (duration  1201ms) [ OK ]
  # ns4 MPTCP -> ns2 (dead:beef:1::2:10035) MPTCP (duration  1242ms) [ OK ]
  # ns4 MPTCP -> ns2 (10.0.2.1:10036      ) MPTCP ns2-60143c00-cDZWo4 SYNRX: MPTCP -> MPTCP: expect 11, got
  # 13
  # (duration  6221ms) [ OK ]
  # ns4 MPTCP -> ns2 (dead:beef:2::1:10037) MPTCP (duration  1427ms) [ OK ]
  # ns4 MPTCP -> ns3 (10.0.2.2:10038      ) MPTCP (duration   881ms) [ OK ]
  (...)

Now we have:

  (...)
  # ns4 MPTCP -> ns2 (10.0.1.2:10034      ) MPTCP (duration  1201ms) [ OK ]
  # ns4 MPTCP -> ns2 (dead:beef:1::2:10035) MPTCP (duration  1242ms) [ OK ]
  # ns4 MPTCP -> ns2 (10.0.2.1:10036      ) MPTCP (duration  6221ms) [ OK ] WARN: SYNRX: expect 11, got 13
  # ns4 MPTCP -> ns2 (dead:beef:2::1:10037) MPTCP (duration  1427ms) [ OK ]
  # ns4 MPTCP -> ns3 (10.0.2.2:10038      ) MPTCP (duration   881ms) [ OK ]
  (...)

So we put everything on one line, keep the durations and "OK" aligned
and removed duplicated info to short the warning.

Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

selftests: mptcp: fix ACKRX debug message

Info from received MPCapable SYN were printed instead of the ones from
received MPCapable 3rd ACK.

Fixes: fed61c4b584c ("selftests: mptcp: make 2nd net namespace use tcp syn cookies unconditionally")
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

selftests: mptcp: dump more info on errors

Even if that may sound completely unlikely, the mptcp implementation
is not perfect, yet.

When the self-tests report an error we usually need more information
of what the scripts currently report. iproute allow provides
some additional goodies since a few releases, let's dump them.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'hns3-cleanups'

Huazhong Tan says:

====================
net: hns3: some cleanups for -next

To improve code readability and maintainability, the series
refactor out some bloated functions in the HNS3 ethernet driver.

change log:
V2: remove an unused variable in #5

previous version:
V1: https://patchwork.kernel.org/project/netdevbpf/cover/1612943005-59416-1-git-send-email-tanhuazhong@huawei.com/
====================

Acked-by: Jakub Kicinski <kuba@kernel.org>

net: hns3: refactor out hclge_rm_vport_all_mac_table()

hclge_rm_vport_all_mac_table() is bloated, so split it into
separate functions for readability and maintainability.

Signed-off-by: Hao Chen <chenhao288@hisilicon.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: hns3: refactor out hclgevf_set_rss_tuple()

To make it more readable and maintainable, split
hclgevf_set_rss_tuple() into two parts.

Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: hns3: refactor out hclge_set_rss_tuple()

To make it more readable and maintainable, split
hclge_set_rss_tuple() into two parts.

Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: hns3: split out hclgevf_cmd_send()

hclgevf_cmd_send() is bloated, so split it into separate
functions for readability and maintainability.

Signed-off-by: Yufeng Mo <moyufeng@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: hns3: split out hclge_cmd_send()

hclge_cmd_send() is bloated, so split it into separate
functions for readability and maintainability.

Signed-off-by: Yufeng Mo <moyufeng@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: hns3: split out hclge_dbg_dump_qos_buf_cfg()

hclge_dbg_dump_qos_buf_cfg() is bloated, so split it into
separate functions for readability and maintainability.

Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: hns3: refactor out hclgevf_get_rss_tuple()

To improve code readability and maintainability, separate
the flow type parsing part and the converting part from
bloated hclgevf_get_rss_tuple().

Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: hns3: refactor out hclge_get_rss_tuple()

To improve code readability and maintainability, separate
the flow type parsing part and the converting part from
bloated hclge_get_rss_tuple().

Signed-off-by: Jian Shen <shenjian15@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: hns3: refactor out hclge_set_vf_vlan_common()

To improve code readability and maintainability, separate
the command handling part and the status parsing part from
bloated hclge_set_vf_vlan_common().

Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: hns3: use ipv6_addr_any() helper

Use common ipv6_addr_any() to determine if an addr is ipv6 any addr.

Signed-off-by: Jiaran Zhang <zhangjiaran@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: hns3: clean up hns3_dbg_cmd_write()

As more commands are added, hns3_dbg_cmd_write() is going to
get more bloated, so move the part about command check into
a separate function.

Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: hns3: refactor out hclgevf_cmd_convert_err_code()

To improve code readability and maintainability, refactor
hclgevf_cmd_convert_err_code() with an array of imp_errcode
and common_errno mapping, instead of a bloated switch/case.

Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: hns3: refactor out hclge_cmd_convert_err_code()

To improve code readability and maintainability, refactor
hclge_cmd_convert_err_code() with an array of imp_errcode
and common_errno mapping, instead of a bloated switch/case.

Signed-off-by: Peng Li <lipeng321@huawei.com>
Signed-off-by: Huazhong Tan <tanhuazhong@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

nl80211: add documentation for HT/VHT/HE disable attributes

These were missed earlier, add the necessary documentation
and, while at it, clarify it.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Link: https://lore.kernel.org/r/20210212105023.895c3389f063.I46dea3bfc64385bc6f600c50d294007510994f8f@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>

cfg80211/mac80211: Support disabling HE mode

Allow user to disable HE mode, similar to how VHT and HT
can be disabled. Useful for testing.

Signed-off-by: Ben Greear <greearb@candelatech.com>
Link: https://lore.kernel.org/r/20210204144610.25971-1-greearb@candelatech.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>

mac80211: add STBC encoding to ieee80211_parse_tx_radiotap

This patch adds support for STBC encoding to the radiotap tx parse
function. Prior to this change adding the STBC flag to the radiotap
header did not encode frames with STBC.

Signed-off-by: Philipp Borgers <borgers@mi.fu-berlin.de>
Link: https://lore.kernel.org/r/20210125150744.83065-1-borgers@mi.fu-berlin.de
[use u8_get_bits/u32_encode_bits instead of manually shifting]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>

mac80211: minstrel_ht: remove sample rate switching code for constrained devices

This was added to mitigate the effects of too much sampling on devices that
use a static global fallback table instead of configurable multi-rate retry.
Now that the sampling algorithm is improved, this code path no longer performs
any better than the standard probing on affected devices.

Signed-off-by: Felix Fietkau <nbd@nbd.name>
Link: https://lore.kernel.org/r/20210127055735.78599-6-nbd@nbd.name
Signed-off-by: Johannes Berg <johannes.berg@intel.com>

mac80211: minstrel_ht: show sampling rates in debugfs

This makes it easier to see what rates are going to be tested next

Signed-off-by: Felix Fietkau <nbd@nbd.name>
Link: https://lore.kernel.org/r/20210127055735.78599-5-nbd@nbd.name
Signed-off-by: Johannes Berg <johannes.berg@intel.com>

mac80211: minstrel_ht: significantly redesign the rate probing strategy

The biggest flaw in current minstrel_ht is the fact that it needs way too
many probing packets to be able to quickly find the best rate.
Depending on the wifi hardware and operating mode, this can significantly
reduce throughput when not operating at the highest available data rate.

In order to be able to significantly reduce the amount of rate sampling,
we need a much smarter selection of probing rates.

The new approach introduced by this patch maintains a limited set of
available rates to be tested during a statistics window.

They are split into distinct categories:
- MINSTREL_SAMPLE_TYPE_INC - incremental rate upgrade:
  Pick the next rate group and find the first rate that is faster than
  the current max. throughput rate
- MINSTREL_SAMPLE_TYPE_JUMP - random testing of higher rates:
  Pick a random rate from the next group that is faster than the current
  max throughput rate. This allows faster adaptation when the link changes
  significantly
- MINSTREL_SAMPLE_TYPE_SLOW - test a rate between max_prob, max_tp2 and
  max_tp in order to reduce the gap between them

In order to prioritize sampling, every 6 attempts are split into 3x INC,
2x JUMP, 1x SLOW.

Available rates are checked and refilled on every stats window update.

With this approach, we finally get a very small delta in throughput when
comparing setting the optimal data rate as a fixed rate vs normal rate
control operation.

Signed-off-by: Felix Fietkau <nbd@nbd.name>
Link: https://lore.kernel.org/r/20210127055735.78599-4-nbd@nbd.name
Signed-off-by: Johannes Berg <johannes.berg@intel.com>

mac80211: minstrel_ht: reduce the need to sample slower rates

In order to more gracefully be able to fall back to lower rates without too
much throughput fluctuations, initialize all untested rates below tested ones
to the maximum probabilty of higher rates.
Usually this leads to untested lower rates getting initialized with a
probability value of 100%, making them better candidates for fallback without
having to rely on random probing

Signed-off-by: Felix Fietkau <nbd@nbd.name>
Link: https://lore.kernel.org/r/20210127055735.78599-3-nbd@nbd.name
Signed-off-by: Johannes Berg <johannes.berg@intel.com>

mac80211: minstrel_ht: update total packets counter in tx status path

Keep the update in one place and prepare for further rework

Signed-off-by: Felix Fietkau <nbd@nbd.name>
Link: https://lore.kernel.org/r/20210127055735.78599-2-nbd@nbd.name
Signed-off-by: Johannes Berg <johannes.berg@intel.com>

mac80211: minstrel_ht: use bitfields to encode rate indexes

Get rid of a lot of divisions and modulo operations
Reduces code size and improves performance

Signed-off-by: Felix Fietkau <nbd@nbd.name>
Link: https://lore.kernel.org/r/20210127055735.78599-1-nbd@nbd.name
Signed-off-by: Johannes Berg <johannes.berg@intel.com>

cfg80211: initialize reg_rule in __freq_reg_info()

Sparse started warning on this function because we can potentially
return an uninitialized value. The reason is that if the caller
passes a min_bw value that is higher then the last value in bws[], we
will not go into the loop and reg_rule will remain initialized. This
cannot happen because the only caller of this function uses either 1
or 20 in min_bw, but the function will be more robust if we
pre-initialize the value.

Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Link: https://lore.kernel.org/r/iwlwifi.20210204154439.6c884ea7281c.I257278d03b0c1ae0aa6631672cfa48f1a95d5996@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>

mac80211: fix potential overflow when multiplying to u32 integers

The multiplication of the u32 variables tx_time and estimated_retx is
performed using a 32 bit multiplication and the result is stored in
a u64 result. This has a potential u32 overflow issue, so avoid this
by casting tx_time to a u64 to force a 64 bit multiply.

Addresses-Coverity: ("Unintentional integer overflow")
Fixes: 050ac52cbe1f ("mac80211: code for on-demand Hybrid Wireless Mesh Protocol")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Link: https://lore.kernel.org/r/20210205175352.208841-1-colin.king@canonical.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>

mac80211: enable QoS support for nl80211 ctrl port

This patch unifies sending control port frames
over nl80211 and AF_PACKET sockets a little more.

Before this patch, EAPOL frames got QoS prioritization
only when using AF_PACKET sockets.

__ieee80211_select_queue only selects a QoS-enabled queue
for control port frames, when the control port protocol
is set correctly on the skb. For the AF_PACKET path this
works, but the nl80211 path used ETH_P_802_3.

Another check for injected frames in wme.c then prevented
the QoS TID to be copied in the frame.

In order to fix this, get rid of the frame injection marking
for nl80211 ctrl port and set the correct ethernet protocol.

Please note:
An erlier version of this path tried to prevent
frame aggregation for control port frames in order to speed up
the initial connection setup a little. This seemed to cause
issues on my older Intel dvm-based hardware, and was therefore
removed again. Future commits which try to reintroduce this
have to check carefully how hw behaves with aggregated and
non-aggregated traffic for the same TID.
My NIC: Intel(R) Centrino(R) Ultimate-N 6300 AGN, REV=0x74

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Markus Theil <markus.theil@tu-ilmenau.de>
Link: https://lore.kernel.org/r/20210206115112.567881-1-markus.theil@tu-ilmenau.de
Signed-off-by: Johannes Berg <johannes.berg@intel.com>

cfg80211: remove unused callback

The ieee80211 class registers a callback which actually does nothing.
Given that the callback is optional, and all its accesses are protected
by a NULL check, remove it entirely.

Signed-off-by: Matteo Croce <mcroce@microsoft.com>
Link: https://lore.kernel.org/r/20210208113356.4105-1-mcroce@linux.microsoft.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>

rtw88: 8822c: update RF_B (2/2) parameter tables to v60

Update RTL8822C devices' RF_A tables to v60.
The new parameters fix incorrect RSSI report under 2.4G link.

Signed-off-by: Po-Hao Huang <phhuang@realtek.com>
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Reviewed-by: Brian Norris <briannorris@chromium.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20210209070755.23019-9-pkshih@realtek.com

rtw88: 8822c: update RF_B (1/2) parameter tables to v60

Update RTL8822C devices' RF_B tables to v60.
The new parameters fix incorrect RSSI report under 2.4G link.

Signed-off-by: Po-Hao Huang <phhuang@realtek.com>
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Reviewed-by: Brian Norris <briannorris@chromium.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20210209070755.23019-8-pkshih@realtek.com

rtw88: 8822c: update RF_A parameter tables to v60

Update RTL8822C devices' RF_A tables to v60.
The new parameters fix incorrect RSSI report under 2.4G link.

Signed-off-by: Po-Hao Huang <phhuang@realtek.com>
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Reviewed-by: Brian Norris <briannorris@chromium.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20210209070755.23019-7-pkshih@realtek.com

rtw88: 8822c: update MAC/BB parameter tables to v60

Update RTL8822C devices' MAC/BB tables to v60.
The new parameters fix incorrect RSSI report under 2.4G link.

Signed-off-by: Po-Hao Huang <phhuang@realtek.com>
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Reviewed-by: Brian Norris <briannorris@chromium.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20210209070755.23019-6-pkshih@realtek.com

rtw88: replace tx tasklet with work queue

Replace tasklet so we can do tx scheduling in parallel. Since throughput
is delay-sensitive in most cases, we allocate a dedicated, high priority
wq for our needs.

Signed-off-by: Po-Hao Huang <phhuang@realtek.com>
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Reviewed-by: Brian Norris <briannorris@chromium.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20210209070755.23019-5-pkshih@realtek.com

rtw88: add napi support

Use napi to reduce overhead on rx interrupts.

Driver used to interrupt kernel for every Rx packet, this could
affect both system and network performance. NAPI is a mechanism that
uses polling when processing huge amount of traffic, by doing this
the number of interrupts can be decreased.

Network performance can also benefit from this patch. Since TCP
connection is bidirectional and acks are required for every several
packets. These ack packets occupie the PCI bus bandwidth and could
lead to performance degradation.

When napi is used, GRO receive is enabled by default in the mac80211
stack. So mac80211 won't pass every RX TCP packets to the kernel TCP
network stack immediately. Instead an aggregated large length TCP packet
will be delivered.

This reduces the tx acks sent and gains rx performance. After the patch,
the Rx throughput increases about 25Mbps in 11ac.

Signed-off-by: Po-Hao Huang <phhuang@realtek.com>
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Reviewed-by: Brian Norris <briannorris@chromium.org>
Tested-by: Brian Norris <briannorris@chromium.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20210209070755.23019-4-pkshih@realtek.com

rtw88: add rts condition

Since we set the IEEE80211_HW_HAS_RATE_CONTROL flag, so use_rts in
ieee80211_tx_info will never be set in the ieee80211_xmit_fast path.
Add length check for skb to decide whether rts is needed.

Signed-off-by: Po-Hao Huang <phhuang@realtek.com>
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Reviewed-by: Brian Norris <briannorris@chromium.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20210209070755.23019-3-pkshih@realtek.com

rtw88: add dynamic rrsr configuration

Register rrsr determines the response rate we send.
In field tests, using rate higher than current tx rate could lead
to difficulty for the receiving end to receive management/control
frames. Calculate current modulation level by tx rate then cross out
rate higher than those.

Signed-off-by: Po-Hao Huang <phhuang@realtek.com>
Signed-off-by: Ping-Ke Shih <pkshih@realtek.com>
Reviewed-by: Brian Norris <briannorris@chromium.org>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/20210209070755.23019-2-pkshih@realtek.com

iwlwifi: remove incorrect comment in pnvm

We use this driver as a backport that also runs on older kernels (as
part of the backports project). So we use some checks to backport or
prevent code from compiling in incompatible kernel version.

When I took one of the PNVM patches from the backport, I accidentally
left the comment that a certain part of the code doesn't work in older
kernels. This obviously should never be valid for the mainline.
Remove this comment.

Reported-by: Kalle Valo <kvalo@codeaurora.org>
Signed-off-by: Luca Coelho <luciano.coelho@intel.com>
Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
Link: https://lore.kernel.org/r/iwlwifi.20210211223049.40d545a0fa89.I04793aaa5312b926335c8db32131f000432df511@changeid

Merge branch 'sock-rx-qmap'

Tariq Toukan says:

====================
Compile-flag for sock RX queue mapping

Socket's RX queue mapping logic is useful also for non-XPS use cases.
This series breaks the dependency between the two, introducing a new
kernel config flag SOCK_RX_QUEUE_MAPPING.

Here we select this new kernel flag from TLS_DEVICE, as well as XPS.
====================

Acked-by: Jakub Kicinski <kuba@kernel.org>

net/mlx5: Remove TLS dependencies on XPS

No real dependency on XPS, but on RX queue mapping, which
is being selected by TLS_DEVICE.

Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net/tls: Select SOCK_RX_QUEUE_MAPPING from TLS_DEVICE

Compile-in the socket RX queue mapping field and logic when TLS_DEVICE
is enabled. This allows device drivers to pick the recorded socket's
RX queue and use it for streams distribution.

Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net/sock: Add kernel config SOCK_RX_QUEUE_MAPPING

Use a new config SOCK_RX_QUEUE_MAPPING to compile-in the socket
RX queue field and logic, instead of the XPS config.
This breaks dependency in XPS, and allows selecting it from non-XPS
use cases, as we do in the next patch.

In addition, use the new flag to wrap the logic in sk_rx_queue_get()
and protect access to the sk_rx_queue_mapping field, while keeping
the function exposed unconditionally, just like sk_rx_queue_set()
and sk_rx_queue_clear().

Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

tcp: Sanitize CMSG flags and reserved args in tcp_zerocopy_receive.

Explicitly define reserved field and require it and any subsequent
fields to be zero-valued for now. Additionally, limit the valid CMSG
flags that tcp_zerocopy_receive accepts.

Fixes: 7eeba1706eba ("tcp: Add receive timestamp support for receive zerocopy.")
Signed-off-by: Arjun Roy <arjunroy@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Soheil Hassas Yeganeh <soheil@google.com>
Suggested-by: David Ahern <dsahern@gmail.com>
Suggested-by: Leon Romanovsky <leon@kernel.org>
Suggested-by: Jakub Kicinski <kuba@kernel.org>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>

r8169: handle tx before rx in napi poll

Cleaning up tx descriptors first increases the chance that
rtl_rx() can allocate new skb's from the cache.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: fix dev_ifsioc_locked() race condition

dev_ifsioc_locked() is called with only RCU read lock, so when
there is a parallel writer changing the mac address, it could
get a partially updated mac address, as shown below:

Thread 1 Thread 2
// eth_commit_mac_addr_change()
memcpy(dev->dev_addr, addr->sa_data, ETH_ALEN);
// dev_ifsioc_locked()
memcpy(ifr->ifr_hwaddr.sa_data,
dev->dev_addr,...);

Close this race condition by guarding them with a RW semaphore,
like netdev_get_name(). We can not use seqlock here as it does not
allow blocking. The writers already take RTNL anyway, so this does
not affect the slow path. To avoid bothering existing
dev_set_mac_address() callers in drivers, introduce a new wrapper
just for user-facing callers on ioctl and rtnetlink paths.

Note, bonding also changes slave mac addresses but that requires
a separate patch due to the complexity of bonding code.

Fixes: 3710becf8a58 ("net: RCU locking for simple ioctl()")
Reported-by: "Gong, Sishuai" <sishuai@purdue.edu>
Cc: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Cong Wang <cong.wang@bytedance.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: mvpp2: fix interrupt mask/unmask skip condition

The condition should be skipped if CPU ID equal to nthreads.
The patch doesn't fix any actual issue since
nthreads = min_t(unsigned int, num_present_cpus(), MVPP2_MAX_THREADS).
On all current Armada platforms, the number of CPU's is
less than MVPP2_MAX_THREADS.

Fixes: e531f76757eb ("net: mvpp2: handle cases where more CPUs are available than s/w threads")
Reported-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: Stefan Chulski <stefanc@marvell.com>
Reviewed-by: Russell King <rmk+kernel@armlinux.org.uk>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'am65-cpsw-nuss-switchdev-driver'

Vignesh Raghavendra says:

====================
net: ti: am65-cpsw-nuss: Add switchdev driver

This series adds switchdev support for AM65 CPSW NUSS driver to support
multi port CPSW present on J721e and AM64 SoCs.
It adds devlink hook to switch b/w switch mode and multi mac mode.

v2:
Rebased on latest net-next
Update patch 1/4 with rationale for using devlink
====================

docs: networking: ti: Add driver doc for AM65 NUSS switch driver

J721e, J7200 and AM64 have multi port switches which can work in multi
mac mode and in switch mode. Add documentation explaining how to use
different modes.

Borrowed from:
Documentation/networking/device_drivers/ethernet/ti/cpsw_switchdev.rst

Signed-off-by: Vignesh Raghavendra <vigneshr@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: ti: am65-cpsw-nuss: Add switchdev support

J721e, J7200 and AM64 have multi port switches which can work in multi
mac mode and in switch mode. Add support for configuring this HW in
switch mode using devlink and switchdev notifiers.

Support is similar to existing CPSW switchdev implementation of TI's 32 bit
platform like AM33/AM43/AM57.

To enable switch mode:
devlink dev param set platform/8000000.ethernet name switch_mode value true cmode runtime

All configuration is implemented via switchdev API and notifiers.
Supported:
      - SWITCHDEV_ATTR_ID_PORT_PRE_BRIDGE_FLAGS
      - SWITCHDEV_ATTR_ID_PORT_BRIDGE_FLAGS
      - SWITCHDEV_ATTR_ID_PORT_STP_STATE
      - SWITCHDEV_OBJ_ID_PORT_VLAN
      - SWITCHDEV_OBJ_ID_PORT_MDB
      - SWITCHDEV_OBJ_ID_HOST_MDB

Hence AM65 CPSW switchdev driver supports:
     - FDB offloading
     - MDB offloading
     - VLAN filtering and offloading
     - STP

Signed-off-by: Vignesh Raghavendra <vigneshr@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: ti: am65-cpsw-nuss: Add netdevice notifiers

Register netdevice notifiers in order to receive notification when
individual MAC ports are added to the HW bridge.

Signed-off-by: Vignesh Raghavendra <vigneshr@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: ti: am65-cpsw-nuss: Add devlink support

AM65 NUSS ethernet switch on K3 devices can be configured to work either
in independent mac mode where each port acts as independent network
interface (multi mac) or switch mode.

Add devlink hooks to provide a way to switch b/w these modes.

Rationale to use devlink instead of defaulting to bridge mode is that
SoC use cases require to support multiple independent MAC ports with no
switching so that users can use software bridges with multi-mac
configuration (e.g: to support LAG, HSR/PRP, etc). Also, switching
between multi mac and switch mode requires significant Port and ALE
reconfiguration, therefore is easier to be made as part of mode change
devlink hooks. It also allows to keep user interface similar to what
was implemented for the previous generation of TI CPSW IP
(on AM33/AM43/AM57 SoCs).

Signed-off-by: Vignesh Raghavendra <vigneshr@ti.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'bcm4908_enet-post-review-fixes'

Rafał Miłecki says:

====================
bcm4908_enet: post-review fixes

V2 of my BCM4908 Ethernet patchset was applied to the net-next.git and
it was later that is received some extra reviews. I'm sending patches
that handle pointed out issues.

David: earler I missed that V2 was applied and I sent V3 and V4 of my
inital patchset. Sorry for that. I think it's the best to ignore V3 and
V4 I sent and proceed with this fixes patchset instead.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

net: broadcom: bcm4908_enet: fix endianness in xmit code

Use le32_to_cpu() for reading __le32 struct field filled by hw.

Signed-off-by: Rafał Miłecki <rafal@milecki.pl>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: broadcom: bcm4908_enet: fix received skb length

Use ETH_FCS_LEN instead of magic value and drop incorrect + 2

Signed-off-by: Rafał Miłecki <rafal@milecki.pl>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: broadcom: bcm4908_enet: fix minor typos

1. Fix "ensable" typo noticed by Andrew
2. Fix chipset name in the struct net_device_ops variable

Suggested-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Rafał Miłecki <rafal@milecki.pl>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: broadcom: bcm4908_enet: drop "inline" from C functions

It seems preferred to let compiler optimize code if applicable.
While at it drop unused enet_umac_maskset().

Suggested-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Rafał Miłecki <rafal@milecki.pl>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: broadcom: bcm4908_enet: drop unneeded memset()

dma_alloc_coherent takes care of zeroing allocated memory

Suggested-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Rafał Miłecki <rafal@milecki.pl>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

net: broadcom: rename BCM4908 driver & update DT binding

compatible string was updated to match normal naming convention so
update driver as well

Signed-off-by: Rafał Miłecki <rafal@milecki.pl>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

dt-bindings: net: bcm4908-enet: include ethernet-controller.yaml

It should be /included/ by every Ethernet controller binding. It adds
support for various generic properties.

Suggested-by: Rob Herring <robh@kernel.org>
Signed-off-by: Rafał Miłecki <rafal@milecki.pl>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

dt-bindings: net: rename BCM4908 Ethernet binding

Rob pointed out that a normal convention is "brcm,bcm4908-enet" so
update whole binding to match it.

Suggested-by: Rob Herring <robh@kernel.org>
Signed-off-by: Rafał Miłecki <rafal@milecki.pl>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'for-upstream' of git://git./linux/kern
el/git/bluetooth/bluetooth-next

Johan Hedberg says:

====================
pull request: bluetooth-next 2021-02-11

Here's the main bluetooth-next pull request for 5.12:

- Add support for advertising monitor offliading using Microsoft
vendor extensions
- Add firmware download support for MediaTek MT7921U USB devices
- Suspend-related fixes for Qualcomm devices
- Add support for Intel GarfieldPeak controller
- Various other smaller fixes & cleanups

Please let me know if there are any issues pulling. Thanks.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

Merge branch 'marvell-cn10k'

Geetha sowjanya says:

====================
Add Marvell CN10K support

The current admin function (AF) driver and the netdev driver supports
OcteonTx2 silicon variants. The same OcteonTx2's
Resource Virtualization Unit (RVU) is carried forward to the next-gen
silicon ie OcteonTx3, with some changes and feature enhancements.

This patch set adds support for OcteonTx3 (CN10K) silicon and gets
the drivers to the same level as OcteonTx2. No new OcteonTx3 specific
features are added.

Changes cover below HW level differences
- PCIe BAR address changes wrt shared mailbox memory region
- Receive buffer freeing to HW
- Transmit packet's descriptor submission to HW
- Programmable HW interface identifiers (channels)
- Increased MTU support
- A Serdes MAC block (RPM) configuration

v5-v6
Rebased on top of latest net-next branch.

v4-v5
Fixed sparse warnings.

v3-v4
Fixed compiler warnings.

v2-v3
Reposting as a single thread.
Rebased on top latest net-next branch.

v1-v2
Fixed check-patch reported issues.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>

octeontx2-af: cn10k: MAC internal loopback support

MAC on CN10K silicon support loopback for selftest or debug purposes.
This patch does necessary configuration to loopback packets upon receiving
request from LMAC mapped RVU PF's netdev via mailbox.

Also MAC (CGX) on OcteonTx2 silicon variants and MAC (RPM) on
OcteonTx3 CN10K are different and loopback needs to be configured
differently. Upper layer interface between RVU AF and PF netdev is
kept same. Based on silicon variant appropriate fn() pointer is
called to config the MAC.

Signed-off-by: Hariprasad Kelam <hkelam@marvell.com>
Signed-off-by: Geetha sowjanya <gakula@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

octeontx2-af: cn10k: Add RPM Rx/Tx stats support

RPM supports below list of counters as an extension to existing counters
*  class based flow control pause frames
*  vlan/jabber/fragmented packets
*  fcs/alignment/oversized error packets

This patch adds support to display supported RPM counters via debugfs
and define new mbox rpm_stats to read all support counters.

Signed-off-by: Hariprasad Kelam <hkelam@marvell.com>
Signed-off-by: Geetha sowjanya <gakula@marvell.com>
Signed-off-by: Sunil Kovvuri Goutham <Sunil.Goutham@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

octeontx2-af: cn10k: Add RPM LMAC pause frame support

Flow control configuration is different for CGX(Octeontx2)
and RPM(CN10K) functional blocks. This patch adds the necessary
changes for RPM to support 802.3 pause frames configuration on
cn10k platforms.

Signed-off-by: Rakesh Babu <rsaladi2@marvell.com>
Signed-off-by: Geetha sowjanya <gakula@marvell.com>
Signed-off-by: Sunil Kovvuri Goutham <Sunil.Goutham@cavium.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

octeontx2-pf: cn10k: Get max mtu supported from admin function

CN10K supports max MTU of 16K on LMAC links and 64k on LBK
links and Octeontx2 silicon supports 9K mtu on both links.
Get the same from nix_get_hw_info mbox message in netdev probe.

This patch also calculates receive buffer size required based
on the MTU set.

Signed-off-by: Hariprasad Kelam <hkelam@marvell.com>
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: Geetha sowjanya <gakula@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

octeontx2-af: cn10K: Add MTU configuration

OcteonTx3 CN10K silicon supports bigger MTU when compared
to 9216 MTU supported by OcteonTx2 silicon variants. Lookback
interface supports upto 64K and RPM LMAC interfaces support
upto 16K.

This patch does the necessary configuration and adds support
for PF/VF drivers to retrieve max packet size supported via mbox

This patch also configures tx link credit by considering supported
fifo size and max packet length for Octeontx3 silicon.

This patch also removes platform specific name from the driver name.

Signed-off-by: Hariprasad Kelam <hkelam@marvell.com>
Signed-off-by: Geetha sowjanya <gakula@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

octeontx2-af: cn10k: Add support for programmable channels

NIX uses unique channel numbers to identify the packet sources/sinks
like CGX,LBK and SDP. The channel numbers assigned to each block are
hardwired in CN9xxx silicon.
The fixed channel numbers in CN9xxx are:

0x0 | a << 8 | b            - LBK(0..3)_CH(0..63)
0x0 | a << 8                - Reserved
0x700 | a                   - SDP_CH(0..255)
0x800 | a << 8 | b << 4 | c - CGX(0..7)_LMAC(0..3)_CH(0..15)

All the channels in the above fixed enumerator(with maximum
number of blocks) are not required since some chips
have less number of blocks.
For CN10K silicon the channel numbers need to be programmed by
software in each block with the base channel number and range of
channels. This patch calculates and assigns the channel numbers
to efficiently distribute the channel number range(0-4095) among
all the blocks. The assignment is made based on the actual number of
blocks present and also contiguously leaving no holes.
The channel numbers remaining after the math are used as new CPT
replay channels present in CN10K. Also since channel numbers are
not fixed the transmit channel link number needed by AF consumers
is calculated by AF and sent along with nix_lf_alloc mailbox response.

Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: Geetha sowjanya <gakula@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>

octeontx2-af: cn10k: Add RPM MAC support

OcteonTx2's next gen platform the CN10K has RPM MAC which has a
different serdes when compared to CGX MAC. Though the underlying
HW is different, the CSR interface has been designed largely inline
with CGX MAC, with few exceptions though. So we are using the same
CGX driver for RPM MAC as well and will have a different set of APIs
for RPM where ever necessary.

This patch adds initial support for CN10K's RPM MAC i.e. the driver
registration, communication with firmware etc. For communication with
firmware, RPM provides a different IRQ when compared to CGX.
The CGX and RPM blocks support different features. Currently few
features like ptp, flowcontrol and higig are not supported by RPM. This
patch adds new mailbox message "CGX_FEATURES_GET" to get the list of
features supported by underlying MAC.

RPM has different implementations for RX/TX stats. Unlike CGX,
bar offset of stat registers are different. This patch adds
support to access the same and dump the values in debugfs.

Signed-off-by: Hariprasad Kelam <hkelam@marvell.com>
Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: Geetha sowjanya <gakula@marvell.com>
Signed-off-by: Sunil Goutham <sgoutham@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>